Com Sci 230

Homeworks for Spring 1998

Nachos Project #2
Due 29 April 1998
Thread Synchronization

Last modified: Tue Apr 21 21:35:57 CDT

Project #2 is due on Wednesday, 29 April 1998, at 7:30 AM. Use the submit-project Project_2 command to hand in your work as you did in the revised procedures for Project #1. Submit as often as you like up until the deadline. I will evaluate only your last submission, which will wipe out all the previous ones in my directory. Please submit only work that is complete in terms of organization, internal documentation, and explanatory writing. But, you may submit work that implements only a part of the project step, or which has errors in functionality, as long as you explain the incompleteness and/or errors.

Basic goals

In this project, you must take the thread system defined in the initial version of Nachos, complete the synchronization primitives, and use these primitives to solve several problems in concurrent programming. Since we are not yet using the MIPS instruction interpreter, and are not assigning user address spaces, there are no user programs involved in this assignment. But, to test the synchronization primitives, you will need to write functions that behave like user programs, at least as far as their concurrent behavior is involved.

Testing your concurrent programs

In order to test your concurrent programs, you need to run them with a lot of different interleaved orderings. To support such testing, nachos with the option -rs provides pseudorandom timer interrupts to jumble up the order of execution of concurrent threads. But, because of the two-level structure of Nachos, the way in which nachos provides these interrupts is somewhat artificial looking, and requires some programming effort from you in order to let it happen.

nachos -rs starts a simulated timer, and sets interrupts for pseudorandomly chosen intervals. Whenever a timer interrupt is triggered, the initial Nachos code calls TimerInterruptHandler in system.cc, which Yields the CPU. But, "time" in Nachos is simulated. The clock ticks once whenever there is a call to Interrupt::OneTick in interrupt.cc. In simulated MIPS code, OneTick is called after every simulated machine instruction. But, in non-MIPS code, it may never get called unless the code explicitly causes it. In the initial Nachos code, OneTick is called from the OS only when interrupts are re-enabled, after having been off. The call interrupt->Enable(), which sets the interrupt state to IntOn, will only cause a clock tick when the previous interrupt state was IntOff.

In order to test your concurrent programming thoroughly, you may need to add some clock ticks to provide more opportunities for timer interrupts. OneTick appears to assume, without checking, that interrupts are enabled. Make sure that you only tick the clock in portions of your code where interrupts are guaranteed to be enabled. Insert interrupt->OneTick wherever there is a chance that the interleaving of another thread could cause trouble.

When you are ready to do a concurrent test, and have compiled your code, give the command nachos -rs seed, where seed is any integer. seed provides a starting value for the pseudorandom number generator. Every time you call the same version of nachos with the same seed, you will see the same results. This is a huge advantage for debugging. In order to test thoroughly, you should run the same program with several different seeds. You may also decide to create specific patterns of interleaving using explicit calls to Yield.

General requirements

These requirements hold for Project #2 and for all subsequent project steps.

In all of the following tasks, you must write well structured and documented C++ code, defining the key concepts in terms of classes. Use the formatting conventions in the initial Nachos code for indentation, placement of comments, etc. Use the conventions described in the C++ Introduction for your variable names.

It is your responsibility to provide good demonstration inputs that make it clear to me how your code works. You should put as much effort into the demonstration as into the code itself. I will not be impressed by large numbers of redundant tests, but only by systematically convincing tests that exercise the key features of your code.

Provide a general description of each solution, approximately 1/2 page to 1 page, as well as good internal documentation.

Special requirements for this assignment

Do not use busy-waiting loops that hold the CPU. But, you should use loops that Sleep on every iteration when there is good reason to expect that they will not usually waste a lot of CPU time. Loops that Yield are sometimes defensible, but the reasoning is more delicate. There are good solutions both with and without Yielding loops.

All synchronization commands that you implement must follow the Mesa rules, where threads do not release the CPU after they signal other threads.

Your solutions to synchronization problems must not make assumptions about the scheduler. In particular, they must not depend on the fact that the initial Nachos scheduler uses a FIFO queue.

When you create a queue as part of the implementation of synchronization commands, do not let the correctness of the synchronization depend on the queueing order. This is important because, with the Mesa interpretation of synchronization commands, the order in which threads get service is not always the same as the order in which they are taken off of the queue as a data structure.

I will give maximum credit only for solutions that do not normally stop a thread when the logical conditions for continued computing hold. If you write very short sequences of code holding a lock, then you can assume that Yielding the CPU while holding the lock is a highly infrequent, abnormal, behavior. But, you should never let one thread wait on another that is Sleeping, unless there is no logically correct alternative.

Please discuss these requirements in class, outside of class in study groups, and online. I do not require you to understand them at first reading, but I do require you to understand them as a result of your project work.

The project itself

Use semaphores to implement bounded buffers. A bounded buffer is a queue with a fixed maximum length. When the queue is not empty, a Read operation reads and removes the item at the head of the queue. When the queue is not full, a Write operation adds an item to the tail of the queue. When the conditions for a Read or Write operation are not satisfied, the thread executing the operation must wait until they are. Any number of threads may Read, and any number of threads may Write, to the same buffer.
Implement Locks and ConditionVariables. I made a slight change to the interface for ConditionVariables, to better reflect the fact that each is associated permanently with a single lock. synch.h-alt1, synch.cc-alt1, and synchlist.cc-alt1 in the code/threads directory incorporate this change.

The amount of code that you need to write is very small, and it can all be created by appropriate modifications to synch.h and synch.cc. Although in principle condition variables may be implemented using semaphores as building blocks, it is better to program them from scratch. Look at the semaphore code for ideas, particularly regarding the Mesa rules.

Locks are rather simple. l->Acquire() waits until the Lock l is FREE, then sets it to BUSY. If a thread t holding the Lock l executes l->Release, it sets the state of l to FREE, and wakes up one thread queued up by Acquire, if there is one. l->Release() has no effect if the thread executing it does not hold the Lock l. In the spirit of Nachos, the thread made ready by Release does not neccesarily get the lock yet. It must try again, and if it misses, it goes back on the queue.

ConditionVariables are described in the text, in conjunction with monitors, but Tom Anderson's notes from Berkeley are better. A monitor is a class, with a Lock implicitly associated. In order to use members of the monitor class, a thread must hold the associated Lock. A ConditionVariable c associated with a Lock l is essentially a queue. A thread t1 holding l may execute c->Wait(), which releases l, blocks t1, and adds t1 the queue for c. A thread t2 that holds l may execute c->Signal(), which releases one other thread from the c queue, if there is one, or t2 may execute c->Broadcast(), which releases all of the threads on the c queue. For our purposes, Signal and Broadcast never block (assuming that l is held), and do not cause t2 to relinquish the lock. The final action of Wait is to Acquire l again, and t1 may block again in order to accomplish this.

A single Lock may have many ConditionVariables, but each ConditionVariable is permanently associated with precisely one Lock. This is why I changed the ConditionVariable interface to take the Lock as a parameter to its constructor, rather than to each Wait, Signal, and Broadcast.

With ConditionVariables, we still need Acquire and Release for the beginning and end of our interaction with a given Lock. ConditionVariables are an extension of the Lock functionality, not a replacement for it.
Implement message passing with Send and Receive, using condition variables and locks. A port identifies a channel over which messages may be sent. Send(port,message) sends the given message to the given port, and waits for it to be Received. Receive(port) waits for someone to Send(port,message) a message, then returns the value of the message. There is no constraint on the order of Sends and Receives.
Com Sci 330 only. The implementation of message passing above includes a handshake in which only one thread needs to Sleep. That's pretty good, but we can do better. If both threads come along sort of at the same time, we should be able to do a handshake with no Sleeping at all. A key issue is the interpretation of ``sort of at the same time.''
1. Implement a nonblocking handshake, in which two threads can synchronize with no Sleeping on a queue. To do this, the threads cannot synchronize on a point in their code; rather they must synchronize on an interval. Implement Reach(party) and Drop(party), corresponding roughly to reaching out for a hand to shake, and dropping that hand after shaking. party can be 0 or 1. Whenever a thread executes Reach(0), followed somewhat later by Drop(0), it must shake hands with another thread that executes Reach(1) followed somewhat later by Drop(1). ``Shake hands'' means that the intervals between Reach and Drop must overlap. Whenever the luck of the scheduler causes those intervals to overlap, neither thread should Sleep on a queue. While the interval between Acquire and Release of a lock should be as short as possible, the interval between Reach and Drop should be as long as possible, for best performance. Discuss very briefly how long the interval must be to get a benefit from nonblocking handshake, and how this differs on a uniprocessor vs. a multiprocessor.
2. Extend your nonblocking handshake to implement a nonblocking message exchange, where two threads exchange messages synchronously. The message to send must be provided in a SendExchange command, corresponding to Reach. The message received must not be used until the thread executes a ReceiveExchange command, corresponding to Drop. To accomplish this thing perfectly, each thread in an exchange must write into the other's data, at a location given by a pointer passed to the crucial operation. Although we haven't started implementing user address spaces, your description should distinguish data that should be in a user address space from data that should be in the kernel's address space.

Big Hint

For all of this project assignment, even the Com Sci 330 portion, good solutions to the problems above are small, simple, and elegant, involving 10s of lines of code at most. If you are developing voluminous or complicated code, then you are on the wrong track.

Hot Tips

The secret to successful software work is to never let your code get out of control. It's much better to have code that you understand, accomplishing something less than the requirements of the problem, than to have code that takes a wild shot at a solution. Never add/change more than 6-12 lines of code without making sure you understand the results. This includes the DEBUG statements, which you should always write along with the function containing it.

Recompile all the time. I spend a lot more time fixing little syntactic and type-checking errors than I do working out the right computation steps. The C++ compiler messages are somewhat obtuse, but if you have changed only a few lines, you have a fighting chance to discover a problem.

As soon as you have implemented one feature, and compiled it successfully, test it immediately, and write a description. By no means should you ever continue developing untested software. Look for a short simple test that reveals the essential workings of your program. From the beginning, you should spend as much time testing as coding the solutions to the problems.

When you are done with a feature, go back and test it thoroughly, work out the tests that will demonstrate your work most effectively to an independent critic, such as me, and polish your description of that feature.