Com Sci 231

Homeworks for the Current Quarter

Nachos Project #4: Due 30 January 1996

Last modified: Wed Jan 10 19:49:53 1996

Project #4 is due on Tuesday, 30 January. For the advanced quarter, I am willing to be more flexible about project contents and collection dates, as long as negotiation is used to generate good experience and produce a good product. If flexibility turns into sloppiness, then I'll go back to the rigid collection times of 230/330.

Basic goals

The initial Nachos file systm is very primitive. Essentially all it provides is reading and writing of the simulated disk. There is no synchronization, so only one thread can use the system safely. Files have a fixed small size, and are laid out with one level of indirection through a fixed-size header. There is no hierarchical directory structure. Data in the file system may be destroyed when the system crashes.

Your task is to improve the system in at least the following three ways:

Allow files to grow (optionally, to shrink as well) dynamically after they have been created. For this purpose, you need to introduce some sort of linked-block structure, instead of the contiguous allocation in the initial system.
Allow files to have arbitrary sizes, as long as the sum of all files fits on the disk. There may be almost nothing to this after doing the previous part.
Add synchronization so that any number of threads may use the file system in parallel. Individual Read and Write system calls must behave as if they are atomic (you can't actually make them temporally atomic with an acceptable performance). If a Write completes before a Read begins, the Read must see the material written by the Write. If the execution of a Read and a Write overlap, then the system may behave as if either one came first, but atomicity requires that the Read sees either all or none of the data from the Write.

I'm not sure of the best order in which to do these. At first, I contemplated synchronization first (as listed in the Berkeley assignment). On reflection, it appears that the best form for synchronization in a linked system may be different from that in a system with contiguous allocation. So, I moved synchronization to be last (as in the Stanford assignment). If you discover a good reason to do so, you may change the order. Please discuss your ideas in this regard openly in class and/or online.

Testing

The worst part of the CS230/330 project work was testing. Nobody succeeded consistently in presenting me with convincing tests. There were only a very few individual instances of pretty good testing. I am not going to be able to help a lot more with this, unless you start into it and ask somewhat specific questions. Testing is a subtle business, and each of our projects has presented radically different challenges for testing. The best general principle is to imagine yourself as an intelligent and unfriendly critic, who has read the code and come to a general understanding, but who assumes that there are some overlooked glitches. Try to come up with a small, clearly structured, set of examples that give strong (not perfect) evidence that the code is correct, to someone who has read the code. Remember that the critic does not trust you completely (so, for example, he is likely to try a few examples independently), but he is also not willing to go through the effort of testing on his own, so you need to guide him through precooked examples in detail.

The file system project is probably one of the easier ones to test well. Please give it a try at least. Test individual Reads and Writes that exercise all interesting variations on the data structure that you use to link up varying-length files. Synchronization is probably the hardest part to test. Random testing is helpful for catching some problems, but not very convincing. To be convincing, you need to systematically construct all of the interesting primitive patterns of interleaving operations. What they are will depend on your data structures and on the way in which you accomplish the synchronization.

Optional work

As long as good work gets done, and presented in a form that I can evaluate, I am willing to negotiate all aspects of the project. Based on your stated preference, we are doing individual project work rather than co-operative. But, I think that you should share code from the previous quarter freely, so that nobody spends too much time doing the prerequisites for the current project. I encourage open discussion of strategy and design. If you find it convenient to share other ideas, and even code, you may do so as long as you acknowledge your sources. Of course, in order to get a good grade, you must demonstrate some interesting contribution and understanding of your own. I suggest that we do not read protect directories this quarter, and go on the honor system for acknowledging ideas received.

Particular additional improvements that you may decide to include to impress me more, and increase your experience:

Hierarchical directory structure. This sounds rather easy at first, and perhaps not too interesting. I think that the main issue is parsing and constructing path names. But, perhaps I've overlooked something deeper.
Improve performance by caching and/or intelligent scheduling of disk operations. This problem is open-ended. It may interact in interesting ways with synchronization. It should be accompanied by some performance measurement to determine whether a change is truly and improvement. None of our other projects has taken performance or measurement seriously, and it is likely that the network project will also neglect performance. So, if you would like some experience in performance issues, this is the place to get it.
Make the file system robust under crashes of the Nachos OS. Assume that somebody pulls the plug on the simulated MIPS machine at an arbitrary moment. Arrange the system so that, on startup, the file system always comes up in a sensible state, reflecting all of the changes that were made up to some point just before the crash. To go even farther, provide support for restarting programs that manipulate the file system so that they do precisely the work that failed to be permanently recorded before the crash. The robustness issue will dominate our work on networking. So, you can ignore it here on the grounds that you'll learn about it later, or you can use file-system robustness as a warmup for network robustness.

Hot tips

These hot tips are repeated from CS230/330, but they don't seem to have sunk in completely, so please think about them again.

The secret to successful software work is to never let your code get out of control. It's much better to have code that you understand, accomplishing something less than the requirements of the problem, than to have code that takes a wild shot at a solution. Never add/change more than 6-12 lines of code without making sure you understand the results. This includes the DEBUG statements, which you should always write along with the functions containing them.

Recompile all the time. I spend a lot more time fixing little syntactic and type checking errors than I do working out the right computation steps. The C++ compiler messages are somewhat obtuse, but if you have changed only a few lines, you have a fighting chance to discover a problem.

As soon as you have implemented one feature, and compiled it successfully, test it immediately. By no means should you ever continue developing untested software. Look for a short simple test that reveals the essential workings of your program. From the beginning, you should spend as much time testing as coding the solutions to the problems.

When you are done with a feature, go back and test it thoroughly, and work out the tests that will demonstrate your work most effectively to an independent critic, such as me.