James Bruce 15-712 Paper Review Monday October 1st "Cluster I/O with River: Making the Fast Case Common" Important Points: - Software and hardware heterogeny in a distributed system, especially when transient, cause performance problems for statically balanced algorithms. Compensating for the heterogeny is difficult, and would not be necessary if dynamic balancing were efficient and easy to use; River is designed to satisfy this goal. - A data flow design is a nice abstraction for most I/O intensive applications, and can be optimized to produce efficient near peak operation while maintaining an easy programming interface. Furthermore, the modular structure allows for easy to use non-language based editors (in this case a GUI) to combine modules into flows. - Distributed queues (DQ) and graduated declustering (GD) are the key to implementing the system efficiently. DQ's are inserted by the programmer in a flow to balance producer and consumer pools that may each be heterogeneous, forming the basis of the producer/consumer balancing. GD is a data replication and balancing system for disk I/O; The replication means multiple sources are available for given data, and by varying the balance of rates from the various sources the throughput to each disk consumer (for reading) or producer (for writing) can be balanced even when individual disk throughput is not. Deficiencies: I have only one minor complaint about this paper, in that I would like to see an additional example. Some common algorithms really do not map well into a dataflow paradigm, such as those that involve random access from a relatively small fraction of a large file. It would have been interesting in the evaluation to see an example of something that was not as straightforward mapping. This would stress the generality of the programming model, exploring potential limitations, rather than just strengths. Conclusions: A very interesting system was presented in this paper that used dataflows as a programming interface for building dynamically balancing algorithms. The DQ's and GD were clever techniques of more general applicability; They work well since they can balance without the more explicit inter-server communication such as that required with work stealing. They also allow a very clean and simple programming environment to be created, where systems based on other interfaces such as MPI do not work as cleanly.