Rajesh Krishna Balan The Design and Implementation of a Log-Structures File System Mendel Rosenblum, John K. Ousterhout This paper talks about a new filesystem known as the log-structured file system (LFS). This system was designed to speed up writes to disks and was based on the observation that with increasing memory caches, most of the accesses to disks will be writes and not reads. The filesystem buffers a sequence of disk writes and writes them to the disk as one long sequential write. This eliminate the seek time needed to find the place on the disk to write the files common in other filesystems. The authors specify a scheme whereby indexing entries are stored in the log written to the disk so that files can be accessed later without having to sequentially search the entire log for the file. To work efficiently, the LFS requires large amounts of free consecutive disk space in order to write the log. If the disk space was fragmented, the filesystem would have to write the log all over the disk and this would negate the benefits of the filesystem as the writes would need multiple seeks to find the next free segment on the disk. The authors propse a cleaning mechanism that runs in the background and automatically reclaims disk space from segments that are no longer in use. The cleaning mechanism also moves segments around to form larger consecutive blocks of free space. This is somewhat similar to garbage collection algorithms in modern programming languages. Finally, the authors do a performance analysis of a system employing the LFS and show that its performance is good compared with regular UNIX filesystems like BSD's FFS and Solaris's filesystem. The main thing i took away from this paper is that there are always tradeoffs in systems design. LFS improves writes to disk at the cost of having to run the expensive cleaning process. The major problem with this paper is the overhead associated with the cleaning mechanisms. These mechanisms need to be run in the background and add additional load to the system. It's unclear how these mechanisms will perform in systems where the disk and/or CPU load is very high. Also the cleaning mechanisms need to identify the file access patterns so that it can optimize its cleaning methodology. This does not seem that easy to figure out.