Distributed Anonymous Information Retrieval: Freenet We plan to investigate distributed anonymous retrieval systems by analyzing the Freenet implementation. Freenet divides up files among hosts; encrypts them; and replicates them based on their popularity. Our chief interest is how one can maintain anonymity yet search engines can still determine where parts of files reside. How can a user's machine send data to a machine requesting it without explicitly resolving it's IP address and forwarding packets with its signature. We also plan to investigate and perhaps come up with some new ways to determine when files get deleted and replicated. The fear is that the system can become a virtual /tmp directory where many people just dump data for the convenience. Any system should be able to quickly clean these types of files out, yet keep important files around. Freenet is the ultimate heterogeneous system. It is important for a system such as Freenet to take the node's capabilities and internet connection rate into account. We will first get the latest CVS snapshot of Freenet and attempt to build our own Freenet system at home. This should take under two weeks for us to get the main parts of the system performing as expected. There are a number of papers published on Freenet which are freely accessible from the sourceforge website. We will read these in order to better understand the architecture of the system, and perhaps some of its weaknesses. We expect to encounter a number of obstacles that will need to be overcome because of the "in development" status of the system, as many of its major features have not yet been implemented. Key Points ---------- o Scalability How well does a system such as Freenet scale with a growing number of hosts? Can the increased message passing between hosts become a serious bottleneck, such as in other fully distributed systems such as Gnutella? o Anonymity Are Freenet objects truly anonymous? With emerging technologies for IP traceback (IPv6 for example) how do we keep objects from having their origins traced? o Persistence / Durability How does Freenet insert and distribute objects? How does it purge objects? What if an object needs immediate removal, such as a trojan virus executable? o Availability / Caching How do more popular objects become more accessible (available)? o Fault Tolerance What happens in the case of a node failure, be it temporary or permanent?