--------------------------------------------- 18-845 Group Project Proposal Title: PASISizing the Web Authors: Vijay Pandurangan, Mehmet Bakkaloglu --------------------------------------------- In recent months, various types of attacks have paralyzed major Web sites, including those of Yahoo!, Microsoft, and CNN. Additionally, hackers have been gaining access to sensitive information stored in monolithic servers throughout the web. In attempting to "PASIS-ize" the web, we attempt to bring the security and availability afforded by PASIS to the World Wide Web. About PASIS ----------- The PASIS architecture flexibly and efficiently combines decentralized storage system technologies, data redundancy and encoding, and dynamic self-maintenance to create survivable information storage systems -- that is, storage systems whose availability, confidentiality, and integrity policies can survive component failures and successful malicious attacks. PASIS systems operate from the fundamental design thesis that no individual service, node or user can be fully trusted; having some compromised entities is viewed as the common case rather than the exception. By appropriately replicating and distributing data and services across many nodes, PASIS systems protect data and access to it from many successful small- and large-scale attacks. In PASIS, threshold schemes (ways of splitting up data into multiple parts) are used to store data. A threshold scheme has three parameters, namely, n, m and p (n>=m>=p). Data is divided into n shares, any m of the shares are sufficient to fully recover the data and less than p shares give no information about the data. The n shares formed during encoding are stored on n different storage nodes. Our Proposal ------------ Storing data in this manner increases the availability and security of data; however this comes with the cost of degraded performance. We believe a way to minimize this effect is to implement over-requesting. When reading data, since m shares are sufficient and since requesting m shares puts the least amount of traffic on the network, normally only m of the shares are requested. If one or more of the storage nodes is not available, additional shares must be requested in a second pass. This results in a slow response time. Thus, it might be wise to request more than m shares in the first pass in order to ensure at least m shares are received. Our goal is to determine the value of over-requesting policies for various threshold schemes. A goal of this effort will be to make the system as user-friendly as possible. Various methods of tying in PASIS functionality to the web will be explored. Another problem associated with this project that must be considered is developing a reasonably fault-tolerant method for mapping PASIS-based URLs to server nodes.