Analyzing Hadoop namenode scalability and availability in multi-namenode configuration. Shrikant Mether and Prajakta Karandikar Abstract: Hadoop HDFS with its traditional single namenode configuration faced several issues 1. Namenode, which stores all metadata in memory, restricts the number of files HDFS can hold due to limited RAM size and forms a bottleneck 2. Namenode is a single point of failure A new architecture HDFS was introduced to tackle these issues. HDFS Federation tries to achieve scalability by allowing multiple namenodes serving part of the global namespace. This architecture answers scalability problems but does not provide automatic failover mechanisms to tackle availability issues. To improve availability, experiments have been done to make namenode stateless by storing all metadata into a distributed database instead of s memory , but the current implementation relies on static partitioning of namespace. As part of this project, we plan to 1. Explore how multi-namenode configurations (HDFS federation, stateless namenodes) provide better scalability 2. Implement dynamic namespace partitioning and failover mechanism for the stateless namenode implementation and analyze its performance. References: HDFS Federation - http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html#Key_Benefits Stateless Namenodes - http://lalith.in/2011/12/15/towards-a-scalable-and-highly-available-namenode/ namenodeFederationcalled