Hash Join Algorithms in Light of CPU Cache Prefetching
Tuesday February 14, 2006
Hamerschlag Hall D-210
Intel Research Pittsburgh
The latency gap between CPU caches and main memories has
become a major performance problem in database systems.
Recent studies have demonstrated that on commercial
databases, about 50% or more of execution time in memory is
often wasted due to cache misses. In light of this problem,
a number of recent studies focused on reducing the number of
cache misses of database algorithms. In this talk, I
investigate a different approach: reducing the impact of
cache misses through a technique called CPU cache
Specifically, I will focus on one of the core database
algorithms, the hash join algorithm, and re-examine its
design in light of CPU cache prefetching. One of the major
challenges here is to prefetch for hash table accesses,
which are random in nature. Our solutions, Group
Prefetching and Software-Pipelined Prefetching, exploit the
inter-tuple parallelism to overlap cache misses across the
processing of multiple tuples. Moreover, we exploit the
two-pass structure of the hash join algorithm to learn from
the earlier pass for even better performance of the later pass.
Our experimental results demonstrate dramatic
performance benefits of our cache prefetching enabled
techniques compared to state-of-the-art cache-friendly
algorithms for hash joins.
Shimin Chen obtained his PhD under the supervision of Prof.
Todd C. Mowry and Prof. Anastassia Ailamaki in Computer
Science from Carnegie Mellon University in December 2005.
He joined Intel Research Pittsburgh after graduation. His
research interests are database systems, computer
architectures, and software systems. He received his
master's degree in 1999 and bachelor's degree in 1997 both
from Tsinghua University in China.