Associate Professor – CS
| Contact Information | |
|---|---|
| Department | Computer Science |
| Office | 7109 Wean Hall |
| Telephone | (412)-268-7848 |
| natassa@epfl.ch | |
| Website | http://www.cs.cmu.edu/~natassa/ |
My research interests are in creating new database management technologies for modern computer hardware and applications, aiming at (a) creating new database software architectures that interact efficiently with modern computation and peripheral devices, and (b) facilitating the usage of database management systems by scientific disciplines.
Over the past two decades, database software has fundamentally remained unchanged, whereas hardware infrastructure has undergone significant changes that lead to faster and more sophisticated processors which access slower and larger main memories. At the same time, I/O latencies in modern database applications are hidden by faster disks and smarter storage managers, while the ever-increasing processor/memory speed gap exposes memory-related latencies. As a result, the exciting advancements in microarchitecture do not yield commensurate database performance improvements. To improve the interaction between the database software and the underlying hardware, I develop techniques that optimize traditional database data placement across the entire storage hierarchy, as well as query processing algorithms that eliminate unnecessary accesses to slow memory and hide latencies caused by unavoidable loads.
Current database software architectures suffer from inherent shortcomings in using the memory hierarchy efficiently. Typical commercial products include a database server which maps queries to threads managed by the underlying operating system. As a result, the operating system's context-switching decisions are oblivious to the state of the query/thread, and therefore cause context thrashing in the hierarchy. To solve this problem, I am leading a project that develops a new, staged DBMS architecture that allows for context switching on module boundaries, and for cohort scheduling of requests on a per-module basis. The new system includes a set of independent servers, each of which implements a module of the database software and is oblivious to the rest of the system. An incoming query visits only the servers that it needs, and executes the corresponding modules. This system is not only easier to develop and troubleshoot; it offers virtually unlimited availability and scalability as it can run on top of a chip multiprocessor, a multithreaded processor, or a network of commodity workstations. The light-weight servers can be dynamically replicated to reflect the system's load. This modular, flexible architecture will enable high parallelism and performance, and will easily adapt to new hardware such as chip mutiprocessors and simultaneously multithreaded microarchitectures.
Scientific experiments in fields such as astronomy and biology typically require accumulating, storing, and processing very large amounts of information. Current scientific data sets are in the order of tens of terabytes, and grow at a fast rate as new experimental resumts are accumulated. In environments of such scale, query execution performance depends heavily on the underlying database physical design, in particular the set of relations, indexes, and views stored on the disks. In addition, physical database design decisions that rely on accurate descriptions of representative workloads typically yield significantly higher performance improvements. My research efforts focus on three areas:
We currently enable previously impossible experiments involving (a) astronomical databases, used in determining the shape of the universe; (b) seismographic data, used in earhquake forecasting; and (c) environmental databases, used in evaluating quality of drinking water. The highly interdisciplinary nature of this research is exciting and rewarding, as the results are inspired, evaluated, and used directly by the scientific teams.
In my SCS web site you can find more about my group and projects.
Database systems, internet querying, computer architecture
PhD, 2000
Computer Science
University of Wisconsin, Madison
MS, 1996
Computer Science
University of Rochester
MS, 1993
Electronic and Computer Engineering
TUC
BS, 1990
Computer Engineering and Informatics
University of Patra