Carnegie Mellon University


November 07, 2018

Graph reordering for an optimal future

By Olivia Olshevski

Krista Burns

Graph processing applications are an everyday part of our lives: they make it possible to use GoogleMaps, rank search engine results in order of relevance, and offer recommendations for interesting products on retail websites such as Amazon. Although they have become integral to our online experiences, graph processing applications remains far from efficient. The fundamental challenge in optimizing graph processing applications is that they exhibit an irregular memory access pattern which leads to poor locality in processor caches and can result in up to 80% of the execution time being spent on waiting for data to brought in from main memory. In a paper granted the Best Paper Award at the IEEE International Symposium on Workload Characterization (IISWC) 2018, Vignesh Balaji, a Ph.D. candidate in electrical and computer engineering, and assistant professor Brandon Lucia investigated how the performance of graph processing applications could be improved using a technique called Graph Reordering.

Graph Reordering is a data-layout optimization technique that aims to improve performance of graph applications by making better use of processor caches. Reordering techniques leverage specific properties of the input graphs to modify the layout of graph data to place commonly accessed data close to each other. The modified data layout improves locality in on-chip caches and reduces long-latency main memory accesses and the paper notes performance improvements of up to 3x.

However, graph reordering is a controversial optimization because it requires modifying the layout of graph data before starting any graph computations and, hence, reordering the graph imposes some overhead. For some reordering algorithms, the overhead of modifying the layout can be significant enough to overshadow the benefits offered by reordering. For example, Balaji and Lucia explain how one reordering algorithm, Gorder, took almost 15 hours to reorder the graph which ultimately reduces the run time of a graph application from 12 to 5 seconds. While Gorder produced a speedup of 2.4x the extreme overhead of reordering the graph calls to question the effectiveness of techniques such as Gorder.

The community is divided on the effectiveness of graph reordering as a locality-enhancing optimizations. Some researchers cite the high overheads of reordering techniques like Gorder and conclude that graph reordering always imposes significant overheads and, after accounting for the cost of reordering the graph, never improves performance of graph applications. On the other hand, some researchers argue that graph reordering is an effective optimization since the benefits of reordering the graph can be reaped over multiple executions on the reordered graph.

To address the question of whether or not graph reordering is an optimization, Balaji and Lucia take a more data-driven approach. The authors make three main contributions in the paper. First, by running various combinations of reordering techniques across realistic graph applications and input graphs, they were able to demonstrate the existence of ​Lightweight​ reordering techniques that, unlike Gorder, managed to improve performance of graph applications even after accounting for the cost of reordering the graph. Second, their investigations revealed that the performance improvements from lightweight reordering techniques depended on the specific combination of graph application and the input graph. Finally, the authors proposed a mechanism to perform ​selective lightweight graph reordering only for the combinations of application and input graph that achieve high performance improvements.

To enable selective lightweight graph reordering, Balaji and Lucia observed that the performance improvement from reordering can be linked to easy-to-identify properties of graph applications and input graphs. Based on their analysis of reordering techniques across applications and input graphs, the authors summarized a list of application characteristics and structural properties of input graphs essential for achieving performance improvement from lightweight reordering. The paper introduces a metric called ​packing factor​ that measures structural properties of input graphs and can be used to predict the performance improvements for lightweight reordering. Using the easy-to-compute packing factor metric, Balaji and Lucia were able to demonstrate the effectiveness of selective lightweight reordering which incurred the overhead of graph reordering only when the expected performance improvement from reordering was significant.

Ultimately, Balaji and Lucia shed light on the scenarios when graph reordering is an effective optimization and offer a more promising approach to improve the performance of graph processing applications by selectively performing lightweight graph reordering only for applications and input graphs that are likely to benefit the most from reordering.