Abstract |
Modern microprocessors include a sophisticated hierarchy of caches to hide the latency of memory access and thereby speed up data processing. However, multiple cores within a processor usually share the same last-level cache. This can hurt performance, especially in concurrent workloads whenever a query suffers from cache pollution caused by another query running on the same socket.
In this work, we confirm that this particularly holds true for the different operators of an in-memory DBMS: The throughput of cache-sensitive operators degrades by more than 50%. To remedy this issue, we devise a cache allocation scheme from an empirical analysis of different operators and integrate a cache partitioning mechanism into the execution engine of a commercial DBMS. Finally, we demonstrate that our approach improves the overall system performance by up to 38 %.
|