Abstract
|
Parallel garbage collection seeks to exploit the inherent parallelism of graph tracing by evenly distributing the set of objects in the heap among all available processing resources. Any straightforward implementation, however, suffers from prohibitive overheads since each access to the worklist of objects and to the objects themselves needs to be protected by synchronization, especially so in the case of compacting collectors. For this reason, known parallel collectors sacrifice a great deal of work distribution granularity and scalability to keep the synchronization costs acceptable. In this paper, we present a case study of a different approach. Our parallel compacting collector is based on Cheneys copying algorithm, employs a single worklist and distributes garbage collection work on an object-by-object basis. This way, it achieves well balanced work distribution and good scalability. To solve the synchronization problem, we introduce a low-cost multi-core garbage collection coprocessor and take advantage of hardware-supported synchronization. We built an FPGA-based prototype with a single-core main processor supported by a multi-core garbage collection coprocessor. Measurement results show that an 8-core garbage collection coprocessor decreases the duration of garbage collection cycles by a factor of up to 7.4, while a 16-core configuration still achieves a factor of up to 12.1.
|
Reference entry
|
Horvath, O.; Meyer, M.
Fine-Grained Parallel Compacting Garbage Collection through Hardware-Supported Synchronization
5th International Symposium on Embedded Multicore SoCs, San Diego, 2010
|