An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms

by Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, and Keith H. Randall.

In 8th Annual ACM Symposium on Parallel Algorithms and Architectures
SPAA'96, June 24-26, 1996, Padua, Italy, pp.297-308.
[Full text]


In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page faults, and space requirements for multithreaded algorithms executed by a work-stealing thread scheduler and the Backer coherence algorithm for maintaining dag consistency. We prove that if the accesses to the backing store are random and independent (the Backer algorithm actually uses hashing), then the expected execution time of a ``fully strict'' multithreaded computation on P processors, each with an LRU cache of C pages, is O(T1(C)/P + m C Tinf) , where T1(C) is the total work of the computation including page faults, Tinf is its critical-path length excluding page faults, and m is the minimum page transfer time. As a corollary to this theorem, we show that the expected number of page faults incurred by a computation executed on P processors, each with an LRU cache of C pages, is F1(C) + O(C P Tinf) , where F1(C) is the number of serial page faults. Finally, we give simple bounds on the number of page faults and the space requirements for ``regular'' divide-and-conquer algorithms. We use these bounds to analyze parallel multithreaded algorithms for matrix multiplication and LU-decomposition.

This page maintained by: Matteo Frigo. Last updated: Sun Jun 21 09:29:38 2009