In 8th Annual ACM Symposium on Parallel Algorithms and Architectures SPAA'96, June 24-26, 1996, Padua, Italy, pp.297-308. [Full text]

Abstract

In this paper, we analyze the performance of
parallel multithreaded algorithms that use dag-consistent distributed
shared memory. Specifically, we analyze execution time, page faults,
and space requirements for multithreaded algorithms executed by a
work-stealing thread scheduler and the Backer coherence algorithm for
maintaining dag consistency. We prove that if the accesses to the
backing store are random and independent (the Backer algorithm
actually uses hashing), then the expected execution time of a ``fully
strict'' multithreaded computation on P
processors, each with an LRU cache of C pages, is
O(T_{1}(C)/P + m C T_{inf}) ,
where T_{1}(C) is the total work of the
computation including page faults, T_{inf}
is its critical-path length excluding page faults, and m
is the minimum page transfer time. As a corollary to this
theorem, we show that the expected number of page faults incurred by a
computation executed on P processors, each with an
LRU cache of C pages, is
F_{1}(C) + O(C P T_{inf}) , where F_{1}(C) is the number of serial page faults.
Finally, we give simple bounds on the number of page faults and the
space requirements for ``regular'' divide-and-conquer algorithms. We
use these bounds to analyze parallel multithreaded algorithms for
matrix multiplication and LU-decomposition.

This page maintained by: Matteo Frigo. Last updated: Sun Jun 21 09:29:38 2009