Execution of functional programs on distributed-memory
multiprocessors gives rise to the
problem of evaluating expressions that are shared between several
Processing Elements (PEs).
One of the main difficulties of solving this problem is that, for a
given shared expression, it
is not known in advance whether realizing the sharing is more cost
effective than duplicating
its evaluation. Realizing the sharing requires coordination between
the sharing PEs to ensure
that the shared expression is evaluated only once. This coordination
involves relatively
high communication costs, and is therefore only worthwhile when the
shared expressions
require much computation time to evaluate. In contrast, when the
shared expression is not
computation intensive, it is more cost effective to duplicate the
evaluation, and thus avoid the
communication overhead costs. This dilemma of deciding whether to
duplicate the work or
to realize the sharing stems from the unknown computation time that
is required to evaluate
a shared expression. This computation time is difficult to estimate
due to unknown run-time
evolution of loops and recursion that may be part of the expression.
This paper presents an
on-line (run-time) algorithm that decides which of the expressions that
are shared between
several PEs should be evaluated only once, and which expressions should
be evaluated locally
by each sharing PE. By applying competitive considerations, the
algorithm manages to exploit
sharing of computation-intensive expressions, while it duplicates the
evaluation of expressions
that require little time to compute. The algorithm accomplishes this
goal even though it has
no a priori knowledge of the amount of computation that is
required to evaluate the shared
expression. We show that this algorithm is competitive with a hypothetical
optimal off-line algorithm, which does have such knowledge, and we prove
that the
algorithm is deadlock free. Furthermore, this algorithm does not require
any programmer intervention, it has low
overhead, and it is designed to run on a wide variety of distributed systems.