← forum

Challenge Two Profiling for parallelization

Xiangyu Zhang, 2008/09/17 11:08 Challenge Two: A recent trend to parallelize a sequential program is to spawn a method call as a separate thread. Devise a profiler that identifies method calls that are amenable to such parallelization. Hint: you ought to consider if dependences between a method and itscontinuation are broken in the parallelized version. You can assume a dependence detector.

I received three proposals.

*Proposal 1
Proposal 2

Generally speaking, there are two situations by calling a function in 
a program. One case is the program after function foo use the return 
value of function foo. And the other case is that foo just does something 
else and the program executed after function foo does not care about the 
execution result of function foo. So, the first case is not amenable to the 
parallelization mentioned in the problem and the second one can be parallelized.

To ensure the whole program execute correctly in such parallelization way,
the profiler should determine whether the foo's continuation is data dependent on 
the return value of function foo. Actually, assume there is a dependence 
detector, we do not care what's the content inside of the function foo.    
We should just pay attention to the return value, which is either in the left
side of the function call, or in the parameters list. We should just determine
whether there are data dependence between the return values and the variable 
after function foo.

In my opinion, there are two steps. First, identify the return value. 
Second, identify whether there is data dependence between the return value
and foo's continuation.

First step, I think that's relatively easy. But there may be some challenge
in the second step because we should take care of the aliasing problem.   
In dynamic analysis, by tracing the real address of the variables, maybe 
that can be solve. However, I am not sure about this point.

Proposal 3

Solution 1:  [Easy and Immediate] If a profiler has to be used to determine
when a method call can be spawned off as a thread, then the profiler should
determine whether there is any data dependence between any statement in the
continuation and the method call - for example, the method call should not
change any member of the class which the continuation uses, the continuation
should not use any return value of the method,  etc! The profiler uses the
data dependence detector (which should also perform alias analysis) to determine 
methods that can be spun off as a thread. I also assume that the data 
dependence detector can detect dependences across nested method calls

i.e. if foo() calls bar(), foo() can be spun off as a thread only if foo()'s
continuation is not data dependent on bar(). This is however, restrictive.
It is rare that a continuation doesn't have any data dependence on the method call!

Solution 2: Another solution can be to use the profiler to determine the
first statement in a continuation with any data dependence with the method
call. Let us call this statement S. Then instrument the code, to (1) make
the method call a Future (2) insert a get() call (of the Java Future API)
before S to wait for the method call to complete and get the return value
(if any). This solution increases concurrency i.e. if foo() is a method
call,this solution's aim is to execute that initial part of foo()'s continuation 
that does not depend on foo(). The fact that this solution needs 
source code isn't an impediment because to make method calls threads, one 
needs to have the source code anyway. This solution is coarse grained, i.e. the 
entire part of the continuation after the first data dependence waits 
for the method call to finish. i.e. if statement 10 of the continuation 
has a variable that is last defined in statement 2 of foo(), statement 10 
waits for foo() to complete ... which may not be necessary, ideally 
statement 10 should wait only for statement 2.

disst/challenge_two_profiling_for_parallelization.txt · Last modified: 2019/08/16 21:33 (external edit)