← forum
Challenge Two Profiling for parallelization
Xiangyu Zhang, 2008/09/17 11:08 Challenge Two: A recent trend to parallelize a sequential program is to spawn a method call as a separate thread. Devise a profiler that identifies method calls that are amenable to such parallelization. Hint: you ought to consider if dependences between a method and itscontinuation are broken in the parallelized version. You can assume a dependence detector.
I received three proposals.
- Proposal 2
Generally speaking, there are two situations by calling a function in a program. One case is the program after function foo use the return value of function foo. And the other case is that foo just does something else and the program executed after function foo does not care about the execution result of function foo. So, the first case is not amenable to the parallelization mentioned in the problem and the second one can be parallelized.
To ensure the whole program execute correctly in such parallelization way, the profiler should determine whether the foo's continuation is data dependent on the return value of function foo. Actually, assume there is a dependence detector, we do not care what's the content inside of the function foo. We should just pay attention to the return value, which is either in the left side of the function call, or in the parameters list. We should just determine whether there are data dependence between the return values and the variable after function foo.
In my opinion, there are two steps. First, identify the return value. Second, identify whether there is data dependence between the return value and foo's continuation.
First step, I think that's relatively easy. But there may be some challenge in the second step because we should take care of the aliasing problem. In dynamic analysis, by tracing the real address of the variables, maybe that can be solve. However, I am not sure about this point.
- Proposal 3
Solution 1: [Easy and Immediate] If a profiler has to be used to determine when a method call can be spawned off as a thread, then the profiler should determine whether there is any data dependence between any statement in the continuation and the method call - for example, the method call should not change any member of the class which the continuation uses, the continuation should not use any return value of the method, etc! The profiler uses the data dependence detector (which should also perform alias analysis) to determine methods that can be spun off as a thread. I also assume that the data dependence detector can detect dependences across nested method calls
i.e. if foo() calls bar(), foo() can be spun off as a thread only if foo()'s continuation is not data dependent on bar(). This is however, restrictive. It is rare that a continuation doesn't have any data dependence on the method call!
Solution 2: Another solution can be to use the profiler to determine the first statement in a continuation with any data dependence with the method call. Let us call this statement S. Then instrument the code, to (1) make the method call a Future (2) insert a get() call (of the Java Future API) before S to wait for the method call to complete and get the return value (if any). This solution increases concurrency i.e. if foo() is a method call,this solution's aim is to execute that initial part of foo()'s continuation that does not depend on foo(). The fact that this solution needs source code isn't an impediment because to make method calls threads, one needs to have the source code anyway. This solution is coarse grained, i.e. the entire part of the continuation after the first data dependence waits for the method call to finish. i.e. if statement 10 of the continuation has a variable that is last defined in statement 2 of foo(), statement 10 waits for foo() to complete ... which may not be necessary, ideally statement 10 should wait only for statement 2.