-
Notifications
You must be signed in to change notification settings - Fork 125
Description
Hi!
Is your feature request related to a problem? Please describe.
Currently, ImmutableTaskGraph binds input/output objects and device transfer directives at graph construction and snapshot time. This design prevents safe concurrent execution of the same graph instance by multiple Java threads with different inputs/outputs, and forces users to duplicate the graph and re-compile the kernels for each thread/context.
TestMultiThreadedExecutionPlans.java is the existing example for multiple Java threads executing the same TaskGraph.
In the example we can see test01() and test02() execute the same TaskGraph but without changing the input at all, because both threads share the same FloatArray input (and output). Also, each thread creates their own snapshot() of the graph - that means each snapshot has to compile its own kernel.
Likewise, test03() allows the use of different inputs by two threads, but each thread compiles an entire TaskGraphs to allow the inputs to differ.
Describe the solution you'd like
Introduce a thread-safe execution API for ImmutableTaskGraph that allows per-invocation binding of kernel arguments and data transfer objects, such as:
myImmutableTaskGraph.execute(inputA, inputB, outputC)
There are several steps to achieve this:
- remove per-execution context from
ImmutableTaskGraphand make it truly "Immutable"
(This involvesTornadoTaskGraphas the actual implementation) - Allow the passing of
Param<T>placeholders to Tasks, device transfers and persistence directives instead of the actualObject...parameters. - On every
execute(Object params...)create an individual execution context, and bind the given params to that executions.
Besides enabling multiple threads to execute the same ImmutableTaskGraph at the same time, this approach will provide a more "functional" coding pattern, because the parameters are given at .execute()
The existing API can still be supported by keeping the existing Task(Object...) calls and assigning these parameters as "default" parameters for execution.
Describe alternatives you've considered
- Continue with current approach, duplicating snapshot() and recompiling for each thread/context (inefficient)
Additional context
What do you think about this?
I'm not sure if I'll have the time to, but I might work on this.
To start working I would be happy if someone could verify this approach can work, and give me some directions on relevant code where execution context is currently handled in TornadoVM