@@ -149,6 +149,17 @@ defmodule EXLA do
149
149
To increase the stack size of dirty IO threads from 40 kilowords to
150
150
128 kilowords. In a release, you can set this flag in your `vm.args`.
151
151
152
+ ## Distribution
153
+
154
+ EXLA allows its tensors to be sent across nodes, as long as the parent
155
+ node (which effectively holds the tensor) keeps a reference to the
156
+ tensor while it is read by any other node it was sent to.
157
+
158
+ The result of `EXLA.compile/3` can also be shared across nodes.
159
+ On invocation, the underlying executable is automatically serialized
160
+ and sent to other nodes, without requiring a full recompilation,
161
+ as long as the same conditions as above apply.
162
+
152
163
## Docker considerations
153
164
154
165
EXLA should run fine on Docker with one important consideration:
@@ -274,11 +285,11 @@ defmodule EXLA do
274
285
[2, 4, 6]
275
286
>
276
287
277
- Results are allocated on the `EXLA.Backend`. Note that the
278
- `EXLA.Backend` is asynchronous: operations on its tensors
279
- *may* return immediately, before the tensor data is available.
280
- The backend will then block only when trying to read the data
281
- or when passing it to another operation .
288
+ The returned function can be sent across nodes, as long as the parent
289
+ node (which effectively holds the function) keeps a reference to the
290
+ function while it is invoked by any other node it was sent to. On
291
+ invocation, the underlying executable is automatically serialized
292
+ and sent to other nodes, without requiring a full recompilation .
282
293
283
294
See `jit/2` for supported options.
284
295
"""
0 commit comments