Executor for Apache Spark #499

rbavery · 2024-07-14T22:16:50Z

Could Spark be added as a supported executor?

Maybe RDD.map or RDD.mapPartitions would be the correct way to map a function similar to map_unordered in the Lithops executor.

https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.RDD.mapPartitions.html#pyspark.RDD.mapPartitions

To support this a guess would need to be made up front on the reserved memory available for python UDFs. It sounds like currently this would be done globally but maybe later could be done on a per-operator basis?

The text was updated successfully, but these errors were encountered:

tomwhite · 2024-07-18T10:31:53Z

A Spark executor would be a great addition. I just added some notes about implementing a new executor in #498 if you're interested in having a go at this @rbavery?

rbavery · 2024-07-23T18:26:50Z

I'm definitely interested, thanks for adding notes. It's possible I won't make quick (or any) progress because of other responsibilities 😬

TomNicholas mentioned this issue Jul 14, 2024

Docs on how to write a new executor #498

Open

tomwhite added the runtime label Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Executor for Apache Spark #499

Executor for Apache Spark #499

rbavery commented Jul 14, 2024

tomwhite commented Jul 18, 2024

rbavery commented Jul 23, 2024

Executor for Apache Spark #499

Executor for Apache Spark #499

Comments

rbavery commented Jul 14, 2024

tomwhite commented Jul 18, 2024

rbavery commented Jul 23, 2024