Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executor for Apache Spark #499

Open
rbavery opened this issue Jul 14, 2024 · 2 comments
Open

Executor for Apache Spark #499

rbavery opened this issue Jul 14, 2024 · 2 comments
Labels

Comments

@rbavery
Copy link
Contributor

rbavery commented Jul 14, 2024

Could Spark be added as a supported executor?

Maybe RDD.map or RDD.mapPartitions would be the correct way to map a function similar to map_unordered in the Lithops executor.

https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.RDD.mapPartitions.html#pyspark.RDD.mapPartitions

To support this a guess would need to be made up front on the reserved memory available for python UDFs. It sounds like currently this would be done globally but maybe later could be done on a per-operator basis?

@tomwhite
Copy link
Member

A Spark executor would be a great addition. I just added some notes about implementing a new executor in #498 if you're interested in having a go at this @rbavery?

@rbavery
Copy link
Contributor Author

rbavery commented Jul 23, 2024

I'm definitely interested, thanks for adding notes. It's possible I won't make quick (or any) progress because of other responsibilities 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants