-
Notifications
You must be signed in to change notification settings - Fork 63
Spark job dynamic resource allocation with Cook #905
Comments
For 1, we already have a priority field on jobs you can use which should work for the case you described. For 2, do you mean that you'd want a job to start with a certain percentage of the resources available on a host when it starts, regardless of the absolute amount? Or do you want a job to receive as many resources as possible, and the percentage is the minimum amount on a host to start the job? |
Dear, Thank you for your fast answer. Regarding 2, allowing a job to start with a certain percentage of the total available resource would be enough. If we set the percentage to 100, then the job is supposed to take all the available resource. Best regards |
Just want to make sure, for two, are you referring bto Sparks dynamic
allocation (
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-dynamic-allocation.html)
where spark can request more executors as needed / as resources are
available?
…On Tue, Jul 3, 2018, 3:18 AM DatPhanTien ***@***.***> wrote:
Dear,
Thank you for your fast answer.
Regarding 1, do you mean that the priority setting of Cook allows a high
priority job to steal resources from a running job which has lower priority
(by killing it)?
Regarding 2, allowing a job to start with a certain percentage of the
total available resource would be enough. If we set the percentage to 100,
then the job is supposed to take all the available resource.
Best regards
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#905 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABZnHX4PAD-KQiZ0uSNebfsBctWn1OQmks5uCxrAgaJpZM4U_npJ>
.
|
Dear Thank you for this information. This looks very interesting and seems like it provides the utility for us to reach the expectation number 2. BTW, I wonder how the Cook scheduler works with priority. Assume that job A with low priority is running on Spark cluster. Job B comes and has high priority. If the resource requested by B is higher than the current available, DOES Cook send a delete request to Spark API to kill A in order to release resource for B? Moreover, does this deleting action take time, how long it is expected to take? Please enlighten me for this aspect. |
Regarding priorities: Job priorities in Cook allow jobs with a higher priority to preempt lower-priority jobs started by the same user. In other words, if I have a bunch of (lower-priority) batch jobs running now that are using all of my resources, and I submit a (higher-priority) Spark job, then Cook will kill some of my batch jobs to make room for the Spark job. The batch jobs that were preempted will be automatically retried again later when I have more resources available. However, if the batch jobs are killed / fail several times, they can run out of available retries, meaning they will no longer be automatically retried by Cook. Note that Cook's priority value is a priority weight value (in contrast to a priority ranking value). E.g., a priority value of 80 on a Cook job is higher priority than a priority value of 1. |
Thank you. Best |
Dear,
We are interesting in running Spark jobs using Cook on our DC/OS cluster.
There are two type of Spark jobs in our cluster.
Batch jobs that have high workload but do not require fast response.
Interactive jobs, that are triggered from user side, which have strict response time requirement (order of seconds or less).
Expectation:
So far, we could not find any documentation about Cook that mentions this.
Could you please enlighten us.
Best
The text was updated successfully, but these errors were encountered: