Skip to content

feat!: job arrays #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions snakemake_executor_plugin_slurm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import atexit
import csv
from io import StringIO
from itertools import groupby
import os
from pathlib import Path
import re
Expand Down Expand Up @@ -168,6 +169,20 @@ def warn_on_jobcontext(self, done=None):
def additional_general_args(self):
return "--executor slurm-jobstep --jobs 1"

def run_jobs(self, jobs: List[JobExecutorInterface]):
for rule_name, group in groupby(jobs, key=lambda job: job.rule.name):
same_rule_jobs = list(group) # Materialize the generator
if len(same_rule_jobs) == 1:
self.run_job(same_rule_jobs[0])
else:
# TODO submit as array
# share code with run_job

# TODO in the future: give a hint to the scheduler to select preferably
# many jobs from the same rule if possible, in order to have
# more efficient array jobs. This should be somehow tunable, because
# it might contradict other efficiency goals.
...
def run_job(self, job: JobExecutorInterface):
Comment on lines +172 to 186
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Verify implementation aligns with PR objectives

The PR mentions implementing a min_jobs resource for job aggregation, but this resource isn't referenced in the current implementation. Consider how this resource will be integrated with the job grouping logic.


🏁 Script executed:

#!/bin/bash
# Search for any references to min_jobs in the codebase
rg "min_jobs" --type python

Length of output: 58


I'll run a revised shell script using the proper file type flag to search for references to min_jobs in Python files. This will help verify whether the intended resource is referenced anywhere in the codebase.


🏁 Script executed:

#!/bin/bash
# Search for any references to 'min_jobs' in Python files using the correct file type flag.
rg "min_jobs" --type py

Length of output: 23


Action Required: Integrate the min_jobs resource with job aggregation logic

Our verification confirms that there are no references to the min_jobs resource anywhere in the Python codebase. Since the PR explicitly mentions implementing a min_jobs resource for job aggregation, the current job grouping logic in snakemake_executor_plugin_slurm/__init__.py (lines 172-186) does not reflect this change.

  • Location: snakemake_executor_plugin_slurm/__init__.py (within the run_jobs method)
  • Action: Please update the implementation to incorporate the min_jobs logic into the job aggregation process, or provide additional documentation explaining how this resource will be integrated in subsequent changes.
🧰 Tools
🪛 Ruff (0.8.2)

173-173: Loop control variable rule_name not used within loop body

Rename unused rule_name to _rule_name

(B007)

# Implement here how to run a job.
# You can access the job's resources, etc.
Expand Down
Loading