-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use subcommands instead of various scripts, and allow to install lobster locally or system wide. Also try to make the language in the help consistent.
- Loading branch information
Showing
10 changed files
with
293 additions
and
184 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,38 @@ | ||
# Dependencies | ||
# Installation | ||
|
||
[PyYaml](http://pyyaml.org/wiki/PyYAML) is a pre-requisite. Install it | ||
locally after executing `cmsenv` in a release which is suppossed to be used | ||
with lobster: | ||
## Dependencies | ||
|
||
cd /tmp | ||
wget -O - http://pyyaml.org/download/pyyaml/PyYAML-3.10.tar.gz|tar xzf - | ||
cd PyYAML-3.10/ | ||
python setup.py install --user | ||
cd .. | ||
rm -rf PyYAML-3.10/ | ||
### CClab tools | ||
|
||
See [instructions on github](https://github.com/cooperative-computing-lab/cctools) | ||
of the Notre Dame Cooperative Computing Lab to obtain versions of | ||
`parrot` and `work_queue`. | ||
|
||
### Setuptools | ||
|
||
Install the python `setuptools`, if not already present, with | ||
|
||
wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | python - --user | ||
|
||
*At ND* | ||
|
||
wget --no-check-certificate https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py -O - | python - --user | ||
|
||
Now `lobster` can be installed, and any further python dependencies will be | ||
installed into your `~/.local` directory. | ||
|
||
## Setup | ||
|
||
Install the `argparse` module, if not available for your python (normally | ||
2.7 and up): | ||
Install lobster itself with | ||
|
||
cd /tmp | ||
wget -O - http://argparse.googlecode.com/files/argparse-1.2.1.tar.gz|tar xzf - | ||
cd argparse-1.2.1/ | ||
git clone [email protected]:matz-e/lobster.git | ||
cd lobster | ||
python setup.py install --user | ||
cd .. | ||
rm -rf argparse-1.2.1/ | ||
|
||
# Setting up your environment | ||
and `lobster` will be installed as `~/.local/bin/lobster`. Add it to your | ||
path with | ||
|
||
export PYTHONPATH=$PYTHONPATH:/afs/nd.edu/user37/ccl/software/cctools-lobster/lib/python2.7/site-packages/ | ||
export PATH=/afs/nd.edu/user37/ccl/software/cctools-lobster/bin:$PATH | ||
export PATH=$PATH:$HOME/.local/bin | ||
|
||
# Running lobster | ||
|
||
|
@@ -43,10 +51,18 @@ CMS): | |
|
||
4. Running lobster | ||
|
||
./lobster.py test/beanprod.yaml | ||
lobster process test/beanprod.yaml | ||
|
||
5. Starting workers --- see below. | ||
|
||
6. Creating summary plots | ||
|
||
lobster plot --outdir <your output directory> <your config/working directory> | ||
|
||
7. Publishing | ||
|
||
lobster publish <labels> <your config/working directory> | ||
|
||
# Submitting workers | ||
|
||
To start 10 workers, 4 cores each, connecing to a lobster instance with id | ||
|
@@ -60,10 +76,51 @@ If the workers get evicted by condor, the memory and disk settings might need | |
adjustment. Check in them`lobster.py` for minimum settings (currently 1100 Mb for | ||
memory, 4 Gb for disk). | ||
|
||
# Monitoring at ND | ||
# Running at ND | ||
|
||
## Setting up your environment | ||
|
||
Use `work_queue` etc from the CC lab: | ||
|
||
export PYTHONPATH=$PYTHONPATH:/afs/nd.edu/user37/ccl/software/cctools-lobster/lib/python2.7/site-packages/ | ||
export PATH=/afs/nd.edu/user37/ccl/software/cctools-lobster/bin:$PATH | ||
|
||
or, for `tcsh` users, | ||
|
||
setenv PYTHONPATH ${PYTHONPATH}:/afs/nd.edu/user37/ccl/software/cctools-lobster/lib/python2.7/site-packages/ | ||
setenv PATH /afs/nd.edu/user37/ccl/software/cctools-lobster/bin:${PATH} | ||
|
||
## Running opportunistically | ||
|
||
The CRC login nodes `opteron`, `newcell`, and `crcfe01` are connected to | ||
the ND opportunistic computing pool. On these, multicore jobs are | ||
preferred and can be run with | ||
|
||
cores=4 | ||
condor_submit_workers -N lobster_<your_id> --cores $cores \ | ||
--memory $(($cores * 1100)) --disk $(($cores * 4500)) 10 | ||
|
||
or, for `tcsh` users, | ||
|
||
set cores=4 | ||
condor_submit_workers -N lobster_<your_id> --cores $cores \ | ||
--memory `dc -e "$cores 1100 *p"` --disk `dc -e "$cores 4500 *p"` 10 | ||
|
||
## Running locally | ||
|
||
To submit 10 workers (= 10 cores) to the T3 at ND, run | ||
|
||
condor_submit_workers -N lobster_<your_id> --cores 1 \ | ||
--memory 1000 --disk 4500 10 | ||
|
||
on `earth`. | ||
|
||
## Monitoring | ||
|
||
* [CMS dasboard](http://dashb-cms-job.cern.ch/dashboard/templates/web-job2/) | ||
* [CMS squid statistics](http://wlcg-squid-monitor.cern.ch/snmpstats/indexcms.html) | ||
* [Condor usage](http://condor.cse.nd.edu/condor_matrix.cgi) | ||
* [NDCMS trends](http://mon.crc.nd.edu/xymon-cgi/svcstatus.sh?HOST=ndcms.crc.nd.edu&SERVICE=trends&backdays=0&backhours=6&backmins=0&backsecs=0&Go=Update&FROMTIME=&TOTIME=) | ||
to monitor squid bandwidth | ||
* [External bandwidth](http://prtg1.nm.nd.edu/sensor.htm?listid=491&timeout=60&id=505&position=0) | ||
* `work_queue_status` on the command line |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,115 +1,5 @@ | ||
#!/usr/bin/env python | ||
import os | ||
import shutil | ||
import time | ||
import yaml | ||
from lobster import cmssw | ||
from lobster import job | ||
from argparse import ArgumentParser | ||
|
||
import work_queue as wq | ||
from lobster.ui import boil | ||
|
||
parser = ArgumentParser(description='A job submission tool for CMS') | ||
parser.add_argument('config_file_name', nargs='?', default='test/lobster.yaml', help='Configuration file to process.') | ||
parser.add_argument('--bijective', '-i', action='store_true', default=False, help='Use a 1-1 mapping for input and output files (process one input file per output file).') | ||
args = parser.parse_args() | ||
|
||
with open(args.config_file_name) as config_file: | ||
config = yaml.load(config_file) | ||
|
||
config['filepath'] = args.config_file_name | ||
if not 'cmssw' in repr(config): | ||
job_src = job.SimpleJobProvider(config) | ||
else: | ||
job_src = cmssw.JobProvider(config) | ||
from ProdCommon.Credential.CredentialAPI import CredentialAPI | ||
cred = CredentialAPI({'credential': 'Proxy'}) | ||
if not cred.checkCredential(Time=60): | ||
cred.ManualRenewCredential() | ||
|
||
wq.cctools_debug_flags_set("all") | ||
wq.cctools_debug_config_file(os.path.join(config["workdir"], "debug.log")) | ||
wq.cctools_debug_config_file_size(1 << 29) | ||
|
||
queue = wq.WorkQueue(-1) | ||
queue.specify_log(os.path.join(config["workdir"], "work_queue.log")) | ||
queue.specify_name("lobster_" + config["id"]) | ||
queue.specify_keepalive_timeout(300) | ||
# queue.tune("short-timeout", 600) | ||
|
||
print "Starting queue as", queue.name | ||
print "Submit workers with: condor_submit_workers -N", queue.name, "<num>" | ||
|
||
payload = 400 | ||
|
||
while not job_src.done(): | ||
stats = queue.stats | ||
|
||
print "Status: Slaves {0}/{1} - Jobs {3}/{4}/{5} - Work {2} [{6}]".format( | ||
stats.workers_busy, | ||
stats.workers_busy + stats.workers_ready, | ||
job_src.work_left(), | ||
stats.tasks_waiting, | ||
stats.tasks_running, | ||
stats.tasks_complete, | ||
time.strftime("%d %b %Y %H:%M:%S", time.localtime())) | ||
|
||
hunger = max(payload - stats.tasks_waiting, 0) | ||
|
||
while hunger > 0: | ||
t = time.time() | ||
jobs = job_src.obtain(50, bijective=args.bijective) | ||
with open(os.path.join(config["workdir"], 'debug_lobster_times'), 'a') as f: | ||
delta = time.time() - t | ||
if jobs == None: | ||
size = 0 | ||
else: | ||
size = len(jobs) | ||
ratio = delta / float(size) if size != 0 else 0 | ||
f.write("CREA {0} {1} {2}\n".format(size, delta, ratio)) | ||
|
||
if jobs == None or len(jobs) == 0: | ||
break | ||
|
||
hunger -= len(jobs) | ||
|
||
for id, cmd, inputs, outputs in jobs: | ||
task = wq.Task(cmd) | ||
task.specify_tag(id) | ||
task.specify_cores(1) | ||
# temporary work-around? | ||
task.specify_memory(1100) | ||
task.specify_disk(4000) | ||
|
||
for (local, remote) in inputs: | ||
if os.path.isfile(local): | ||
task.specify_input_file(str(local), str(remote), wq.WORK_QUEUE_CACHE) | ||
elif os.path.isdir(local): | ||
task.specify_directory(local, remote, wq.WORK_QUEUE_INPUT, | ||
wq.WORK_QUEUE_CACHE, recursive=True) | ||
else: | ||
raise NotImplementedError | ||
|
||
for (local, remote) in outputs: | ||
task.specify_output_file(str(local), str(remote)) | ||
|
||
queue.submit(task) | ||
|
||
print "Waiting for jobs to return..." | ||
task = queue.wait(3) | ||
tasks = [] | ||
while task: | ||
tasks.append(task) | ||
if queue.stats.tasks_complete > 0: | ||
task = queue.wait(1) | ||
else: | ||
task = None | ||
print "Done waiting..." | ||
if len(tasks) > 0: | ||
t = time.time() | ||
job_src.release(tasks) | ||
with open(os.path.join(config["workdir"], 'debug_lobster_times'), 'a') as f: | ||
delta = time.time() - t | ||
size = len(tasks) | ||
ratio = delta / float(size) if size != 0 else 0 | ||
f.write("RECV {0} {1} {2}\n".format(size, delta, ratio)) | ||
boil() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.