forked from pegasus-isi/pegasus
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRELEASE_NOTES
8832 lines (6262 loc) · 334 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
===============================
Release Notes for Pegasus 5.0.7
===============================
We are happy to announce the release of Pegasus 5.0.7, which is a minor
bug fix release for Pegasus 5.0 branch.
The release can be downloaded from https://pegasus.isi.edu/downloads
This release included improvements such as
- users now to specify an input directory with executables to avoid
creating a transformation catalog
- support for SGE clusters in pegasus-init
- support for private tokens while retrieving data from http endpoints
- preference of apptainer executables over singularity
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
Pegasus JIRA is accessible at https://jira.isi.edu/
New Features and Improvements
-----------------------------
1) [PM-1926] – pegasus should allow users to specify an input directory with
executables to avoid creating a TC
2) [PM-1888] – Apptainer support
3) [PM-1929] – convenient way to associate profiles for a site in properties file
4) [PM-1931] – support for local SGE cluster in pegasus-init
5) [PM-1933] – support for private-token to curl invocations
6) [PM-1936] – CLONE – support for local SGE cluster in pegasus-init
7) [PM-1928] – update python api to add the -t|–transformations-dir to the planner
8) [PM-1930] – add a convenience function to python api for adding site profiles
9) [PM-1935] – update the planner to optionally use the credentials file
for http transfers
10) [PM-1937] – support for mixed binary/venv/conda installs
Bugs Fixed
----------
1) [PM-1921] – arm arch string not consistent on linux and mac platforms
2) [PM-1922] – fix recording of download metrics
3) [PM-1927] – kickstart fails to filter out some UTF-8 non-printable characters
===============================
Release Notes for Pegasus 5.0.6
===============================
We are happy to announce the release of Pegasus 5.0.6, which is a minor bug
fix release for Pegasus 5.0 branch.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
Pegasus JIRA is accessible at https://jira.isi.edu/
New Features and Improvements
-----------------------------
[PM-1907] Improve stash integration to be osdf:// aware
[PM-1910] Handle dagman no longer inheriting user environment for
the dagman job
[PM-1911] Add support for Arm 64 architecture (aarch64)
[PM-1917] Enable host-wide metrics collection
Bugs Fixed
----------
[PM-1905] File dependencies between sub workflow and compute
jobs broken
[PM-1906] Planner container mount point parsing breaks on . in
the dir name
[PM-1909] request_disk is incorrectly set to MBs instead of KBs
[PM-1913] +DAGNodeRetry for attrib=value assigment breaks on
HTondor 10.0.x when direct submission is disabled
[PM-1916] Data management between parent compute job and a sub
workflow job broken
[PM-1918] Inplace cleanup broken when a sub workflow job and a
parent compute job has a data dependency
===============================
Release Notes for Pegasus 5.0.5
===============================
We are happy to announce the release of Pegasus 5.0.5, which is a minor bug
fix release for Pegasus 5.0 branch. This release corrects a build/packaging
problem in 5.0.4, resulting in the planner not finding all classes.
We invite our users to give it a try.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
Pegasus JIRA is accessible at https://jira.isi.edu/
Bugs Fixed
----------
[PM-1904] Incomplete clean between ant targets
===============================
Release Notes for Pegasus 5.0.4
===============================
We are happy to announce the release of Pegasus 5.0.4, which is a minor bug
fix release for Pegasus 5.0 branch. This release has some importan
updates namely
- Support for HTCondor 10.2 series
- Improved sub workflow file handling
We invite our users to give it a try.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
Pegasus JIRA is accessible at https://jira.isi.edu/
New Features and Improvements
-----------------------------
[PM-1890] pegasus-analyzer should show failing jobs
[PM-1891] pegasus-analyzer should traverse all sub workflows
[PM-1898] File dependencies for sub workflow jobs - differentiate
inputs for planner use and those for sub workflow
[PM-1899] update python api and json schema to expose forPlanning
boolean attribute with files in uses section
[PM-1900] update java wf api to support forPlanner attribute for files
Bugs Fixed
----------
[PM-1895] handle condor_submit updated way of specifying environment
in the .dag.condor.sub file
[PM-1893] need to explicitly mount sharedfilesystem dir into container
when using shared filesystem as staging site for nonsharedfs
[PM-1894] worker package transfer into application containers
[PM-1896] In pegasus lite scripts worker package strict check is
turned off
[PM-1897] update pegasus-configure-glite to use BLAHPD_LOCATION
===============================
Release Notes for Pegasus 5.0.3
===============================
We are happy to announce the release of Pegasus 5.0.3, which is a minor bug fix release for Pegasus 5.0
branch. This release has some important updates namely
- Support for Deep LFN’s in CondorIO Mode
If you are using bypass of staging for input files, then support for deep LFN’s depends
on associated HTCondor ticket 1325 that will be fixed in HTCondor release 10.1.0.
- Per Job Symlinking
- New Containers exercise in the Pegasus Tutorial
We invite our users to give it a try.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
Pegasus JIRA is accessible at https://jira.isi.edu/
New Features and Improvements
-----------------------------
1) [PM-1873] – Add a containers focussed exercise to the tutorial
2) [PM-1875] – Support deep LFN’s in CondorIO mode
3) [PM-1806] – Fix dest filenames for transformations
4) [PM-1815] – Prevent pegasus-lite failure when user passes -w to docker
5) [PM-1871] – Remove the “version” parameter from worker package transformation in TC documentation
6) [PM-1879] – Per job symlinking
7) [PM-1885] – Allow bypass of input files in CondorIO mode to be similar to behavior for nonsharedfs
8) [PM-1876] – implement moveto support in pegasus-transfer
9) [PM-1877] – mimic transfer_output_remaps in pegasus-lite-local.sh for local universe jobs
Bugs Fixed
----------
1) [PM-1809] – Job fails when a container has a pre-existing group with the same gid as the
one being created with groupadd
2) [PM-1864] – request_memory and request_disk does not get applied for local universe jobs
3) [PM-1868] – pegasus-init cli tool not working
4) [PM-1872] – Unable to locate executable when pegasus::worker transform is overridden
5) [PM-1878] – hierarchical workflows broken on recent condor install
6) [PM-1883] – users cannot decrease planner logging if pegasus.mode is set to debug
7) [PM-1884] – pegasus-remove does does not remove a running workflow
8) [PM-1887] – update broken bamboo tests
===============================
Release Notes for Pegasus 5.0.2
===============================
We are happy to announce the release of Pegasus 5.0.2, which is a minor bug
fix release for Pegasus 5.0 branch. This release has some important updates
namely
- Updated Pegasus Log4J support to 2.17.
- Globus Online Transfers now have support for consent options on endpoints
The release also has important bug fixes related to correctly detecting job
failures for grid universe jobs.
We invite our users to give it a try.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
New Features and Improvements
-----------------------------
1) [PM-1828] – Doc. mismatch
2) [PM-1817] – pegasus-run with shell code generator
3) [PM-1824] – decaf jobs should be associated with pegasus-exitcode postscript
4) [PM-1825] – update glossary
5) [PM-1826] – way to pass on additional arguments to clustered jobs
6) [PM-1830] – Globus Online transfers required GO consent
7) [PM-1835] – Upcoming changes to DAGMan output logging
8) [PM-1836] – Update Pegasus Log4J support to 2.16
9) [PM-1839] – allow easy clustering of the whole workflow without associating
labels for the jobs
10) [PM-1673] – passing properties as str to Properties() can be error prone,
add some preliminary checks before writing
11) [PM-1759] – user facing class/function args shouldn’t be prefixed
with _ such as _id
12) [PM-1816] – make it easier to add entries to the replica catalog by inferring
site, lfn, pfn from file Path or URL
13) [PM-1822] – improve parsing of value in Mixins._to_mb(value)
14) [PM-1827] – type check pfn in ReplicaCatalog.add_replica()
15) [PM-1831] – planner by default should pick up credentials.conf when pegasus-s3 is used
16) [PM-1834] – ensure that yaml is serialized in a deterministic manner
17) [PM-1858] – pegasus-s3 should pick up PEGASUS_CREDENTIAL environment variable
18) [PM-1859] – Document decaf as a clustering tool in the clustering guide
Bugs Fixed
----------
1) [PM-1763] – validate all strings that will then be used as filenames or
used in sub files
2) [PM-1821] – job failures not detected for grid universe jobs
3) [PM-1823] – Serialization of pegasus.memory results in a floating point no.
4) [PM-1832] – extraneous whitespace in arguments for sub workflow job with java generator
5) [PM-1837] – planner throws null pointer exception when invalid staging site is given
6) [PM-1838] – Intermediate outputs in a clustered job get sent back to staging
site when they are not used by subsequent jobs outside of the cluster
and when stage_out has been set to false for those files
7) [PM-1840] – Workflow class methods will always be None
8) [PM-1851] – PegasusLite submissions to local cluster (Slurm/PBS/etc) unable
to source pegasus-lite-common.sh
===============================
Release Notes for Pegasus 5.0.1
===============================
We are happy to announce the release of Pegasus 5.0.1. Pegasus 5.0.1 is a
minor bug fix release after Pegasus 5.0. We invite our users to give it
a try.
The release features improvements to the Pegasus Python API including
ability to visualize statically the abstract and generated executable
workflows. It also has improved support for DECAF, including an ability to
get clustered jobs in a workflow executed using DECAF. This release has
improvements to data access in PegasusLite jobs, if data resides on local
site, and job runs on a site where “auxiliary.local” profile is set to
true. Users can now use a new Submit Mapper called Named that allows
you to specify what sub directory a job’s submit files are placed in.
Release also features updated support for submission of jobs using
HubZero Distribute to HPC Clusters and new pegasus.mode called "debug"
to enable verbose logging throughout the Pegasus stack.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
JIRA items
----------------------------
Exhaustive list of features, improvements and bug fixes can be found below.
New Features and Improvements
-----------------------------
1) [PM-1726] – Update support for HubZero Distribute
2) [PM-1751] – Named Submit Directory Mapper
3) [PM-1798] – instead of the workflow having explicit data flow jobs,
get pegasus to automatically cluster jobs to a decaf
representation
4) [PM-1753] – add Workflow.get_status()
5) [PM-1767] – remove the default arguments, output_sites and cleanup
in SubWorkflow.add_planner_args()
6) [PM-1786] – update usage of threading.Thread.isAlive() to be
is_alive() in python scripts
7) [PM-1788] – Add configuration documentation for hierarchical workflows
8) [PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow
i.e. development, production, etc
9) [PM-1651] – Add more profile keys in the add_pegasus_profile
10) [PM-1672] – override add_args for SubWorkflow so that args refer to planner args
11) [PM-1706] – sphinx has hardcoded versios
12) [PM-1730] – 5.0.1 Python Api improvements
13) [PM-1733] – expand on checkpointing documentation
14) [PM-1739] – expose panda job submissions similar to how we support BOSCO
15) [PM-1742] – allow a tc to be empty without the planner failing
16) [PM-1743] – allow catalogs to be embedded into workflow when workflow
contains sub workflows
17) [PM-1747] – 031-montage-condor-io-jdbcrc failing
18) [PM-1768] – replace GRAM workflow tests with bosco
19) [PM-1769] – update tests since /nfs/ccg3 is gone now
20) [PM-1771] – pegasus-db-admin upgrade
21) [PM-1780] – Refactor Transfer Engine Code
22) [PM-1787] – auxiliary.local is not considered when triggering symlink in
PegasusLite in nonsharedfs mode
23) [PM-1792] – decaf jobs over bosco
24) [PM-1794] – put in support for additional keys required by decaf
25) [PM-1796] – passing properties to be set for sub workflow jobs
26) [PM-1800] – enable inplace cleanup for hierarchical workflows
27) [PM-1802] – Add support for Debian 11
28) [PM-1803] – use force option when doing a docker rm of the container image
29) [PM-1810] – Extend debug capabilities for pegasus.mode
30) [PM-1811] – add pegasus-keg to worker package
31) [PM-1818] – new pegasus.mode debug
32) [PM-1723] – add_<namespace>_profile() should be plural
33) [PM-1731] – functions that take in File objects as input parameters should also accept strings for convenience
34) [PM-1744] – progress bar from wf.wait() should include “UNRDY” as shown in status output
35) [PM-1755] – catalog write location should be stored upon call to catalog.write()
36) [PM-1757] – add pegasus profile relative.submit.dir
37) [PM-1784] – Refactor Stagein Generator code out of Transfer Engine
38) [PM-1790] – extend site catalog schema to indicate shared file system access for a directory
39) [PM-1791] – update planner to parse sharedFileSystem attribute from site catalog
40) [PM-1797] – use logging over print statements
41) [PM-1804] – add verbose options for development mode
Bugs Fixed
----------
1) [PM-1709] – the yaml handler in pegasus-graphviz needs to handle ‘checkpoint’ link type
2) [PM-1722] – Job node_label attribute is not identified by the planner
3) [PM-1725] – nodeLabel for a job needs to be parsed in yaml handler if it is given
4) [PM-1736] – Pegasus pollutes the job env when getenv=true
5) [PM-1737] – monitord fails on divide by 0 error while computing avg cpu utilization
6) [PM-1745] – time.txt in stats is misformatted
7) [PM-1746] – jobs aborted by dagman, but with kickstart exitcode as 0 are not marked as failed job
8) [PM-1748] – planner fails with NPE on empty workflow
9) [PM-1750] – ensemble mgr workflow priorities need to be reversed
10) [PM-1752] – fix checkpoint.time in add_pegasus_profile
11) [PM-1754] – pegasus-db-admin fails to upgrade database
12) [PM-1761] – pegasus-analyzer showing “failed to send files” error when root cause is exec format error
13) [PM-1762] – pegasus-analyzer showing no error at all when workflow failed based on status output
14) [PM-1764] – fix pegasus-analyzer output typo
15) [PM-1765] – for SubWorkflow jobs, the planner argument, –output-sites, isn’t being set
16) [PM-1766] – for SubWorkflow jobs, the planner argument, –force, isn’t being set
17) [PM-1770] – 041-jdbcrc-performance failing
18) [PM-1772] – db upgrade leaves transient tables
19) [PM-1777] – pegasus-graphviz producing incorrect dot file when redundant edges removed
20) [PM-1779] – Stage out job executed on local instead of remote site (donut)
21) [PM-1783] – bypass input staging in nonsharedfs mode does not work for file URL and auxiliary.local set
22) [PM-1785] – hostnames missing from elasticsearch job data
23) [PM-1789] – Scratch dir GET/PUT operations get overridden
24) [PM-1795] – Output Mapper in conjunction with data dependencies between sub workflow jobs
25) [PM-1799] – json schema validation fails for selector profiles
26) [PM-1820] – Deserializing a YAML transformation files always sets the os.type to linux
===============================
Release Notes for Pegasus 5.0.0
===============================
We are happy to announce the release of Pegasus 5.0. Pegasus 5.0
is be a major release of Pegasus and builds upon the beta version
released couple of months back. It also includes all features and
bug fixes from the 4.9 branch. We invite our users to give it a
try.
The release can be downloaded from:
https://pegasus.isi.edu/downloads
If you are an existing user, please carefully follow these
instructions to upgrade at
https://pegasus.isi.edu/docs/5.0.0/user-guide/migration.html#migrating-from-pegasus-4-9-x-to-pegasus-5-0
Highlights of the Release
--------------------------
1) Reworked Python API:
This new API has been developed from the
ground up so that, in addition to generating the abstract workflow
and all the catalogs, it now allows you to plan, submit, monitor,
analyze and generate statistics of your workflow.
To use this new Python API refer to the Moving From
DAX3 to Pegasus.api at
https://pegasus.isi.edu/docs/5.0.0/user-guide/migration.html#moving-from-dax3
2) Adoption of YAML formats:
With Pegasus 5.0, we are moving to
adoption of YAML for representation of all major catalogs. We
have provided catalog converters for you to convert your existing
catalogs to the new formats. In 5.0, the following are now represented
in YAML:
- Abstract Workflow
- Replica Catalog
- Transformation Catalog
- Site Catalog
- Kickstart Provenance Records
3) Python3 Support:
All Pegasus tools are Python 3 compliant.
5.0 release will require Python 3 on workflow submit node
Python PIP packages for workflow composition and monitoring
4) Default data configuration
In Pegasus 5.0, the default data configuration has been changed
to condorio . Up to 4.9.x releases, the default configuration
was sharedfs.
5) Zero configuration required to submit to local HTCondor pool
6) Data Management Improvements
- New output replica catalog that registers outputs including
file metadata such as size and checksums
- Ability to do bypass staging of files at a per file,
executable and container level
- Improved support for hierarchal workflows allow you to create
data dependencies between sub workflow jobs and compute jobs
- Support for staging of generated outputs to multiple output sites
- Support for integrity checking of user executables and application
containers in addition to data
- Support for webdav transfers
- Easier enabling of data reuse by specifying previous workflow
submit directories using –reuse option to pegasus-plan.
- Stagein transfer jobs are assigned priorities based on the number
of child compute jobs. Details can be found in JIRA ticket 1385
7) New Jupyter Notebook Based Tutorial
With this release, we are pleased to announce a brand new tutorial
based on a Docker container running interactive Jupyter notebooks.
You can access the tutorial at
https://jira.isi.edu/browse/PM-1385
8) Support for CWL (Common Workflow Language)
The pegasus-cwl-converter command line tool has been developed to
convert a subset of the Common Workflow Language (CWL) to Pegasus’s
native YAML format. Given the following three files: a CWL workflow
file, a workflow inputs specification file, and a transformation
(executable) specification file, pegasus-cwl-converter
will do a best-effort translation. This will also work with CWL
workflow specifications that refer to Docker containers. The entire
CWL language specification is not yet covered by this converter, and
as such we can provide additional support in converting
your workflows into the Pegasus YAML format.
9) Support for triggers in Ensemble Manager
The Pegasus Ensemble Manager is a service that manages collections
of workflows. In this latest release of Pegasus, workflow triggering
functionality has been added to this service. With the ensemble manager
service up and running, the pegasus-em command can now be used to start
workflow triggers. Two triggersare currently supported:
- a cron based trigger: The cron based trigger will, at a given
interval, submit a new workflow to your ensemble.
- a cron based, file pattern trigger: The cron based, file pattern
trigger, much like the cron based trigger, will submit a new
workflow to your ensemble at a given time interval, with the addition
of any new files that are detected based on a given file pattern.
This is useful for automatically processing data as it arrives.
10) Improved events for each job reported to AMQP
Historically the events reported to AMQP endpoints are normalized
events corresponding to the stampede database, which makes
correlation hard. Pegasus now also reports a new job composite event
(stampede.job_inst.composite) to AMQP end points that have a
complete information about a job execution.
11) Revamped Documentation
Documentation has been overhauled and broken down into a user guide
and a reference guide. In addition, we have moved to readthedocs style
documentation using restructured text. The documentation can be found
at https://pegasus.isi.edu/docs/5.0.0/index.html .
12) pegasus-statistics reports memory usage and avg cpu utilization
pegasus-statistics now reports memory usage and average cpu utilization
for your jobs in the transformation statistics file (breakdown.txt).
13) PegasusLite Improvements
- Users can now specify environment setup scripts in the site catalog
that need to be sourced to setup the environment before the job is
launched by PegasusLite. More details at
https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#setting-the-environment-in-pegasuslite-for-your-job
- Users can get the transfers in PegasusLite to run on DTN nodes while
the jobs run on the the compute nodes. More details at
https://pegasus.isi.edu/docs/5.0.0/reference-guide/pegasus-lite.html#specify-compute-job-in-pegasuslite-to-run-on-different-node
13) Credentials existence is checked upfront by planner
14) Performance improvements for pegasus-rc-client
pegasus-rc-client now does bulk inserts when inserting entries into
a database backed Replica Catalog.
15) Pegasus ensures a consistent UTF8 environment across full workflow
Details can be found in JIRA ticket 1592 .
JIRA items
-----------
Exhaustive list of features, improvements and bug fixes can be found below.
New Features
------------
[PM-603] – Enable workflows to reference local input files directly instead of symlinking them
[PM-1133] – Kickstart should send a heartbeat so that condor can kill stuck jobs
[PM-1156] – PegasusLite to tar up the contents of the cwd in case of job failure
[PM-1278] – stats should include cpu and memory utilization
[PM-1309] – develop a pip package that only contains the DAX API and the catalogs API
[PM-1335] – YAML based transformation catalog
[PM-1339] – construct a default entry for local site if not present in site catalog
[PM-1345] – Support for Shifter at Nersc
[PM-1351] – YAML based kickstart records
[PM-1352] – Build failure on Debian 10 due to mariadb/MySQL-Python incompatibility
[PM-1354] – pegasus-init to support titan tutorial
[PM-1355] – composite records when sending events to AMQP
[PM-1357] – In lite jobs, chirp durations for stage in, stage out of data
[PM-1367] – Support for retrieval from HPSS tape store using commands htar and hsi
[PM-1390] – ensure all machine parseable information is one file associated with job
[PM-1396] – kickstart yaml parser fails because of : unacceptable character #x001b: special characters are not allowed
[PM-1398] – include machine information in job_instance.composite event
[PM-1402] – pegasus-init to support summit as execution env for tutorial
[PM-1411] – create the schema for a YAML based DAX
[PM-1438] – YAML Based Site Catalog
[PM-1461] – Ability to specify a wrapper/launcher for compute jobs in PegasusLite
[PM-1470] – pegasus-graphviz needs a yaml handler
[PM-1493] – YAML Based Replica Catalog
[PM-1501] – Parse YAML DAX files
[PM-1516] – Planner should create a default condorpool compute site if a user does not have it specified
[PM-1528] – update DAX R API to emit new workflow format in yaml
[PM-1529] – set default data configuration to condorio
[PM-1551] – Update JAVA DAX API to generate yaml formatted DAX
[PM-1552] – move to using pegasus lite for cases, where we transfer pegasus-transfer
[PM-1608] – data dependencies between dax jobs and compute jobs in a workflow
[PM-1620] – enable integrity checking for containers
[PM-1681] – Enable easy data reuse from previous runs
[PM-1685] – Python package to clone github repos for pegasus-init in 5.0
[PM-1286] – Deprecate Perl DAX API
[PM-1376] – Add LSF local attributes
[PM-1378] – Handle (copy) HPSS credentials when an environment variable is set
[PM-1382] – Add ppc64le to the known architectures
[PM-1383] – Switching to AMQP 0.9.1 in Pegasus Monitord
[PM-1400] – Remove MacOS .pkg builder
[PM-1401] – Deprecate pegasus-plots
[PM-1412] – Upgrade documentation
[PM-1413] – Upgrade Pegasus Databases to be Unicode Compatible
[PM-1416] – add data collection setup instructions to docs under section 6.7.1.1. Monitord, RabbitMQ, ElasticSearch Example
[PM-1446] – create tests for python client
[PM-1467] – Update jupyter notebook code to use new 5.0 api
[PM-1484] – create a default site catalog
[PM-1485] – change default pegasus.data.configuration from sharedfs to condorio
[PM-1486] – update default filenames for catalogs and workflow
[PM-1495] – checksum.value and checksum.type need to be added as optional rc entry fields
[PM-1510] – stageOut and registerReplica fields in Uses to be omitted for Uses of type input
[PM-1513] – Merge DECAF branch to master to support decaf integration
[PM-1524] – Use entrypoint in docker containers
[PM-1533] – Database Schema Cleanup
[PM-1537] – update existing workflow tests to use yaml
[PM-1547] – for hierarchical workflows, sc and tc cannot be inlined into the workflow file
[PM-1602] – Decaf development for the tess_dense example
[PM-1663] – remove old grid types from schema and add slurm to scheduler type
[PM-1679] – pegasus-s3 mkdir should cleanly exit if bucket already exists and is owned by user
[PM-1689] – pegasus-rc-client delete semantics
[PM-1692] – Add any missing options from pegasus-plan cli to Workflow.plan() and Client.plan()
[PM-1697] – Add major and minor number options to pegasus-version
[PM-643] – Better support for stdout of clustered jobs
[PM-1049] – Jobs should not be retried immediately, but rather delayed for some time
[PM-1170] – statistics should include memory details
[PM-1232] – Allow for multiple output sites
[PM-1235] – Python 2/3 compatible code
[PM-1247] – Revisit release-tools/get-system-python
[PM-1321] – Move transfer staging into the container rather than the host OS
[PM-1323] – pegasus transfer should not try and transfer a file that does not exist
[PM-1324] – pegasus plan should be able to create site scratch directories with unique names
[PM-1329] – pegasus integrity causes LIGO workflows to fail
[PM-1338] – Add support for TACC wrangler to pegasus-init
[PM-1341] – Transition from vendored `configobj` to release
[PM-1344] – update pam usage by pamela
[PM-1349] – Improve error message when jobs fail due to deep dir. structure with depth of > 20.
[PM-1356] – Replace Google CDN with a different CDN as China block it
[PM-1359] – Don’t support Chinese character in the file path
[PM-1363] – Condor Configuration MOUNT_UNDER_SCRATCH causes pegasus auxiliary jobs to fail
[PM-1368] – Implement Catch and Release for integrity errors
[PM-1373] – Debian Buster no longer provides openjdk-8-jdk
[PM-1374] – make monitord resilient to dagman logging the debug level in dagman.out
[PM-1375] – Do not run integrity checks on symlinked files
[PM-1385] – Prioritize transfers bases on dependencies
[PM-1386] – bypass should be a per-file option
[PM-1391] – Allow properties to be set via environment variables.
[PM-1392] – YAML based braindump file
[PM-1403] – Support POWER9 nodes in PMC
[PM-1417] – add type field in job, dax, and dag
[PM-1418] – No code to handle postgresql backup.
[PM-1427] – remove STAT profile namespace
[PM-1429] – Introduce PEGASUS_ENV variable to define mode of workflow i.e. development, production, etc
[PM-1431] – Remove -0 suffixes from generated code files.
[PM-1432] – Remove deprecated pegasus-plan CLI args
[PM-1459] – Upgrade Java Unit Test Setup
[PM-1462] – Remove pegasus-submit-dag
[PM-1463] – improve insert performance of pegasus-rc-client
[PM-1472] – Update pegasus-s3 regions
[PM-1474] – Extend command line tools with –json option.
[PM-1476] – Explore deprecation/replacement of Perl codebase
[PM-1481] – add status progress bar to python client code
[PM-1489] – Implement a generic data access credential handler
[PM-1490] – Support Webdav for data transfers
[PM-1508] – in the python api, the method signature to manually add dependencies needs to be updated
[PM-1512] – Removed unused code from planner codebase
[PM-1514] – Move to boto3 and port p-s3
[PM-1527] – p-status shows failure when state is unknown
[PM-1532] – python client “plan” should accept –sites as a list needs a –staging-site flag
[PM-1539] – Support PANDA GAHP – allow condorio for glite style
[PM-1544] – pegasus-transfer logs skips container verification when retrieving from singularity hub
[PM-1545] – handling metadata in 5.0
[PM-1550] – resolve relative input-dir and output-dir options for dax jobs in hierarchical workflows
[PM-1580] – build packages for ubuntu 20 for 5.0
[PM-1581] – pegasus-integrity callout from kickstart
[PM-1584] – 5.0 Python Api Improvements
[PM-1592] – Consistent UTF8 environment across full workflow
[PM-1594] – Short circuit p-transfer in the case of excessive failure in large transfers
[PM-1621] – Basic support for gpus in Docker and Singularity containers
[PM-1622] – /bin/sh compatibility
[PM-1626] – Allow users to specify arbitrary cli arguments for containers
[PM-1647] – Add more profile keys in the add_condor_profile
[PM-1650] – replace –dax with a positional argument
[PM-1651] – Add more profile keys in the add_pegasus_profile
[PM-1654] – pick up workflow api from vendor extensions encoded in the workflow
[PM-1657] – a pre-flight check needs to be done for the python package attrs
[PM-1674] – Deprecate hints profile namespace and use selector namespace
[PM-1687] – disable warnings when validating yaml files
[PM-1691] – Add pre script hook to hub repos
[PM-1236] – Compatibility Module
[PM-1237] – Python 3: Pegasus Dashboard
[PM-1238] – Python 3: Pegasus Monitord
[PM-1239] – Python 2/3 Compatible: Pegasus Transfer
[PM-1240] – Python 3: Pegasus Statistics
[PM-1241] – Python 3: Pegasus DB Admin
[PM-1242] – Python 3: Pegasus Metadata
[PM-1328] – support sharedfs on the compute site as staging site
[PM-1340] – make planner os releases consistent with builds
[PM-1353] – update monitord to parse both xml and yams based ks records
[PM-1361] – create the schema for a YAML based Site Catalog
[PM-1365] – remove __ from event keys wherever possible
[PM-1371] – Python 3 Compatible: DAX
[PM-1372] – Python 3 Compatible: pegasus-exitcode
[PM-1381] – Associated planner changes to handle LSF sites
[PM-1387] – make the netlogger events consistent with the documentation
[PM-1404] – pegasus tutorial for summit from Kubernetes
[PM-1407] – Python 3: Netlogger Monitord Code
[PM-1408] – Python 3: Pegasus Analyzer
[PM-1410] – create the schema for a YAML based Replica Catalog
[PM-1419] – Java
[PM-1420] – add checksum field for Containers in the TransformationCatalog schema
[PM-1422] – integrate the client code into the dax api
[PM-1423] – refactor and improve tests
[PM-1428] – change dax and dag to subworkflow
[PM-1430] – ensure that api can write out unicode characters in utf-8
[PM-1433] – Remove old SC RC TC catalog versions
[PM-1435] – create a “moving from dax3 to new api” doc
[PM-1436] – implement method chaining using decorators
[PM-1437] – add type annotations
[PM-1439] – Planner support for YAML based Site Catalog
[PM-1441] – Python
[PM-1442] – Integrate Check in CI
[PM-1443] – implement api for pegasus.conf
[PM-1445] – update monitord to parse yaml based brain dump file
[PM-1447] – update pegasus-sc-converter to covert old format catalog to new yaml based one
[PM-1448] – SiteFactory should auto detect version and load the correct implementation
[PM-1450] – Module to load/dump YAML files
[PM-1453] – Module to load/dump Workflow(DAX) files
[PM-1454] – Module to load/dump Replica Catalog files
[PM-1455] – Module to load/dump Transformation Catalog files
[PM-1456] – Module to load/dump Site Catalog files
[PM-1457] – Module to load/dump Properties files
[PM-1460] – add docs to user guide
[PM-1464] – disallow variable expansion while converting site catalog from one format to another
[PM-1465] – Module to load/dump JSON files
[PM-1468] – remove catalog
[PM-1469] – update Pegasus-DAX3-Tutorial.ipynb
[PM-1471] – update jupyter docs
[PM-1473] – fix logging bug
[PM-1475] – Pegasus Plan
[PM-1477] – pegasus-config
[PM-1478] – pegasus-remove
[PM-1479] – pegasus-run
[PM-1480] – pegasus-status
[PM-1482] – Clean up javadoc warnings
[PM-1483] – rename ProfileMixin.add_<ns>() to ProfileMixin.add_profile_<ns>()
[PM-1487] – update python api to write new default filenames
[PM-1491] – Update pegasus-tc-converter to convert from old format to new YAML format
[PM-1492] – support docker/singularity container usage in cwl
[PM-1494] – Parse YAML Based RC files
[PM-1496] – update rc schema so that rc entries can have checksum fields
[PM-1497] – update rc api to support checksum fields
[PM-1498] – update RC docs to mention checksum values
[PM-1499] – Implement YAML Based RC Backend
[PM-1500] – create workflow test
[PM-1502] – add infer_dependencies=True to wf.write
[PM-1503] – in the python api, write out yml in order
[PM-1504] – flatten out the uses section. remove the extra file property aggregation
[PM-1505] – for job arguments, files should be added as a strings
[PM-1507] – update schema for stdin, stdout, stderr in jobs to s.t only lfn is used
[PM-1509] – update “type” field options in the AbstractJob schema to be “job”, “pegasusWorkflow”, and “condorWorkflow”
[PM-1511] – Update DAXParser Factory to load the right parser based on content of input dax file
[PM-1515] – prefer catalog entries for SC, RC and TC in the DAX over everything else
[PM-1517] – update default –sites option in python client planner wrapper code
[PM-1518] – Update Replica Factory to auto load RC backend based on type of file
[PM-1519] – update schema for job args s.t. types can be strings and scalars
[PM-1520] – Update Transformation Factory to auto load Transformation backend based on type of file
[PM-1523] – work on replica catalog converter
[PM-1525] – support in planner for compound transformations
[PM-1526] – update python api to write out tr requirements in the format “namespace::name:version”
[PM-1535] – default paths picked up should be logged in the properties file in the submit directory
[PM-1538] – convert test 023-sc4-ssh-http to use yaml
[PM-1540] – allow mount point regex to parse shell variable names
[PM-1541] – change the pegasus-plan invocation via pegasus lite in dagman prescripts to handle new worker package organization
[PM-1542] – convert test 024-sc4-gridftp-http to use yaml
[PM-1543] – convert test 025-sc4-file-http to use yaml
[PM-1546] – add metadata to entry in replica catalog
[PM-1548] – preserve case for property keys when properties python api is used
[PM-1549] – registration of outputs in Pegasus 5.0
[PM-1555] – Chapter 2. Tutorial – Update and review
[PM-1556] – Chapter 3. Installation – Update and review
[PM-1557] – Chapter 4. Creating Workflows – Update and review
[PM-1558] – Chapter 5. Running Workflows – Update and review
[PM-1559] – Chapter 6. Monitoring, Debugging and Statistics – Update and review
[PM-1560] – Chapter 7. Execution Environments – Update and review
[PM-1561] – Chapter 8. Containers – Update and review
[PM-1562] – Chapter 9. Example Workflows – Update and review
[PM-1563] – Chapter 10. Data Management – Update and review
[PM-1564] – Chapter 11. Optimizing Workflows for Efficiency and Scalability – Update and review
[PM-1565] – Chapter 12. Pegasus Service – Update and review
[PM-1566] – Chapter 13. Configuration – Update and review
[PM-1567] – Chapter 14. Submit Directory Details – Update and review
[PM-1568] – Chapter 15. Jupyter Notebooks – Update and review
[PM-1569] – Chapter 16. API Reference – Update and review
[PM-1570] – Chapter Command Line Tools man pages – Update and review
[PM-1571] – rewrite Introduction
[PM-1572] – Chapter “Packages” – Update and review
[PM-1573] – Chapter 17. Useful Tips – Update and review
[PM-1574] – Chapter 18. Funding, citing, and anonymous usage statistics – Update and review
[PM-1575] – Chapter 19. Glossary – Update and review
[PM-1576] – Chapter 20. Tutorial VM – Update and review
[PM-1577] – convert test 045-hierarchy-sharedfs
[PM-1578] – Migration Guide to 5.0
[PM-1579] – convert test 045-hierarchy-sharedfs-b to use yaml
[PM-1582] – ensure checksum and file metadata also appears in the output replica catalog
[PM-1583] – Parse meta files as a RC backend in the planner
[PM-1585] – remove glibc from schema and api as it is not used anymore
[PM-1586] – add built-in support for pathlib.Path objects where ever paths are used
[PM-1587] – _DirectoryType enums should have underscores in name
[PM-1588] – in add_pegasus_profile(), add data_configuration as a kwarg
[PM-1590] – in the Workflow object set infer_dependencies to be True by default
[PM-1591] – Only require site and pfn in Transformation constructor when automatically creating a TransformationSite
[PM-1596] – pegasus-db-admin should use PEGASUS_HOME to discover pegasus-version etc
[PM-1597] – Pegasus Run
[PM-1598] – Output originating from pegasus-tools should be output by workflow object as is (without any log category)
[PM-1599] – Improve exception handling for failed execution of pegasus client commands
[PM-1600] – Update workflow and client python apis to support multiple output sites
[PM-1601] – Client plan input_dir must take in a list of str
[PM-1604] – Fix deprecation warnings
[PM-1605] – Get ensemble manager running on master (Python3)
[PM-1606] – Merge in/factor in code implemented in add-ensemble-triggers branch
[PM-1607] – Add time interval based triggering on a given directory/file pattern
[PM-1609] – register_replica should be set to True by default
[PM-1610] – ensure pegasus-db-admin downgrade works
[PM-1611] – add pegasus-graphviz functionality into the api
[PM-1612] – update monitord to record avg cpu utilization and maxrss
[PM-1613] – update database schema to track maxrss and avg_cpu
[PM-1614] – update planner for revised 5.0 replica catalog format
[PM-1615] – unique constraint failed error when using multiple output sites
[PM-1616] – add checksums for executables
[PM-1617] – add missing integrity check for executables
[PM-1618] – update 039-black-metadata to use python 5.0 api
[PM-1623] – fix 5.0 python auto generated python api documentation
[PM-1624] – convert 032-black-checkpoint
[PM-1625] – Pegasus specific profile for requesting GPU resources
[PM-1627] – workflow uuid, submit dir, submit hostname, root wf id should be accessible from the workflow object
[PM-1628] – files generated by the api should have comments specifying that they have been auto generated by the api
[PM-1629] – Add metadata to container
[PM-1631] – create 032-kickstart-chkpoint-signal-condorio
[PM-1632] – create 032-kickstart-chkpoint-signal-nonsharedfs
[PM-1633] – CLI: manpage pegasus-plan
[PM-1634] – CLI: manpage pegasus-db-admin
[PM-1635] – CLI: manpage pegasus-status
[PM-1636] – CLI: manpage pegasus-remove
[PM-1637] – CLI: manpage pegasus-anaylzer
[PM-1638] – CLI: manpage pegasus-statistics
[PM-1639] – CLI: manpage pegasus-run
[PM-1640] – CLI: manpage pegasus-transfer
[PM-1641] – CLI: manpage pegasus-s3
[PM-1642] – CLI: manpage pegasus-integrity
[PM-1643] – CLI: manpage pegasus-rc-converter
[PM-1644] – CLI: manpage pegasus-sc-converter
[PM-1645] – CLI: manpage pegasus-tc-converter
[PM-1646] – update database overview, and update schema picture
[PM-1648] – add unit tests that put/pull to/from aws s3 and ceph
[PM-1649] – remove old config parameters if they are not used
[PM-1652] – update pegasus client api to pass the workflow to be planned at the end
[PM-1655] – the metrics server should pick up key wf_api , and default back to dax_api if not present
[PM-1656] – Add unit tests to ensure YAML is being generated correctly
[PM-1659] – in the python api, add pegasus profile key container.arguments
[PM-1660] – add boolean bypass flag for input files
[PM-1661] – expose bypass parameter for Containers and TransformationSites
[PM-1662] – document behavior in user guide about bypassing file staging
[PM-1665] – expose –randomdir/–randomdir=<path> in client code in the python api
[PM-1666] – get live output from pegasus cli tools when they are called with the client code
[PM-1667] – when client code is called, it should be more obvious what pegasus-<tool> is being called
[PM-1668] – fb-nlp workflow generator creates multiple jobs that create same output file
[PM-1669] – remove “pegasus: <version>” from tc when inlined in a workflow
[PM-1670] – expose a schema validation function that can be used in unit tests
[PM-1682] – add –reuse option to Workflow.plan in python api
[PM-1683] – update section 7.3 supported transfer protocols in data management guide
[PM-1684] – reorg table of contents
[PM-1688] – update pegasus-db-admin as a new ‘trigger’ table has been added to the schema
[PM-1693] – convert test 010-runtime-clustering-CondorIO to use python api
[PM-1694] – convert test 010-runtime-clustering-Non-SharedFS to use python api
[PM-1695] – convert 010-runtime-clustering-SharedFS to use python api
[PM-1696] – convert 010-runtime-clustering-SharedFS Staging and No Kickstart to use python api
[PM-1698] – Release notes for 5.0 release
[PM-1699] – add logo the documentation pages
[PM-1700] – add SIGINT handler to Client.wait() so that it can exit cleanly
[PM-1701] – planner should get chmod jobs to run locally if compute site has auxiliary.local set
Bugs Fixed
----------
[PM-1150] – Pegasus should verify that required credentials exists before starting a workflow
[PM-1192] – User supplied env setup script for lite
[PM-1199] – Notification naming / meta notifications
[PM-1326] – singularity suffix computed incorrectly
[PM-1327] – bypass input file staging broken for container execution
[PM-1330] – .meta files created even when integrity checking is disabled.
[PM-1332] – monitord is failing on a dagman.out file
[PM-1333] – amqp endpoint errors should not disable database population for multiplexed sinks
[PM-1334] – pegasus dagman is not exiting cleanly
[PM-1336] – pegasus-submitdir is broken
[PM-1346] – Pegasus job checkpointing is incompatible with condorio
[PM-1347] – pegasus will always try and transfer output when a code has checkpointed
[PM-1350] – pegasus is ignoring when_to_transfer_output
[PM-1358] – HTCondor 8.8.0/8.8.1 remaps /tmp, and can break access to x509 credentials
[PM-1360] – planner drops transfer_(in|out)put_files if NoGridStart is used
[PM-1362] – Chinese characters in the file path
[PM-1366] – Pegasus Cluster Label – Job Env Not Picked Up in Containers
[PM-1377] – A + in a tc name breaks pegasus-plan
[PM-1379] – Stage out job fails – wrong src location
[PM-1380] – Support for Singularity Library
[PM-1389] – pegasus.cores causes issues on Summit
[PM-1395] – GLite LSF scripts don’t work as intended on OLCF’s DTNs
[PM-1405] – Is Pegasus supposed to build on 32-bit x86 (Debian i386 Stretch)?
[PM-1409] – Python virtual environments not considered first before system wide installation
[PM-1421] – p-lite generated sh files fail, when used with Docker containers, that make use of USER argument
[PM-1466] – Upgrade Python Package Versions
[PM-1488] – Get condorpool worker machine ipaddr error!!!
[PM-1506] – pegasus-python-wrapper locates executable from PEGASUS_HOME instead of dirname of the exec.
[PM-1521] – Decaf Jobs does not generate valid JSON for Decaf anymore
[PM-1522] – output not being capture in pegasus client code
[PM-1530] – site selector does map job correctly when using stageable or all mapper with a containerized job
[PM-1531] – pegasus-db-admin fails on macosx on a clean db
[PM-1534] – integrity checking with bypass input staging if checksums are specified in replica catalog