Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warehouse PostProcess does not respect destination options #93

Open
TonyB9000 opened this issue Nov 23, 2021 · 1 comment
Open

Warehouse PostProcess does not respect destination options #93

TonyB9000 opened this issue Nov 23, 2021 · 1 comment

Comments

@TonyB9000
Copy link
Contributor

The warehouse state-machine "postprocess" workflow (currently dedicated to climo and timeseries generation) appears to ignore the "-p" (publication destination) specification, and when the source material (indicated by "-w") is given to be the pub_root, will only output its products to the (dataset path) pub_root area. More seriously (unlike "warehouse publish") it does not attempt to avoid overwriting existing files or directories at the destination or to create a "next higher" version directory for output. It "adds to" (and may clobber) files in the existing destination directory. Another related issue is that the behavior appears different for atmos timeseries, land timeseries, and for climos, with atmos timeseries datafiles placed into a "v0.*" sequence of fractional-version directories (ala "warehouse"), and left there despite the existence of pre-existing v1 and v2 published directories. (This last may be due to a "silent abort" in the state-machine which failed to advance to a final roll-up of the data.)

@TonyB9000
Copy link
Contributor Author

A related issue here is the unfortunate attempt to be "flexible" when specifying either the source or destination (warehouse versus publication) of files used by and generated by PostProcess. A proposal here is to settle upon publication (pub_root) directories as the default location for datasets intended for eventual publication, and to follow such placement immediately with a mapfile-generation step, rather than generate the mapfile in warehouse and require automated editing of its content upon publication. This should reduce "variant" forms of processing and lead to more predictable results. In any case, a mapfile generated should never need to be edited to "appear" as if the hashes were produced at a different location.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant