Simplify the increasingly convoluted selection mechanism #363

JSKenyon · 2025-02-25T07:08:37Z

The reopening of #226 has made it clear that the current approach to selecting DDIDs and fields is somewhat error prone. This is due, in part, to the fact that it xds_from_table doesn't have an internal filtering mechanism (if we ignore TAQL). This means that in order to provide chunking information to those calls, we need to supply the chunking for all the resulting datasets, rather than just the ones we care about. This is not a huge problem, but the current code is somewhat sloppy and will likely be a problem again in the future.

There are two options to consider:

Start a PR on dask-ms to include some sort of filter/callable which deselects certain DDIDs/fields in the xds_from_table call.
Modify QuartiCal's chunking code to be smarter e.g. provide dummy chunks for deselected fields/DDIDs. QuartiCal should also be much more explicit about the mapping from xarray dataset to chunk specification. Currently, this is positional and will likely be the source of future errors.

The text was updated successfully, but these errors were encountered:

JSKenyon self-assigned this Feb 25, 2025

JSKenyon mentioned this issue Feb 26, 2025

Add a workaround for selection in the presence of partial columns #364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify the increasingly convoluted selection mechanism #363

Simplify the increasingly convoluted selection mechanism #363

JSKenyon commented Feb 25, 2025

Simplify the increasingly convoluted selection mechanism #363

Simplify the increasingly convoluted selection mechanism #363

Comments

JSKenyon commented Feb 25, 2025