Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan / implement solution for handling multiple versions of products in search results #91

Closed
jordanpadams opened this issue Sep 15, 2023 · 1 comment · Fixed by nasa-pds-engineering-node/registry-mgr-legacy#16
Assignees
Labels
B14.0 i&t.skip Skip I&T of this task/ticket task

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Sep 15, 2023

💡 Description

Current implementation attempts to use Solr collapse functionality.

https://github.com/NASA-PDS/registry-mgr-legacy/blob/main/src/main/resources/collections/data/solrconfig.xml#L789
https://github.com/NASA-PDS/registry-mgr-legacy/blob/main/src/main/resources/collections/data/solrconfig.xml#L848

A few issues with this method:

  1. It attempts to collapse on version_id as a float, which won't work since PDS4 uses semantic versioning (e.g. in PDS4 1.100 > 1.2). We need to figure out some sort of normalized numeric version we can use for this.

  2. For collapse to work it requires all versions of that product to be on the same shard, which is unlikely. We would need to change the installation to 1 shard for this to work.

  3. The search faceting overwrites the fq value. So as soon as someone facets on something, we no longer have the latest versions only.

Alternative options to collapse:

  1. Remove previous versions and only load the latest versions.
  2. Create separate index to support latest-only, and write tool to query data core to get only the latest products.
  3. Create separate field for maintaining latest version, add that to search and archive-filter request handlers. (see provenance in registry-sweepers for possible implementation)
  4. Switch to use 1 shard, create new some version_id_normalized field for handling numeric comparison to support Solr collapse, update add-hierarchy.xsl to calculate and add version_id_normalized to metadata onload.

Sub-tasks

@jordanpadams jordanpadams self-assigned this Sep 15, 2023
@nutjob4life
Copy link
Member

Or maybe look to registry-sweepers 🤔

c-suh referenced this issue in nasa-pds-engineering-node/registry-mgr-legacy Oct 4, 2023
jordanpadams referenced this issue in nasa-pds-engineering-node/registry-mgr-legacy Oct 5, 2023
@jordanpadams jordanpadams transferred this issue from nasa-pds-engineering-node/registry-mgr-legacy Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B14.0 i&t.skip Skip I&T of this task/ticket task
Projects
None yet
3 participants