Skip to content

[EPIC]: Track migration process and display it in a dashboard #2074

@nfx

Description

@nfx

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

There's no easy way to know what still needs to be migrated to UC within a given workspace

Proposed Solution

Potential challenges

  • [FEATURE] Let crawlers support append to tables #2597
  • but for migration progress purposes, we need to overwrite the tables. or add another column with a timestamp and modify "fetch latest" queries to fetch the latest timestamp of the snapshot. fetching the latest timestamp from the snapshot allows to build a bar-chart widget to see how fast migration progresses, but we don't really care about it it. if we do fetch-latest-timestamp, all our views and dashboards would become a bit more complicated. but that's fine. --> Decision is made to keep the current ucx inventory and store history in a separate table (see proposed solution above)
  • let's keep the status of migration progress in HMS (for now), but we can change this decision in a few weeks. --> Decision is made to store the migration process in a ucx catalog.

Migration process crawlers

Assessment tasks that make sense to re-run on migration-progress workflow:

  • crawl_tables
  • assess_jobs - potentially harden the code there as well
  • assess_clusters - potentially harden as well
  • assess_pipelines - potentially harden as well
  • crawl_cluster_policies
  • assess_global_init_scripts

not to be re-run:

  • crawl_mounts - we already pre-created external locations
  • setup_tacl - we don't need to crawl grants
  • crawl_grants - no need to, i think
  • estimate_table_size_for_migration - most likely not necessary
  • guess_external_locations - we already migrated external locations by this point
  • assess_incompatible_submit_runs - not going to be necessary in september
  • workspace_listing - we are going to analyse only those notebooks that are part of jobs in the scope of static analysis
  • crawl_permissions - we expect permissions to already be migrated
  • crawl_groups - we expect groups to already be migrated

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    EPICfeat/vizvizualizing UCX progress as a redash/lakeview dashboardstep/assessmentgo/uc/upgrade - Assessment Step

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions