Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(number-format): adds memory data transfer rates in binary and decimal format #32264

Merged
merged 16 commits into from
Feb 24, 2025

Conversation

tshallenberger
Copy link
Contributor

@tshallenberger tshallenberger commented Feb 14, 2025

SUMMARY

At Yahoo we needed custom number formatters for data transfer rates, in either IEC or Standard format (binary or decimal). This change adds the following NumberFormats:

MEMORY_TRANSFER_RATE_DECIMAL
MEMORY_TRANSFER_RATE_BINARY

Where decimal records rates in terms of gigabyte, terabyte, and petabytes, and binary records rates in terms of gibibytes, tebibytes, and pebibytes. The number formatter expects absolute values in bytes.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

image

TESTING INSTRUCTIONS

  1. Create a Big Number chart for any dataset
  2. Add a metric with custom sql for the value: 123456789
  3. Nav to Customize > Chart Options > Number Format and select either Bytes (SI), Bytes (IEC), Byterate (SI), Byterate (IEC)
  4. You should see either of the following for your selection:

MEMORY_TRANSFER_RATE_DECIMAL: 123.46 MB/s
MEMORY_TRANSFER_RATE_BINARY (IEC): 117.74 MiB/s

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added change:frontend Requires changing the frontend explore:control Related to the controls panel of Explore labels Feb 14, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Performance Uncached D3 Formatter Creation ▹ view
Readability Redundant Break After Return ▹ view
Functionality Incorrect Zero Value Formatting ▹ view
Documentation Format description inconsistencies ▹ view
Functionality Invalid format patterns for network data formatting ▹ view
Files scanned
File Path Reviewed
superset-frontend/packages/superset-ui-core/src/number-format/index.ts
superset-frontend/packages/superset-ui-core/src/number-format/NumberFormats.ts
superset-frontend/packages/superset-ui-chart-controls/src/utils/D3Formatting.ts
superset-frontend/src/setup/setupFormatters.ts
superset-frontend/packages/superset-ui-core/src/number-format/factories/createNetworkNumberFormatter.ts
superset-frontend/src/explore/controls.jsx

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

@tshallenberger
Copy link
Contributor Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Error Handling Uncaught d3Format Error ▹ view
Functionality Inconsistent small number formatting ▹ view
Functionality Missing bounds check for large values ▹ view
Documentation Missing Closing Parentheses ▹ view
Functionality Missing Closing Parenthesis ▹ view
Files scanned
File Path Reviewed
superset-frontend/packages/superset-ui-core/src/number-format/index.ts
superset-frontend/packages/superset-ui-core/src/number-format/NumberFormats.ts
superset-frontend/packages/superset-ui-chart-controls/src/utils/D3Formatting.ts
superset-frontend/src/setup/setupFormatters.ts
superset-frontend/packages/superset-ui-core/src/number-format/factories/createNetworkNumberFormatter.ts
superset-frontend/src/explore/controls.jsx

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

@tshallenberger
Copy link
Contributor Author

/korbit-review

@sadpandajoe sadpandajoe added the review:checkpoint Last PR reviewed during the daily review standup label Feb 14, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Performance Unbounded Cache Growth ▹ view
Error Handling Stack trace lost in error re-throw ▹ view
Functionality Missing Type Safety for Labels Parameter ▹ view
Functionality Invalid siFormatter Parameter for Microsecond Values ▹ view
Files scanned
File Path Reviewed
superset-frontend/packages/superset-ui-core/src/number-format/index.ts
superset-frontend/packages/superset-ui-core/src/number-format/NumberFormats.ts
superset-frontend/packages/superset-ui-chart-controls/src/utils/D3Formatting.ts
superset-frontend/src/setup/setupFormatters.ts
superset-frontend/packages/superset-ui-core/src/number-format/factories/createNetworkNumberFormatter.ts
superset-frontend/src/explore/controls.jsx

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

@tshallenberger
Copy link
Contributor Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Performance Inefficient Small Number Formatting Strategy ▹ view
Error Handling Unsafe cache key access ▹ view
Functionality Incorrect Unit Notation in Format Descriptions ▹ view
Files scanned
File Path Reviewed
superset-frontend/packages/superset-ui-core/src/number-format/index.ts
superset-frontend/packages/superset-ui-core/src/number-format/NumberFormats.ts
superset-frontend/packages/superset-ui-chart-controls/src/utils/D3Formatting.ts
superset-frontend/src/setup/setupFormatters.ts
superset-frontend/packages/superset-ui-core/src/number-format/factories/createNetworkNumberFormatter.ts
superset-frontend/src/explore/controls.jsx

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff, I'm sure this will be valuable to many orgs! First pass comments.

Comment on lines 10 to 20
const bytesSILabels = ['Bytes', 'kB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB'];
const bytesIECLabels = [
'Bytes',
'KiB',
'MiB',
'GiB',
'TiB',
'PiB',
'EiB',
'ZiB',
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there overlap here with the code in createMemoryFormatter.ts? I think ideally we want to avoid duplication and double down on the same terms. Some quick observations:

  • Here we're talking about Bytes, while in memory formatter we have B
  • Here we're only going to ZiB/ZB, while in memory formatter we're going all the way to YiB and for SI to QB.

Comment on lines 38 to 57
const byterateSILabels = [
'Bytes/s',
'kB/s',
'MB/s',
'GB/s',
'TB/s',
'PB/s',
'EB/s',
'ZB/s',
];
const byterateIECLabels = [
'Bytes/s',
'KiB/s',
'MiB/s',
'GiB/s',
'TiB/s',
'PiB/s',
'EiB/s',
'ZiB/s',
];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid duplication, can we just use the bytes above and then slap on a /s after them?

Comment on lines 62 to 82
function siFormatter(decimals: number) {
if (formatterCache.size >= MAX_CACHE_SIZE) {
const iterator = formatterCache.keys();
const first = iterator.next();
if (!first.done) {
formatterCache.delete(first.value);
}
}

try {
if (!formatterCache.has(decimals)) {
formatterCache.set(decimals, d3Format(`.${decimals}s`));
}
return formatterCache.get(decimals);
} catch (error) {
throw new Error(
`Failed to create SI formatter for ${decimals} decimals: ${error.message}`,
{ cause: error },
);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this performance optimization needed? Was there performance issues without this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was to satisfy the korbit-ai review bot 😅 if you don't think it's really needed I can simplify it back to its original form

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, sorry about our house bot giving you bad advice 🤦 Yes, please roll this back, I didn't quite understand how it would help in this context. So let's keep it simple and add memoization later if it's really needed.

Comment on lines 84 to 118
function formatValue(
value: number,
labels: string[],
base: number,
decimals: number,
) {
if (value === 0) {
return `0 ${labels[0]}`;
}
const absoluteValue = Math.abs(value);
if (absoluteValue >= 1000) {
const i = Math.min(
Math.floor(Math.log(absoluteValue) / Math.log(base)),
labels.length - 1,
);
const parsedVal = parseFloat(
(absoluteValue / Math.pow(base, i)).toFixed(decimals),
);
return `${value < 0 ? '-' : ''}${parsedVal} ${labels[i]}`;
}
if (absoluteValue >= 1) {
return `${float2PointFormatter(value)} ${labels[0]}`;
}
if (absoluteValue >= 0.001) {
return `${float4PointFormatter(value)} ${labels[0]}`;
}
if (absoluteValue > 0.000001) {
return `${siFormatter(decimals)(value * 1000000)}µ ${labels[0]}`;
}
return `${float4PointFormatter(value)} ${labels[0]}`;
}

function formatBytesSI(value: number, decimals: number) {
return formatValue(value, bytesSILabels, 1000, decimals);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a fair amount of logic here that we'd want to have unit tests for. Can we add those to get full test coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I'm still in the process of figuring out how to run tests from my dev container, once I get that running I'll see about getting coverage on all new branches

@rusackas
Copy link
Member

I think this approach is great, but I also wonder if you're aware of the Advanced Types feature. It's a bit lesser-known, but this might be an interesting application for it. The original implementation of it was to match different formats of IP addresses to one another (acting as a bit of a translation layer, if you will). Having a "fille size" type might allow you to match data where columns contain different units, and do the appropriate conversions, e.g. 1000KB = 1MiB, or 1024K = 1MB, etc.

Here's the original SIP that explains it better than I'm doing :)
#17852

@michael-s-molina michael-s-molina removed the review:checkpoint Last PR reviewed during the daily review standup label Feb 18, 2025
@rusackas
Copy link
Member

re-running CI 🤞

@tshallenberger
Copy link
Contributor Author

Sorry for the delayed update on the PR: @rusackas we weren't able to get advanced types to work for our particular use case, so I took the comments and revisited the code change. It's still a work in progress but I think the best approach is to simply add an extra config to createMemoryFormatter that specifies if it's a data transfer rate or not, and therefore appends the '/s' (since data transfer rates are usually specified in bits/bytes per second, I don't think we need to worry about supporting per minute or per hour).

so the final setupFormatters.ts would look kinda like:

.registerValue('MEMORY_DECIMAL', createMemoryFormatter({ binary: false }))
.registerValue('MEMORY_BINARY', createMemoryFormatter({ binary: true }))
// new
.registerValue('MEMORY_TRANSFER_RATE_DECIMAL', createMemoryFormatter({ binary: false, transfer: true }))
.registerValue('MEMORY_TRANSFER_RATE_BINARY', createMemoryFormatter({ binary: true, transfer: true }))

This way we heavily re-use the createMemoryFormatter and simplify things a bit.

@villebro Thoughts?

@pull-request-size pull-request-size bot added size/M and removed size/L labels Feb 21, 2025
@tshallenberger tshallenberger changed the title feat(number-format): adds network number formatters for SI and IEC data transfer rates feat(number-format): adds memory data transfer rates in binary and decimal format Feb 21, 2025
@rusackas
Copy link
Member

Thanks for staying on top of this! Re-running CI 🤞

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, LGTM!

@villebro villebro merged commit bc02f05 into apache:master Feb 24, 2025
51 of 52 checks passed
@tshallenberger tshallenberger deleted the feat/network-number-formatter branch February 24, 2025 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change:frontend Requires changing the frontend explore:control Related to the controls panel of Explore packages size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants