Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Requests: Eliminate unmarshalling/marshalling of provided JSON telemetry data blobs #26

Closed
rtamalin opened this issue Jun 27, 2024 · 1 comment

Comments

@rtamalin
Copy link
Collaborator

Currently we are storing the provided JSON data blob in the client data store as is (potentially gzip compressed) but when it comes time to generate a telemetry report JSON payload we are unmarshaling the JSON telemetry data blobs into in-memory Go data structures to create in-memory telemetry bundles, which are included into an in-memory telemetry report, which is then marshaled as a JSON telemetry report request payload.

It should be possible to define custom marshal & unmarshal handing using MarshalJSON and UnmarshalJSON methods for the payload field of the in-memory telemetry data item data structure that will eliminate the unnecessary unmarshaling and re-marshaling of the provided JSON data blob.

We should be able to define a type to represent the telemetry data item's payload that implements MarshalJSON and UnmarshalJSON methods such that we just store the JSON blob value untouched as the data item payload when we are creating the in-memory representation of telemetry data items, telemetry bundles and telemetry reports, and then when we go to marshal the telemetry report to JSON the MarshalJSON handler for the telemetry data item payload type will just return the JSON data blob as the encoded representation, without touching it.

In essence, on the telemetry client side, the provided JSON blob should only ever be unmarshaled into a temporary variable when it is first received by the telemetry library as part of validating that it encodes a JSON object. There after we shouldn't be touching the original encoded representation except to potentially compress it as part of storing it in the client data store.

Similarly on the server side, when we receive a telemetry report request and unmarshal it's JSON payload, we shouldn't be touching the JSON data item representation before storing it in the Telemetry DB unless there is defined storage transform for that telemetry type.

rtamalin added a commit that referenced this issue Aug 2, 2024
The TelemetryBlob type provides helper methods that can be used to
validate that provided blobs are:
* Valid JSON
* JSON objects
* Contain a top-level "version" field
* Are not too big.

The validity of provided JSON blobs is checked when they are received
via a Generate() interface.

Also update client side item handling to leverage the json.RawMessage
type for storing the JSON blobs; this avoid undesirable processing of
the provided JSON blob data, which should remain untouched en-route
to long term storage in the SUSE Telemetry service.

Minor restructuring of the client side library, moving limits to it's
own subpackage to avoid an import loop when adding the CheckLimits()
helper method to the TelemetryBlob.

Fixes: #41, #26
@rtamalin
Copy link
Collaborator Author

rtamalin commented Aug 2, 2024

This issue should now be resolved as part of #42

@rtamalin rtamalin closed this as completed Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant