Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Make DASH parsing first pass closer to original XML #1652

Open
wants to merge 1 commit into
base: misc/remove-dash-native
Choose a base branch
from

Conversation

peaBerberian
Copy link
Collaborator

@peaBerberian peaBerberian commented Feb 25, 2025

Based on #1482 (removal of our "native" MPD parser).

Initial problem: DASH Patch Document

I was looking into the implementation of the "MPD Patch Document", which allows for theoretically more optimal MPD updates by only communicating what changed since the last update.
From what I can see, it relies on a special format indicating that some XML attribute or element is either added, removed or updated compared to the last loaded MPD (also an XML).

Any element or attribute seems to be updateable here, which is a problem I would imagine for most DASH players as we generally transform a loaded MPD (the original XML document) before using it, often in a new format that looses a lot of information from the original XML, including the original structure (e.g. was a SegmentTemplate at the Period or at the Representation level?).
One of the strategies, seen in the shaka-player, could be to only consider the most sensible subset of properties that are generally updated, but I was a little afraid this led to a insufficient implementation status for the application or packagers, which may need this for things we did not prepare for yet.

Interestingly, the DASH 5th edition specification has the following sentence on that subject:

To allow for DASH client implementation efficiency, the in-memory MPD need not be stored in full XML structure format, but it must be stored such that all DASH defined identifiers and element order of multiple events of the same type are preserved. See 5.15.3.3 for the XPath addressing modes permitted by this document.

So we have to have a preferably structurally non-destructive and efficient (not re-parsing everything each time and preferably memory-efficient) way to store the previous MPD.

Our MPD "Intermediate Representation"

We had already such a format (parsed XML, but with all the original structural information still here), which we called the "MPD intermediate representation", which is outputed by our first pass of our DASH MPD parsers.

The MPD Intermediate Representation is basically the XML format transformed it into a JS Object, with what were semantically dates and numbers transformed into JavaScript's numbers, what were XML elements into objects and so on.

Here we could e.g. keep this object around when we detect that "MPD patching" is possible,

But there were still small modifications compared to the original structure. Minor ones like the lang attribute becoming language and more major ones like some elements having a special syntax, some children being added to an array and other not etc.

We could also argue that the transformation of the original textual format into numbers (e.g. ISO8601 duration into number of seconds) could be a loss of information, but I cannot think of any case where the format in which an attribute or Element inner content would be important to keep for Patch Documents.

This proposal

So here I propose just to greatly simplify the output of that first pass:

  • All Elements output an object with at most 3 properties: children, attributes and value (for the stringified inner content)
  • The case and exact naming is kept: elements are now in Pascal cases like in the original XML
  • All children are put in arrays as there may technically be multiple ones when parsing

There are still some exceptions to this: some elements (SegmentTimeline, EventStream), still have a special syntax for now because it would not be as straigtforward to have this format for them for example.

This work does not even begin to implement PATCH Documents, but I thought that it was a good standalone work anyway, as it makes the first parsing pass much easier to update: just keep the same structure than in the original XML.

@peaBerberian peaBerberian added the Priority: 3 (Low) This issue or PR has a low priority. label Feb 26, 2025
@peaBerberian peaBerberian force-pushed the misc/simplify-node-parsers branch 2 times, most recently from 8dbe03b to 02ee531 Compare March 3, 2025 15:35
This is to facilitate code understanding as well as a first step to
facilitate DASH PATCH MPD, by having a less destructive first pass
regarding the original XML.
@peaBerberian peaBerberian force-pushed the misc/simplify-node-parsers branch from 02ee531 to 2f369bd Compare March 3, 2025 16:12
@canalplus canalplus deleted a comment from github-actions bot Mar 3, 2025
@canalplus canalplus deleted a comment from github-actions bot Mar 3, 2025
Copy link

github-actions bot commented Mar 3, 2025

Automated performance checks have been performed on commit 2f369bdd52525a59371539a7056f377e1ff6e7cf with the base branch misc/remove-dash-native.

Tests results

✅ Tests have passed.

Performance tests 1st run output

No significative change in performance for tests:

Name Mean Median
loading 19.95ms -> 20.45ms (-0.503ms, z: 2.25566) 27.00ms -> 27.30ms
seeking 15.27ms -> 13.30ms (1.972ms, z: 0.72257) 11.25ms -> 11.10ms
audio-track-reload 26.26ms -> 26.30ms (-0.039ms, z: 0.83618) 38.25ms -> 38.25ms
cold loading multithread 47.70ms -> 47.34ms (0.364ms, z: 7.35938) 69.60ms -> 68.85ms
seeking multithread 13.89ms -> 17.93ms (-4.040ms, z: 0.81445) 10.35ms -> 10.35ms
audio-track-reload multithread 25.96ms -> 25.97ms (-0.012ms, z: 0.92321) 38.10ms -> 38.10ms
hot loading multithread 16.18ms -> 16.32ms (-0.139ms, z: 3.24461) 23.40ms -> 23.70ms

If you want to skip performance checks for latter commits, add the skip-performance-checks label to this Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 3 (Low) This issue or PR has a low priority.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant