Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parsing for m3u and m3u8 files #187

Merged
merged 5 commits into from
Jan 8, 2021
Merged

Conversation

nitika080289
Copy link
Contributor

@nitika080289 nitika080289 commented Jan 6, 2021

Closes #186

While investigating about the m3u files, I found that there can be three types supported. A plain text file, a file with extended directives and file containing hls m3u extensions.
https://en.wikipedia.org/wiki/M3U
Files which are extracted playlists from media players fall into second and third category. I think that recognising a plain text file as m3u just on the basis of file extension might not be the right thing to do and there is no other way of recognising it as per my understanding.
The second and third category files contain a header #EXTM3U which can be used to recognise them. This is what I am currently doing here. So, this logic will not recognise a plain text m3u file but will be able to recognise an extended m3u or hls m3u extension.

Also, the directives vary for the second and third category and seems complicated to extract the information from these files. I will create another ticket for extracting more information from the files and work on it separately

@nitika080289 nitika080289 marked this pull request as ready for review January 6, 2021 13:16
Copy link
Contributor

@linkyndy linkyndy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I would say, if we don't need more metadata/directives from the file, we can just ignore them for now.

@fabioperrella
Copy link
Contributor

You can also add this new format in Currently supported filetypes on Readme!


FormatParser::Text.new(
format: :m3u,
size: io.size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to be sure: will this read the whole file in order to get the size? Because if it does, it might be wiser not to include it (or implement it in a different way).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size is available once a single read has been performed and will not incur any extra reads

Copy link
Contributor

@julik julik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks fine - I would be careful with using a property called size for a Text, the reason is that it is unclear what the unit of size is. Is it a byte? (in this case it is). Is it a codepoint? (if the data read was a String in UTF8 and we would call size on it it would be). Is it a grapheme cluster? Number of lines? Number of paragraphs?

Also I wonder - do we need to pass the size of the IO to the caller? How is this information going to be used? Maybe we could surface the computed IO size in bytes in all Result types if that information is useful and can avoid an extra HTTP request?


FormatParser::Text.new(
format: :m3u,
size: io.size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size is available once a single read has been performed and will not incur any extra reads

@nitika080289
Copy link
Contributor Author

Everything looks fine - I would be careful with using a property called size for a Text, the reason is that it is unclear what the unit of size is. Is it a byte? (in this case it is). Is it a codepoint? (if the data read was a String in UTF8 and we would call size on it it would be). Is it a grapheme cluster? Number of lines? Number of paragraphs?

Also I wonder - do we need to pass the size of the IO to the caller? How is this information going to be used? Maybe we could surface the computed IO size in bytes in all Result types if that information is useful and can avoid an extra HTTP request?

Thanks for your inputs @julik! I just wanted to add some more information about the file along with format. It would not be used by the previewing service but I thought it might be meaningful information for other services using FormatParser outside of our org. If you think this can be removed for now, I will skip it. We can always look into it again if there are any requests to include size.

@julik
Copy link
Contributor

julik commented Jan 8, 2021

@nitika080289 Let's remove it and replace it later with, say, a bytesize on all Result types?

Copy link
Contributor

@linkyndy linkyndy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🚀

@nitika080289 nitika080289 merged commit 5862cc8 into master Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

M3U8 support
4 participants