Support parsing for m3u and m3u8 files #187

nitika080289 · 2021-01-06T11:40:09Z

Closes #186

While investigating about the m3u files, I found that there can be three types supported. A plain text file, a file with extended directives and file containing hls m3u extensions.
https://en.wikipedia.org/wiki/M3U
Files which are extracted playlists from media players fall into second and third category. I think that recognising a plain text file as m3u just on the basis of file extension might not be the right thing to do and there is no other way of recognising it as per my understanding.
The second and third category files contain a header #EXTM3U which can be used to recognise them. This is what I am currently doing here. So, this logic will not recognise a plain text m3u file but will be able to recognise an extended m3u or hls m3u extension.

Also, the directives vary for the second and third category and seems complicated to extract the information from these files. I will create another ticket for extracting more information from the files and work on it separately

…to add-m3u8-parser

linkyndy

Looks good!

I would say, if we don't need more metadata/directives from the file, we can just ignore them for now.

spec/parsers/m3u_parser_spec.rb

lib/m3u.rb

lib/parsers/m3u_parser.rb

fabioperrella · 2021-01-06T15:07:02Z

You can also add this new format in Currently supported filetypes on Readme!

lib/parsers/m3u_parser.rb

linkyndy · 2021-01-07T12:02:56Z

lib/parsers/m3u_parser.rb

+
+    FormatParser::Text.new(
+      format: :m3u,
+      size: io.size


Just want to be sure: will this read the whole file in order to get the size? Because if it does, it might be wiser not to include it (or implement it in a different way).

size is available once a single read has been performed and will not incur any extra reads

julik

Everything looks fine - I would be careful with using a property called size for a Text, the reason is that it is unclear what the unit of size is. Is it a byte? (in this case it is). Is it a codepoint? (if the data read was a String in UTF8 and we would call size on it it would be). Is it a grapheme cluster? Number of lines? Number of paragraphs?

Also I wonder - do we need to pass the size of the IO to the caller? How is this information going to be used? Maybe we could surface the computed IO size in bytes in all Result types if that information is useful and can avoid an extra HTTP request?

julik · 2021-01-07T12:49:04Z

lib/parsers/m3u_parser.rb

+
+    FormatParser::Text.new(
+      format: :m3u,
+      size: io.size


size is available once a single read has been performed and will not incur any extra reads

nitika080289 · 2021-01-08T09:19:46Z

Everything looks fine - I would be careful with using a property called size for a Text, the reason is that it is unclear what the unit of size is. Is it a byte? (in this case it is). Is it a codepoint? (if the data read was a String in UTF8 and we would call size on it it would be). Is it a grapheme cluster? Number of lines? Number of paragraphs?

Also I wonder - do we need to pass the size of the IO to the caller? How is this information going to be used? Maybe we could surface the computed IO size in bytes in all Result types if that information is useful and can avoid an extra HTTP request?

Thanks for your inputs @julik! I just wanted to add some more information about the file along with format. It would not be used by the previewing service but I thought it might be meaningful information for other services using FormatParser outside of our org. If you think this can be removed for now, I will skip it. We can always look into it again if there are any requests to include size.

julik · 2021-01-08T10:21:36Z

@nitika080289 Let's remove it and replace it later with, say, a bytesize on all Result types?

linkyndy

Nice! 🚀

nitika080289 added 2 commits January 6, 2021 12:38

Support m3u format

6ed5bb3

Merge branch 'master' of ssh://github.com/WeTransfer/format_parser in…

49fd3c0

…to add-m3u8-parser

nitika080289 marked this pull request as ready for review January 6, 2021 13:16

nitika080289 requested review from julik, fabioperrella and linkyndy January 6, 2021 13:16

linkyndy reviewed Jan 6, 2021

View reviewed changes

spec/parsers/m3u_parser_spec.rb Outdated Show resolved Hide resolved

lib/m3u.rb Outdated Show resolved Hide resolved

lib/parsers/m3u_parser.rb Outdated Show resolved Hide resolved

fabioperrella reviewed Jan 6, 2021

View reviewed changes

lib/parsers/m3u_parser.rb Outdated Show resolved Hide resolved

Implemented review suggestions

cc89558

linkyndy reviewed Jan 6, 2021

View reviewed changes

lib/parsers/m3u_parser.rb Outdated Show resolved Hide resolved

lib/parsers/m3u_parser.rb Outdated Show resolved Hide resolved

Updated regex and supported formats

ecc9f90

linkyndy reviewed Jan 7, 2021

View reviewed changes

julik reviewed Jan 7, 2021

View reviewed changes

Remove size

f3bf637

nitika080289 requested review from julik, linkyndy and fabioperrella January 8, 2021 11:28

linkyndy approved these changes Jan 8, 2021

View reviewed changes

fabioperrella approved these changes Jan 8, 2021

View reviewed changes

julik approved these changes Jan 8, 2021

View reviewed changes

nitika080289 merged commit 5862cc8 into master Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support parsing for m3u and m3u8 files #187

Support parsing for m3u and m3u8 files #187

nitika080289 commented Jan 6, 2021 •

edited

Loading

linkyndy left a comment

fabioperrella commented Jan 6, 2021

linkyndy Jan 7, 2021

julik Jan 7, 2021

julik left a comment

julik Jan 7, 2021

nitika080289 commented Jan 8, 2021

julik commented Jan 8, 2021

linkyndy left a comment

Support parsing for m3u and m3u8 files #187

Support parsing for m3u and m3u8 files #187

Conversation

nitika080289 commented Jan 6, 2021 • edited Loading

linkyndy left a comment

Choose a reason for hiding this comment

fabioperrella commented Jan 6, 2021

linkyndy Jan 7, 2021

Choose a reason for hiding this comment

julik Jan 7, 2021

Choose a reason for hiding this comment

julik left a comment

Choose a reason for hiding this comment

julik Jan 7, 2021

Choose a reason for hiding this comment

nitika080289 commented Jan 8, 2021

julik commented Jan 8, 2021

linkyndy left a comment

Choose a reason for hiding this comment

nitika080289 commented Jan 6, 2021 •

edited

Loading