-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does not return metadata for some mp3 files #163
Labels
Comments
I will try to understand how TinyTag can read this file and maybe replicate the logic to format_parser |
fabioperrella
added a commit
that referenced
this issue
Sep 16, 2020
In MP3Parser, the logic to jump to the next bytes was wrong. To debug it, a puts was added to parse_mpeg_frames method as follows: ``` seek_jmp = sync_bytes_offset_in_4_byte_seq(four_bytes) puts "frame: #{frame_i}, pos: #{io.pos-4}, pos(h): #{(io.pos-4).to_s(16)} four: #{four_bytes}, hexa: #{four_bytes.map {|ii| ii.to_s(16)} }, jump: #{seek_jmp}" if seek_jmp > 0 io.seek(io.pos - 0 + seek_jmp) next end ``` Then the result was: ``` frame: 183, pos: 1474, pos(h): 5c2 four: [0, 255, 251, 148], hexa: ["0", "ff", "fb", "94"], jump: 1 frame: 184, pos: 1479, pos(h): 5c7 four: [0, 0, 0, 0], hexa: ["0", "0", "0", "0"], jump: 4 ``` The expected behavior would be jumping to offset 1475, but in reality we went +4 because we didn't take into account the 4 bytes that were already read at the beginning of the method performing the read. ``` # lib/parsers/mp3_parser.rb:139 data = io.read(4) ``` We fixed to subtract this 4 bytes, as follows: ``` if seek_jmp > 0 io.seek(io.pos - 4 + seek_jmp) next end ``` And the result was: ``` frame: 366, pos: 1474, pos(h): 5c2 four: [0, 255, 251, 148], hexa: ["0", "ff", "fb", "94"], jump: 1 frame: 367, pos: 1475, pos(h): 5c3 four: [255, 251, 148, 196], hexa: ["ff", "fb", "94", "c4"], jump: 0 ``` It was also necessary to change MAX_FRAMES_TO_SCAN from 128 to 500, since the previous value wasn't enough to reach the first valid frame in some files. After this modification, the MP3Parser started to identify PNG as MP3 and because of that it was added a condition to detect the PNG header bytes and skip this parser in that condition. This can happen, because the MP3 file format is actually very lax. Later on we might need to consider adding a confidence score to the MP3 parser and only returning the parsed result if it is above a certain value. Co-authored-by: Julik Tarkhanov <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For some mp3 files, the MP3Parser class doesn't parse the mpeg frames correctly and because of that it fails when tries to find the
frame_bitrate
.test.mp3.zip
This is an example of file with that problem ☝️ . @linkyndy told me that the owner of the file authorized us to use this file as an example, but I'm not sure if it's necessary to mention his name here, is it?
From what I saw, when the parser finds the sync bytes, it's not getting a valid mpeg_id.
These are the expected values according to the docs:
For this file, the first frame that was found returns the value 01, which is
reserved
.Because of that, it can't find the frame_bitrate and it raises an error
InvalidDeepFetch
.I tested parsing the file with tinytag, which the method
parse_mpeg_frames
was based on , and it worked:The text was updated successfully, but these errors were encountered: