Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mp3_parser to not raise error when a tag frame has an unsupported encoding #161

Merged
merged 15 commits into from
Sep 8, 2020

Conversation

fabioperrella
Copy link
Contributor

closes #160

I found a file which raises an error when calling the method tag.album (in id3v2), so I added a loop to call the member methods in the tag inside attempt_id3_v2_extraction. If some of them raise an error, it will consider the tag as invalid and return nil.

@fabioperrella fabioperrella self-assigned this Aug 28, 2020
@fabioperrella
Copy link
Contributor Author

@julik @linkyndy wdyt about that approach?

I think it's not possible to fix the parser to get the correct value of those attributes, but I'm not sure.

Next week I will download an id3tag editor and see if it recognizes those attributes.

@linkyndy
Copy link
Contributor

I think it's fair to treat it as nil, instead of failing the whole file analysis, given that ID3 tags are somewhat adjacent to MP3s (we can still use the latter even with malformed tags).

I'm curious how would separate editors deal with such tags.

@julik
Copy link
Contributor

julik commented Aug 31, 2020

Looping in @krists as he is the author of the ID3 parsing gem we are using as a dependency. What do you think about this one? Could there be a more generic way of recovering from encoding errors in ID3 parsing?

@fabioperrella
Copy link
Contributor Author

to give more information, this piece of code is trying to fetch the ENCODING_MAP with key 173:

From: /Users/fabioperrella/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/id3tag-0.12.1/lib/id3tag/frames/v2/text_frame.rb:29 ID3Tag::Frames::V2::TextFrame#source_encoding:

    27: def source_encoding
    28:   require 'pry-byebug'; binding.pry
 => 29:   ENCODING_MAP.fetch(get_encoding_byte) { raise UnsupportedTextEncoding }.to_s
    30: end

[1] pry(#<ID3Tag::Frames::V2::TextFrame>)> get_encoding_byte
=> 173

And there is no encoding like that:

ENCODING_MAP = {
  0b0 => Encoding::ISO8859_1,
  0b1 => Encoding::UTF_16,
  0b10 => Encoding::UTF_16BE,
  0b11 => Encoding::UTF_8
}

@krists
Copy link
Contributor

krists commented Sep 1, 2020

Hi!

I would not mind adding a configuration option in the parser where you could choose how to handle this situation - raise error or use fallback encoding. In later case there might be issues that the text might have some unexpected characters. But it is probabbly better than thrown exception.

Anyone would like to create a PR?

@fabioperrella
Copy link
Contributor Author

Hi @krists ! I can create a PR for that, tks!

@fabioperrella fabioperrella marked this pull request as ready for review September 4, 2020 15:26
@fabioperrella
Copy link
Contributor Author

I managed to create an mp3 file and edit it's bytes to simulate an unsupported encoding, like the example that I got from production!

I used the app "Tag Editor Free" to edit the V2 tags and SublimeText to edit the binary file and change the encoding.

@fabioperrella fabioperrella changed the title Fix mp3_parser to not raise error when a member of the tag has invali… Fix mp3_parser to not raise error when a tag frame has an unsupported encoding Sep 4, 2020
@@ -1,6 +1,11 @@
require 'ks'
require 'id3tag'

ID3Tag.configuration do |c|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to save the ID3Tag.configuration and restore it once the parser returns? It might be that the application configures ID3Tag differently than we do 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm good point! I think it's possible. I will do it next week!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julik could you take a look if it is ok now, pls? I used ID3Tag.local_configuration for it!

fabioperrella and others added 2 commits September 7, 2020 15:53
When working on #161 we noticed that the method `TagWrapper#to_h` does not ignore empty strings and because of that, it can replace a id3v1tag by a empty id3v2 tag, so added a condition to skip for empty values.

Example before this change:
```
v1tag.genre = 'Rock'
v2tag.genre = ''

result: ""
```

Example after this change:
```
v1tag.genre = 'Rock'
v2tag.genre = ''

result: "Rock"
```

The mp3 used as a fixture was created by us. We used the tool `id3ed` (brew install id3ed) to edit the id3v1 tags
@fabioperrella fabioperrella requested a review from julik September 7, 2020 15:34
Copy link
Contributor

@julik julik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 And so cool there is a local_config option

@julik julik merged commit d072478 into master Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ID3Tag::Frames::V2::TextFrame::UnsupportedTextEncoding
5 participants