COREML BERT Crashing on long text

For documents with lots of words, BERT ends up crashing outputting the error
`Fatal error: 'try!' expression unexpectedly raised an error: App.TokenizerError.tooLong("Token indices sequence length is longer than the specified maximum\nsequence length for this BERT model (784 > 512. Running this\nsequence through BERT will result in indexing errors\".format(len(ids), self.max_len)")`

How do you solve this or is BERT only available for paragraphs which a less number of words? Can we increase the `maxLen` to 1024 or even 2048 or would that not work?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

COREML BERT Crashing on long text #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

COREML BERT Crashing on long text #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions