UnicodeDecodeError first time doing anything with Python need some help #69

Despirited · 2025-01-06T09:27:34Z

I have exported both my movies and my shows from Trakt, I managed to successfully import all the movies but when I try to import my shows I get this error

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 7868: character maps to

I have no idea what the issue is, the CSV I exported is this:
episodes_views.csv

Here is the entire message I got:

Options: Namespace(config='config.ini', input=<_io.TextIOWrapper name='episodes_views.csv' mode='r' encoding='cp1252'>, watched_at=True, rated_at=False, format='imdb', type='episodes', list='history', seen=False, clean=False, verbose=True)
Config file: config.ini
Config: <configparser.ConfigParser object at 0x00000239A28707D0>
Trakt, skipped access token refresh, token is less than 30 days, only 1:06:27.085743
Trakt: {'client_id': '02....05', 'client_secret': '64....33', 'access_token': 'fa....74', 'refresh_token': '1a....ee', 'baseurl': 'https://api.trakt.tv'}
Authorization header: Bearer fa....74
trakt-api-key header: 02....05
Traceback (most recent call last):
  File "C:\Python312\import_trakt.py", line 517, in <module>
    main()
  File "C:\Python312\import_trakt.py", line 449, in main
    read_ids = read_csv(options)
               ^^^^^^^^^^^^^^^^^
  File "C:\Python312\import_trakt.py", line 154, in read_csv
    return list(reader)
           ^^^^^^^^^^^^
  File "C:\Python312\Lib\csv.py", line 116, in __next__
    row = next(self.reader)
          ^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 7868: character maps to <undefined>

The text was updated successfully, but these errors were encountered:

xbgmsharp · 2025-01-06T10:08:29Z

Can you try with Python311 or Python310? Python312 is not to have critical changes.

Despirited · 2025-01-06T12:50:56Z

I just tried on Python311 and I am still getting the same error just on a different position, instead of 7868 it's now in position 7975.

NanoBitrin · 2025-01-25T00:14:01Z

Using chatgpt and Python 3.12.8, i found a way do bypass this error

First, correct the csv file, because is badly structured with too much double quotes:
Create a .py and execute it

import csv

input_file = "export_episodes_history.csv"
output_file = "export_episodes_history_cleaned.csv"

with open(input_file, mode='r', encoding='utf-8-sig') as infile, \
     open(output_file, mode='w', encoding='utf-8', newline='') as outfile:
    
    # Read the original CSV
    reader = csv.reader(infile)
    writer = csv.writer(outfile)
    
    for row in reader:
        # Fix the header
        if reader.line_num == 1:
            fixed_header = [col.replace('""', '').strip('"') for col in row]
            writer.writerow(fixed_header)
        else:
            # Fix duplicated double quotes in data rows
            fixed_row = [col.replace('""', '"').strip('"') for col in row]
            writer.writerow(fixed_row)

print(f"Cleaned file saved as: {output_file}")

Then use the import command again

python import_trakt.py -c config.ini -f tmdb -i export_episodes_history_cleaned.csv -l history -t episodes -w

If the error still exists, change line 153 of import_trakt.py from this:

reader = csv.DictReader(options.input, delimiter=',')

to this:

reader = csv.DictReader(open(options.input.name, mode='r', encoding='utf-8-sig'), delimiter=',')

and try again, it worked for me

xbgmsharp · 2025-01-25T09:37:21Z

Which Operating System are you using? utf-8-sig is need if you have a BOM file.
The export output is in UTF8 format, https://github.com/xbgmsharp/trakt/blob/master/export_trakt.py#L158
The import could handle UTF8 better, it is directly open by the argparse library
https://github.com/xbgmsharp/trakt/blob/master/import_trakt.py#L355
https://github.com/xbgmsharp/trakt/blob/master/import_trakt.py#L151

The default read encoding depends on the Operating System.
https://docs.python.org/3/glossary.html#term-filesystem-encoding-and-error-handler
https://docs.python.org/3/glossary.html#term-locale-encoding

Also the quote are need.

To solve the issue, can you try to replace line 355 by the following code.
https://github.com/xbgmsharp/trakt/blob/master/import_trakt.py#L355

parser.add_argument('-i', '--input',
                    help='CSV file to import, default %(default)s',
                    nargs='?', type=lambda f: open(f, mode='r', encoding='utf-8'),
                    default=None, required=True)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnicodeDecodeError first time doing anything with Python need some help #69

UnicodeDecodeError first time doing anything with Python need some help #69

Despirited commented Jan 6, 2025 •

edited by xbgmsharp

Loading

xbgmsharp commented Jan 6, 2025 via email •

edited

Loading

Despirited commented Jan 6, 2025

NanoBitrin commented Jan 25, 2025 •

edited

Loading

xbgmsharp commented Jan 25, 2025

UnicodeDecodeError first time doing anything with Python need some help #69

UnicodeDecodeError first time doing anything with Python need some help #69

Comments

Despirited commented Jan 6, 2025 • edited by xbgmsharp Loading

xbgmsharp commented Jan 6, 2025 via email • edited Loading

Despirited commented Jan 6, 2025

NanoBitrin commented Jan 25, 2025 • edited Loading

xbgmsharp commented Jan 25, 2025

Despirited commented Jan 6, 2025 •

edited by xbgmsharp

Loading

xbgmsharp commented Jan 6, 2025 via email •

edited

Loading

NanoBitrin commented Jan 25, 2025 •

edited

Loading