-
Notifications
You must be signed in to change notification settings - Fork 9
Bug: codevalidator -f crashes and leaves my file empty when using non-ASCII characters #37
Description
Summary
When a YAML file contains a character outside of ASCII as well as trailing whitespaces,
codevalidator -f filename
will crash with an (not that useful) error message and leave the file empty, at the same time not even creating a backup copy.
Fortunately I did a git add
beforehand.
How to reproduce
I'm using codevalidator 0.8.2, judging from pip show codevalidator
(codevalidator itself doesn't have a --version option).
Here is an example file (encoded as UTF-8):
definitions:
purchase_order:
type: object
description: |
An either sparse or complete representation of a purchase order.
(TODO: The definition here is not yet complete – e.g. positions
are missing.)
This has a trailing space in line 5, and an –
(en-dash) in line 6. I guess the latter causes codevalidator to crash, the former makes it try to correct at all.
Here is the output for this file:
$ codevalidator -v -f backend/src/main/resources/api/swagger-purchase-order.yaml
backend/src/main/resources/api/swagger-purchase-order.yaml: contains lines with trailing whitespace
backend/src/main/resources/api/swagger-purchase-order.yaml: Trying to fix notrailingws..
Traceback (most recent call last):
File "/usr/local/bin/codevalidator", line 9, in <module>
load_entry_point('codevalidator==0.8.2', 'console_scripts', 'codevalidator')()
File "/usr/local/lib/python2.7/dist-packages/codevalidator.py", line 953, in main
fix_files()
File "/usr/local/lib/python2.7/dist-packages/codevalidator.py", line 878, in fix_files
fix_file(fname, rules)
File "/usr/local/lib/python2.7/dist-packages/codevalidator.py", line 866, in fix_file
fd.write(fixed.encode())
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 191: ordinal not in range(128)
No .pre-cvfix file is created in this case (but one is created if I replace the –
by -
).
It looks like it is using 'ascii' as the codec instead of an unicode one.
Expected behaviour
Codevalidator should be able to handle UTF-8 encoded files.
Even if it is not able to fix my file, it should certainly not destroy it (overwriting it with an empty file).
Even if it does that, it should create a backup copy (unless being told not to by --no-backup
).
Workaround
Make sure only ASCII characters are used in files passed to codevalidator.
And if you are not sure, make a copy of the file beforehands.