You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The spec appears to only mention text, and the specific binary encoding / charset of that text seems out of scope.
Accordingly it seems to me as though cassava should generally be dealing with Text instead of ByteString, perhaps with a Data.Csv.Utf8 module for just directly treating ByteString values as encoded utf-8 text.
The text was updated successfully, but these errors were encountered:
An ASCII delimiter in the undecoded ByteString corresponds a delimiter in the corresponding UTF-8-decoded Text, so under UTF-8 encoding there is no problem with making a mistake with delimiters.
However, the user is forced to use UTF-8 if there are Text/ShortText/Char fields (cassava assumes UTF-8). If he wants to use another text encoding, he needs to use ByteString fields and do the ByteString-Text conversion separately. Alternatively, he can perform transcoding between UTF-8 and the other text encoding, using UTF-8-encoded ByteStrings when interfacing with cassava.
I have no idea about the performance characteristics of each alternative, though, including the proposed Data.Csv.Utf8.
To be clear Data.Csv.Utf8 would just be the current implementation. The module name should make it clear that using utf8 for the ByteString arguments is safe, and that non-utf8 arguments should expect edge cases and require additional care.
I am also unsure of how the Data.Csv or Data.Csv.Text or whatever Text-based alternative would change performance.
The spec appears to only mention text, and the specific binary encoding / charset of that text seems out of scope.
Accordingly it seems to me as though cassava should generally be dealing with
Text
instead ofByteString
, perhaps with aData.Csv.Utf8
module for just directly treatingByteString
values as encoded utf-8 text.The text was updated successfully, but these errors were encountered: