While trying to parse an Amazon pay file, I stumbled across this particular problem:

Exception: Unexpected quote at 1:2 (CSV::MalformedCSVError)

The reason: Amazon Pay encodes it’s files in UTF-8 with BOM. The BOM is an optional (in UTF-8 files) marker. You can see it with the tool xxd for instance

xxd apolish_amazon_2018DecMonthlyTransaction.csv | less

image

The three dots at the beginning of the file are the BOM. (EF BB BF in Hex).

Dealing with the BOM

Setting a different file encoding will unfortunately not help, as iconv seems to pass the BOM on unchanged.

There are no “special” encoding variations you could use, see my list for iconv here.

Therefore we need to remove it by reading three bytes.

apf = File.open(my_filename)

discard = apf.gets(3)

puts (discard.inspect)
apf.set_encoding(„UTF-8“, nil)

Further reading: