Problem description
Greetings,
in my quest to have a proper package for daps in official Debian, I have stumbled upon this series of lintian warnings:
W: daps: national-encoding [etc/daps/xep/hyphen/dehyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/huhyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/ithyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/ruhyphal.tex]
The reason of the warning is the following:
A file is not valid UTF-8.
Debian has used UTF-8 for many years. Support for national encodings is being phased out. This file probably appears to users in mangled characters (also called mojibake).
Packaging control files must be encoded in valid UTF-8.
Please convert the file to UTF-8 using iconv or a similar tool.
I can see that this makes perfect sense with respect to the hyphenation files: they are indeed for languages that require diacritical signs not present in us-ascii, for example.
So my question is the following, would it be correct to convert these files to utf-8 with a command line like the following:
iconv --from-code=ISO_8859-1 --to-code=UTF-8// -o file-new file
Note: the file huhyph_rx.tex has the following first line:
% ISO8859-2
but when checked using file -i huhyph_rx.tex, the encoding seems to actually be charset=iso-8859-1 (which is confirmed by vim which says the encoding is latin-1, a synonym of iso-8859-1, I think).
This means that the first line of the file is not binding and we can freely recode these files to UTF-8.
If this idea is not silly nor erroneous, would you do this upstream so that for the next version it will be there? For the time being, if you do not make negative comments about this, I'll make that reencoding myself.
What's your take on this?
Sincerely,
Filippo
Problem description
Greetings,
in my quest to have a proper package for daps in official Debian, I have stumbled upon this series of lintian warnings:
W: daps: national-encoding [etc/daps/xep/hyphen/dehyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/huhyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/ithyph_rx.tex]
W: daps: national-encoding [etc/daps/xep/hyphen/ruhyphal.tex]
The reason of the warning is the following:
A file is not valid UTF-8.
Debian has used UTF-8 for many years. Support for national encodings is being phased out. This file probably appears to users in mangled characters (also called mojibake).
Packaging control files must be encoded in valid UTF-8.
Please convert the file to UTF-8 using iconv or a similar tool.
I can see that this makes perfect sense with respect to the hyphenation files: they are indeed for languages that require diacritical signs not present in us-ascii, for example.
So my question is the following, would it be correct to convert these files to utf-8 with a command line like the following:
iconv --from-code=ISO_8859-1 --to-code=UTF-8// -o file-new file
Note: the file huhyph_rx.tex has the following first line:
% ISO8859-2
but when checked using file -i huhyph_rx.tex, the encoding seems to actually be charset=iso-8859-1 (which is confirmed by vim which says the encoding is latin-1, a synonym of iso-8859-1, I think).
This means that the first line of the file is not binding and we can freely recode these files to UTF-8.
If this idea is not silly nor erroneous, would you do this upstream so that for the next version it will be there? For the time being, if you do not make negative comments about this, I'll make that reencoding myself.
What's your take on this?
Sincerely,
Filippo