Archive

Posts Tagged ‘charset’

Using CUPS to print text files in non-UTF8 charset encoding

May 17th, 2012 No comments

At our university department, many people still haven’t migrated to UTF8 and are still happily using ISO-8859-2 – mainly due to the amount of legacy text (TeX, …) documents.
Nowadays, support for non-UTF8 is slowly waning though, and CUPS is a prime example. Most of (shabby anyway) support for non-UTF8 encodings have been removed few years ago. It is still possible to force CUPS to print text files in non-UTF8 encoding if you extract the appropriate files from ancient version (1.2 or some-such) of CUPS to /usr/share/cups/charset/ and print using e.g. lpr -o document-format='text/plain;charset=iso-8859-2'. However, there is simply no support for lpr automatically setting the charset based on your locale.

We decided that the best way to go is to simply auto-detect the encoding using the awesome enca package and convert text files from this encoding to UTF8. This should be actually fairly fool-proof in practice, unless you are dealing with an extremely mixed set of languages. Making own CUPS filter is easy – just change texttops entries in /etc/cups/mime.conv to textautoencps and create a new /usr/lib/cups/filter/textautoencps file:

#!/bin/bash   if [ $# == 0 ]; then echo >&2 "ERROR:$0 job-id user title copies options [file]" exit 1 fi   { if [ $# -ge 6 ]; then cat$6 else cat fi; } | enconv -x utf-8 -L czech | /usr/lib/cups/filter/texttops "\${@:0:6}"
Categories: Tags: