Quantcast
Channel: Everyday I'm coding » unicode
Viewing all articles
Browse latest Browse all 7

Best way to convert text files between character sets?

$
0
0

What is the fastest, easiest tool or method to convert text files between character sets?

Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa.

Everything goes: one-liners in your favorite scripting language, command-line tools or other utilities for OS, web sites, etc.

Best solutions so far:

On Linux/UNIX/OS X/cygwin:

  • Gnu iconv suggested by Troels Arvin is best used as a filter. It seems to be universally available. Example:

    $ iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt

    As pointed out by Ben, there is an online converter using iconv.

  • Gnu recode (manual) suggested by Cheekysoft will convert one or several files in-place. Example:

    $ recode UTF8..ISO-8859-15 in.txt
    This one uses shorter aliases:
    $ recode utf8..l9 in.txt

    Recode also supports surfaces which can be used to convert between different line ending types and encodings:

    Convert newlines from LF (Unix) to CR-LF (Dos):
    $ recode ../CR-LF in.txt

    Base64 encode file:
    $ recode ../Base64 in.txt     

    You can also combine them.

    Convert a Base64 encoded UTF8 file with Unix line endings to Base64 encoded Latin 1 file with Dos line endings:
    $ recode utf8/Base64..l1/CR-LF/Base64 file.txt

On Windows with Powershell (Jay Bazuzi):

  • PS C:\> gc -en utf8 in.txt | Out-File -en ascii out.txt

    (No ISO-8859-15 support though; it says that supported charsets are unicode, utf7, utf8, utf32, ascii, bigendianunicode, default, and oem.)


Solution:

Stand-alone utility approach:

iconv -f UTF-8 -t ISO-8859-1 in.txt > out.txt

f: from
t: to


Viewing all articles
Browse latest Browse all 7

Latest Images

Trending Articles





Latest Images