Converting files from ISO-8859-1 to UTF-8

With a simple command you can convert a file from ISO-8859-1 encoding to UTF-8 encoding.


Many websites are still encoded in ISO‑8859‑1. If you want to convert all the text in a file from ISO‑8859‑1 to UTF‑8, you can use the iconv utility, which is included in most Linux distributions.

The following command converts a single file from ISO-8859-1 to UTF-8:

Command
iconv -f iso-8859-1 -t utf-8 file1 > file2

The script below converts all text files (PHP, JS, CSS, HTML, and TXT) within a directory from ISO-8859-1 to UTF-8. The script should be executed from outside the directory to convert; it will then create a new directory with the prefix utf8.. For example, if you were to run ./iconvall.sh www/, it would create utf8.www in the current working directory, where all text files are converted, and non-text files are copied. The directory structure in utf8.www will be identical to www.

The script looks like this:

iconvall.sh
#!/bin/bash
if [ $# -ne 1 ]
then
        echo "Script requires one argument, the folder to be converted from iso to utf."
        exit
fi

mkdir utf8.$1
cd utf8.$1
(cd ../$1; find -type d ! -name .) | xargs mkdir
cd ..
for i in `find $1 -type f -print`;
do
            #converts text files
            if [[ $i == *.php ]] || [[ $i == *.js ]] || [[ $i == *.css ]] || [[ $i == *.html ]] || [[ $i == *.txt ]]
            then
            echo "[CONVERT]: $i";
            iconv -f ISO-8859-1 -t UTF-8 $i -o utf8.$i;
            else
                echo "[COPY]: $i";
                cp $i utf8.$i
            fi
done

Example

Last updated

Was this helpful?