Americas

  • United States
sandra_henrystocker
Unix Dweeb

Finding and fixing typos on Linux

How-To
Oct 25, 20225 mins
Linux

The Linux aspell and enchant tools can both ID typos in text files and suggest replacements.

linux spelling
Credit: Sandra Henry-Stocker

If you want to check a text file for typos, Linux can help.

It has a couple of tools and a number of commands that can point out the errors including aspell and enchant, and I’ll share a script that I put together recently that looks for typos using the system’s words file.

Using aspell

aspell is very clever tool that will point out typos and make it surprisingly easy to fix them. When used to make changes to a single file, it reverses the text and background colors to highlight misspelled words. You would start it with a command like this:

$ aspell check myfile

If aspell detects no typos, it simply exits. Otherwise, it will open with a display that contains the file text (or just the top lines depending on the length of the file) followed by a list of suggested replacement words and, below that, a list of the commands that you can run. The first typo (or suspected typo) will be displayed with the text and background colors reversed as shown below.

I wish that I could type with my eyes closed and never make a mistake. I don't
like typoze and I think I run into them far more often than I want.

1) depose                               6) typo              

If you want to replace the typo with one of the words listed, just use your keyboard to type the digit to the left of the word you want to select. If it’s the only typo in the file, aspell will make the change and exit. Otherwise, it will move on to the next misspelled word.

You can also replace a typo by typing “r” and then typing the word you want to use to replace it. If it’s a word that is likely to be repeated, you can press “R” instead and replace all instances of the word in the file. You can also decide to ignore what aspell deems a typo. After all, it might be a term that aspell simply doesn’t recognize or an acronym. You can do this one instance at a time by typing "i" or as a group by typing "I".

As a precaution, aspell creates a backup file (e.g., myfile.bak) of the file you are checking so that you can recover your typos if you find it necessary and repair words you might have changed in error.

You can also use aspell to check the spelling of a group of words. Type “aspell -a” as shown below and you can type a word and see the list of suggested replacements. If aspell responds with an asterisk (*), the word was spelled correctly.

$ aspell -a
@(#) International Ispell Version 3.1.20 (but really Aspell 0.60.8)
typoze
& typoze 17 0: depose, typos, typo's, types, type, typo, topaz, topees, type's, typed, tapes, topee, Topsy, doze, pose, tape, tape's

typos
*

^C

Type ^C to exit as shown above.

Using enchant

A tool named “enchant” will list the words it considers typos with a command like this:

$ enchant -l myfile
typoze
typoze

If you expect a lot of typos, you can use a command like the one below to tell you how many times each typo appears in the file:

$ enchant -l myfile | uniq -c
      2 typoze

To view the suggested replacements, run a command like this:

$ enchant -a myfile | grep :
& typoze 1 5: typo
& typoze 1 6: typo

Building a spell-checking script

I put a bash script together to see how well I could check the words in a file against the Linux words file (/usr/share/dict/words on my system). The task turned out to be a little trickier than I expected.

I run the script like this:

$ findTypos myfile
typoze
typoze

The script contains a series of commands to find and display a list of the misspelled words. The first group of commands check to see that a filename has been provided as an argument. If not, it prompts for one.

#!/bin/bash

if [ $# == 0 ]; then
    echo -n "file: "
    read file
else
    file=$1
fi

while read -ra line;
do
    for word in "${line[@]}";
    do
        word=$(echo $word | tr '[:upper:]' '[:lower:]')
        word=`echo $word | tr -d '[.,?:!"]'` # punct doesn't work for this
        word=`echo $word | sed s/'s//| sed s/'s//`
        grep ^$word$ words >/dev/null || echo $word
    done;
done 

The script then runs through each word in the file and runs a tr command to change it to all lowercase to avoid issues with capitalized words. It then uses a second tr command to remove most punctuation marks so that periods, question marks, etc. don’t cling to the words that need to be checked. I didn’t use [:punct:] because it would have removed the apostrophe in words like “isn’t”, but I separately removed the possessive “’s” at the ends of words. The last step was looking for the word in the words file. The ^ and $ characters tell grep to find only the word specified, not words that might include that word.

The script, which I call "findTypos", finds typos, but makes no attempt to fix them or suggest replacement words.

Wrap-Up

Detecting misspelled words in your text files can be helpful, especially if you’re preparing your weekly report and your boss is a stickler for grammar. Fortunately, Linux provides a number of ways to help with this.

sandra_henrystocker
Unix Dweeb

Sandra Henry-Stocker has been administering Unix systems for more than 30 years. She describes herself as "USL" (Unix as a second language) but remembers enough English to write books and buy groceries. She lives in the mountains in Virginia where, when not working with or writing about Unix, she's chasing the bears away from her bird feeders.

The opinions expressed in this blog are those of Sandra Henry-Stocker and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.

More from this author