Credit: Willis Lai / IDG There are many ways to remove duplicate lines from a text file on Linux, but here are two that involve the awk and uniq commands and that offer slightly different results. Remove duplicate lines with awk The first command we’ll examine in this post is a very unusual awk command that systematically removes every line in the file that is encountered more than once. It leaves the first instance of the line intact, but “remembers” it and removes any duplicates encountered afterwards. Here’s an example. Initially, the file looks like this: Once upon a time, there was a lovely princess with a foul temper. Whenever she went for a walk, she left her castle smiling, but if she ran into anyone frowning or arguing with someone else, she stopped and made an angry face. Continue reading If the princess ran into a friend who didn't want to chat with her, she stopped and made an angry face. Continue reading The awk command that does this work looks like this: $ awk '!x[$0]++' grouchy_princess Once upon a time, there was a lovely princess with a foul temper. Whenever she went for a walk, she left her castle smiling, but if she ran into anyone frowning or arguing with someone else, she stopped and made an angry face. Continue reading If the princess ran into a friend who didn't want to chat with her, Note that each of the duplicated lines is now displayed only once and in its initial position. In fact, if you simply want to see any duplicated lines, you only need to change the command in a minor way. Just remove the exclamation point (signifying “not”) and you will see only the duplicated lines: $ awk 'x[$0]++' grouchy_princess she stopped and made an angry face. Continue reading The only problem with the awk ‘!x[$0]++’ command is that it’s not all that easy to remember. On the other hand, it’s also not that hard to turn the command into a simple script. Mine looks like this: $ cat rmdups #!/bin/bash awk '!x[$0]++' $1 The awk command removes duplicate lines from whatever file is provided as an argument. If you want to save the output to a file instead of displaying it, make it look like this: #!/bin/bash awk '!x[$0]++' $1 > $1-new You can run the script shown using a command like “rmdups addresses”. If you use the second version, a file with “-new” added to the original file name will contain the output. Remove duplicate lines with uniq If you don’t need to preserve the order of the lines in the file, using the sort and uniq commands will do what you need in a very straightforward way. The sort command sorts the lines in alphanumeric order. The uniq command ensures that sequential identical lines are reduced to one. $ sort grouchy_princess | uniq but if she ran into anyone frowning or arguing with someone else, Continue reading If the princess ran into a friend who didn't want to chat with her, Once upon a time, there was a lovely princess with a foul temper. she stopped and made an angry face. Whenever she went for a walk, she left her castle smiling, In addition, if sorting the contents of your file contents is helpful, this approach may be ideal. While this technique doesn’t work all that well with fairy tales, it works just fine for lists of meeting attendees, grocery shopping lists etc. This combined use of sort and uniq surrounding the file name means a command like it can’t be turned into an alias, but it could be turned into a simple script like this: #!/bin/bash if [ $# == 1 ]; then if [ -f $1 ]; then sort $1 | uniq fi fi The script verifies that an argument was provided and that it’s an existing file before it sorts it and sends the output to the uniq command. Wrap-Up Commands like those shown can be very helpful in cleaning up or verifying the content of text files, particularly lists in which you don’t want any line to show up multiple times. Turning the commands into a script makes it convenient to call on them whenever they might be helpful. Related content how-to How to find files on Linux There are many options you can use to find files on Linux, including searching by file name (or partial name), age, owner, group, size, type and inode number. By Sandra Henry Stocker Jun 24, 2024 8 mins Linux opinion Linux in your car: Red Hat’s milestone collaboration with exida With contributions from Red Hat and critical collaborators, the safety and security of automotive vehicles has reached a new level of reliability. By Sandra Henry Stocker Jun 17, 2024 5 mins Linux how-to How to print from the Linux command line: double-sided, landscape and more There's a lot more to printing from the Linux command line than the lp command. Check out some of the many available options. By Sandra Henry Stocker Jun 11, 2024 6 mins Linux how-to Converting between uppercase and lowercase on the Linux command line Converting text between uppercase and lowercase can be very tedious, especially when you want to avoid inadvertent misspellings. Fortunately, Linux provides a handful of commands that can make the job very easy. By Sandra Henry Stocker Jun 07, 2024 5 mins Linux PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe