Linux Shell – How To Remove Duplicate Text Lines

I need to sort data from a log file, but there are too many duplicate lines. How do I remove all duplicate lines from a text file under GNU/Linux?

You need to use shell pipes along with the following two Linux command line utilities to sort and remove duplicate text lines:

  1. sort command– Sort lines of text files in Linux and Unix-like systems.
  2. uniq command– Rport or omit repeated lines on Linux or Unix

Removing Duplicate Lines With Sort, Uniq and Shell Pipes

Use the following syntax:
sort {file-name} | uniq -u
sort file.log | uniq -u

Remove duplicate lines with uniq

Here is a sample test file called garbage.txt displayed using the cat command:
cat garbage.txt
Sample outputs:

this is a test
food that are killing you
wings of fire
we hope that the labor spent in creating this software
this is a test
unix ips as well as enjoy our blog

Removing duplicate lines from a text file on Linux

Type the following command to get rid of all duplicate lines:
$ sort garbage.txt | uniq -u
Sample output:

food that are killing you
unix ips as well as enjoy our blog
we hope that the labor spent in creating this software
wings of fire


  • -u : check for strict ordering, remove all duplicate lines.

Sort file contents on Linux

Let us say you have a file named users.txt:
cat users.txt
Sample outputs:

SXI ADMIN 24/10/72
Martin Lee 12/11/68
Sai Kumar  31/12/84
Marlena Summer 13/05/76
Wendy Lee  04/05/77
Sayali Gite 13/02/76
SXI ADMIN 24/10/72

Let us sort, run:
sort users.txt
Next sort by last name, run:
sort +2 users.txt
Want to sort in reverse order? Try:
sort -r users.txt
You can eliminate any duplicate entries in a file while ordering the file, run:
sort +2 -u users.txt
sort -u users.txt

Without any options, the sort compares entire lines in the file and outputs them in ASCII order. You can control output with options.

How to remove duplicate lines on Linux with uniq command

Consider the following file:
cat -n telphone.txt
Sample outputs:

     1	99884123
     2	97993431
     3	81234000
     4	02041467
     5	77985508
     6	97993431
     7	77985509
     8	77985509

The uniq command removes the 8th line from file and places the result in a file called output.txt:
uniq telphone.txt output.txt
Verify it:
cat -n output.txt

How to remove duplicate lines in a .txt file and save result to the new file

Try any one of the following syntax:
sort input_file | uniq > output_file
sort input_file | uniq -u | tee output_file


The sort command is used to order the lines of a text file and uniq filters duplicate adjacent lines from a text file. These commands have many more useful options. I suggest you read the man pages by typing the following man command:
man sort
man uniq

Posted by: SXI ADMIN

The author is the creator of SXI LLC and a seasoned sysadmin, DevOps engineer, and a trainer for the Linux operating system/Unix shell scripting. Get the latest tutorials on SysAdmin, Linux/Unix and open source topics via RSS/XML feed or weekly email newsletter.

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

How to Make Website WCAG Compliant?

Next Post

Link download Kali Linux 2020.1 (ISO + Torrent)

Related Posts

The Future of eCommerce Business

The basic premise of search engine reputation management is to use the following three strategies to accomplish the goal of creating a completely positive first page of search engine results for a specific term...
Read More