How can I find duplicate files in Linux?

Learn how to identify duplicate copies of files in your Linux system allowing you to be more organized and save disk space.

By Arnold Robbins

March 21, 2017

Screen from "How can I find duplicate files in Linux?" (source: O'Reilly)

Over the years, any Linux user will tend to collect multiple copies of files in their system. This could be from from unpacking distributions in different areas, copying files for safekeeping, or any number of other reasons. In this training video Arnold Robbins shows you how to clean up your system. He will walk you through a file tree to identify duplicate files by:

Using the find command to walk a directory tree.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Actioning the results with xargs.

Sorting, and then printing the duplicate files found.

All of this is accomplished using a simple 20-line shell script; and is explained in a way that any level of Linux user can understand and replicate.

Get in-depth shell scripting training from Arnold Robbins.

Arnold Robbins is a professional software engineer who has worked with UNIX systems since 1980. The author of more than a dozen O’Reilly titles, including Linux in a Nutshell, Effective awk Programming, and the Bash Pocket Reference, Arnold is a master communicator who holds a BA in Information Science from Yeshiva University and an MS in Computer and Information Science from Georgia Tech.

Post topics: Software Engineering

Post tags: Questions