Bash: compare two directories

In Unix based systems like Linux and Mac OS X there are a number of ways of comparing two directories. The simplest way is to use diff:

diff –brief -rb directory_1 directory_2

This command compares each file and reports if they differ. You can find the meanings of the options in man diff.

Diff is fine if you’re on a fast drive, if there aren’t many files or the files aren’t big. The command compares the contents of each file so it can take quite some time on a slow external drive.

If you just want to know which files are in one directory and not in the other directory it’s overkill. This little bit of Bash scripting does that however:

diff <(cd dir1 && find | sort) <(cd dir2 && find | sort)

It still uses diff, but compares the file listing of each directory instead of the files. It’s much faster and perfect for figuring out what files are out of place on my 2 relatively slow USB drives. (source)



5 Replies to “Bash: compare two directories”

  1. I use Robocopy to back up to external drives. very fast. Need to automate it a bit in .bat files though.

    Btw, would be nice to be notified of future comments via twitter seeing as I’ve logged in that way.

    1. Nice, I didn’t know about Robocopy. I use rsync to sync my two ext3 drives but since one of them broke a while ago (and was replaced) I had used the other for some backups of my phone. I never synced those properly and managed to free up about 5GB of extra space by deleting dupes and videos and jpegs I had copied elsewhere 🙂

      I guess the only way to get those notifications would be via an @ reply. The app doesn’t have permission to DM you. I can’t remember if your Twitter email is exposed either but I’ll have to suggest this to the Jetpack guys 🙂

  2. That’s the great thing about *nix systems. A dozen ways to skin the cat.
    I use
    rsync -rvnc –dry-run dir1 dir2
    It also allows me to compare a local dir with a remote over SSH
    rsync -rvnc –dry run dir1 remote:dir2

    for a dumber version of diff try using comm

    rsnapshot if you want to automate backups. It uses rsync.

  3. What is the fastest way to find the difference between lot of files in two directories?
    You say “Diff is fine if you’re on a fast drive, if there aren’t many files or the files aren’t big”… what to do if there are many files? High performance is my requirement.

Leave a Reply