LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] tip of the day (find duplicate files)

[ILUG] tip of the day (find duplicate files)

Brady, Padraig Padraig.Brady at compaq.com
Tue Sep 5 15:38:09 IST 2000


Just did the following script which finds duplicate files 
in the specified directories and their subdirectories.
It's very fast.

usage:
	finddupe [dir1] [dir2] ...
e.gs:
	cd dir1;finddupe
	finddupe /usr/bin /bin /sbin /usr/sbin

Note it requires V2.0 of uniq which is part of
GNU textutils.

Padraig.

#!/bin/sh
# September 2000 * Padraig at Brady001.iol.ie
#
find ${*-.} -xdev -size +0c -type f -printf "%p\0%i\0%s\n" |
tr ' \t\0' '\0\1 '                                         |
sort +2nr +1 -u                                            |
uniq -2 -D                                                 |
cut -f1 -d' '                                              |
sort                                                       |
tr '\0\1\n' ' \t\0'                                        |
xargs -0 md5sum                                            |
sort +0 -1                                                 |
tr ' \t' '\1\2'                                            |
sed -e 's/\(^.\{32\}\)..\(.*\)/\2 \1/'                     |
uniq -D -1                                                 |
sed -e 's/\(^.*\) \(.*\)/\2 \1/'                           |
tr '\1\2' ' \t'                                            |
(
psum='no match'
line=''
while read sum file; do
  if [ "$sum" != "$psum" ]; then
    if [ ! -z "$line" ]; then
       echo -e "$line"
    fi
    line="`du -b "$file"`"
    psum="$sum"
  else
    line="$line $file"
  fi
done

if [ ! -z "$line" ]; then
  echo -e "$line"
fi
)                                                          |
sort +0 -1 -brn




More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell