p.slabiak at linux-services.org writes:
> >Lars Hecking writes:
> >> Timothy Murphy writes:
> >> > I want to determine which files in directory foo/ on remote machine A
> >> > are duplicates of files in directory foo/ on my current machine.
> >> > What is the simplest way to do this, please?
> >>
> >> machine A: md5sum foo/* >foo.md5
> >> copy foo.md5 to machine B
> >> machine B: md5sum 0c foo.md5
> >
> > Typo: s/0c/-c/
> I have more than 20 GB mp3 and some of them were double on my Kanotix-PC....
> And here is my solution:
>> find -exec md5sum {} \; > /tmp/sum.unsorted
> sort < /tmp/sum.unsorted > /tmp/sum.sorted
> cut -f 1 -d " " /tmp/sum.sortd | uniq -d > /tmp/sum.dupe
> grep -F -f /tmp/sum.dupe /tmp/sum.sorted
>> That is only an idea (which works)...
a similar method to deal with that situation is
http://jmason.org/software/scripts/lndir-full.txt :
=head1 NAME
lndir-full - reduce disk usage by linking identical files
=head1 SYNOPSIS
lndir-full dir1 [...]
=head1 DESCRIPTION
This script will descend one or more directory trees provided on the command
line, and will hard-link all identical files to each other.
won't help Tim, though, as it requires that all files be on
the same filesystem (so that hard links work).
--j.
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!