Last week I had to sort a list of several thousand email addresses in a
file, and then break the file into manageable chunks.
To do it I decided to rely on some basic Unix commands present in Linux:
cat, sort and split.
Anyway, here's the little bit of code. It demonstrates the power of the
Unix command line by piping one command into another into another again:
cat list.txt | sort -f -t @ -k 2 | split -l 500 - list
what it does:
There are 3 commands here. Each does some function and hands the results
either to the next command or in the case of the last command, into
resulting text files. Information is transferred between commands using
the "|" or pipe symbol.
"cat list.txt" will print the contents of the file list.txt to the
screen or anywhere else you might point it at.
"sort -f -t @ -k 2" will sort the information it got from the "cat"
command and spews out a sorted list.
Parameters: "-f" is the equivalent of sorting case insensitively, "-t @"
defines a separator between fields in the input, "-k 2" says to sort on
the second key. (i.e. the hostname in the email address)
"split -l 500 - list" will split the input it gets into files, 500 lines
long, and put the results into files which have filenames starting with
the word list. i.e. listaa, listab, listac.
Parameters: "-l 500" tells split to split every 500 lines, the solitary
"-" means to listen on STDIN or on the previous commands' piped output
for input, "list" is the prefix for the resulting filenames.
It's possible to shorten the above commands but I prefer to have it
slightly more human readable!
Bit longer than I intended, but there are many people on the list who
don't know about all this stuff, and what you can do with it..
Donncha.
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!