On Friday 19 March 2004, Declan.Grady at nuvotem.com (Declan Grady) wrote:
>Hi,
>I've written a small perl proggy to split up ascii spoolfiles into separate
>ascii files for archiving.
>e.g. a spoolfile containing say 20 order acknowledgements will be split up
>into 20 separate ascii files, and then converted to pdfs, with a simple html
>list linking to them.
>My next problem is that to get the list of spoolfile names, i was using grep,
>since there are lots of files with the same extension...
>grep ACKNOWLEDGMENT spool*sdy | awk 'BEGIN {FS=":"}{print $1}' > filelist
>>To have this run from within my perl script, do I just use system("grep ..);
>or is there a more clever 'perl way' to do this ?
>>I thought of opening every sdy file and checking for the word ACKNOWLEDGMENT
>in the specific line number where it would appear, but I think this would be
>overkill - mabye not, as I guess grep would open every file anyway ?
Yes, you do need to open every file but it's not overkill - how else could you
examine their contents? But you definitely don't use system("grep ..) from
within perl - perl was designed as a text processing language, and a superset
of grep and awk. You could wrap your existing perl code in something like this
foreach $spool (@ARGV) {
open SPOOL, "<$spool";
# read entire spool file into one variable, in a block to localise $\
{
local $/; # undef the IRS to read entire file into one variable
$_ = <SPOOL>;
}
if (/ACKNOWLEDGMENT/) {
# split into an array of lines
@lines = split '\n';
# your code goes here
}
close SPOOL;
}
Observe deliberate use of perl's $_ variable here, which is the default
argument for many perl functions. If I hadn't used $_ there, I'd have had to
create a temporary variable to hold the lines, and then search and split that
variable. There is definitely an argument that using this temporary variable
makes the code clearer, but IMO using $_ saves me from having to have an
otherwise useless temporary variable. It's completely readable to a perl
person anyway :-)
At the "# your code goes here" you now have the array @lines, containing all
the lines of the spoolfile. It's time for your code to take over, and split
that up into separate files as you mentioned. Bear in mind that @lines
contains lines which don't have trailing \n (because the split removed that)
whereas if you had done something like
@lines = <SPOOL>
the lines WOULD have trailing \n
You would then run your program like
split_spools spool*sdy
Niall
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!