From: Fergal Daly (fergal at domain esatclear.ie)
Date: Fri 20 Oct 2000 - 01:48:58 IST
At 00:27 20/10/00, Niall O Broin wrote:
>On Thu, Oct 19, 2000 at 07:28:59PM +0100, Fergal Daly wrote:
>
> > while ( $string =~ /\G(.{5})/g )
> > {
> > my $header = $1;
> > print "header = $header\n";
> > my ($MB, $SMB, $LB) = unpack("na1n", $header);
> > print "MB = $MB, SMB = $SMB, LB = $LB\n";
> >
> > $string =~ /\G(.{$LB})/g;
> > my $data = $1;
> > print "data = $data\n";
> > }
>
>As I mentioned, I ended up using something like that, but I didn't use \G. I
>fail to see what \G does for you in either of those regex. Mind you, I had a
>hard time getting my head around \G because the bloody Perl Cookbook, which
>I generally find excellent, has two examples for \G and in the second the \G
>is not necessary - the code works without it, just as mine does, and just as
>the above would, I think. It seems to me that \G is only of use when you're
>doing repeated matches in the same string as per /\G /0/g to change leading
>spaces in a string into zeroes. Mind you, what seems to me to be true of
>regular expressions is not to be relied upon :-)
Yeah, the \G is overkill here. There's no possibility of the patterns
having to go looking for a match anywhere beside immediately after the last
one. There have been times when either a regex has worked but as far as I
can tell it shouldn't and vice versa! The Perl Journal has some excellent
articles on them, including a detailed explanation of the regex state
machine and one by the guy who wrote the code that Yahoo use to replace any
occurrence in a press release of any of the 16000 ticker symbols with a
link to it's quote page, contains a non-spurious use of \G.
Also, of the two examples in the man page, only one of them needs \G. I
think the main use for \G if you want to try match a string against several
expressions, for example in a parser,
$_ = "aabf 45345 fasdf asdfasdf 32452345 23452345";
while (1)
{
if (/([a-z]+)\s+/gc)
{
print "Got a word - '$1'\n";
}
elsif (/\G([0-9]+)\s+/gc)
{
print "Got a number - '$1'\n";
}
else
{
last;
}
}
if you left out \G, the first regex would skip over '45345' and find 'fasdf'.
There was talk of a regex debugger some time ago, or maybe Active State
have one in their dev kit, can't remember, but it'd be handy to be able to
single step through some of the nastier ones,
Fergal
P.S. Just found that internals of regexs article here
http://www.plover.com/~mjd/perl/Regex/
This archive was generated by hypermail 2.1.6 : Thu 06 Feb 2003 - 13:07:51 GMT