LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Regular Expressions

[ILUG] Regular Expressions

Brian Foster blf at utvinternet.ie
Wed Apr 2 07:01:50 IST 2008


  | Date: Wed, 2 Apr 2008 00:14:24 +0100
  | From: Pádraig Brady <P at draigBrady.com>
  | 
  | > Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; ru) Opera 8.50=0A=
  | 
  | You can't really "AND" multiple things in the same RE
  | so in .htaccess for example you specify each separately to match
  | and the "AND" is implicit (unless you specify [OR]).
  | 
  | If it's always in the same order then you could do:
  | MSIE 6\.0.*ru.*Opera

 I understand what Pádraig is trying to say,
 but he didn't say it as well as he could have.
 it's perfectly possible to "and" REs in an RE.
 most REs are a series of "and"s (of a sort):
 for instance, the RE /ab/ is
 the RE /a/ "AND then followed by" the RE /b/.

 what is quite difficult to do is omit the
 "... then followed by" part of the "and ...".
 the easiest way to say RE /a/ and RE /b/
 (with no ordering, i.e., "ab" and "xbcay" both
 match) is, as Pádraig says, to use separate REs.

 there are two choices.  one possibility is to
 have an RE for each possible permutation; e.g.,
 either /ab/ or /ba/ must match.  (actually,
 the REs would be /a.*b/ and /b.*a/, but the
 `.*' is visual noise, so I'm omiting it.)

 the other is to declare a match only when all
 REs match; e.g., both /a/ and /b/ must match.

 I don't know squid(8) so I don't how you do
 either method that in `squid'.  but in awk(1),
 the second (and much easier) method is:

    /MSIE 6\.0/ & /ru/ & /Opera/  { ... }

 and in sed(1) it is:

   /MSIE 6\.0/{
     /ru/{
       /Opera/{
         ...
       }
     }
   }

 in both examples ‘...’ is what to execute on a
 match; that is, when the line contains all three
 of the (un-)desired substrings.  (efficiency
 freaks would probably order the REs so the least
 likely to match is checked first, with the most
 likely to match is checked last. (think about it.))

 it's possible to do the alternative method of all
 possible permutations in one ERE (Extended RE),
 but it tends to make heads explode: /(ab)|(ba)/

cheers!
	-blf-
-- 
“How many surrealists does it take to    |  Brian Foster
 change a lightbulb?  Three.  One calms  |  somewhere in south of France
 the warthog, and two fill the bathtub   |     Stop E$$o (ExxonMobil)!
 with brightly-coloured machine tools.”  |       http://www.stopesso.com



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell