LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] [OT puzzle] Grep regex Answers

[ILUG] [OT puzzle] Grep regex Answers

Declan Moriarty junk_mail at iol.ie
Thu Jun 29 13:12:32 IST 2006


On Tue, 2006-06-27 at 13:28 +0100, Proinnsias Breathnach wrote:

> > > strings index.dat | grep http | less ?
> > > 
> > > Maybe, just maybe ?
> > 
We were doing very well here...

> > it caught the http all right. No file:// ftp:// or whatever else. That
> > might be enough though.
> 
> easy to fix: 
> 	strings index.dat | grep -e '[http|ftp|file]' | less
This caught the desired stuff along with a huge pile of unwanted gruff
as if the regex was too all-encompassing.  I had lines like
f7,ea,0
as well. Adding -o showed it was triggering on single letters :-/.

Kevin's offering 

>     strings index.dat | grep -e '\(http|ftp|file\)' | less

Was a little to restrictive - it caught nothing :-(.

In fact, I fiddled away with brackets and escapes to no avail.
Based on this, my feeble attempt

  strings index.dat | grep -e 'http' -e 'ftp' -e 'file' | less

Actually more or less did the trick.  I only ever got semi functional
with Perl regexes, and these have to be Posix or something

To trim the crud, Proinnsias offered 
> Probably easily solved by adding a :
> 	| sed -e 's/ .*$//' 
> before the | less 

This was excellent - it provided the correct search term. It didn't
work, mind you. I got a list of lines saying 
Visited:
and nothing more. So the correct answer seems to have been

strings index.dat | grep Visited |less 

giving me lines like the end of the mail (Excuse Evolution's nutty
Wrapping). But we'll still leave the award with Proinnsias because he
did most of the work that worked. I haven't the cheek to claim it
anyhow.

So the trick to improving that would be to trim the crud (cookies?)
without losing the redirects and subdirs. That is way beyond my html
parsing knowhow and regex skills, so I'll wade through the crud, and
thank all who offered suggestions.

-- 
        With Best Regards,

        Declan Moriarty.


Visited: administrator at file:///D:/Britanica/cache/info_31_.html
Visited: administrator at about:blank
Visited: administrator at http://rad.msn.com/ADSAdClient31.dll?GetAd?
PG=IMSIRD?SC=HF
Visited: administrator at http://t.msn.com/en-ie/default.aspx?
ver=7.0.0777&did=1&t=7MG0XqkiQB*DQ4BRIgU!wZLJWb!jhSW!lbssD52WRyoW
v2Mg3GOMewEBJYKV4UzNq4xkED3*YTN01GAhg3HNAMDPVnw7dlgL30pAog0Y1aqnlE2pXe2ZCrlWNOoyby!Iez&p=7b05ezVgIDCv6A!8v*GQBs6gRz0SnwG7CYc
NW!eGSxrl3l!
JkeKIxGkXL*LU2KIACRcdkjpOkkxDec1qTNUhiRToEZfnFvYBasLe1eVbtVnK7OSJbNKpQNqgZMNi0UkOyMM9aa7rFMZUY4lp05RJUuc6WRbZN0d
35hAmHca9UOzx0bVeFMbXBo4PVXzKCc5Pqm
Visited:
administrator at http://messenger.yahoo.com/external/client_ad.php?p=409640
Visited: administrator at http://t.msn.com/en-ie/default.aspx?
ver=7.0.0777&did=1
Visited: administrator at file:///C:/WINDOWS/temp/9324124/bill_1.html
Visited: administrator at http://images.google.ie/imgres?
imgurl=https://www1.columbia.edu/sec/itc/hs/medical/pathology/neuropat
hology/02_DD_Developmental/DD-02.jpg&imgrefurl=https://www1.columbia.edu/sec/itc/hs/medical/pathology/neuropathology/02_DD_D
evelopmental/&h=491&w=700&sz=59&tbnid=HIJ2hn3WnTEJ:&tbnh=96&tbnw=138&hl=en&start=1&prev=/images%3Fq%3Ddd%26svnum%3D10%26hl%3
Den%26lr%3D%26safe%3Doff
Visited: administrator at http://images.google.ie/imgres?
imgurl=http://www.salsaruedacongress.com/Salsa%2520Rueda%2520Congress%
2520Of%2520The%2520Americas%25202004%2520Gallery/Miami%2520Hottest%
2520Salsera/Miami%2520Hottest%2520Salsera%2520Big%25209.j
pg&imgrefurl=http://www.salsaruedacongress.com/Miami%2520Hottest%
2520Salseras%2520Winners%2520Gallery.asp&h=454&w=302&sz=85&


-- 
        With Best Regards,

        Declan Moriarty.




More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell