LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] [OT puzzle] Grep regex Answers

[ILUG] [OT puzzle] Grep regex Answers

Proinnsias Breathnach proinnsias at linux.ie
Thu Jun 29 13:57:56 IST 2006


On Thu, Jun 29, 2006 at 01:37:29PM +0100, Proinnsias Breathnach wrote:
> *ah* ... If we'd known the lines started with "Visited: " we could have
> made the job much easier ...
> 
> strings index.dat | grep ^Visited | sed -e 's@\(/.*\) .*$@\1@' | less
> 
> Should do the trick - again, untested and ymmv
> 
> (we're looking for a / followed by a number of characters before a space
> and a number of characters preceeding the EoL char ($) and stripping the
> space up to the EoL, this should do the job.
> 

Of course - the "crud" as you called it, isn't space delimited (per the
examples you included) - rather the stuff after the .html / .aspx etc.
This is often important in knowing what part of a site was visited.

If you really want to trim, notice that the .html/.aspx is usually
suffixed by a ? .. so replace the space above with a ? - odds are it's
the first in the line (is ? a valid filename character under DOS/Windows
?)

so:
	strings index.dat | grep ^Visited | sed -e 's/\?.*$//g' | less

might well do the job, if not - modify as above to ensure the URL is
kept too ..

P



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell