On Wed, 19 May 2004, Justin MacCarthy wrote:
> Hi
>> I want to use wget to give me a list of all URIs referenced on my
> domain (all only my domain), so I can remove old files. I'm sure there
> is way to do this with wget, but I just can't see it, (really bad flu
> today) I don't want to download anything just list the documents so I
> can clean up a local copy
>> Thanks Justin
Well maybe you could combine wget and a bit of bash scripting
and sed/awk/perl. Use wget to download a page, then pass the downloaded
page into an awk/perl script to extract out all the URLs/URIs, and save
the outputted list of URLs/URIs. That might do the trick, if I understand
your problem.
Rory
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!