"Kenn Humborg" said:
> That's what I meant. This little proxy would take over the role
> of the spider/crawler in a standard search engine setup. Except
> this time the "spider" is driven by the user's browsing, rather
> than by following links.
>> I'd imagine that modern machines would have no trouble indexing
> pages at the speed a normal user reads.
> And, for added coolness, when you install this indexer for the
> first time, it would prime the index by trawling through your
> browser cache.
... and history database too, to pick up URLs, I think the hist db is
longer-lived...
Sounds cool. But have you worked out if it's possible to get netscape to
dump the URLs/content of pages it's looking at? Or do you mean use this
prog as a proxy? That'd slow down browsing a little :(
BTW I think one of the Samba guys (not too clear on this BTW) did a cool
hack which could cut and paste HTML (including forms) directly from
Netscape, as a kind of annotation/personal log tool. Sounds like complete
magic. Whatever techniques he used for that may also help here.
Also www.genehack.org and www.jwz.org both contain stuff about a concept
called Intertwingle which would be a kind of personal mail concept
indexer... vaguely related.
Good luck with this, sounds cool,
--j.
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!