Believe it or not, Microsoft has release it's own parser for <Ahem> "html"
files - What's more I think it's currently on it's second incarnation. It's
a plug-in for Word rather than a stand-alone app. Bit of useless trivia
You could try out html tidy from http://www.w3.org/People/Raggett/tidy/ with
lots of options (including a word-2000 flag).
From the site:
If set, Tidy will go to great pains to strip out all the surplus stuff
Microsoft Word 2000 inserts when you save Word documents as "Web pages". The
default is no."
> Someone sent me a file that looks like Word9 got it's hands on it. It's
> covered in XML-ish looking style code. Anyone got a program to get rid of
> this ? It does render in netscape, but it looks horrid.
>> I tryed Demoroniser, but it didn't change much...
> "The fool must be beaten with a stick, for an intelligent person
> the merest hint is sufficient" -- Zen Master Greg
> Irish Linux Users' Group: ilug at linux.ie>http://www.linux.ie/mailman/listinfo/ilug for (un)subscription
> List maintainer: listmaster at linux.ie>
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!