Hi,
I'm having trouble using sed to do replacements on some badly tagged
xml. I have a large number of files that are tagged as follows:
<first id="34">
blah blah
<second id="56" name="xyz1">hello hello</second>
<second name="xyz4">hello hello</second>
<second id="16" name="xyz5">hello hello</second>
<first id="3">
blah blah blah
<second>hello hello</second>
<second id="12" name="xyz5">hello hello</second>
The "first" tags have no closing tags at all, and may or may not have
text between the tag and the next tag. What I want to do is remove the
"first" tag and any text up to, but not including the "second" tag.
I've got to the following stage, but don't know how to get it to _not_
delete the line containing the "second" tag:
sed -e '<first/,/<second s/.*//' file.xml
Any suggestions?
Thanks,
--
Marcus Furlong
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!