need help with sed and regexps

Hello guys
I need help with sed and regular expressions.
I have an input file containing text with html formatting.
I have to import this file into another program that respects only </br> tag.
I need to clean all html tags except
and variations of it, before importing this file.
All kind of br-s have to become </br>.

something like that:
1.
,
,</br> ... => </br>
2. "<whatever tag withot </br> >" => ""

How could i do it using sed?
 
Code:
sed 's@<\([^
][^<>]*\)>\([^<>]*\)</\1>@\2@g'

Pipe the line into this and it should strip off all HTML tags, the content between the tags will remain intact, and
tags will remain too.

P.S.
Up The Irons!
;)
 
:) 10x \m/
but it didnt work
here is sample file:
Code:
line1<tag1>alabala
blabla</tag2>
line2<tag>blabla
<tag3>text<tag4>blabla


</br>
</br>
< br />

here is sed output:
Code:
sed 's@<\([^
][^<>]*\)>\([^<>]*\)</\1>@\2@g' test.txt
line1<tag1>alabala
blabla</tag2>
line2<tag>blabla
<tag3>text<tag4>blabla


</br>
</br>
< br />


I did what i want with 3 seds.
Code:
sed -e "s:<[^<>]*br[^<>]*>:uniqstring123:g" Export.TXT > out1.txt
sed -e "s:<[^<>]*>::g" out1.txt > out2.txt
sed -e "s:uniqstring123:</br>:g" out2.txt > FINAL.TXT

but my way seems very lame... thats why i need another solution :)
 
Back
Top