Each file has the same structure as follows:
I'm looking to dump only the section between <table> and </table> (including those tags).
I've spend a little while to find some solution, but still isn't clear for me, what's the easiest way to achieve that.
Tried following solutions:
http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/
http://www.unix.com/shell-programming-scripting/147347-how-get-one-particular-section-using-awk.html
http://www.unix.com/shell-programming-scripting/66251-remove-html-tags-bash.html
http://www.unix.com/shell-programming-scripting/58479-multiple-line-match-using-sed.html
A good start:
http://www.grymoire.com/Unix/Sed.html#uh-47
HTML:
<html>
<head></head>
<body><p><table>My table here!</table></p>
</body>
</html>
I'm looking to dump only the section between <table> and </table> (including those tags).
I've spend a little while to find some solution, but still isn't clear for me, what's the easiest way to achieve that.
Tried following solutions:
http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/
http://www.unix.com/shell-programming-scripting/147347-how-get-one-particular-section-using-awk.html
http://www.unix.com/shell-programming-scripting/66251-remove-html-tags-bash.html
http://www.unix.com/shell-programming-scripting/58479-multiple-line-match-using-sed.html
A good start:
Code:
lynx --base --source http://ai-contest.com/rankings.php | less "+/table"
Code:
sed -n '1h;1!H;${;g;s/<h2.*/No title here/g;p;}' sample.php
Code:
perl -0777 -pe 's/\A[^\{]*\{//s; s/\}.*?\{/\n/sg; s/\}[^\}]*\Z//s'