What command would I use to do that?Beastie said:If all you want to do is read them, then uncompress them and read the contents using any text editor.
OJ said:What command would I use to do that?Beastie said:If all you want to do is read them, then uncompress them and read the contents using any text editor.
tar xf file.docx
should do.You can use xmlto(1) for that purpose, as following:OJ said:I wonder if there's a simple (command line is best) program to convert .xml to plain text.
% xmlto txt document.xml
cpm said:
Document /home/ole/tmp/word/document.xml does not validate
OJ said:cpm said:
I tried that with several documents from two different sources and the result is this:
Code:Document /home/ole/tmp/word/document.xml does not validate
Oops, sorry I forgot to mention that I already tried that. Perhaps Microsoft has their own proprietary format for XML since that just gives a .txt file with a great pile of markup. Like this:cpm said:You need to pass or use --skip-validation option or fix the document syntax![]()
<w:document><w:body><w:p><w:pPr><w:jc></w:jc><w:rPr><w:b></w:b><w:i></w:i>
<w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr></w:pPr><w:r><w:rPr><w:b></
w:b><w:i></w:i><w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr><w:t>Attn:
Residents of </w:t></w:r><w:proofErr></w:proofErr><w:r><w:rPr><w:b></w:b><w:i>
</w:i><w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr><w:t>Coalmont</w:t></