Viewing docx files

mdg · Feb 8, 2011

Any suggestions on a docx viewer? Just want to view them, not edit.
OpenOffice is too much and Abiword seems to have trouble with docx.

treepython · Feb 8, 2011

Hello,
libreoffice can be a solution for your problem. libreoffice is only fork of openoffice but you needn't use java.

Beastie · Feb 8, 2011

If all you want to do is read them, then uncompress them and read the contents using any text editor.

DutchDaemon · Feb 9, 2011

See if docs.google.com knows what to do with them?

dennylin93 · Feb 9, 2011

LibreOffice is quite big as well, but it somehow manages to compile a lot faster than OpenOffice.org, so it might be worth a try.

rhish · Aug 28, 2012

This turned out to be great advice! Thanks!

Deleted member 9563 · Jun 16, 2014

Re:

Beastie said:
If all you want to do is read them, then uncompress them and read the contents using any text editor.

What command would I use to do that?

bsdkeith · Jun 16, 2014

This site gives an insight to docx.
http://pcsupport.about.com/od/fileexten ... cxfile.htm

Beastie · Jun 16, 2014

Re: Re:

OJ said:
Beastie said:

If all you want to do is read them, then uncompress them and read the contents using any text editor.

Click to expand...

What command would I use to do that?

tar xf file.docx should do.

Deleted member 9563 · Jun 17, 2014

Thanks @Beastie! That's really useful. Unfortunately the content shows up as document.xml. So what does one do with that? A text editor show garbage. Konqueror reads it, but runs some words together. Not bad though, and will do in a pinch. Firefox just displays a mess of markup. I wonder if there's a simple (command line is best) program to convert .xml to plain text.

cpm@ · Jun 17, 2014

OJ said:
I wonder if there's a simple (command line is best) program to convert .xml to plain text.

You can use xmlto(1) for that purpose, as following:
% xmlto txt document.xml

Deleted member 9563 · Jun 17, 2014

cpm said:
You can use xmlto(1) for that purpose, as following:
% xmlto txt document.xml

I tried that with several documents from two different sources and the result is this:

Code:

Document /home/ole/tmp/word/document.xml does not validate

cpm@ · Jun 17, 2014

OJ said:
cpm said:

You can use xmlto(1) for that purpose, as following:
% xmlto txt document.xml

Click to expand...

I tried that with several documents from two different sources and the result is this:

Code:

Document /home/ole/tmp/word/document.xml does not validate

You need to pass or use --skip-validation option or fix the document syntax

Deleted member 9563 · Jun 18, 2014

cpm said:
You need to pass or use --skip-validation option or fix the document syntax

Oops, sorry I forgot to mention that I already tried that. Perhaps Microsoft has their own proprietary format for XML since that just gives a .txt file with a great pile of markup. Like this:

Code:

<w:document><w:body><w:p><w:pPr><w:jc></w:jc><w:rPr><w:b></w:b><w:i></w:i>
<w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr></w:pPr><w:r><w:rPr><w:b></
w:b><w:i></w:i><w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr><w:t>Attn:
Residents of </w:t></w:r><w:proofErr></w:proofErr><w:r><w:rPr><w:b></w:b><w:i>
</w:i><w:sz></w:sz><w:szCs></w:szCs><w:u></w:u></w:rPr><w:t>Coalmont</w:t></

cpm@ · Jun 18, 2014

You can strip out all XML tags of word/document.xml, e.g. % unzip document.docx word/document.xml | sed 's#</w:p>#\n\n#g;s#<[^>]*>##g'

Viewing docx files

mdg

treepython

Beastie

DutchDaemon

Administrator

dennylin93

rhish

Deleted member 9563

Guest

bsdkeith

Beastie

Deleted member 9563

Guest

cpm@

Moderator

Deleted member 9563

Guest

cpm@

Moderator

Deleted member 9563

Guest

cpm@

Moderator