antiword-xp-rb extracts pure text from docx and odt files. The binaries are called ~antiwordxp.rb~ and ~antiodt.rb~, and they work similarly to "antiword": antiwordxp.rb "$file" > file.txt