Back to skills

read-document-files

Turn local documents into readable text.

Use this skill to convert PDF, DOCX, XLSX, and PPTX files into UTF-8 text output, while exporting embedded DOCX images for separate analysis.

PDFDOCXXLSXPPTX

What It Does

Runs a reusable Python extractor that writes plain-text output, metadata, and DOCX image assets.

Best For

Reading PDF and Office documents before summarization, search, migration, parsing, or agent handoff.

Install

  1. git clone --depth 1 https://github.com/picasso250/skills.git ~/tmp/picasso-skills
  2. cp -r ~/tmp/picasso-skills/read-document-files ~/.agents/skills/
  3. Run the bundled extractor script on a local Office or PDF file path.