PDF · AI translation

Translate PDFs. Layout and fonts survive.

Drop your text-selectable PDF and get it back translated with paragraphs, columns, tables, fonts, and images in the same positions. Output as PDF or editable DOCX — your choice.

0PDF jobs0words processed

What is Native PDF?

You can tell a PDF is native by clicking and dragging — if you can select text, it's native. Native PDFs are generated from Word, InDesign, LaTeX, Google Docs, or similar — the text is real, the structure is parseable. About 75% of business PDFs are native. If you can't select text, it's a scanned PDF, and the pipeline is different — see /translate/pdf-scanned for OCR-based translation.

Why Native PDF is tricky for AI translation

  • Reading order: PDF stores glyphs by visual position, not logical order. Multi-column layouts, sidebars, and footnotes can scramble into nonsense if reading order is inferred naively.
  • Text expansion: Spanish runs 25–30% longer than English. A PDF with tight columns will overflow. Naive translation produces clipped text.
  • Font substitution: source font may not contain target-language glyphs (English Helvetica → Arabic, Chinese). Substitution must preserve weight, size, and color.
  • Embedded tables: PDF tables are visual constructs (lines + cells of text), not semantic tables. Extracting and re-rendering them is an art.
  • Footnotes and references: numbered references in main body link to footnote positions on the same page. Translating without preserving links breaks the reference chain.
  • Images with embedded text: a native PDF can still contain image-only diagrams with captions. Those need OCR + translation.
  • Forms: PDF form fields (/Tx, /Btn) are translatable independently of body text.

How Fily handles Native PDF

  • Block-level extraction: PDF parsed into structured blocks — paragraphs, table cells, headers, footers, captions, footnotes — with positions and reading order preserved.
  • Layout reconstruction: output PDF reuses the source layout where possible. DOCX output reconstructs columns, tables, and styles.
  • Font handling: target-language fonts substituted intelligently when the source font lacks coverage.
  • Text-expansion buffer: layout engine adjusts line breaks and paragraph reflow for expansion (ES, FR, DE) and compression (ZH, JA).
  • Image-text handling: image regions with embedded text are detected; OCR runs on those regions separately.
  • Tables: cell-by-cell translation with column structure preserved.
  • Footnotes: reference numbers stay linked; footnote text translated with original numbering.
  • Output format: searchable PDF or DOCX — picked at upload.

Pipeline: pdf_qa_12step@2.0.0 · pdf_qa_12step_v2@1.0.0

The Native PDF workflow with Fily

1

Upload

Drop your .pdf (single or batch ZIP). Optional: glossary, TM, style guide.

2

Process

Fily runs the Native PDF pipeline + 12 QA steps. Typical job: 10–20 minutes.

3

Download

Same format, ready to deliver. QA report HTML attached.

Common upload: a 20-page native PDF (legal contract, technical spec, marketing one-pager) with mixed paragraphs, tables, and inline images. Fily delivers a translated PDF that opens identically in Acrobat — same layout, fonts substituted where needed, paragraphs reflowed for text expansion.

Frequently asked about Native PDF

Ready to translate a Native PDF file?

No card. No setup. Upload one file and see the output.