HomeDocsSecurity & Optimization › OCR PDF

OCR PDF

Add a searchable text layer to scanned or image-based PDF documents using Optical Character Recognition.

Overview

The OCR PDF tool applies Optical Character Recognition to scanned or image-based PDF pages, creating an invisible text layer behind the page images. This makes the document fully searchable, selectable, and accessible without altering its visual appearance. OCR is essential for digitising paper documents and meeting accessibility requirements.

How to Use

  1. Navigate to OCR PDF from the Security & Optimization menu.
  2. Upload your file using drag-and-drop, Browse Files, or cloud storage (Dropbox / Google Drive).
  3. Select the document language for optimal recognition accuracy.
  4. Click OCR to process.
  5. Download the result when processing completes.

Options

OptionDescription
Language(s)Select one or more languages that appear in the document. Hold Ctrl (or Cmd on Mac) to select multiple. Choosing the correct language improves recognition accuracy for language-specific characters and dictionaries. Defaults to English.

Supported Languages

OCR supports the following 80+ languages. For multilingual documents, select all relevant languages from the dropdown.

Common
  • English
  • French
  • German
  • Spanish
  • Italian
  • Portuguese
  • Dutch
  • Russian
  • Chinese (Simplified & Traditional)
  • Japanese
  • Korean
  • Arabic
  • Hebrew
  • Hindi
  • Turkish
  • Polish
  • Greek
  • Thai
  • Vietnamese
European
  • Albanian, Basque, Bosnian, Breton, Bulgarian, Catalan, Corsican, Croatian, Czech, Danish, Estonian, Faroese, Finnish, Frisian, Galician, Hungarian, Icelandic, Irish, Latvian, Lithuanian, Luxembourgish, Macedonian, Maltese, Norwegian, Occitan, Romanian, Scottish Gaelic, Serbian, Serbian (Latin), Slovak, Slovenian, Swedish, Ukrainian, Welsh
Asian
  • Bengali, Cebuano, Filipino, Gujarati, Indonesian, Javanese, Kannada, Malay, Malayalam, Marathi, Nepali, Panjabi, Sundanese, Tamil, Telugu, Urdu
Middle Eastern & African
  • Afrikaans, Azerbaijani, Pashto, Persian, Sindhi, Swahili, Uyghur, Uzbek, Yoruba
Other
  • Esperanto, Haitian Creole, Latin, Maori, Quechua, Tatar, Tongan, Yiddish

Tips & Notes

Tip

Higher-quality scans produce better OCR results. Aim for at least 300 DPI when scanning documents. Ensure the pages are straight and the text is not skewed for best accuracy.

Tip

For multilingual documents, select all languages present in the document. The OCR engine will detect which language applies to each region of text automatically.

Note

Pages that already contain a text layer are skipped to avoid duplicating text. Only image-based pages are processed.

Related tools: Extract Text · PDF to Word