Export Formats

Overview

OCRR supports exporting your processed documents in five different formats, each with unique advantages for different use cases. You can choose the format that best suits your needs when saving your documents.

Important: OCRR is primarily designed for PDF exports, which provide the most accurate preservation of document layout. While other formats are supported, they may not maintain the exact layout of the original document.

Supported Formats

PDF (Portable Document Format)

The PDF format is ideal for maintaining the exact visual layout of your input image while adding a searchable text layer.

  • Features:
    • Preserves original document appearance
    • Includes searchable text layer
    • Adjustable resolution (72-600 DPI)
    • Options for contrast and brightness adjustment
    • Black and white conversion option for smaller file size
  • Best for: Archiving, sharing, and professional document distribution

TXT (Plain Text)

Plain text format contains just the extracted text without any formatting or layout information.

  • Features:
    • Simple text content only
    • Smallest file size
    • Universal compatibility
    • Basic line structure preservation
  • Best for: Text extraction, data processing, or when formatting isn't important

Layout Disclaimer: TXT format does not preserve any layout or formatting from the original document. It contains only the extracted text content.

RTF (Rich Text Format)

Rich Text Format preserves basic text formatting while maintaining compatibility with most word processors.

  • Features:
    • Preserves font styles and basic layout
    • Detects and emphasizes headers
    • Uses table-based positioning for layout preservation
    • Wide compatibility with word processors
  • Best for: Editable documents with basic formatting needs

Layout Disclaimer: While RTF attempts to preserve layout, it may not match the original document exactly. Complex layouts, columns, or precise positioning may be simplified.

DOCX (Microsoft Word Document)

The DOCX format creates Microsoft Word compatible documents with preserved layout and formatting.

  • Features:
    • Full compatibility with Microsoft Word
    • Preserves text formatting and structure
    • Maintains text positioning
    • Header detection and styling
  • Best for: Documents that need further editing in Microsoft Word

Layout Disclaimer: DOCX format attempts to preserve layout but may not match the original document exactly. Complex layouts, columns, or precise positioning may be simplified or restructured.

HTML (Web Document)

HTML format creates web-ready documents with preserved layout that can be viewed in any browser.

  • Features:
    • Creates complete HTML document with CSS
    • Includes page images
    • Uses responsive layout techniques
    • Print-friendly design
  • Best for: Web publishing, online sharing, or viewing in a browser

Layout Disclaimer: HTML format uses responsive design techniques that may alter the layout to fit different screen sizes. While it attempts to preserve the original appearance, the layout may vary depending on the viewing device.

JSON (Structured Data)

JSON format exports document data in a structured, machine-readable format ideal for developers and data processing.

  • Features:
    • Complete structured representation of document content
    • Includes precise text positioning information
    • Contains metadata about document pages
    • Preserves confidence scores for OCR accuracy
    • Includes font analysis data
  • Best for: Data extraction, custom processing, integration with other applications, or AI/ML workflows

Layout Disclaimer: JSON format is designed for data interchange rather than visual display. While it contains positioning data, it's intended for programmatic use rather than direct viewing.

Export Settings

When exporting your documents, you can configure various settings to optimize the output:

Image Quality Settings

  • DPI Options:
    • Low (72 DPI) - Smallest file size, good for screen viewing
    • Medium (150 DPI) - Balanced quality and size
    • Standard (300 DPI) - Print quality
    • High (600 DPI) - High-resolution print quality
  • Contrast & Brightness: Adjust image appearance for better readability
  • File Size Optimization: Options to reduce file size and convert to black & white

Page Format Options

  • Original: Preserve source document dimensions
  • Letter (8.5×11″): Standard US letter size
  • A4 (210×297mm): Standard international page size
  • A3 (297×420mm): Larger format for detailed documents

How to Export

  1. Process your document with OCR
  2. Select File → Export Document... or use the keyboard shortcut Command+E
  3. Choose your desired format from the Format dropdown menu
  4. Configure any format-specific settings
  5. Select the destination location and filename
  6. Click Save

For batch export operations, you can also configure export settings in the batch processing window.


Navigation