Image Search

How to configure Image Search in a Curiosity Workspace

What is OCR?

OCR, or Optical Character Recognition, is a technology that extracts text from different types of documents, like scanned paper documents, PDF files or images captured by a digital camera.

OCR Support in Curiosity

Curiosity Workspaces include OCR capabilities to enhance the efficiency and accuracy of data retrieval. This includes:

  • Image Documents: Curiosity can process JPEG, PNG, TIFF, and BMP files, extracting text for indexing and searching.

  • Scanned Documents: Curiosity can extract text from scanned documents like PDFs or scanned images.

  • Multi-Language Support: Curiosity can recognize a range of languages, including English, French, Spanish, German, and Portuguese.

Which file types are supported with OCR?

Curiosity supports OCR for the following file types:

  • Images ( .png, .jpg, .jpeg, .gif, .tif, .tiff, .bmp, .dng, .webp, .raw, .heic, .heif, .psb, .svg, .odg, .otg, .odi)

  • PDF scans (.pdf files where there are only images in the content)

Configuring OCR in a Curiosity Workspace

Documentation coming soon...

Last updated