Image Search

How to configure Image Search in a Curiosity Workspace

What is OCR?

OCR, or Optical Character Recognition, is a technology that extracts text from different types of documents, like scanned paper documents, PDF files or images captured by a digital camera.

OCR Support in Curiosity

Curiosity Workspaces include OCR capabilities to enhance the efficiency and accuracy of data retrieval. This includes:

Image Documents: Curiosity can process JPEG, PNG, TIFF, and BMP files, extracting text for indexing and searching.
Scanned Documents: Curiosity can extract text from scanned documents like PDFs or scanned images.
Multi-Language Support: Curiosity can recognize a range of languages, including English, French, Spanish, German, and Portuguese.

Which file types are supported with OCR?

Curiosity supports OCR for the following file types:

Images ( .png, .jpg, .jpeg, .gif, .tif, .tiff, .bmp, .dng, .webp, .raw, .heic, .heif, .psb, .svg, .odg, .otg, .odi)
PDF scans (.pdf files where there are only images in the content)

Configuring OCR in a Curiosity Workspace

Documentation coming soon...

PreviousSelf-Hosted Model with ollama NextAudio and Video Search

Last updated 1 year ago

hashtagWhat is OCR?

hashtagOCR Support in Curiosity

hashtagWhich file types are supported with OCR?

hashtagConfiguring OCR in a Curiosity Workspace

What is OCR?

OCR Support in Curiosity

Which file types are supported with OCR?

Configuring OCR in a Curiosity Workspace