Curiosity for Developers
  • Overview
  • Getting Started
    • Introduction
    • System Overview
      • Workspace
      • Connectors
      • Front End
    • Requirements
    • Installation
      • Deploying on Windows
        • Download Curiosity Workspace for Windows
      • Deploying on Docker
        • Deploying using Docker Desktop App
        • Docker Hub
      • Deploying on Kubernetes
      • Deploying on OpenShift
      • Configuration
    • Configure your Workspace
    • Connecting to a Workspace
      • Download App
    • Built-in Templates
  • Security
    • Introduction
    • Hosting
    • Encryption
    • Users and Access
      • User Invitations
      • Single Sign-On (SSO)
        • Google Sign-In
        • Microsoft / Azure AD
        • Okta
        • Auth0
    • Permissions Management
    • Auditing
    • Teams management
    • Configuring Backup
      • Restoring a backup
    • Activate a workspace license
  • Data Sources
    • Introduction
    • User Apps
    • Workspace Integrations
    • API Integrations
      • Introduction
      • Data Modeling
      • Writing a Connector
      • Access Control
      • API Tokens
      • API Overview
      • Tips
    • Supported File Types
    • Curiosity CLI
      • Installation
      • Authentication
      • Commands
  • Search
    • Introduction
    • Languages
    • Synonyms
    • Ranking
    • Filters
    • Search Permissions and Access Control
  • Endpoints
    • Introduction
    • Creating an endpoint
    • Calling an endpoint
    • Endpoint Tokens
    • Endpoints API
  • Interfaces
    • Introduction
    • Local Development
    • Deploying a new interface
    • Routing
    • Node Renderers
    • Sidebar
    • Views
  • Artificial Intelligence
    • Introduction
    • Embeddings Search
    • AI Assistant
      • Enabling AI Assistant
    • Large Language Models
      • LLMs Models Configuration
      • Self-Hosted Models
    • Image Search
    • Audio and Video Search
  • Sample Workspaces
    • Introduction
    • HackerNews
    • Aviation Incidents
    • Covid Papers
    • NASA Public Library
    • Suggest a Recipe
  • Basic Concepts
    • Graph database
    • Search Engine
  • Troubleshooting
    • FAQs
      • How long does it take to set up?
      • How does Curiosity keep my data safe?
      • Can we get Curiosity on-premises?
      • Can I connect custom data?
      • How does Workspace pricing work?
      • Which LLM does Curiosity use?
      • What's special about Curiosity?
      • How are access permissions handled?
      • What enterprise tools can I connect?
      • How to access a workspace?
      • How do I hard refresh my browser?
      • How do I report bugs?
      • How do I solve connectivity issues?
      • How do I contact support?
  • Policies
    • Terms of Service
    • Privacy Policy
Powered by GitBook
On this page
  • What is Speech-to-Text?
  • Speech-to-Text Support in Curiosity
  • Supported file types
  • Configuring Speech-to-Text in a Curiosity Workspace
  1. Artificial Intelligence

Audio and Video Search

How to configure Speech-To-Text in a Curiosity Workspace

PreviousImage SearchNextIntroduction

Last updated 10 months ago

What is Speech-to-Text?

Speech-to-Text (STT), also known as voice recognition technology, is a method that converts spoken language into written text. In Curiosity it is used to transform audio content from audio and video files into searchable text.

Speech-to-Text Support in Curiosity

Curiosity has integrated Speech-to-Text capabilities based on the models. Key features include:

  • Audio File Processing: Curiosity can process audio files in formats like MP3, WAV, and FLAC, converting spoken words into searchable text. That lets users find contents inside the spoken text and jump to the right place in the audio file.

  • Video File Processing: Curiosity also processes video files in formats like MP4 to make them searchable. Users can search inside the spoken text and jump to the right place in the video file.

  • Multi-Language Support: Curiosity STT can recognize and transcribe a broad range of languages, including English, French, Spanish, German, and others.

Supported file types

Curiosity supports speech to text on the following file types:

  • Video files (.mp4, .wmv, .mpeg, .avi, .mkv, .mov, .ogv, .3gp, .m4a, .oga, .weba, .webm, .flv)

  • Audio files (.mp3, .wav, .mka, .wma, .flac, .aac, .aiff)

Configuring Speech-to-Text in a Curiosity Workspace

Documentation coming soon...

open-source Whisper
Curiosity showing a video (right) after searching the correct place using Speech To Text