Our models are built to deliver in real-time, which means you get the very best performance and fast transcription whether you choose batch or real-time modes.
Quickly transcribe large quantities of pre-recorded video or audio files. You can easily set up Speechmatics to process thousands of hours of recordings.
Transcribe your pre-recorded files to get the data you need, when you need it. It’s a great way to extract understanding from your audio at pace and with efficiency.
We offer low-latency, accurate transcription of live audio streams from meetings, calls, or broadcast events.
You’ll get initial transcriptions in milliseconds, with context-driven accuracy improvements over time. Our real-time transcription uses the same core machine learning models to give you the best accuracy.
Deliver for diverse customer needs with support for Cloud and on-prem deployments. Switch seamlessly between the two or combine these modes.
Meet architecture, security and compliance needs by hosting our API in your own environment. Flexibly combine with Cloud if required.
You can deploy Speechmatics using Docker Containers or preconfigured Virtual Appliances. Going On-Prem enables you to improve workflow efficiencies and minimize latencies. This helps to target a wider market with diverse customer needs.
Get instant, secure and scalable access to our API, with access deployments hosted in multiple global locations.
Avoid the cost and complexity of building a high-availability system from scratch while getting instant access to all our new features, languages and updates.
Partner with Speechmatics to maximize your total addressable market. We deliver for multilingual, multicultural and multinational businesses, with coverage of nearly half the world’s languages across a range of dialects and accents.
We support 48 languages, covering most native languages with unmatched accuracy.
Whether you need Brazilian Portuguese or Canadian French, we have you covered with a single language model that supports all associated accents and dialects.
Transcribe and translate audio to and from English for over 30 languages using a single API call.
Simplify integration and ensure accurate transcription with automatic detection of the language spoken.
The vocabulary used in different contexts and different domains can vary widely. Our customization options allow you to achieve high accuracy with even the most unique words and phrases.
Boost accuracy for proper nouns, acronyms or industry-specific terms by providing a list of custom words.
Increase accuracy for a use-case or domain by using a relevant corpus of textual content to customize default models.
We're developing English language packs optimized to industry with sector-specific terminology. Finance is available now, with more to follow soon.
Our diarization enriches the transcript with accurate speaker labels, so your users can identify every speaker in a conversation.
Track who said what and when with speaker labelling for each word, available for both batch and real-time transcription.
Capture exactly what was said, even when there is crosstalk between speakers, with separate transcription on each channel.
Written and spoken conversations vary. From punctuation to the formatting of numbers and dates, our API includes a number of features to accurately transform conversation to transcript.
Identify and correctly format numbers, dates and currencies automatically to improve transcript readability and enable effective post-processing.
Improve readability with language-specific capitalization and punctuation including commas, question marks and exclamation marks.
Aid comprehensibility and compliance by detecting and optionally removing words that are considered profanities or hesitations.
Easily push a variety of media formats to the API and get a rich set of metadata to support your post processing needs.
Get accurate timestamps for every word in the transcript to allow for post-processing and improved end user experience.
Collect confidence scores for every word in the transcript to enable efficient human review and editing.
Minimize the resource needed to prepare audio or video files with support for all major audio and video formats along with automatic sample rate detection.
Sign up to our free speech-to-text SaaS Portal and we’ll guide you through the integration of our API.