Can I replace words or phrases in the ASR output?
Yes, by using postprocessing dictionaries, you can automatically replace, remove, or reformat words in the ASR output after transcription is generated. This article explains how they work and how to configure them
Post Processing Dictionaries
Postprocessing dictionaries are a powerful tool that can be used to transform ASR output in order to change text formatting, implement word replacements, create custom profanity filter behavior, and removal of filler words in different languages.This feature works with all ASR engines and for all languages.
Each entry in a postprocessing dictionary includes a replacement term, pipe character, and a pattern that matches an entire word. Multiple comma-separated patterns are supported for one replacement term. In this example, we are trying to match any of these three spellings as a pattern - Zelensky, Zelenskyi or Zelenski and replacing any of those with the officially recognized spelling Zelenskyy.
Zelenskyy|Zelensky,Zelenskyi,Zelenski
Patterns cannot include spaces. Replacement terms can use spaces. For example
Mc Mahon|McMahon,MacMahon
Words can be removed by leaving the replacement term blank. For example
|ah,ahh
Custom handling of profanity terms can be implemented - censoring terms, replacing terms and/or removing them entirely. For example
d***|damn
dan*|dang
dagnabbit|dammit
|crap
Regular expressions are also supported in patterns. This example would match either Syncwords or syncwords and replace with the properly cased term SyncWords
SyncWords|[S|s]yncwords
Creating and using Postprocessing Dictionaries
Postprocessing dictionaries can only be created by admin users in the Organization Settings screen. Each dictionary must have a name and set language. One postprocessing dictionary can be set as default per language.
Events and Services will use any default postprocessing dictionary set for the source language. If no default is set, services and events can be configured to use one of the available postprocessing dictionaries for the language. Their use can also be disabled for any service or event.
Translations use any transformations applied to source language text.
Dictionaries and Postprocessing Dictionaries
Dictionaries and Postprocessing dictionaries can be used together but they serve very different purposes. Dictionaries provide a suggested list terms to ASR engines which increases the likelihood (but other factors may prevent) of output matching the included terms with indicated spelling. Postprocessing dictionaries apply transformations after text is received from ASR engines and reliably replaces patterns matched with replacement terms.
Limits and Warning
You can specify up to 1000 replacement terms and patterns in a postprocessing dictionary.
Depending on the number of terms and the complexity of patterns (especially patterns that include regular expressions) there is a chance that output in services and events may be delayed or fail entirely.
TIP: We strongly recommend testing your postprocessing dictionaries with content to ensure you don't experience any problems or failures.