Transcriber

Component/part description

This is the module responsible for generating the transcription, generally using a third party service or API such as IBM Watson one.

It is composed of 2 main components

  • Audio converter Convert audio or video to audio specs for stt API

  • STT sdk audio to STT API/Service, to receive time-coded transcription.

With Extra:

  • Speaker diarization can either happen at the STT API level or as a separate module to be interpolated with the transcription.

And optional:

  • Srt parsing. Allow srt as input. In case transcription comes from elsewhere. Can use module srtParserComposer to refactorarrow-up-right

  • Plain text as input, if you already have the transcription, use something like Gentle to re-align and generate transcription json.

It was Initially prototyped as a standalone app to test quality of speech to text. see Transcriberarrow-up-right.

Implementations Options considered

NA

Current implementation

See componentarrow-up-right

What needs refactoring

Perhaps look into compositor pattern to bring together the components of this module.

Last updated