[OpenSIPS-Users] OpenSIPS and Speech-to-Text

Bogdan-Andrei Iancu bogdan at opensips.org
Wed Oct 6 09:23:38 EST 2021


Hi Mark,

But using the media_exchange you can "fork" (as a new SIP call) only one 
of the RTP streams - of course, the TTS engine should be able to accept 
pure SIP calls.

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
   https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
   https://opensips.org/training/OpenSIPS_eBootcamp_2021/

On 9/17/21 3:50 PM, Mark Allen wrote:
> Thanks for that Johan - I hadn't thought about that aspect. All 
> theoretic at the moment, but IBM Voice Gateway, at least, does claim 
> to be able to handle it using SIPREC - so maybe they are confident 
> about their ability to differentiate between caller and callee in a 
> single stream?...
>
>     "The voice gateway provides the ability to transcribe caller and
>     callee (e.g. contact-center agent) audio from an active phone call
>     in real time using the SIPREC protocol." -
>     https://www.ibm.com/docs/en/voice-gateway?topic=gateway-about-voice
>     <https://www.ibm.com/docs/en/voice-gateway?topic=gateway-about-voice>
>
>
> On Fri, 17 Sept 2021 at 10:33, johan <johan at democon.be 
> <mailto:johan at democon.be>> wrote:
>
>     The issue with siprec (based on rtpproxy) is that you have only 1
>     stream containing the voice from caller to callee and callee to
>     caller. So that will give a hard time on the ASR :-).  I do know
>     that rtpengine has something similar to siprec but I don't know
>     the details.
>
>
>     Bottom line, in my opinion, you need to have 2 separate streams
>     before you can start STT.
>
>
>     wkr,
>
>
>     On 17/09/2021 11:04, Mark Allen wrote:
>>     I'm just starting to look at Speech-to-Text (STT) processing for
>>     calls - initially recordings but moving on to real-time. I would
>>     see this working along the lines of either:
>>
>>     - a call is recorded, and when the call ends an event is
>>     triggered to initiate transcription of the recording
>>     - a call starts, the RTP is forked to the STT engine which sends
>>     real-time transcription
>>
>>     I can see that with OpenSIPS, the SIPREC and Media Exchange
>>     modules allow for forking of the RTP, providing a means of
>>     sending the data for processing, but is anybody actually doing
>>     this? If so, what has been your experience? Is there a toolset
>>     that works well with this (e.g. IBM Voice Gateway, Google, Amazon
>>     etc)?
>>
>>     _______________________________________________
>>     Users mailing list
>>     Users at lists.opensips.org  <mailto:Users at lists.opensips.org>
>>     http://lists.opensips.org/cgi-bin/mailman/listinfo/users  <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>     _______________________________________________
>     Users mailing list
>     Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>     http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>     <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20211006/3693122e/attachment-0001.html>


More information about the Users mailing list