Highly accurate speech recognition is one of the most exciting results of the Deep-Learning revolution. Until recently, such accuracy could only be achieved within severely restricted domains, by limiting the ‘vocabulary’ of words to be recognized. Through the magic of Deep Learning, we can now recognize virtually unrestricted domains with sufficient accuracy to support commercial applications.

At Saigen, we develop such large-vocabulary recognizers that are optimized for our customers’ specific needs. In a call centre, for example, the requirement may be for highly accurate recognition of certain key elements (related to legal compliance, for example) while maintaining good coverage of a broad domain of other topics that may occur in a telephone conversation. Or a media monitoring company may need adaptable keyword recognition in several languages, for which Deep-Learning based speech recognition is not available.

Try our speech recognizer

To request a free trial visit

Supported languages: SA English, isiZulu, Sesotho and Afrikaans.


SAIGEN is part of the Alphawave group, which employs 230+ people, 130+ of which are engineers 

We develop customised large-vocabulary speech-recognition systems for commercial applications. While being well-published academics, we have also collaborated with international partners to solve interesting and challenging problems: we were part of the consortium which won the recent IARPA-sponsored spoken term detection Babel-program, we worked with Google on Voice search for the South African languages and were the first to build speech recognition systems in all 11 of South Africa’s official languages.



Speech analytics in the call centre. In a large call centre, thousands of hours of speech are recorded each day. That speech is a potential treasure trove of information on topics such as the following:

Are operators complying with the legal and other requirements of their respective tasks?Are there identifiable operator behaviours that correlate with successful call outcomes?Are customers raising common issues that are not known within the rest of the business?
In many call centres such issues are partially addressed by quality-control staff who listen to a small sample of the recorded calls. However, such QC is both expensive and limited in scope, since it is difficult for each QC auditor to keep track of the statistics of subtle patterns that occur in highly variable telephone conversations.

We therefore offer speech-recognition based speech analytics, that is optimized for the dialogues that occur in a particular call centre. This solution, which can be cloud-based or deployed on-premise, is surprisingly cost effective and can discover patterns in both customer and operator speech turns that have significant business impact.

Monitoring web-based and broadcast media
Podcasts, radio and TV broadcasts, user-generated content in social media … there is a massive universe of spoken content that is available to the public but hard to utilize for business purposes. However, by tailoring a speech-recognition platform to transcribe such content, it is possible to develop insights into public perceptions and media coverage on a large scale.

Our ability to monitor spoken media content in a variety of languages is useful for purposes such as verifying advertisement transmissions, analyzing editorial content and understanding social trends.


Dr Charl van Heerden
Dr Charl van HeerdenCEO & Director
Prof Etienne Barnard
Prof Etienne BarnardDirector
Dr Frans Meyer
Dr Frans MeyerNon-executive Director
CJ Meyer
CJ MeyerLinguist
Pieter Uys
Pieter UysSpeech engineer
Felix McGregor
Felix McGregorSpeech engineer
Arnold Pretorius
Arnold PretoriusSpeech engineer
Simone Wills
Simone WillsSpeech scientist




35 Brickfield Rd
Brickfield Canvas
Salt River
Cape Town 7925
Google maps