KI / AI
13.11.2020

Where artificial intelligence surpasses human beings

The Karlsruhe Institute of Technology (KIT) has developed an AI that can convert spoken words into speech in a very short time. This puts it far ahead of humans in speech recognition.

The AI was developed by researchers at the Karlsruhe Institute of Technology (KIT). Measurements have shown that within 1.75 seconds English words are converted with an error rate of 5 percent. For comparison: humans have a 5.5 percent error rate for such measurements. This means that the technology works better in this respect than a human being.

Basically, the standard test “Switchboard conversational corpus” has been applied. It contains about 2500 conversations with about 500 speakers.

The challenges of spontaneous speech

The problem with spontaneous speech is that it is spontaneous. This leads to the use of filler words such as “äh” or sentence breaks.

The technology already represents the core of the Lecture Translator, which has been used in lectures in Karlsruhe since 2012. The advantage of this is that it allows different experiences to be gathered regarding the challenges of spontaneous language.

The research team works with neural networks in the encoder-decoder architecture and combines the LSTM (long short-term memory) and the transformer approach. Dr. Sebastian Stücker, group leader for multilingual speech recognition, comments as follows: “Our strengths lie in basic technology. […]

In addition, we have modified the minimisation of the loss function in the training of neural networks and thus reduced the latency.”

Source: iStock / skynesher
Spontaneous speech poses challenges to the developed technology. (Source: iStock / skynesher)

This enabled them to reduce the calculation time of German texts to 1.3 seconds and fall below the human error rate.

 

In general, the developers can already report new successes, but these do not have a comparable benchmark.

Source Cover Image: iStock /alvarez

[plista]