The German company, specializing in translation, had for several years lacked a service allowing it to immediately display translated subtitles from audio sources. While many companies have gotten involved, including Google and Microsoft, DeepL finally fills this gap.
In his blog post, DeepL cites several studies to support his point. Thus, according to the NBER (National Bureau of Economic Research), 33.8% of the time spent in meetings is lost due to problems of understanding between participants. Axios HQ estimates that it costs businesses up to $54,860 per employee per year, again due to misunderstandings and wasted time.
DeepL of course aims to reduce this f(r)acture with its new product. At TechCrunch, the company explains that this was the most frequent customer request since 2017. Why did it take so long? Because it developed its own language model, rather than relying on an existing model, like GPT.
DeepL Voice is therefore primarily intended for businesses. The new service is divided into two versions: one for meetings, the other for conversations. In the first, frames appear near the participants to translate what they are saying. DeepL largely highlights the time savings when meetings bring together people from all over the world.
In the second, it is the telephone which serves as an interface between two people. Here again, the use cases envisaged always take place in a professional context, for example when an employee speaks to a foreign client.
From TechCrunch, we also learn several important elements. First, that DeepL Voice is not available as an API that applications can integrate as they wish. DeepL specifically works with other companies to integrate its technology. In the context of meetings, the only product to benefit from it for the moment is Microsoft Teams. There is currently no question of integration, for example, into browsers in the form of an extension.
DeepL also says nothing is saved. The voice data is indeed sent to the servers, but nothing would be stored there, neither for archiving purposes nor for training the models. It is possible, as our colleagues note, that not everyone is comfortable with the idea of all their comments being sent to a company for analysis. DeepL, however, sought to reassure by indicating that the question of GDPR or any other regulation of the same ilk was being worked on with its clients.
With Copilot+ PCs, Microsoft offers automatic translation, but into a handful of languages for the moment. This solution, which uses the NPU of the machines, nevertheless has the advantage of being local.