• Oleg Voronko

Voice Recognition in VR/AR for Business. Part 1: Threats


Intro


Changing reality technology is a fantastic phenomenon, allowing you to get into another dimension in a mere instant. That’s how you can immerse yourself in the jungle with wild predators or teleport to a futuristic game as in the movie Ready Player One through VR tools. In the case of AR, you’re able to interact with virtual objects placed in your real surroundings—for example, gather pokemons right at home or try on new sneakers from Nike without going to the store.

The incredible opportunities VR/AR open to regular users, make us think that nothing much more can be improved for a better experience. The majority of people believe cross-reality inventions have reached the peak of their usefulness. However, let’s remember that their application isn’t limited to simple daily routine operations. VR and AR gain their real power for tasks related to running a business. When it comes to work and complex projects, it turns out that these technologies have room to grow. Providing us with advanced visual, auditory, motor, and tactile sensations, VR and AR almost neglect significance of the crucial fifth element—voice. To provide a truly natural, optimized, and adequately working virtual interface managed by voice, developers should take great care of the analytical algorithms, which process the user speech sensibly and correctly connect it with corresponding actions and events. Since deep learning has been implemented in the voice recognition mechanisms, the number of word identification errors has been drastically decreased. But, despite all the improvements, these systems still can’t boast of the human-level recognition process. There are many kinds of failures for speech, which can be divided into three major groups.

Threat #1: Not so smart as a human

The central pitfall voice recognition technology faces when analyzing human speech is a wrong interpretation of the request. This can occur when you unclearly say the phrase or the used words have a lot of close-in-sound analogs. Naturally, such a mismatch will result in different action or zero reaction at all. This problem is exacerbated by the presence of accents and background noise. Recognizers aren’t resistant enough to various sources of reverberation, which leads to inaction or latency in answers.


Another flaw is the lack of smart algorithms serving to process complex combinations of words or ones that are missing in the database. If you give a command that wasn’t recorded into the program, you won’t get any result. Only with the implementation of self-learning AI/ML technologies, a qualitative interaction with virtual objects and entirely abstract conversations become possible.

Threat #2: Not suitable for mission-critical actions


As current voice technologies don’t have a 100 percent recognition rate, we can’t rely on them when performing tasks of high importance. For example, even assuming that VR is slightly introduced into the world of medicine, particularly, into the surgery process, managing the surgical intervention via voice is unacceptable as human life is at stake. Or if we take a less severe case, such as gameplay, the wrong recognition of a player’s speech can also cause some mission failure or downgrade. That’s how users become hateful.

Threat #3: Inappropriate use of voice recognition technology

Not associated with the technical imperfection, rather with people’s misunderstanding of how to use it, voice recognition can break the VR immersion too. We told you that the primary goal of speech in VR is to strengthen the feeling of presence. Well, if you misuse the verbal commands, you run the risk to have no enjoyment from your virtual trip.

For example, imagine that you modeled a 3D prototype of your future house and decided to walk along its walls through VR. You went to the door, and instead of opening it manually, you said, “Open the door.” Sure, the door will open, but what will you get in the long run? A door is a material object designed to be interacted with physically, not mentally. Keep in mind such peculiarities during your VR immersion to determine accurately how and when you should give voice commands. This is a pledge of your total, natural presence.

15 views
  • Facebook Social Icon
  • LinkedIn Social Icon
  • Twitter Social Icon

SIA PHONAL TECHNOLOGIES

MATISA STR. 61-32

RIGA, LATVIA LV-1009

SUPPORT@PHONAL.AI

© 2019, Phonal Technologies