Feature

Article for Smart Gorillas - Speech technologies: the next generation

Networks & Network Services

Article for Smart Gorillas - Speech technologies: the next generation

Jennifer Axelrad, Vice President Marketing, SVOX

Jennifer Axelrad, Vice President Marketing, SVOX

Speech technologies are far from new – for many years now we’ve heard about solutions that can recognise your speech, or conversely translate text into speech.  But today’s technologies are unrecognisable from early approaches.  Gone are the stilted conversations, the challenges of making yourself understood by a machine and the feeling that responses are being read out by a robot.

In part, this change comes down to advances in technology – it is now possible to create very natural interactions with speech systems.  For example, natural language understanding eliminates the need for specific voice commands, enabling consumers to use their natural, everyday speech to control their devices instead of being guided through menus.  However, the change is also down to widespread access to speech solutions.

Mobile phones are an integral part of most of our lives – by 2004 there were more mobile phones in the UK than people – and many of us find it hard to operate without them.  Advances in speech, the ability to have an email read aloud or use voice to input a search query while on the move – eyes and hands free – have added a whole new dimension.  

Text-to-speech and speech recognition are now regular features of mobile phones and recent reviews of handsets running Android have highlighted that users are impressed with the additional functionality.  This marks a positive

shift in consumer attitudes towards speech solutions, leaving behind pre-conceived notions of poor accuracy, quality and ease of use.  Times truly have changed and speech solutions are now recognised as vital, everyday tools for use in cars, on mobile phones and other consumer electronics devices.  Speech technology has moved away from being merely a gimmick and into the mainstream, quickly becoming an essential tool to improve the usability of mobile devices, increase productivity while on the go and assist with driver safety. 

Crucially, speech technologies are now to be found embedded in a range of commonly used devices – not just mobile phones, but eBooks, satellite navigation systems and in-car infotainment systems.  The automotive sector is an obvious user of speech.  Tough laws govern the use of mobile phones while driving and the ability to change the radio station, redirect the sat nav or phone a restaurant to say you’re running late, all while keeping your hands on the wheel and your eyes on the road, is both more convenient and more safe.

Recent research from motoring organisation the RAC revealed that 50% of motorists admitted to checking their phone, 21% were likely to read a social media alert and 31% confessed to texting at the wheel.  As well as the obvious safety implications, these figures really highlight how attached we’ve become to our phones.  Those of us that are addicted to Twitter want to tell the world what we’re doing even if that involves sitting in a traffic jam on the M25.  The ability to do this without endangering ourselves or other road users is a major advancement.

Speech solutions have evolved so significantly that users no longer have to learn specific commands to control their devices.  Researchers have been developing natural speech technology which enables users to interact with their phone, navigation or in car infotainment system as though they are interacting with another human.  Developments in consumer gadgetry are also adding to the demands for speech.  The sophistication is great, you can run a business from one small device, but the more complex these devices become, the more difficult it can be to navigate and manage the information held on them.  Similarly, the iPad and other tablet devices look cool and are a whole world away from the heavy old laptops we used to carry, but without traditional keypads or buttons, in practice they can be hard to handle.  As a result, many developers are seeing the benefits of implementing speech technologies to streamline the search process and access music titles, as well as the more traditional uses of making and accepting calls and having texts, e-mails and voice mails read aloud.

This increased acceptance and prevalence of speech solutions is indicative of the recent advances in the technology.  Much of the improvement is because developers are finding that accuracy can be enhanced further by focusing on a best-in-class user interface.  Offering users a tailored solution, specifically designed to meet the particular requirements of each device – whether a mobile handset, satnav system, or tablet computer – makes speech technology a practical reality for many of us today.