AT&T connecting your world

AT&T Speech API — Voice Biometrics

Connecting your world

Biometrics: identifying, distinctive, measurable characteristics. When you think biometrics, you might think of the latest action thriller, where the protagonist can only gain access to critical information through fingerprint and retina scans. Movies aside, in any given day, the amount of personal information being accessed through online, digital and mobile logins is staggering. At the same time, people are more likely to leave home without their wallets than their cell phone. And let’s not forget the rise in popularity of using the power of your voice to control your devices.

All of these trends lead to an incredible opportunity for biometrics to become more than just the stuff you see on the big screen, particularly in the area of voice biometrics. And AT&T is making it possible, by introducing a Voice Biometrics API into its recently announced Alpha API program. Powered by the AT&T WatsonSM speech engine – and designed to eventually be a part of the broader Speech API – the voice biometrics capability will empower developers to incorporate this type of authentication technology into their apps.

Powered by the AT&T WatsonSM speech engine

How did the idea hatch?

AT&T WatsonSM speech recognition technology is AT&T's pioneering speech services platform and has been powering advanced speech services in the marketplace for decades. It reflects research and development in speech technologies that has led to more than 600 U.S. patents and additional patent applications. AT&T exposed Watson to third-party developers via its Speech API in July 2012. The Speech API was released with seven different contexts for developers: Web search, business search, voicemail to text, SMS, question and answer, TV and generic. The Speech API also boasts new enhancements, such as new speech contexts that are tuned for gaming and social media apps. It also offers updates to its generic context, enabling text-to-speech supporting both English and Spanish languages.

As part of AT&T’s Alpha API program – which works to bring new APIs to market faster – new contexts for the Speech API are in development, which include voice biometrics, translation capabilities, text to speech, inline grammar and more.

The future

While we are continuing to place more of AT&T Watson’s speech capabilities in the cloud, we are also designing them to have multiple use cases that expand the possibilities for what developers can create on top of our intelligence and, ultimately, put into customers’ hands. For voice biometrics, these include:

  • Personalization. Beyond security and authentication, the voice biometrics can also be used for personalization. For instance, if an app recognizes your voice, it can pull up your preferences and settings, as opposed to other users who might have access to the app or service.
  • Customer Care. In the future this technology could be used to recognize customers who have called customer care before and immediately pull up their account history.
  • Facial recognition. Combining voice with facial recognition, not only elevates the ability to create secure environments, but also allows for increased personalization. You can imagine how this type of technology could be combined with in-home systems like AT&T’s Digital Life or in a connected car environment.

About the researchers

Ilija ZeljkovicIlija Zeljkovic is a principal member of technical staff at AT&T Labs and specializes in research around robust, text-independent, speaker identification and verification systems. His recent work focuses on strong authentication for personal devices, in particular, on augmenting acoustic features with face and lip from a personal device. Zeljkovic began working for AT&T in 1990 and has worked at two small start-up companies, Telelogue, Inc. and SpeechCycle, before returning to AT&T Labs in 2006.

Amanda StentAmanda Stent is a computational linguistics researcher at AT&T Labs and focuses on natural language generation in interactive systems. Her recent work includes speech-driven applications, assistive technology and adaptation in dialogue systems including studies around natural language generation and language modeling. Stent was previously a faculty member at Stony Brook University and is the co-author of the book, “The Princess at the Keyboard: Why Girls Should Become Computer Scientists.”

View: Pinterest Instagram