AT&T ups the ante in speech recognition

AT&T ups the ante in speech recognition
NEW YORK--If you've ever been frustrated using a voice activated customer agent or have scratched your head while reading an unintelligible voice-to-text message, AT&T says help is on the way.The company, which has invested more than one million research hours over the past 20 years in speech and language recognition technology, says that it's developed technologies that will not only make these traditional voice activated services more accurate but will extend voice activation to other modes of communication. Earlier this week, AT&T Labs researchers showed off some of the technologies they've been working on at their labs here. Most of the applications showcased are not yet ready for prime-time commercial use. Researchers said they have no idea when these services will find their way into products. But bits and pieces are already in products developed by AT&T and the company's partners. For decades, AT&T has been at the forefront of speech recognition and natural language technology research. It's developed a core technology platform, known as Watson, which is a cloud-based system of services that not only identifies words but interprets meaning and context to deliver more accurate results. The system itself is built on servers that model and compare speech to recorded voices.Watson is an evolving platform that with more data is able to adapt and learn so that it continues to improve accuracy and also cross reference data to use speech as input for getting to all kinds of communication and data. "We are really on the cusp of a technology revolution in speech and language technology," said Mazin Gilbert, executive director of speech and language technology at AT&T Labs."It's no longer about simply trying to get the words right. It's about adding intelligence to interpret what is being said and then using that to apply to other modes of communication, such as text or video."Of course, AT&T is not alone in its quest for developing more intelligent voice-activated technologies. IBM and Microsoft have each invested heavily in this area for years. Microsoft has already incorporated some speech recognition technology into the Xbox Kinect. And Google, a relative newcomer to the field, is also making headway with voice recognition built into its Google Voice product, which is now available on the iPhone. But AT&T's researchers say the intelligence built into the Watson engine sets their applications apart from these others.One of the demonstrations the company showed at its lab in New York was the iRemote, an application that turns an iPhone or some other smartphone into a voice-activated TV remote.The application allows users to speak normal sentences asking to search for specific shows, actors, or genres. Marguerite Reardon/CNETFor example, someone might ask the app to search for reality shows on Thursday evening. And the app will generate a list of all the reality shows starting at 8 p.m. on Thursday. Users will likely have to scroll through a short list of titles, but the search has been greatly refined from the hundreds of shows that would have to be searched otherwise. Voice activated remotes already exist. But AT&T's technology goes far beyond what's currently available today, said Michael Johnston, a principal researcher at AT&T Labs. Many of these other applications respond to prerecorded commands. AT&T's application not only identifies words, but it also uses other principles of language such as syntax and semantics to interpret and understand the meaning of the request. The system is designed to get more accurate over time as it learns the speech patterns of large numbers of users. "The hardest thing in developing a service like this is populating it with a base-level of understanding," he said. "Even humans make mistakes in hearing words correctly. But we're able to infer meaning from the way the question was phrased or even by understanding gestures or facial expressions."AT&T researchers turned an iPhone into a voice-activated TV remote.AT&T LabsEventually, Johnston said cameras could be used to read lips or gauge facial expressions, which can also be used to determine the intent of what's being said."The vision is that we have something like you'd see in 'Star Trek' or 'Minority Report,'" he said. "You shouldn't have to sit with a keyboard and type anything. Your environment should sense you and through voice commands or gestures the devices around you should know what you're searching for or be able to initiate some other action for you."Researchers have also been applying the Watson speech and language framework to mobile devices. Some of AT&T's technology partners, which license AT&T's core speech and language technologies, have already built commercial products.For example, Vlingo licenses AT&T's Watson core technology and also partners with AT&T on research. Today it offers applications forAndroid, BlackBerry, Nokia and iPhone smartphones. The Vlingo apps, which are often used to enable or enhance other applications,allow users to search the Web, find directions, update social networking status, and send emails and text messages to contacts simply by using voice commands. As touch screens and other mobile devices such as the iPad emerge, AT&T has begun introducing physical gestures into the platform. Earlier this year, it introduced a research application for the iPhone that is capable of understanding both the spoken word as well as physical gestures. The Speak4It app, which can be downloaded from the iTunes App Store, allows consumers to discover restaurants within a specific area, obtain directions to the nearest gas station, call their local pharmacy and access information on a variety of local businesses. By pressing the speak button people can say what they would like to find and have it pinpointed on a Google map. Users can also touch a point on a map and ask, "What's there?" Or they can circle a neighborhood on the map and search for something only in that specific area.This iPad app, which is still being developed, uses synthesized voice technology to read aloud children's stories.AT&T LabsIn addition to understanding and correctly interpreting language, AT&T is also developing voice technology that mimics natural voices.Its AT&T Natural Voices technology builds on text-to-speech technology to enable any communication to be spoken in a variety of languages including, English, German, Spanish, French or Italian when text is processed through the AT&T cloud based service.The technology works by accessing a database of high-quality recorded sounds that when melded together by algorithms create spoken phrases. AT&T demonstrated the technology with an application that reads aloud children's story books. The application was downloaded onto an iPad and it used synthesized voice technology to read the story of Goldilocks and the Three Bears aloud. The application highlighted each word as it was read with each character speaking in a different voice. While the voices in the story still sound somewhat mechanical, the goal is that over time, the voices will match the intonations and speech patterns of natural voices. "The whole idea behind what we're doing with this voice and multimodal technology is to develop an intelligent virtual agent that is with you all the time, whether you're at home or out in the world," Gilbert said. "When you're out and about it helps you look for restaurants, it knows to send and SMS to your mother on her birthday, it knows you go to Dunkin Donuts everyday and sends you a virtual coupon on Monday morning, and it can speak to you when you need something read to you. "


Open apps faster via Windows' command line.

Open apps faster via Windows' command line.
I saved myself a few keystrokes by installing a donationware utility that I used to assign a keyboard shortcut to the Command Prompt. Start by downloading and installing Clavier+, a keyboard-shortcut utility from Guillaume Ryder. Open the program, click the blue plus sign on the left side of the main screen, and navigate to Accessories>Command Prompt. Click in the Shortcut field, and press your preferred keystroke combination, making sure not to enter one you already use for some other purpose. (One that is available and easy for me to remember is Ctrl+Alt+C.) After you make your selection, click OK, and you'll see your new shortcut in the list at the top of the main Clavier+ window. Click OK once more to close the program, and now you've got access to the Command Prompt via the keyboard.Assign the keystroke combination of your choice in Clavier+'s Shortcut dialog box.You may be wondering why you can't simply right-click cmd.exe in Windows Explorer (it's in the C:/Windows/System32 folder), choose Create Shortcut, and then assign a keystroke combination to that shortcut by right-clicking it, choosing Properties>Shortcut, and entering the keys in the Shortcut key field. Windows won't let you. I don't know why, and I can't even find an explanation for the restriction. No matter what key combination I entered, I couldn't get it to open the Command Prompt window. For some reason, Clavier+ had no problem opening the window via the keystroke combo I assigned. Go figure.Launch apps from the command lineWith the Command Prompt open, type start winword and press Enter to open Microsoft Word, start excel to launch Excel, and start mplayer2 to open the old version of Windows Media Player (start wmplayer launches the newer release). Here are some other application file names you might find handy:Internet Explorer: iexploreMicrosoft Outlook: outlookMicrosoft PowerPoint: powerpntWindows Explorer: explorer (or press the Windows key and E to open an Explorer window with My Computer highlighted)Calculator: calcMagnifier: magnifyNotepad: notepadPaint: mspaintRegistry Editor: regeditSystem Configuration Utility: msconfigTweak UI: tweakuiWindows Movie Maker: moviemkWordPad: writeMost other applications can be launched simply by typing start and their name, such as "firefox", "thunderbird", "photoshop", "acrobat", and "itunes". To close the Command Prompt window, type exit and press Enter.Of course, you can do much more from the command line than launch applications. The Microsoft TechNet site lists the commands available for system-management tasks, with descriptions of how to use them. But that's a subject for a future post. Tomorrow: Fun with Microsoft Excel's Lookup function.


Comcast CEO- We are not a dead duck

Comcast CEO: We are not a dead duck
Battelle, interviewing Roberts onstage, called it "video-on-demand on steroids."The Associated Press, referencing a briefing this week with executives at Comcast's Philadelphia headquarters, helped fill in some of the details about the service, noting that it would include such popular cable shows as HBO's "Entourage" and AMC's "Mad Men" and for now is being called "On Demand Online."The AP said Comcast subscribers can initially watch shows and movies only on their home computers after being verified by the cable system. Online viewing, at least in the beginning, will be restricted to those who get Internet service through Comcast, not through competitors like phone companies, the AP said.Back at Web 2.0 Summit, Roberts also said that Comcast investments in broadband technology are, in part, what has facilitated the explosion in Web innovation."We're going to keep investing, because we believe there are great ideas in this room and in this country and in the world," Roberts said. "In the same way, it's unthinkable that a Google or a Yahoo or a Facebook or a Twitter would be happening if we hadn't made those investments (in broadband infrastructure) 15 years ago."Battelle asked Roberts why he believes the U.S. lags behind in broadband technology advancements. Roberts replied, "I think that that's just not true."(The audience laughed uncomfortably.)"We have the same equipment (as other countries), the same wires, the same infrastructure, why is the adoption different is a different question. It's not the availability and I don't think it's the lack of speed," he continued. "You get to digital literacy, you get to what language it's in, do you have the right PC or a PC at all...I don't believe the infrastructure providers haven't done enough."As for Net neutrality, an issue where Comcast has been a frequent villain after imposing bandwidth caps and interfering with peer-to-peer file-sharing software, Roberts was vague."We welcome that discussion, that scrutiny, and we're going to be an active participant," he said. "The few limited examples, including our own, that have gotten notoriety usually get dealt with in ten seconds, and changes get made, because this is new technology."More recently, it's bubbled into the press that Comcast is in talks with General Electric to obtain a controlling stake in its NBC Universal property. Conveniently, GE chief Jeffrey Immelt was slated to speak later in the afternoon at Web 2.0 Summit."You and Jeff Immelt must have finished the NBC deal back in the green room," Battelle joked.Roberts replied facetiously, "It's all done."