7.11.2005

 

Voice recognition breakthrough

Vocie recognition breakthrough: "As for a machine, each variant of a voice is unique. That is why speech recognition programs usually require training. As a result of training, an enormous library is built up in the memory of the silicon brain, where thousands of possible options of pronunciation of the same words (for example, numerals) are stored. Having heard a word, the computer would look through the library and almost certainly something similar to the heard word will be found in it.

The approach suggested by the scientists from the Institute of Radio Engineering and Electronics, at the Russian Academy of Sciences, is rather more human than machine: a computer under the researchers' guidance filters individual peculiarities. It picks out the most basic things and rejects all immaterial ones. As a result, the machine even acquires the ability to discern individual sounds and to put together in its 'mind' familiar words from these sounds.

As a result, 1 kilobyte would be sufficient for a processor to confidently recognize all numerals and some simple commands, however, pronounced (although only in Russian at the moment). Several dozen people with far-from-ideal articulation - tried to confuse the quick-witted program, by pronouncing numerals either in a whisper or in a voice trembling with excitement. However, the computer successfully rejected emotional frequencies as irrelevent. "

Comments:
There is no way that humans, in their present state, will be able to create a voice-recognition algorithm robust enough to understand human language in all of it's uniqueness and complexity. Self-learning machine are the solution. I believe that Data from Star Trek:TNG is a prime example of how we will finally create a truely voice-interactive software program.
 
While I have zero doubt that voice rec technology is going to improve quickly in the future, I think this particular article should be read with a little skepticism.

It's not entirely clear what they claim to have accomplished here; and my impression is that they're describing a system which is highly accurate in distinguishing number words from each other, despite ambient noise, etc.

Which isn't very impressive or new. Every time you get run into one of those "say one for main menu"-type thingies on an automated phone system, it's equivalent.

It certainly doesn't sound "fundamentally new."
 
"I believe that Data from Star Trek:TNG is a prime example of how we will finally create a truely voice-interactive software program."

Voice recognition is a precursor to Data from Star Trek.

I don't know how much of a breakthrough this is either. I find newspapers usually get technical details very, very wrong. All I can trust from the article is that there are Russian scientists working on voice recognition routines.

"The approach [...] is rather more human than machine."
That is a pretty meaningless statement.

"1 kilobyte would be sufficient for a processor..."
And how does that compare to other techniques (and couldn't it have been phrased better?)

"However, the computer successfully rejected emotional frequencies as irrelevent."
That sounds more machine-like than human-like.

"intended for mass mobile electronic devices."
Why not for any device if it is so revolutionary? Or is there some sort of performance/processing power trade-off compared to other techniques.

"It has turned out that this is a very small part of human speech sounds - only up to 1 KHz."
This makes me suspect that the 1 kilobyte above was confused with the 1 kiloHertz frequency band that the algorithm concentrates on.

"SOURCE: informnauka.ru press release"
This is just a press release -- the least informative and least accurate type of information source IMHO. Of course it is going to make grandious claims.

Siteseer has no papers with a Anciperov or Antsiperov in their database.

So this is a press release by "scientists" who are going to the media with their results instead of going through peer-review process. Pons and Fleishman did that in the late 80's. I suspect this story holds just as much (heavy) water.
 
Бля ето Американсы завидоют нас!

За Родину!
 
“We cannot live for ourselves alone. Our lives are connected by a thousand invisible threads, and along these sympathetic fibers, our actions run as causes and return to us as results.”
- Herman Melville

RSS is the way of the Future...
climbing news rock rss
 
Post a Comment

<< Home
Archives © Copyright 2005 by Marshall Brain
Atom RSS

This page is powered by Blogger. Isn't yours?