Robotic Nation Evidence: Voice recognition breakthrough

7.11.2005

Voice recognition breakthrough

Vocie recognition breakthrough: "As for a machine, each variant of a voice is unique. That is why speech recognition programs usually require training. As a result of training, an enormous library is built up in the memory of the silicon brain, where thousands of possible options of pronunciation of the same words (for example, numerals) are stored. Having heard a word, the computer would look through the library and almost certainly something similar to the heard word will be found in it.

The approach suggested by the scientists from the Institute of Radio Engineering and Electronics, at the Russian Academy of Sciences, is rather more human than machine: a computer under the researchers' guidance filters individual peculiarities. It picks out the most basic things and rejects all immaterial ones. As a result, the machine even acquires the ability to discern individual sounds and to put together in its 'mind' familiar words from these sounds.

As a result, 1 kilobyte would be sufficient for a processor to confidently recognize all numerals and some simple commands, however, pronounced (although only in Russian at the moment). Several dozen people with far-from-ideal articulation - tried to confuse the quick-witted program, by pronouncing numerals either in a whisper or in a voice trembling with excitement. However, the computer successfully rejected emotional frequencies as irrelevent. "

// permalink - -

Comments:

There is no way that humans, in their present state, will be able to create a voice-recognition algorithm robust enough to understand human language in all of it's uniqueness and complexity. Self-learning machine are the solution. I believe that Data from Star Trek:TNG is a prime example of how we will finally create a truely voice-interactive software program.

# posted by

Anonymous : 11:25 PM, July 11, 2005

While I have zero doubt that voice rec technology is going to improve quickly in the future, I think this particular article should be read with a little skepticism.

It's not entirely clear what they claim to have accomplished here; and my impression is that they're describing a system which is highly accurate in distinguishing number words from each other, despite ambient noise, etc.

Which isn't very impressive or new. Every time you get run into one of those "say one for main menu"-type thingies on an automated phone system, it's equivalent.

It certainly doesn't sound "fundamentally new."

# posted by

gringo : 5:16 AM, July 12, 2005

"I believe that Data from Star Trek:TNG is a prime example of how we will finally create a truely voice-interactive software program."

Voice recognition is a precursor to Data from Star Trek.

I don't know how much of a breakthrough this is either. I find newspapers usually get technical details very, very wrong. All I can trust from the article is that there are Russian scientists working on voice recognition routines.

"The approach [...] is rather more human than machine."
That is a pretty meaningless statement.

"1 kilobyte would be sufficient for a processor..."
And how does that compare to other techniques (and couldn't it have been phrased better?)

"However, the computer successfully rejected emotional frequencies as irrelevent."
That sounds more machine-like than human-like.

"intended for mass mobile electronic devices."
Why not for any device if it is so revolutionary? Or is there some sort of performance/processing power trade-off compared to other techniques.

"It has turned out that this is a very small part of human speech sounds - only up to 1 KHz."
This makes me suspect that the 1 kilobyte above was confused with the 1 kiloHertz frequency band that the algorithm concentrates on.

"SOURCE: informnauka.ru press release"
This is just a press release -- the least informative and least accurate type of information source IMHO. Of course it is going to make grandious claims.

Siteseer has no papers with a Anciperov or Antsiperov in their database.

So this is a press release by "scientists" who are going to the media with their results instead of going through peer-review process. Pons and Fleishman did that in the late 80's. I suspect this story holds just as much (heavy) water.

# posted by

Anonymous : 11:45 AM, July 12, 2005

Бля ето Американсы завидоют нас!

За Родину!

# posted by

Anonymous : 7:22 PM, July 12, 2005

Robotic Nation Evidence

7.11.2005

Voice recognition breakthrough

About

Send Links

archives