Defeating Voice Captchas

one of the newest (now known though) tricks in the captcha book is using voice.

if users cannot understand what the letters are in the now too-complex captchas that are forced on us due to spammer counter-measures at defeating captchas, he or she can click on an icon and listen to it. :)

here is the earliest example of it that i know of:
http://www.notonebit.com/projects/killbot/kbaudio.php

that example is a bit amateurish, as the recording is bad and obviously not done by a girl with a sexy voice. still, the disturbance from the bad microphone can be eliminated or kept entirely. it doesn’t matter.

in this case each letter is played by itself. further, each letter was recorded only once.

therefore, how many times does one have to refresh the page and listen to the captcha to be able to simply learn to identify the captcha by say, an md5 hash of the audio for each letter?

even if it was all set in one audio file, and even if the audio was played with to be, as an example, in a higher pitch. or perhaps even if several different voices would greet us…
looking at general similarities in the audio file itself would be enough to break down this captcha once enough harvesting attempts (not that many really) were saved.

auto-generated voice? that sounds easy to beat but i am not an audio expert so, “sounds like” will stay as my opinion.

it’s is great to be able to finally understand these new annoying captchas, but already we are getting to a point where one can’t understand the recorded speech either due to counter-measures from the spammers and the captchas becoming more and more difficult.

for information on breaking regular text-image captchas, check:
whiteacid’s post
wikipedia

for my post on new comment spam problems:
http://blogs.securiteam.com/index.php/archives/285

update from our friend, valdis kletnieks who demonstrates use of voice recognition technology, which is not really necessary at this time:

“given that voice recognition can currently do up to 160 words/minute continuous speech over a 50k word vocabulary at up to 99% accuracy, on commodity hardware, i would think that recognizing 36 letters/numbers would be a no-brainer.”

http://www.nuance.com/naturallyspeaking/

gadi evron,
ge@beyondsecurity.com.

Share
  • Lee

    Let’s see you break it. I’ve had great success with CAPTCHAS and may try this audio one. Looks good, thanks.

  • Pingback: ha.ckers.org security lab - Archive » CAPTCHA issues

  • http://www.alixaxel.com/ Alix Axel

    What about audio CAPTCHAS like Google’s, where even I can’t understand its meaning?