Uncategorized

Decaptcha 101

So, you probably stumbled upon this blog because you’re having some captcha Problems.
 
I understand your frustration. You’ve probably had to decode hundreds of captchas using your bare-naked fingers before you asked yourself “is there another way to do this?”.
 
The good news is that there is! The important thing here is automation. So if you’re a programmer you can code the automated process yourself, if you’re not a programmer then there’s also ways to get some code to bypass that captcha but will require an investment that can range up to a few hundred dollars depending on the complexity required. But don’t worry, it all can be done.
 
Getting Your Code Up
 
First thing, you encounter a form with a CAPTCHA:

You see that there’s fields that need to be filled along with the captcha. You will need to fill the fields with data you have previously stored. The data should preferably be Pseudo-Random information that you have compiled through-out the web and had organized in a way that can be recycled. For example, if you need to create a name (first and last) you may combine:
 
First Names
Taylor
Fred
John
Mario
Luigi
 
Last Names
Smith
Rogers
Stevens
Tyler
Jones
 
Then combine them and recycle them. From above examples you could have compiled: Mario Stevens, John Jones, Taylor Rogers, etc. You may also use a random alphanumeric string at the end of the combination to create a username that’s unlikely to be already created on the system, for example, mariostevens4810, johnjones998, taylorrogers341. One of the ways to store the data for the fields is using simple text files and have 1 line per element, e.g. firstname.txt, lastname.txt, field1.txt, field2.txt, etc.
 
Afterwards you will need a way to decaptcha that captcha. (A little tongue twister: Can decaptcher decaptcha a captcha from recaptcha or should I decaptcha that captcha with deathbycaptcha?). Well, there are two ways to decaptcha while keeping things automated:
 
1. Using an OCR (Optical Character Recognizer)
2. Using CAPTCHA Solving Service with Humans that do the decaptcha for you.
 
An OCR is a program that actually reads the CAPTCHA image (like an human eye), recognizes the characters in the image and returns the text that is written in the CAPTCHA. These programs are usually run in your own PC/Server and must be custom built for every type of CAPTCHA available.
 
CAPTCHA Solving Services have large teams of people in third world countries (India is the biggest) who get paid to input the text inside CAPTCHAs. They can decaptcha practically anything you throw at them. You will need to upload the CAPTCHA image to their server and they will return the CAPTCHA text in less than 30 seconds depending on their load. These services run for really really cheap, e.g., deathbycaptcha $1.39 for 1,000 decaptchas or decaptcher $2 for 1,000 decaptchas.
 
After you have the text returned for your captcha, you input the data for all the captcha field and submit your data. Forms have usually a timeout time of 5-10 minutes, which is more than enough to fill all the fields even with a very slow connection or script.
 
After you have submitted the form you should keep logs if that operation was successful. For this, you can either use a DataBase Table (if you already posses the know how to do so) or a CSV (Comma Separated File). The important thing is that you need to store your data in a way that can be easily accessed even months after you’ve submitted the form.
 
That’s the basic stuff for now. My next article is going to be OCRs VS Captcha solving services. Here you will see pro’s and con’s about each one of them. Wait for it in the next days.

Be the First to comment.

Leave a Comment

Your email address will not be published. Required fields are marked *