CAPTCHA is no longer as effective: this AI takes 0.05 seconds to solve them

01/04/2022

15 Comments TODAY WE TALK ABOUT

Subscribe to Xataka

Receive an email a day with our articles:

Cristian Rus@CristianRus4

Dealing with CAPTCHAs on web pages is something that practically nobody likes, but it is necessary. For humans they are annoying because they interfere with our use of the web, for bots because they cannot get past them. Although the latest advances in machine learning are changing the case for bots, CAPTCHAs are no longer impossible for a machine.

Google introduced reCAPTCHA v3 in October 2018, a system that no longer requires us to prove that we are human directly because it already knows it based on how we interact with the web. It is an improvement for the user but above all it is a preview of something that sooner or later had to come, the bots already know how to decipher distorted texts.

The AI that creates its own CAPTCHAs to learn how to solve those of others

Until now, some other AI system had managed to solve CAPTCHAs. The problem is that these artificial intelligences required – like many artificial intelligences – a large database to learn from examples. However, a new system developed by the universities of Lancaster, Northwest and Pekin has managed to put an end to this problem. They drastically reduce the real examples of CAPTCHAs that your AI needs and on top of that it is faster and more effective.

In MagnetY at last, a small and heroic robot has managed to deactivate the captcha "I'm not a robot"

The new system is based on using an antagonistic generative network. This means that two neural networks compete with each other to learn and improve their knowledge. in the specific case of this artificial intelligence, it generates its own CPATCHAs and solves them, all you need is a few real CAPTCHAs to learn how to make your own examples.

CAPTCHA ya no es tan efectivo: a esta IA le bastan 0,05 segundos para resolverlas

The adversarial generative network system is especially useful when insufficient data is available, as it allows examples to be created from a better number of existing examples. For example, the researchers indicated that for one of the tests they only needed 500 real CAPTCHAs. Do they seem like a lot? Other artificial intelligences require millions of real CAPTCHAs.

The scheme that the new AI follows to learn from a few real CAPTCHAs to generate others and ultimately learn to solve them.

The researchers put their AI to the test trying to solve 33 different types of CAPTCHAs that use garbled text. 11 of these CAPTCHA schemes are used by the 30 most popular websites on the Internet according to Alexa, including CAPTCHAs from Google, Microsoft or Baidu.

After the machine was trained using self-generated CAPTCHAs and some sample CAPTCHAs, it was given CAPTCHAs from these web portals. The result was that in most of the web pages it had more than 80% success and in some cases, such as the Blizzard or Megaupload web pages, the success was resounding, 100%.

Some examples of real Google CAPTCHAs and those generated by AI to learn.

Systems that bots still can't fool

The problem with an AI solving CAPTCHAs so easily is that CAPTCHAs stop making sense, they lose their main reason for being. It can be misused by bots to bypass security systems and flood websites with bogus request forms or other actions. In this specific case, the researchers indicate that their system is easy to implement by other people, as well as being much more efficient than other methods.

But as bots get smarter at solving CAPTCHAs, they also get more difficult. Proving that you are human with a distorted text is no longer useful, not even by how you move the cursor and this little robot proved it to us. The behavior system using Google's reCAPTCHA v3 can be a good solution, since it depends more on a browsing history than on a specific action.

Other systems to deal with automated bots can be, for example, verification on another device. Certificates and trust systems in browsers is another option. Or something more direct, such as more complex security layers on web pages to prevent bot access. But for now we will have to continue proving that we are human by deciphering distorted texts.

Via | ML MemoirsMore information | Lancaster UniversityImage | unsplash

Sharing CAPTCHAs is no longer as effective: this AI takes 0.05 seconds to solve them