Machine Learning To Beat CAPTCHAs

During high school I took an enrichment program known as the International Baccalaureate Diploma Program (IBDP). All IB students are required to write a 4,000 word essay, known as the Extended Essay, on a subject of their choosing. Each student’s Extended Essay should contain significant research components.

I chose to write my Extended Essay on machine learning, specifically, how machine learning can be used to target and solve CAPTCHAs. This was written during the summer of 2016. The essay compared two different methods of classifying letters in CAPTCHAs: the K-Nearest-Neighbours (KNN) algorithm, and the Classification and Regression Tree algorithm.

After training these algorithms on a large enough sample size, I was able to achieve 75% letter recognition accuracy using the KNN algorithm.

I also eventually presented on this topic to the PEIDevs group in March 2017. This presentation was recorded, and is available below.

Downloads

Download the PDF