Summer Research Fellowship Programme of India's Science Academies

Study on handwritten Telugu character recognition with attention networks

Anish M. Rao

Department of Computer Science and Engineering, PES University, Bengaluru 560085

Dr. Atul Negi

School of Computer and Information Sciences, University of Hyderabad, Gachibaowli, Hyderabad 500046


Opitcal Character Recognition (OCR) is a wildly popular area of research. This is especially true for the subdomain of handwritten character recognition. However, most of the research done in this field is for the task of recognizing English characters. The goal of this project is to design and implement an accurate character recognizer for handwritten Telugu characters. The problem is not easy because of writer-dependent variability and the complex nature of the Telugu script to begin with. The potential for a system like this is huge, because of the amount of handwritten data in Telugu. Government forms, local hospital forms, medical histories and prescriptions, tax records, and historical text, all have handwritten characters. This project compares various methods of character recognition and also creates a modest dataset for handwritten Telugu characters. For the recognizer, the project primarily focuses on deep attention networks and compares their performance on this challenging real-world dataset with other, more traditional approaches such as Support Vector Machines (SVMs) and Convolutional Neural Networks (CNNs). Attention networks were chosen because of their record of being able to improve results in several other tasks such as speech recognition, image captioning, and irregular text detection. They are also a more intuitve solution to deciphering the characters as they mimic how humans would approach the problem. Allowing the network to focus on certain aspects of the image, i.e. by allowing a context-dependent weight to the features extracted from the image. In this work, many such networks were designed and tested. The restults of some of the most successful architectures are presented and elaborated.

Keywords: deep neural networks, optical character recognition

Written, reviewed, revised, proofed and published with