One of the commitments of Medical Intelligence and Language Engineering (MILE) Laboratory is to develop technology that empowers a person with visual disability to access knowledge available in any printed material in Indian languages. We are working on all research and development issues leading to the fulfillment of this goal: mosaicing of coloured document images; text extraction from complex colour images, including camera captured images; document layout analysis; detection of broken and merged characters; OCR technology for Tamil and Kannada; text to speech conversion in Tamil and Kannada; pitch modification using discrete cosine transform in the source domain; automated parts of speech tagging; phrase prediction and prosody modeling. Our demo version of Tamil OCR is being used by Worth Trust, Chennai and Indian Association for the Blind, Madurai to convert printed books into computer readable text and Braille format. Our Tamil TTS is already being used for student assignments by some school teachers in Singapore.
We are also working on Online handwriting recognition (OHWR) in Tamil and Kannada. Our research consortium partners in IITMadras, IIIT Hyderabad, ISI Kolkata and CDAC Pune are working on Telugu, Malayalam, Bangla and Hindi. Currently we have technology that recognizes unlimited vocabulary words with a character accuracy of about 85%. Three form-filling applications have been developed by our industry partners, integrating word-level recognition engines. Combining our OHWR engines and TTS, we have a preliminary demo of a handwritten word to speech demo in both Tamil and Kannada. In collaboration with St. Johns Medical College, we plan to explore the use of the same for persons with vocal disability. We are also working on porting our technologies to Android based mobile platform. The various databases required, recognition technology, coding in C and the natural language processing modules have all been fully developed by us.