Medical Intelligence and Language Engineering Lab    TTS Demo   |   Downloads   |   Videos   |   Contact Us   |   Site Map      
     Home   |    About Mile   |    Projects   |    Research Area   |    Publications   |    Alumni   |    FAQ's    |    News & Events   |    Gallery

Five distinct scene image word datasets with pixel level annotation

        We have annotated more than 3600 word images at pixel level from five publicly available standard word image data sets. These word images have been originally cropped from camera-captured scene images, born digital images (BDI) and street view images. These annotated images are available for download. They were binarized using a Matlab based semi-automated segmentation tool developed by us, which is also available for download.

        The five data sets that have been annotated at the pixel level are (click the link for the source pages):

        Sample word images from ICDAR 2003 data set are shown below (Images are displayed in original format. Different degradations can be observed in each image.) :

        Annotated word images using Matlab based UI tool for ICDAR 2003 data set are shown below:

        We also recognized all the annotated images using the trial version of Nuance Omnipage OCR. The benchmark word recognition rates obtained on ICDAR 2003, Sign evaluation, Street view, Born-digital and ICDAR 2011 data sets are 83.9%, 89.3%, 79.6%, 88.5% and 86.7%, respectively. The following paper gives further descriptions on Matlab UI tool and the recognition process:

        D. Kumar, M.N. Anil Prasad and A.G. Ramakrishnan, “Benchmarking recognition results on camera captured word image datasets,” Proc. Workshop on Document Analysis and Recognition (DAR 2012), pp. 100 - 107, 16 December 2012, IIT Bombay, Mumbai, India. (Download)

    Download Benchmarked Data

        By¬†downloading and using the data set below (or part of them), you agree to acknowledge their source and cite the above paper in related publications. We will be grateful if you contact us to let us know about the usage of our data set.

        We hope that researchers worldwide may find these benchmarked images useful for training their classifier with ground-truth information. Benchmarked data zip file can be downloaded here. This zip file has a folder, which contains README file and five zip files for individual data sets. Each individual zip file has two folders and a ground-truth file, One folder contains the test word images and the other, the ground-truth images.

© 2013 Medical Intelligence and Language Engineering Lab - IISc Campus, Bangalore.