cmaterdb


CMATERdb: The pattern recognition database repository



Due to unfortunate closure of Google Code, the cmaterdb repository has been permanently shifted to http://www.cmaterju.org/cmaterdb.htm



CMATER

Center for Microprocessor Application for Training Education and Research

Computer Science and Engineering Department, Jadavpur University, Kolkata 700032, INDIA.

CMATERdb is the pattern recognition database repository created at the ‘Center for Microprocessor Applications for Training Education and Research’ (CMATER) research laboratory, Jadavpur University, Kolkata 700032, INDIA. This database is free for all non-commercial uses. Please acknowledge CMATER explicitly, whenever you use this database for academic and research purposes. For using some databases, one must also cite relevant research publications, mentioned in this website.

Database description

  • CMATERdb 1: Multiscript handwritten text line extraction database

    • CMATERdb 1.1: Handwritten text lines containing Bangla words
      • CMATERdb 1.1.1: Release 1 consists of 100 grey scale images
      • CMATERgt 1.1.1.1: Line-level ground truth images of CMATERdb 1.1.1
      • CMATERgt 1.1.1.2: Word-level ground truth images of CMATERdb 1.1.1
    • CMATERdb 1.2: Handwritten text lines containing both Bangla and English words
      • CMATERdb 1.2.1: Release 1 consists of 50 grey scale images
      • CMATERgt 1.2.1.1: Line-level ground truth images of CMATERdb 1.2.1
      • CMATERgt 1.2.1.2: Word-level ground truth images of CMATERdb 1.2.1
      • CMATERgt 1.2.1.3: Script-level ground truth images of CMATERdb 1.2.1
      • CMATERdb 1.2.2: Release 2 consists of 150 grey scale images
      • Database details:
      • CMATERgt1.2.2.3: Script-level ground truth images of CMATERdb 1.2.2
    • CMATERdb 1.3: Handwritten documents containing text (Bangla, English or both) and graphical contents
      • CMATERdb 1.3.1: Release 1 of text/graphics database
      • CMATERgt 1.3.1: Ground truth images of CMATERdb 1.3.1
    • Cite the following papers to use this database:
      • R. Sarkar, N. Das, S. Basu, M. Kundu, M. Nasipuri, D. K. Basu,"CMATERdb1: a database of unconstrained handwritten Bangla and Bangla–English mixed script document image," International Journal on Document Analysis and Recognition, vol. 15, issue 1, pp 71-83, 2012.
      • S.Basu, C.Chaudhury, M.Kundu, M.Nasipuri, D.K.Basu, "Text Line Extraction from Multi Skewed Handwritten Documents," Pattern Recognition, Elsevier, vol. 40, no. 6, pp. 1825 – 1839, 2007
    • CMATERdb 1.4: Handwritten text lines containing Devanagari words
    • CMATERdb 1.5: Handwritten text lines containing both Devanagari and English words
    • CMATERdb 1.6: Handwritten documents containing text (Devanagari, English or both) and graphical contents
    • CMATERdb 1.7: Printed text lines
  • CMATERdb 2: Handwritten word image database

    • CMATERdb 2.1: Bangla word images
      • CMATERdb 2.1.1: Handwritten words for Character Segmentation
      • CMATERdb 2.1.2: Handwritten city name database for holistic word recognition
  • CMATERdb 2.2: Hindi word images

    • CMATERdb 2.2.1: Handwritten words for Character Segmentation
    • CMATERdb 2.2.2: City-names for holistic word recognition
  • CMATERdb 3: Handwritten Indian script character database

    • CMATERdb 3.1: Handwritten Bangla numeral+character database
    • CMATERdb 3.2: Handwritten Devanagari numeral+character database
    • CMATERdb 3.3: Handwritten Arabic numeral+character database
    • CMATERdb 3.4: Handwritten Telugu numeral+character database
      • CMATERdb 3.4.1: Handwritten Telugu numeral database*** Cite the following papers to use this database:**
    • N. Das, R. Sarkar, S. Basu, M. Kundu, M. Nasipuri, and D. K. Basu, "A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application," Applied Soft Computing, vol. 12, pp. 1592-1606, 2012.
    • N. Das, J. M. Reddy, R. Sarkar, S. Basu, M. Kundu, M. Nasipuri, and D. K. Basu, "A statistical–topological feature combination for recognition of handwritten numerals," Applied Soft Computing, vol. 12, pp. 2486-2495, 2012.
    • N. Das, K. Acharya, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, "A Novel GA-SVM Based Multistage Approach for Recognition of Handwritten Bangla Compound Characters," Proceedings of the International Conference on Information Systems Design and Intelligent Applications 2012 (INDIA 2012) held in Visakhapatnam, India, January 2012." vol. 132, S. Satapathy, et al., Eds., ed: Springer Berlin / Heidelberg, 2012, pp. 145-152.
    • N. Das, S. Basu, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, "Handwritten Bangla Compound character recognition: Potential challenges and probable solution," in 4th Indian International Conference on Artificial Intelligence, Bangalore, 2009, pp. 1901-1913.
    • N. Das, S. Basu, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, "An Improved Feature Descriptor for Recognition of Handwritten Bangla Alphabet," in International conference on Signal and Image Processing, Mysore, India, 2009, pp. 451-454.
    • N. Das, K. Acharya, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, "A Benchmark Data Base of Isolated Bangla Handwritten Compound Characters," IJDAR( Revised version communicated)
  • CMATERdb 4: Camera-captured Business-Card image database

    • CMATERdb 4.1: Release 1 consists of 10 grey scale camera captured business card images
    • CMATERdb 4.2: Release 2 consists of 20 gray scale camera captured business card images
    • CMATERdb 4.3: Release 3 consists of 40 gray scale camera captured business card images
  • Cite the following paper to use this database:

    • A.F.Mollah, S.Basu, M.Nasipuri, “Text/Graphics Separation and Skew Correction of Text Regions of Business Card Images for Mobile Devices,” Journal of Computing, Volume 2, Issue 2, pp.96-102
  • CMATERdb 5: Postal document image database

    • CMATERdb 5.1: Release 1 consists of 50 grey scale images of Indian Postal documents
    • Cite the following papers to use this database:
      • S.Basu, N.Das, R.Sarkar M.Kundu, M.Nasipuri, D.K.Basu, “A novel framework for automatic sorting of postal documents with multi-script address blocks Pattern Recognition, Elsevier 43(10): pp. 3507-3521 (2010)
  • CMATERdb 6: Document Image Binarization bench-marking dataset

    • CMATERdb 6.1: This release consists of 5 representative color document images. It has camera captured as well as scanned documents, old manuscript as well as new document, degraded as well as good-conditioned, and blurred as well as prominent documents. Ground truth images of CMATERdb 6.1 dataset is available at CMATERgt 6.1. Cite the following paper as reference.
    • Ayatullah Faruk Mollah, Subhadip Basu and Mita Nasipuri, "Computationally Efficient Implementation of Convolution-based Locally Adaptive Binarization Techniques", ICIP-2012, CCIS 292, Springer, pp. 159-168 (2012).
  • CMATERdb 7: Camera-captured Vehicle image database

    • CMATERdb 7.1: Release 1 consists of 100 outdoor surveillance camera captured vehicle images from 10 camera views over varying weather conditions along with benchmark localization and recognition performances.
  • ARTeM: Supplementary materials

      *

Developed tools

  • GT Gen 1.1: The ground truth preparation tool developed at CMATER

Project Members

  • Dr. Dipak Kumar Basu, AICTE Emeritus Fellow and former Professor, CSE Dept., JU.
  • Dr. Mita Nasipuri, Professor, CSE Dept., JU.
  • Dr. Mahantapas Kundu, Professor, CSE Dept., JU.
  • Dr. Subahdip Basu, Asst. Professor, CSE Dept., JU.
  • Dr. Nibaran Das, Asst. Professor, CSE Dept., JU.
  • Dr. Ram Sarkar, Asst. Professor, CSE Dept., JU.
  • Dr. Ayatullah Faruk Mollah, Asst. Prof., CSE Dept., AU.
  • Dr. Satadal Saha, Asso. Prof., ECE Dept., MCKVIE.

Project Information

Labels:
CMATER Pattern Recognition