My favorites | Sign in
Project Home Downloads Wiki Issues
Project Information
Members
Featured
Downloads
Links

CMATER

Center for Microprocessor Application for Training Education and Research

Computer Science and Engineering Department, Jadavpur University, Kolkata 700032, INDIA.


CMATERdb is the pattern recognition database repository created at the ‘Center for Microprocessor Applications for Training Education and Research’ (CMATER) research laboratory, Jadavpur University, Kolkata 700032, INDIA. This database is free for all non-commercial uses. Please acknowledge CMATER explicitly, whenever you use this database for academic and research purposes. For using some databases, one must also site relevant research publications, mentioned in this website.

Database description

  • CMATERdb 1: Multiscript handwritten text line extraction database
    • CMATERdb 1.1: Handwritten text lines containing Bangla words
      • CMATERdb 1.1.1: Release 1 consists of 100 grey scale images
      • CMATERgt 1.1.1.1: Line-level ground truth images of CMATERdb 1.1.1
      • CMATERgt 1.1.1.2: Word-level ground truth images of CMATERdb 1.1.1
    • CMATERdb 1.2: Handwritten text lines containing both Bangla and English words
      • CMATERdb 1.2.1: Release 1 consists of 50 grey scale images
      • CMATERgt 1.2.1.1: Line-level ground truth images of CMATERdb 1.2.1
      • CMATERgt 1.2.1.2: Word-level ground truth images of CMATERdb 1.2.1
      • CMATERgt 1.2.1.3: Script-level ground truth images of CMATERdb 1.2.1
    • CMATERdb 1.3: Handwritten documents containing text (Bangla, English or both) and graphical contents
      • CMATERdb 1.3.1: Release 1 of text/graphics database
      • CMATERgt 1.3.1: Ground truth images of CMATERdb 1.3.1
    • Cite the following papers to use this database:
      • R.Sarkar, N.Das, S.Basu, M.Kundu, M.Nasipuri, D.K.Basu,"CMATERdb1: a database of unconstrained handwritten Bangla and Bangla–English mixed script document image," International Journal on Document Analysis and Recognition (in press); DOI: 10.1007/s10032-011-0148-6
      • S.Basu, C.Chaudhury, M.Kundu, M.Nasipuri, D.K.Basu, "Text Line Extraction from Multi Skewed Handwritten Documents," Pattern Recognition, Elsevier, vol. 40, no. 6, pp. 1825 – 1839, 2007
  • CMATERdb 1.4: Handwritten text lines containing Devanagari words
  • CMATERdb 1.5: Handwritten text lines containing both Devanagari and English words
  • CMATERdb 1.6: Handwritten documents containing text (Devanagari, English or both) and graphical contents
  • CMATERdb 1.7: Printed text lines
  • CMATERdb 2: Handwritten word image database
  • CMATERdb 3: Handwritten Indian script character database
    • CMATERdb 3.1: Handwritten Bangla numeral+character database
      • CMATERdb 3.1.1: Handwritten Bangla numeral database
      • CMATERdb 3.1.2: Handwritten Bangla basic-character database
      • CMATERdb 3.1.3: Handwritten Bangla compound-character database
      • CMATERdb 3.1.3.2: Handwritten Bangla compound-character Grey-level image database
    • CMATERdb 3.2: Handwritten Devanagari numeral+character database
    • CMATERdb 3.3: Handwritten Arabic numeral+character database
    • CMATERdb 3.4: Handwritten Telugu numeral+character database
  • CMATERdb 4: Camera-captured image database
    • CMATERdb 4.1: Release 1 consists of 10 grey scale camera captured business card images
    • CMATERdb 4.2: Release 2 consists of 20 gray scale camera captured business card images
    • CMATERdb 4.3: Release 3 consists of 40 gray scale camera captured business card images
  • Cite the following paper to use this database:
    • A.F.Mollah, S.Basu, M.Nasipuri, “Text/Graphics Separation and Skew Correction of Text Regions of Business Card Images for Mobile Devices,” Journal of Computing, Volume 2, Issue 2, pp.96-102
  • CMATERdb 5: Postal document image database
    • CMATERdb 5.1: Release 1 consists of 50 grey scale images of Indian Postal documents
    • Cite the following papers to use this database:
      • S.Basu, N.Das, R.Sarkar M.Kundu, M.Nasipuri, D.K.Basu, “A novel framework for automatic sorting of postal documents with multi-script address blocks Pattern Recognition, Elsevier 43(10): pp. 3507-3521 (2010)
  • CMATERdb 6: Document Image Binarization bench-marking dataset
    • CMATERdb 6.1: This release consists of 5 representative color document images. It has camera captured as well as scanned documents, old manuscript as well as new document, degraded as well as good-conditioned, and blurred as well as prominent documents.
    • CMATERgt 6.1: Ground truth images of CMATERdb 6.1 dataset

Developed tools

  • GT Gen 1.1: The ground truth preparation tool developed at CMATER

Project Members

  • Dr. Dipak Kumar Basu, AICTE Emeritus Fellow and former Professor, CSE Dept., JU.
  • Dr. Mita Nasipuri, Professor, CSE Dept., JU.
  • Dr. Mahantapas Kundu, Professor, CSE Dept., JU.
  • Dr. Subahdip Basu, Senior Lecturer, CSE Dept., JU.
  • Mr. Nibaran Das, Lecturer, CSE Dept., JU.
  • Mr. Ram Sarkar, Lecturer, CSE Dept., JU.
  • Mr. Ayatullah Faruk Mollah, Research Scholar, CSE Dept., JU.
Powered by Google Project Hosting