|
ChecklistBank
Checklist Bank is a Global Name Server Brokerage
This page is under construction and subject to significant revision. IntroductionChecklist Bank extends the indexing capacity of the GBIF data portal to include taxonomic data sources. It serves to inventory checklist resources, enabling discovery and access to a wide range of taxonomic and nomenclatural databases, datasets, and initiatives that organise and serve these data. Checklist Bank serves as a dynamic archive of "checklists," summarized lists of taxa or taxon names. Checklist Bank stores checklist data as it was provided by the data publisher. In addition, Checklist Bank attempts to collate different published checklist resources by tying the atomic elements of checklists, taxon names, to a common names dictionary.
The scope of this checklist index was effectively captured in a poster presented by Dean Pentcheff and Regina Wetzer at the EBiosphere conference in London, June 2009. BenefitsDiverse taxonomic resources can work together
RequirementsThe ChecklistBank Data Model has been developed to accommodate three major requirements.
Main EntitiesA description of the main entities in the Checklist Bank Model will follow the basic distinction between the core components of the data model. ChecklistA checklist is the basic unit of Checklist Bank. Checklists: - Provide definitions of taxa - Provide nomenclatural details regarding taxon names - Organized and summarized references to taxa that may be defined by combinations of taxonomic, regional, or thematic contexts. - Link taxon names to vernacular (or common) names. The Checklist Entity provides basic metadata about the checklist and and provides the link to entry for the checklist within the GBIF Registry (the GBRDS). Name UsageA checklist is composed of one or more "Name Usages" This term refers to an instance of a taxon name within a single Checklist resource. In some checklists, each row or entry may refer to an individual taxon. In other checklists, a taxon may be represented by multiple entries representing the accepted name and one or more name usages representing synonyms or publications asserted to refer to the same taxon. In addition to a taxon name, a name usage may include taxonomic and/or nomenclatural details regarding the use of the name according to the publisher. Each Checklist is composed of uniquely identified Name Usages. Name StringA "name string" refers to the literal orthography of a taxon name as it is provided by a data publisher. The same name may occur within multiple Name Usages. A name string may include the taxon name, rank information, authorship, and other annotations. Examples of namestrings:
Checklist Bank stores the exact orthography of a name as provided by the data publisher with the follow exceptions removing trailing, leading, multiple whitespace commas in front of year in cited authorship are removed (Ex. "Aotus Illiger, 1811" -> "Aotus Illiger 1811") When a multi-word name is published in atomised form where there is no literal combination a namestring value is generated based on: dwc:genus (dwc:subgenus)? dwc:species dwc:infraspecificRank dwc:infraspecies dwc:authorship TermMany data elements in ChecklistBank benefit from the use of controlled or reconciled vocabularies of terms that may be represented in a particular property. Controlled vocabularies refer to strict lists and reconciled vocabularies occur when terms identified in sources are mapped to a controlled list. The Terms Entity in Checklist Bank manages all controlled lists and associated terms for any of the data elements in the schema. Examples of data elements tied to such controlled terms are taxonomic ranks, nomenclatural status, language and country names, etc. Vocabularies and associated terms are maintained and developed on the GBIF vocabulary server. Lexical GroupChecklist Bank stores the exact orthography of a name as provided by the data publisher with the result that the same name may have slight (or significant) variation. This can present many problems in evaluating and comparing checklists, utilising checklists as organisational schema for biodiversity data, and in providing clear search and browse data interfaces. Checklist Bank addresses this issue by clustering lexically similar names into Lexical groups to enable "fuzzy matching" of names. A Lexical group may contain both correct and incorrect spellings of a name as well as correctly-spelled variations in a name.
Lexical groups are assembled by a combination of algorithmic and manually-mediated methods. Nomenclatural GroupChecklist Bank organises lexical groups into larger sets of groups based on nomenclatural relationships. This allows, for example, expanding a search for information using one name, to include other names that are derived from the original type (homotypic names). This also allows cross-linking among different classifications that may reference the same name placed in different genera. Nomenclatural groups are derived from data sources that publish nomenclatural information linking a name to an original name. | |||||