My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
CurrentNubMethodology  
Summary of how the original GBIF taxonomic backbone is built
Updated Sep 3, 2009 by dprem...@gmail.com

This page is under construction and subject to significant revision.

Separate Classes of Resources

The GBIF taxonomic backbone is built from two primary classes of starting resources: Taxonomic sources and Occurrence sources.

Taxonomic Resources

Historically, taxonomic sources have been limited to the Catalogue of Life, with some use of the Index Fungorum and International Plant Name data. Taxonomic sources can be distinguished from Occurrence sources via the following properties and criteria they provide:

  1. Internally consistent and taxonomically defensible taxonomic hierarchy
  2. An index of accepted species names assumed to be correctly spelled
  3. Names organised into groups based on synonymy that allow a search for one term to be conflated to include all terms in the synonymous group.

The two nomenclatural sources, IPNI and Index Fungorum, have historically been utilised solely for their taxonomic hierarchy information and in previous data formats provided no synonymy.

Occurrence Resources

Occurrence resources, typically formatted as Darwin Core or ABCD specimen or observational data, provide occurrence details regarding a target taxon, generally a species. The target taxon is identified, providing a taxon name. In general, the target taxon is accompanied by additional higher taxonomic references that provide a classification. Classification schemes within each dataset may or may not be internally consistent. No synonymy information is provided with occurrence resources.

----

Match Occurrence Data to Taxonomic Resources

Initially GBIF does a straight 1:1 name match between names that occur within the Catalogue of Life and names that occur within the Occurrence data store. Such a straight match results in approximately 30% overlap between the two sources. These names may be tied to the Catalogue of Life taxonomy with confidence that they match at least the nominal taxon in the Annual Checklist.

Because taxon names in both categories of content (taxonomic and occurrence) contain higher taxonomic information, it is possible to evaluate and compare higher taxonomic information when there are possible issues of homonymy where the same name may be used for different taxa. Due to the often unreliable taxonomy within occurrence data, however, it is often difficult to assess whether a name is truly a homonym or not.

----

Use taxonomy in Occurrence Resources to merge gaps until all taxa are treated


Sign in to add a comment
Powered by Google Project Hosting