My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
publishingClassifications  
Options for publishing taxonomic hierarchies using DwC-A
Updated Sep 29, 2010 by wixner@gmail.com

This page is under construction and subject to significant revision.

Options for Publishing Taxonomic Hierarchies

The Darwin Core taxon terms provide several options for publishing taxonomic classifications or hierarchies.

1. Parent taxa may be referenced by the literal name ("Mollusca") or by identifier reference (1234). 2. Taxonomic hierarchy can be represented in a denormalised fashion, like a typical spreadsheet where a higher taxon group is repeated for each record, or a normalized form where each taxon is referenced only once.

Examples for each are listed here. Note that the full Darwin Core taxon class includes additional terms for infraspecific taxa and higher taxon ranks not included in these examples:

Denormalised by taxon name (commonly used). In this example a row represents a distinct terminal taxon (a species in this case). Higher taxa are organised into columns. The advantage of this format is that it is easy to read. The disadvantages include repeated references to the same higher taxon, increasing the possibilities of errors and the inability to provide additional information (publication, distribution, descriptive) about the higher taxa.

taxonID kingdom phylum class order family scientificName
1AnimaliaChordataMammaliaCarnivoraFelidaePanthera tigris
2AnimaliaChordataMammaliaCarnivoraFelidaePanthera leo
3AnimaliaChordataMammaliaCarnivoraFelidaeAcinonyx jubatus
4 AnimaliaChordataMammaliaCarnivoraFelidaePanthera pardus
5 AnimaliaArthropodaInsectaHymenopteraApidaeApis melifera

Normalised by taxon reference (commonly used) This format is also referred to as a "parent-child relationship" or an "adjacency list." In this case each higher taxon is referenced only once per data row. This allows all taxa to be referenced by ID and for additional information about each taxon to be published. A taxon parent is referenced by it's identifier. The top of the hierarchy has no parent (or equal to 0). In this format a higher taxon must be included as a row in the published data.

taxonID scientificName taxonRank parentNameUsageID
1AnimaliaKingdom0
2ChordataPhylum1
3ArthropodaPhylum1
4MammaliaClass2
5InsectaClass3
6CarnivoraOrder4
7FelidaeFamily6
8Panthera tigrisspecies7
9Panthera leospecies7
10Acinonyx jubatusspecies7
11Panthera pardusspecies7
12HymenopteraOrder3
13ApidaeFamily12
14Apis melliferaspecies13

Normalised by taxon name (rarely used). This example is the same as above but the higher taxon is referenced by literal name instead of by the ID. There are few advantages to this format. The disadvantage is that the relationship between two rows is made by matching name strings instead of name IDs. This can be problematic if there are homonyms in the data or a misspelling occurs between the two instances of the names.

taxonID scientificName taxonRank parentNameUsage
1AnimaliaKingdom
2ChordataPhylumAnimalia
3ArthropodaPhylumAnimalia
4MammaliaClassChordata
5InsectaClassArthropoda
6CarnivoraOrderMammalia
7FelidaeFamilyCarnivora
8Panthera tigrisspeciesFelidae
9Panthera leospeciesFelidae
10Acinonyx jubatusspeciesFelidae
11Panthera pardusspeciesFelidae
12HymenopteraOrderInsecta
13ApidaeFamilyHymenoptera
14Apis melliferaspeciesApidae


Sign in to add a comment
Powered by Google Project Hosting