My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
Nom5AuthoritativeIndexProposal  
A proposed structure for an index of authoritative classifications
Updated Sep 3, 2009 by dprem...@gmail.com

This page is under construction and subject to significant revision. When evaluating the occurrence data, it was spotted that a very large amount of the occurrences are indeed only identified to a higher level taxon (e.g. Genus). The parsing approach used so far starts with cleaning the raw classification, and for the understanding of this index structure you can consider it as simple as:

  • Aus | Bus | Cus | Cus sp.

becomes

  • Aus | Bus | Cus

For organising raw occurrence data, access to an index of the following proposed structure from the merged authoritative management classification is desired.

  Name -> [ Name | author | rank | usageId ]*
  Name + Author -> [ Name | author | rank | usageId ]*
  Name_abbreviated + Author -> [ Name | author | rank | usageId ]*

Example

So considering the Catalogue of Life as a single authoritative source, the following describes the index structure for Puma concolor (Linneaus, 1771)

Puma concolor (Linneaus, 1771)

Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998 Puma | | GENUS | 999 Puma concolor | (Linneaus, 1771) | SPECIES | 1000
P. concolor (Linneaus, 1771)
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998 Puma | | GENUS | 999 Puma concolor | (Linneaus, 1771) | SPECIES | 1000
Puma concolor
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998 Puma | | GENUS | 999 Puma concolor | (Linneaus, 1771) | SPECIES | 1000
P. concolor
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998 Puma | | GENUS | 999 Puma concolor | (Linneaus, 1771) | SPECIES | 1000
Puma
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998 Puma | | GENUS | 999
Felidae
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997 Felidae | | FAMILY | 998
Carnivora
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996 Carnivora | | ORDER | 997
Mammalia
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995 Mammalia | | CLASS | 996
Chordata
Animalia | | KINGDOM | 994 Chordata | | PHYLUM | 995
Animalia
Animalia | | KINGDOM | 994

Now in actual fact, because of homonyms, it would of course be a Set<Classification> for each name in the index, so that the candidate classifications would be returned from the index when you search for (e.g.) "Oenanthe"

Tim has tried to get CoL2009ac into memory using this format, and it does not work well, so propose we look to Lucene / RDBMS for this. Lucene seems a logical candidate.


Sign in to add a comment
Powered by Google Project Hosting