This page is under construction and subject to significant revision. When evaluating the occurrence data, it was spotted that a very large amount of the occurrences are indeed only identified to a higher level taxon (e.g. Genus). The parsing approach used so far starts with cleaning the raw classification, and for the understanding of this index structure you can consider it as simple as:
- Aus | Bus | Cus | Cus sp.
becomes
For organising raw occurrence data, access to an index of the following proposed structure from the merged authoritative management classification is desired.
Name -> [ Name | author | rank | usageId ]*
Name + Author -> [ Name | author | rank | usageId ]*
Name_abbreviated + Author -> [ Name | author | rank | usageId ]*
Example
So considering the Catalogue of Life as a single authoritative source, the following describes the index structure for Puma concolor (Linneaus, 1771)
Puma concolor (Linneaus, 1771)
| Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 | Puma | | GENUS | 999 | Puma concolor | (Linneaus, 1771) | SPECIES | 1000 |
P. concolor (Linneaus, 1771) | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 | Puma | | GENUS | 999 | Puma concolor | (Linneaus, 1771) | SPECIES | 1000 |
Puma concolor | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 | Puma | | GENUS | 999 | Puma concolor | (Linneaus, 1771) | SPECIES | 1000 |
P. concolor | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 | Puma | | GENUS | 999 | Puma concolor | (Linneaus, 1771) | SPECIES | 1000 |
Puma | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 | Puma | | GENUS | 999 |
Felidae | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 | Felidae | | FAMILY | 998 |
Carnivora | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 | Carnivora | | ORDER | 997 |
Mammalia | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 | Mammalia | | CLASS | 996 |
Chordata | Animalia | | KINGDOM | 994 | Chordata | | PHYLUM | 995 |
Animalia | Animalia | | KINGDOM | 994 |
Now in actual fact, because of homonyms, it would of course be a Set<Classification> for each name in the index, so that the candidate classifications would be returned from the index when you search for (e.g.) "Oenanthe"
Tim has tried to get CoL2009ac into memory using this format, and it does not work well, so propose we look to Lucene / RDBMS for this. Lucene seems a logical candidate.