Sunday, August 09, 2015

More Neo4J tests of GBIF taxonomy: Using IPNI to find objective synonyms

Following on from Testing the GBIF taxonomy with the graph database Neo4J I've added a more complex test that relies on linking taxa to names. In this case I've picked some legume genera (Coursetia and Poissonia) where there have been frequent changes of name. By mapping the GBIF taxa to IPNI names (and associated LSIDs) we can build a graph linking taxa to names, and then to objective synonyms (by resolving the IPNI LSIDs and following the links to the basionym), see http://gist.neo4j.org/?4df5af75d42e0f963e5d.

In this example we find species that occur twice in the GBIF taxonomy, which logically should not happen as the names are objective synonyms. We can detect these problems if we have access to nomenclatural data. in this case, because IPNI has tracked the names changes, we can infer that, say, Coursetia heterantha and Poissonia heterantha are synonyms, and hence only one of these should appear in the GBIF classification. This is an example that illustrates the desirability of separating names and taxa, see Modelling taxonomic names in databases.