Skip to content

DwC-DP Conceptual Model: Identification #787

@nielsklazenga

Description

@nielsklazenga

I think there are still a few issues with respect to how the Identification resource is placed in the Darwin Core Conceptual Model and implemented in the Darwin Core Data Package:

  1. To start with, identifications are not entities, as they do not exist independently (there needs to be an organism and a taxon for there to be an identification). In the distinction that is made between Events and Entities in the Conceptual Model, identifications have all the attributes of Events. However, while Events are all about the 'when', 'where' and 'by whom', these attributes are of less importance in identifications, which are more about 'what for' and 'what to'. Therefore, identifications are more accurately categorised as Annotations.
  2. Regarding the 'what for', ultimately identifications are, of course, for organisms, but in the majority of Darwin Core data organisms are not represented by instances of dwc:Organism, but assumed in instances of dwc:Occurrence. Therefore, besides an 'of a' relationship to the Organism, the Conceptual Model should also have an 'of a' relationship to the Occurrence. The 'of a' relationship between Identification and Occurrence is already assumed in the Darwin Core Archive.
  3. Just observing here that the DwC IRI namespace has terms that link dwc:Identifications to dwc:Taxon or tcs:TaxonConcept, dwciri:toTaxon, and to dcterms:Agents, dwciri:identifiedBy, but there are no terms that link dwc:Identification to dwc:Organism or dwc:Occurrence (there is also nothing that links dwc:Occurrence to dwc:Event). I think it would be good to define those terms in the dwciri namespace, so that the Darwin Core Conceptual Model reflects Darwin Core. There are so many new proposals that I have lost track, so this might have been proposed already.
  4. An 'of a' relationship between dwc:Identification and dwc:MaterialEntity seems like a no-brainer, but I am not sure if it is needed right now. Duplicate specimens that belong to the same occurrence can have different identification histories, but the same occurrence does not necessarily mean the same instance of dwc:Occurrence. I think it is best to leave it out for now, so that it can be proposed independently later on, at which stage all the implications, including conflicting identifications, can be considered.
  5. I have big issues with the 'based on' relationships in the Conceptual Model. Identifications are based on diagnostic characters, which might be features that can be seen in images or dissimilarity between nucleotide sequences. They are not based on the Images or NucleotideSequences themselves and certainly not on Occurrences (which themselves can be based on images or DNA sequence, so you get chicken-or-egg situations). No matter how people think about the semantics, there is no 'based on' concept in Darwin Core, which the Conceptual Model is for. If people want the concept in Darwin Core, it should be proposed independently of the Darwin Core Data Package and Conceptual model and then we can discuss how to implement it. Putting it in the Darwin Core Data Package and Conceptual Model now and hoping for the standard to catch up is putting the carriage before the horse. Moreover, the 'based on' relationship is a polymorphic (multiple types of object) many-to-many (multiple objects) relationship, which is not super easy to implement in a Data Package. The way it is implemented in the proposed Darwin Core Data Package, as a bunch of many-to-one relationships, means that an identification can be 'based on' a single image and a single nucleotide sequence, but not on multiple images or multiple sequences, while those latter two scenarios are much more likely to happen than the former. Most importantly, an Identification MUST have one and only one 'of a' relationship, but it can be an Occurrence or Organism, so we cannot make occurrenceID or organismID required. Making available four of five more ID terms with no semantics other than what the object is will muddy the water even more for providers.
  6. When there are multiple identifications for the same occurrence or organism record, we need to have a way to indicate which of those identifications is the current or accepted one (it is not necessarily the most recent one). In the Darwin Core Archive this is done by having the current identification in the Occurrence core and ABCD has the PreferredFlag element. How is this going to be done in the Darwin Core Data Package or RDF?
  7. I am not sure if the TaxonIdentification resource is still meant to be there. It is in conflict with the Conceptual Model, as having a TaxonIdentification resource makes the relationship between Identification and Taxon in the Data Package many-to-many, while in the Conceptual Model it is many-to-one (although it does not say this in so many words).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions