Artifact

Definition: A physical or digital entity that is created, collected, modified, or cataloged by an agent.

Description and Use:

  • Artifacts are the products of agent-driven activities, and represent things to which Contributions are made.
  • We are primarily concerned with material or informational artifacts used in generated by research-related activities. This may include natural specimens and man-made archaeological artifacts that are collected, modified, or cataloged for research purposes (e.g. a dinosaur fossil, an arctic ice core sample, prehistoric human tool fragments).
  • Because the characteristics of Artifacts will vary dramatically across research activities and domains, the CAM model of Artifacts specifies only generic metadata attributes, and leaves domain-specific characteristics to implementations to define as formal extensions of the base Artifact class.

Information Model:

Field Description Cardinality Requirement Data Type
id A unique string that identifies the artifact. 1..1 MUST identifier
type The high-level class to which the artifact belongs (always set to ‘Artifact’). 1..1 MUST class
label A free-text name for the artifact. 0..1 MAY string
description A free-text description of the artifact. 0..1 MAY string
externalID Additional identifier(s) for the artifact that come from an external system or authority. 0..m MAY identifier
artifactType A more specific type for the artifact. 0..m SHOULD coding <<Artifact Type>>
dateCreated The date on which the current version or form of the artifact was completed. 0..1 MAY dateTime
date Modified The date on which the artifact was last updated or modified. 0..1 MAY dateTime
url A URL where information about the Artifact can be found. 0..m MAY url
qualified Contribution A particular contribution made by an agent to the artifact. 0..m MAY Contribution
influencedBy A separate artifact that directly or indirectly influenced creation of the artifact of interest. 0..m MAY Artifact

.

Examples:

  • The Contribution Role Ontology
  • A poster and abstract submission about the Architecting Attribution project
  • A HeLa cell line
  • A protocol for culturing cell lines
  • A dataset about tetrapod bony lesions
  • A catalog entry for a centrifuge instrument
  • The centrifuge itself
  • The CIViC knowledgebase containing curated information about cancer mutations
  • An individual record from CIViC about the BRAF V600E mutation.
  • A dinosaur fossil collected and cataloged from a research site.
  • A prehistoric tool fragment specimen collected and cataloged form an archaeological site.
  • An ice specimen collected from an arctic glacier.
  • A catalog record describing the ice specimen.

Implementation Notes:

  • Using the Artifact Type Value Set (artifactType)

    • Use of the Artifact Type Value Set that is bound to the artifactType attribute above is RECOMMENDED but not required.
    • Implementations can choose to refine or extend this value set that we provide as part of the CAM specification, or use their own, as described in the Implementation Guide.
  • Artifact Identifiers (id and externalID):

    • Artifact identifiers can be captured using the id and externalID attributes. The id attribute MUST hold a single identifier that will be used to track/reference the artifact in an implementing system. This can be an internal de novo identifier, or one borrowed from an external resource or registry (e.g. a PMID for a publication, or ISBN for a book).
    • Additional external identifiers for the artifact MAY be captured using the externalID attribute. For example, the publication described in this Mouse Genome Informatics record has an internally-minted identifier (J:33382) that may be captured in the id slot, and an external Pubmed identifier (8662814) that can be captured in the externalID slot.
  • Typing of Artifacts (type and artifactType)

    • The type attribute MUST be filled with the generic ‘Artifact’ type. To capture a more specific artifact type, implementations can use the artifactType attribute and bind it to a value set of terms that is suited for their domain and use case. We RECOMMEND using the Artifact Type Value Set provided as part of this specification, which can be used in whole or in part, and refined/extended as needed. But ad hoc value sets can be defined and used if desired.
  • Artifact Modification (dateModified)

    • The meaning of ‘modified’ may vary depending on artifact type and context of use, and SHOULD be clarified by a given implementation.
    • For material artifacts, this can include physical alterations or additions that maintain the identity of the artifact. For informational artifacts, this can include updates to content or structure that do not result in a new version with a separate identity in the system of record.
  • Natural and Archaeological Artifacts (dateCreated and dateModified)

    • Many natural or archaeological artifacts originate outside of a research setting, and are only collected and documented as specimens much later (e.g. a dinosaur tooth fossil, or prehistoric tool fragments). Here, dateCreated SHOULD be used to record the date such specimens were taken, not the date the collected material originally came into existence (which may have been thousands or millions of years ago). Similarly, dateModified SHOULD be used to record when modifications were last made to the specimen in a research context (e.g. its last cleaning or sample extraction).
    • In cases where natural specimens are observed and documented, but not physically collected or modified, we RECOMMEND describing contributions to a catalog record about the specimen (as there are no contributions to physical specimen itself to track).
  • Influence Relationships Between Artifacts (influencedBy)

    • The notion of an ‘Influence’ between two artifacts broadly describes scenarios where one is directly or indirectly used in the creation of another. It is based on the PROV notion of influence - but narrower in that it applies here only between two Artifacts.
    • Influences can include derivation or incorporation of material or informational content - e.g. a cell line being derived from a tumor specimen, use of a jpg image into a blog post, or a format translation from a JSON dataset to an RDF version of the dataset. Influences can also cover an artifact providing a source of information used to generate an artifact with entirely separate content - e.g. a dataset on ice core CO2 levels used as evidence for an assertion about the rate of arctic climate change, a microscope/camera used to take images of tissue samples, or a knockout mouse strain used in to generate data about blood glucose levels which support a phenotype annotation made on the deleted gene).
    • The CAM defines a single, generic influencedBy attribute to describe the artifact-artifact relationship in such scenarios. But implementations MAY define specializations of this attribute with more specific meaning - e.g. derivedFrom, revisionOf, informedBy, providesEvidenceFor, etc.