Skip to content

Data Operations

Cristian Vasquez edited this page Oct 17, 2024 · 13 revisions

Data Operations

This Operational Metadata enables efficient identification of data that requires creation, updating, or deletion in downstream systems like CELLAR and notifies upstream systems regarding updates in data sources.

Granularity

The operational metadata is available at two levels of granularity:

  • Batch level: The metadata describing sources, job and data outcomes (partition)
  • Individual level: The metadata specific to a notice

Use Case Scenarios

Example use cases are:

  • Handling Failed Jobs: When a job fails, metadata offers key information for troubleshooting, such as identifying connectivity issues.
  • Managing Failed Notice Transformations: In case of transformation errors, metadata helps trace problems related to data sources, enrichment, or mapping issues.
  • Updating Notices for New Ontology Versions: As ontologies evolve, notices may need to be reprocessed to use updated mappings and controlled vocabularies.
  • Activating/Deactivating Private Fields: Changes in data privacy policies may require specific notices to be reprocessed to address privacy-related adjustments.
  • Summary of processed Notices: Each day the counts of processed notices is retrieved.

Operation: Updating a Set of Notices of Interest

To update a set of notices, they first need to be selected (via query) and then scheduled for transformation.

  • Query Mechanism: Users should be able to define queries based on Operational Metadata, allowing notices to be selected by various criteria.
  • Schedule Custom Jobs: Notices identified through these queries can be used to initiate a new transformation job.
  • Prevent Concurrent Processing: During transformation, notices are locked to prevent concurrent processing and race conditions. Once the transformation is completed or fails, the lock is released, enabling further actions.

Operation: Retrieve metadata

Example: A new batch metadata document is generated for each job triggered daily. This document is accessed via a URL and reported in Teams, with a notification such as "3,000 Notices were transformed".

  • Access via URL: Batch and Notice metadata can be retrieved by dereferencing a URL that follows a predefined pattern.

Clone this wiki locally