Figure out how to archive Babel outputs

We probably want to archive Babel files as generated using the UMLS Level 0 data to avoid _some_ copyright/licensing issues.

I propose to archive Babel outputs in the following files:
* config.yaml (used to generate this Babel run): 7K
* compendia.tar.gz (Compendia files): 20G
* synonyms.tar.gz (Synonym files): 34G
* conflation.tar.gz (Conflation files): 171M
* reports.tar.gz (Reports): 124M
* intermediate.tar.gz (Intermediate files): 8.5G
* parquet.tar.gz (Parquet files from duckdb/parquet): 126G
* kgx.tar.gz (KGZ files): 27G
* metadata.tar.gz (top-level metadata files): 11K

Alternatively, we could combine the core Babel outputs as:
* babel-outputs.tar.gz (Compendia, synonyms and conflation files): 

We should not archive the following directories:
* duckdb/duckdbs (DuckDB files): 171G
* duckdb/parquet (Parquet files): 146G
* sapbert-training-data/ (SAPBERT training data): 24G
* logs/ (Logs from errors in previous runs): 2.6G
* .snakemake/ (Logs, config, etc.): 16M

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Figure out how to archive Babel outputs #623

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Figure out how to archive Babel outputs #623

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions