This repository contains documents and scripts for the creation of a custom medical imaging metadata catalogue. This includes the following processes:
- Statistical metadata generation at different stages of the SMI pipeline (raw, staging, live)
- Derived metadata generation from DICOM metadata (e.g., DICOM-metadata based classification by body part)
- Import of DICOM standard metadata
- Creation of visualisations based on generated metadata via a web interface
| Directory | Contents |
|---|---|
| docs | Metadata and table schemas, overview of the data architecture. |
| metadata_collection | Scripts for the generation and collection of statistical metadata and storage in a MongoDB database. |
| metadata_studies | Derived metadata studies and experiments, including scripts for extracting metadata for analysis. |
| catalogue_ui | Implementation of Flask app as the catalogue UI. |
| modules | Shared document and database manipulation. |
| test | Test deployment of all components in a Docker environment. |
If you want to change configuration (i.e., front-end serving port, database credentials), source an environment file like config.env before the docker-compose command:
$ source test/config.env
$ docker-compose -f test/docker-compose.yml upOtherwise, the default values from the compose file will be used.
This will launch the following containers:
smi-mariadb: acting as both staging and live databasesmi-mongodb: acting as raw and metadata databasesmi-catalogue: host of catalogue front and back end
And will perform the following tasks:
- Install requirements.txt and custom modules in the
smi-cataloguecontainer. - Make use of the DICOM metadata schema to generate synthetic documents in a
dicomdatabase onsmi-mongodb. - Make use of the table schema to create tables on
smi-mariadband populate them with data from thedicomMongoDB database. To emulate the processed data on the live system, two databases are populated,data_load2representing thestagingdatabase, andsmirepresenting thelivedatabase. - Perform metadata collection tasks (see documentation for more details).
- Run body part labelling (see documentation for more details).
- Deploy catalogue UI.
You can manually re-run any of these tasks from the smi-catalogue container. For example:
(smi-catalogue) # . /home/metacat/test/config.env
(smi-catalogue) # cd /home/metacat/metadata_collection
(smi-catalogue) # python3 populate_catalogue.py -d dicom -i -l logs/And you can manually analyse the MongoDB documents. If you used the default configuration, note that the username password are in the config.env file:
(smi-mongodb) # mongosh -u <MONGOUSER> -p
<MONGOPASS>Similarly for the MariaDB container:
(smi-mariadb) # mariadb -u <MYSQLUSER> -p
<MYSQLPASS>To stop containers, press CTRL+C and run the following command to clean up:
$ docker-compose -f ./test/docker-compose.yml down