This repository demonstrates usage of an Azure Confidential Clean Room (CCR) for multi-party collaboration.
- Overview
- Samples environment (per collaborator)
- High level execution sequence
- Setting up the consortium
- Publishing data
- Authoring collaboration contract
- Finalizing collaboration contract
- Proposing a governance contract (operator)
- Agreeing upon the contract (litware, fabrikam, contosso, operator)
- Propose ARM template, CCE policy and log collection (operator)
- Accept ARM template, CCE policy and logging proposals (litware, fabrikam, contosso, operator)
- Configure resource access for clean room (litware, fabrikam, contosso)
- Using the clean room
- Governing the cleanroom
- Contributing
- Trademarks
The following aspects of collaboration using clean room infrastructure are demonstrated:
- Collaboration
- Publishing sensitive data such that it can only be consumed within a clean room. (Data Source)
- Producing sensitive data from within the clean room such that it can only be read by the intended party. (Data Sink)
- Configuring an application to consume and generate sensitive data within a clean room. (Data Access)
- Configuring HTTPS endpoints for clients to invoke the application. (API Endpoint)
- Configuring network policy to govern traffic allowed on the application endpoint. (Network Protection)
- Governance
- Authoring and finalizing a governance contract capturing the application to be executed in the clean room and the data to be made available to it. (Contract)
- Agreeing upon ARM templates and confidential computation security policies that will be used for deploying a clean room implementing the contract. (Deployment)
- Authoring and finalizing governed documents queried by clean room applications at runtime. (Document Store)
- Enabling collection of application logs and clean room infrastructure telemetry, and inspecting the same in a confidential manner. (Telemetry)
- Auditing clean room execution. (Audit)
- Configuring federation for clean room identity in a confidential manner. (Identity Provider)
- Setting up a confidential certificate authority for an HTTPS endpoint inside the clean room. (CA)
Quick start demos showcasing basic functionality:
- Confidential access of protected data through a job.
cleanroomhello-job - Confidential access of protected data through an API endpoint.
cleanroomhello-api
End to end demos showcasing scenario oriented usage:
- Confidential execution of audited queries on protected datasets using a standalone DB engine residing within the CCR.
analytics - Confidential inference from sensitive data using a protected ML model.
inference - Confidential fine tuning of a protected ML model on protected datasets.
training
cleanroomhello-job |
cleanroomhello-api |
analytics |
inference |
training |
|
|---|---|---|---|---|---|
| Collaboration | |||||
| Data Source | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
| Data Sink | ✔️ | ❌ | ❌ | ❌ | ✔️ |
| Data Access | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
| API Endpoint | ❌ | ✔️ | ✔️ | ✔️ | ❌ |
| Network Protection | ✔️ | ✔️ | ❌ | ❌ | ✔️ |
| Governance | |||||
| Contract | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Deployment | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Document Store | ❌ | ❌ | ✔️ | ❌ | ❌ |
| Telemetry | ✔️ | ❌ | ❌ | ❌ | ✔️ |
| Audit | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| Identity Provider | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
| CA | ❌ | ✔️ | ✔️ | ✔️ | ❌ |
All the demos demonstrate collaborations where one or more of the following parties come together:
- Litware, end to end solution developer publishing applications that execute within the CCR.
- Fabrikam, collaborator owning sensitive dataset(s) and protected AI model(s) that can be consumed by applications inside a CCR.
- Contosso, collaborator owning sensitive dataset(s) that can be consumed by applications inside a CCR.
The following parties are additionally involved in completing the end to end demo:
- Operator, clean room provider hosting the CCR infrastructure.
- Client, consumer invoking the CCR endpoint to gather insights, without any access to the protected data itself.
In all cases, a CCR will be executed to run the application while protecting the privacy of all ingested data, as well as protecting any confidential output. The CCR instance can be deployed by the operator, any of the collaborators or even the client without any impact on the zero-trust promise architecture.
All the involved parties need to bring up a local environment to participate in the sample collaborations.
Note
Prerequisites to bring up the environment
- Docker installed locally. Installation instructions here.
- PowerShell installed locally. Installation instructions here.
You can also use GitHub Codespaces to create the local environment which would have the above prerequisites pre-installed.
Each party requires an independent environment. To create such an environment, open a separate powershell window for each party and run the following commands:
$persona = # Set to one of: "operator" "litware" "fabrikam" "contosso" "client"./start-environment.ps1 -shareCredentials -persona $personaThis create a separate docker container for each party that contains an isolated enviroment, while sharing some host volumes across all of them to simplify sharing 'public' configuration details across parties.
Important
The command configures the environment to use a randomly generated resource group name on every invocation. To control the name, or to reuse an existing resource group, pass it in using the -resourceGroup parameter.
Do not use the same resource group name for different personas.
Tip
The-shareCredentials switch above enables the experience for sharing Azure credentials across the sample environments of all the parties. This brings up a credential proxy container azure-cleanroom-samples-credential-proxy that performs a single interactive logon at the outset, and serves access tokens to the rest of the containers from here onwards.
Note
Prerequisites to initialize the environment
- An Azure subscription with adequate permissions to create resources and manage permissions on these resources.
Once the environment is up, execute the following command to logon to Azure:
az login --identityThe command shows the subscription that will be used for resource creation by the sample scripts.
Tip
- The command defaults to using shared credentials for logon. To use a different set of credentials for, omit the
--identityswitch and follow the device login prompts. - If another subscription is to be used for creating resources, execute
az account setto select if before executing the remaining steps.
Post login, initialize the enviroment for executing the samples by executing the following command from the /home/samples directory:
./scripts/initialize-environment.ps1This command create the resource group and other Azure resources required for executing the samples such as a storage account, container registry and key vault (Premium).
Important
For running this sample on MSFT (internal) tenants, please use the preProvisionedOIDCStorageAccount parameter in the above command to specify the name of a preprovisioned storage account, to be used for the OIDC configuration. This storage account has to be whitelisted to be used for Federation on Managed Identities using the process:
- Create a static website in a storage account in your subscription and save the weburl using the steps outlined here. This static website will be used for sharing OIDC public configs for the consortium
- Create an ICM using the template: https://portal.microsofticm.com/imp/v3/incidents/create?tmpl=F332q2 and use that web URL to request the exception. The internal TSG for this exception is present here.
Note
All the steps henceforth assume that you are working in the /home/samples directory of the docker container, and commands are provided relative to that path.
Further details
The following Azure resources are created as part of initialization:
- Key Vault (Premium) for
litware,fabrikamandcontossoenvironments to store data encryption keys. - Storage Account for
litware,fabrikamandcontossoenvironments to use as a backing store for clean room input and output. - Storage Account with public blob access enabled for
litware,fabrikamandcontossoenvironments to store federated identity token issuer details. - Storage Account with shared key access enabled for
operatorenvironment to use as a backing store for CCF deployments. - Container Registry with anonymous pull enabled for
litwareenvironment to use as a backing store for clean room applications and network policies.
The sequence diagram below captures the overall flow of execution that happens across the samples being demonstrated. It might be helpful to keep this high level perspective in mind as you run through the steps.
sequenceDiagram
title Collaboration flow
actor m0 as litware
actor m1 as fabrikam
actor m2 as contosso
actor mx as operator
participant CCF as Consortium
participant CACI as Clean Room
actor user as client
Note over m0,CCF: Setting up the consortium
mx->>CCF: Creates consortium
mx->>CCF: Registers members
par
m0->>CCF: Joins consortium
and
m1->>CCF: Joins consortium
and
m2->>CCF: Joins consortium
end
Note over m0,m2: Publishing data
par
m0->>m0: Publishes application
and
m1->>m1: Publishes model(s)/datasets
and
m2->>m2: Publishes datasets
end
Note over m0,mx: Authoring collaboration contract
par
m0->>mx: Litware contract fragment
and
m1->>mx: Fabrikam contract fragment
and
m2->>mx: Contosso contract fragment
end
mx->>mx: Merges fragments into collaboration contract
Note over m0,CCF: Finalizing collaboration contract
mx->>CCF: Proposes collaboration contract
par
m0->>CCF: Accepts contract
and
m1->>CCF: Accepts contract
and
m2->>CCF: Accepts contract
end
mx->>CCF: Proposes ARM deployment<br> template and CCE policy
par
m0->>CCF: Accepts ARM deployment template and CCE policy
and
m1->>CCF: Accepts ARM deployment template and CCE policy
and
m2->>CCF: Accepts ARM deployment template and CCE policy
end
par
m0->>m0: Configures access to resources by clean room
and
m1->>m1: Configures access to resources by clean room
and
m2->>m2: Configures access to resources by clean room
end
Note over mx,user: Using the clean room (API)
mx-)CACI: Deploys clean room
activate CACI
loop Collaboration lifetime
user-)CACI: Invokes clean room application API
CACI->>CCF: Checks execution consent
CACI->>CCF: Gets tokens for Azure resource access
CACI->>CACI: Accesses secrets in Key Vault
CACI->>CACI: Accesses encrypted storage
CACI--)user: API response
end
mx-)CACI: Delete clean room
deactivate CACI
Note over m0,CACI: Using the clean room (Job)
mx-)CACI: Deploys clean room
activate CACI
CACI->>CCF: Checks execution consent
CACI->>CCF: Gets tokens for Azure resource access
CACI->>CACI: Accesses secrets in Key Vault
CACI->>CACI: Accesses encrypted storage
CACI--)mx: Job execution complete
deactivate CACI
Note over m0,CACI: Governing the cleanroom
activate CACI
m0->>CCF: Proposes enable logging
m1->>CCF: Accepts enable logging
m2->>CCF: Accepts enable logging
m0-)CACI: Export application logs
CACI->>CCF: Checks application telemetry consent
CACI->>CACI: Exports application logs to storage
CACI--)m0: Application logs exported
m0->>m0: Reviews application logs
deactivate CACI
Collaboration using a CCR is realized and governed through a consortium created using CCF hosting a Clean Room Governance Service (CGS). All the collaborating parties become participating members in the consortium.
From a confidentiality perspective any of the collaborators or the operator can create the CCF instance without affecting the zero-trust assurances. In these samples, we assume that it was agreed upon that the operator will host the CCF instance. The operator would create the CCF instance and then invite all the collaborators as members into the consortium.
sequenceDiagram
title Consortium creation flow
actor m0 as litware
actor m1 as fabrikam
actor m2 as contosso
actor mx as operator
participant CCF as CCF instance
par
m0->>mx: Share litware identity details
and
m1->>mx: Share fabrikam identity details
and
m2->>mx: Share contosso identity details
end
mx->>CCF: Create CCF instance
CCF-->>mx: CCF created
mx->>CCF: Activate membership
Note over CCF: operator active
mx->>CCF: Deploy governance service
CCF-->>mx: State: Service deployed
loop litware, fabrikam, contosso
mx->>CCF: Propose adding member
CCF-->>mx: Proposal ID
mx->>CCF: Vote for Proposal ID
Note over CCF: member accepted
CCF-->>mx: State: Accepted
end
par
mx->>m0: Share ccfEndpoint URL eg.<br>https://<name>.<region>.azurecontainer.io
m0->>CCF: Verifies state of the consortium
m0->>CCF: Activate membership
Note over CCF: litware active
and
mx->>m1: Share ccfEndpoint URL eg.<br>https://<name>.<region>.azurecontainer.io
m1->>CCF: Verifies state of the consortium
m1->>CCF: Activate membership
Note over CCF: fabrikam active
and
mx->>m2: Share ccfEndpoint URL eg.<br>https://<name>.<region>.azurecontainer.io
m2->>CCF: Verifies state of the consortium
m2->>CCF: Activate membership
Note over CCF: contosso active
end
A CCF member is identified by a public-key certificate used for client authentication and command signing.
Each member of the collaboration creates their member identity by generating their public and private key pair by executing the following command:
./scripts/consortium/initialize-member.ps1This shares the certificate (e.g. contosso_cert.pem) and Azure AD tenant ID with the operator. This information is used in subsequent steps to register members with the consortium.
Important
The member’s identity private key generated by the command (e.g. contosso_privk.pem) should not be shared with any other member.
It should be stored on a trusted device (e.g. HSM) and kept private at all times.
Azure CLI commands used
az cleanroom governance member keygenerator-sh- generate consortium member certificate and keys.
The operator (who is hosting the CCF instance) brings up a CCF instance using Confidential ACI by executing this command:
./scripts/consortium/start-consortium.ps1Note
In the default sample environment, the containers for all participants have their /home/samples/demo-resources/public mapped to a single host directory, so details about the CCF endpoint would be available to all parties automatically once generated. If the configuration has been changed, the CCF details needs to made available in /home/samples/demo-resources/public of each member before executing subsequent steps.
Azure CLI commands used
az cleanroom governance member keygenerator-sh- generate CCF network operator certificate and keys.az cleanroom ccf network create- initialize a CCF network usingcaci(Confidential ACI) infrastructure.az cleanroom ccf network transition-to-open- activate the CCF network.az cleanroom ccf provider configure- deploy a local container hosting a client for interacting with the CCF network infrastructure.az cleanroom governance client deploy- deploy a local container hosting a client for interacting with the consortium.
The operator (who is hosting the CCF instance) registers each member of the collaboration with the consortium using the identity details generated above.
./scripts/consortium/register-member.ps1Note
In the default sample environment, the containers for all participants have their /home/samples/demo-resources/public mapped to a single host directory, so this identity information would be available to all parties automatically once generated. If the configuration has been changed, the identity details of all other parties needs to made available in /home/samples/demo-resources/public of the operator's environment before running the registration command above.
Tip
To add a member to the consortium, one of the existing members is required to create a proposal for addition of the member, and a quorum of members are required to accept the same.
In the default sample flows, the operator is the only active member of the consortium at the time of inviting the members, allowing a simplified flow where the operator can propose and accept all the collaborators up front. Any additional/out of band member registrations at a later time would require all the active members at that point to accept the new member.
Azure CLI commands used
az cleanroom governance member add- generate CCF proposal to add a member to the consortium.az cleanroom governance proposal vote- accept/reject a CCF proposal.
Once the collaborators have been added, they now need to activate their membership before they can participate in the collaboration.
./scripts/consortium/confirm-member.ps1With the above steps the consortium creation that drives the creation and execution of the clean room is complete. We now proceed to preparing the datasets and making them available in the clean room.
Tip
The same consortium can be used/reused for executing any/all the sample demos. There is no need to repeat these steps unless the collaborators have changed.
Note
In the default sample environment, the containers for all participants have their /home/samples/demo-resources/public mapped to a single host directory, so details about the CCF endpoint would be available to all parties automatically once generated by the operator. If the configuration has been changed, the CCF details needs to made available in /home/samples/demo-resources/public of each member before executing subsequent steps.
Azure CLI commands used
az cleanroom governance member activate- accept an invitation to join the consortium.
Sensitive data that any of the parties want to bring into the collaboration needs to be encrypted in a manner that ensures the key to decrypt this data will only be released to the clean room environment.
The samples follow an envelope encryption model for encryption of data. For the encryption of the data, a symmetric Data Encryption Key (DEK) is generated. An asymmetric key, called the Key Encryption Key (KEK), is generated subsequently to wrap the DEK. The wrapped DEKs are stored in a Key Vault as a secret and the KEK is imported into an MHSM/Premium Key Vault behind a secure key release (SKR) policy. Within the clean room, the wrapped DEK is read from the Key Vault and the KEK is retrieved from the MHSM/Premium Key Vault following the secure key release protocol. The DEKs are unwrapped within the cleanroom and then used to access the storage containers.
It is assumed that the collaborators have had out-of-band communication and have agreed on the data sets that will be shared. In these samples, the protected data is in the form of one or more files in one or more directories at each collaborators end.
These dataset(s) in the form of files are encrypted using the KEK-DEK approach and uploaded into the storage account created as part of initializing the sample environment. Each directory in the source dataset would correspond to one Azure Blob storage container, and all files in the directory are uploaded as blobs to Azure Storage using specified encryption mode - client-side encryption / server-side encryption using customer provided key. Only one symmetric key (DEK) is created per directory (blob storage container).
sequenceDiagram
title Encrypting and uploading data to Azure Storage
actor m1 as Collaborator
participant storage as Azure Storage
loop every dataset folder
m1->>m1: Generate symmetric key (DEK) per folder
m1->>storage: Create storage container for folder
loop every file in folder
m1->>m1: Encrypt file using DEK
m1->>storage: Upload to container
end
end
Tip
Set a variable $demo to the name of the demo to be executed (e.g., "cleanroomhello-job") - it is a required input for subsequent steps.
$demo = # Set to one of: "cleanroomhello-job", "cleanroomhello-api", "analytics", "inference", "training"The following command initializes datastores and uploads encrypted datasets required for executing the samples:
./scripts/data/publish-data.ps1 -demo $demoNote
This command seeds the environment with data from external sources in some of the samples. As a result, this step which could take some time.
Note
The samples currently use server-side encryption for all data sets. However the clean room infrastructure supports client side encryption as well, and client side encryption is the recommended encryption mode as it offers a higher level of confidentiality.
Azure CLI commands used
az cleanroom datastore add- initialize a data store. The--encryption-modeparameter specifies the encryption mode -CPK(server side encryption) orCSE(client side encryption).az cleanroom datastore upload- encrypt and upload local data to a data store.
Every party participating in the collaboration authors their respective contract fragment independently. In these samples, the collaborators share their respective fragments with the operator who merges them into a collaboration contract.
The following command initializes the contract fragement for a given demo:
./scripts/specification/initialize-specification.ps1 -demo $demoIn addition to the contract fragment, this command creates a managed identity that will be used by the clean room to access Azure resources.
Azure CLI commands used
az cleanroom config init- initialize a clean room specification representing the contract fragment.az identity create- create a managed identity used by the clean room to access Azure resources.
The following command adds details about the datastores to be accessed by the clean room and their mode (source/sink) to the contract fragment:
./scripts/specification/add-specification-data.ps1 -demo $demoTip
During clean room execution, the datasources and datasinks get presented to the application as file system mount points using the Azure Storage Blosefuse driver.
The application reads/writes data from/to these mountpoint(s) in clear text. Under the hood, the storage system is configured to handle all the cryptography semantics, and transparently decrypts/encrypt the data using the DEK corresponding to each datastore.
Azure CLI commands used
az cleanroom config add-datasource- configure a data source for reading data in the clean room.az cleanroom config add-datasink- configure a data sink for writing data from the clean room.
The following command adds details about the (containerized) application to be executed within the clean room to the contract fragment:
pwsh ./demos/$demo/add-specification-application.ps1Note
For some samples, this command builds and publishes the application to an Azure Container Registry. This could take a long time to complete if the build process has to pull a large number of underlying layers to generate the container image.
The application container is configured to access protected data through the corresponding filesystem mount point for the datasource/datasink. The fragment also include details about the container image to be executed such as the container registry, image ID, command, environment variables and requested resources.
Warning
The resources for the application container should be allocated so as not to violate confidential ACI limits as defined here.
Tip
The set of datasource/datasink mount points available to an application is controlled through the --datasinks/--datasources options of the az cleanroom config add-application command. These take an input of a list, where each value is specified as the following two-tuple of key-value pairs (comma separated):
foo, wherefoois the name the datasource/datasink to be accessed.bar, wherebaris the path at which the datasource/datasink is to mounted within the application container.
E.g.,--datasources "fabrikam-input=/mnt/remote/model" "contosso-input/mnt/remote/dataset" --datasinks "fabrikam-output=/mnt/remote/output
Tip
To enable traffic to/from the application, the az cleanroom config network http enable is used. This takes the direction of traffic as an input which can be specified as --direction [inbound/outbound] along with an optional policy URL to enforce for requests.
Azure CLI commands used
az cleanroom config add-application- configure the application to be executed within the clean room.az cleanroom config network http enable- allow traffic to/from the application executing within the clean room.
The following command adds details about the storage account endpoint details for collecting the application logs:
./scripts/specification/add-specification-telemetry.ps1 -demo $demoThe actual download of the logs happens later on in the flow.
Tip
In these samples, litware provisions the storage resources to be used by the clean room for exporting any telemetry and logs from the clean room during/after execution, and fabrikam and contosso accept the same.
If any party, say fabrikam were to have a concern that sensitive information might leak out via logs and hence need to inspect the log files before the other party gets them, then the telemetry configuration can be achieved by fabrikam using a storage account under their control as the destination for the execution logs. The log files would then be encrypted and written out to Azure storage with a key that is in fabrikam's control, who can then download and decrypt these logs, inspect them and only share them with litware if satisfied.
Azure CLI commands used
az cleanroom config set-logging- configure a data sink for exporting application telemetry (if enabled).az cleanroom config set-telemetry- configure a data sink for exporting infrastructure telemetry (if enabled).
Once the collaborating parties are finished with above steps, the generated contract fragment for each party captures various details that all parties need to exchange and agree upon before the clean room can be created and deployed. This exchange and agreement is captured formally by creation of a governance contract hosted in the consortium. This is a YAML document that is generated by consuming all the contract fragments and captures the collaboration details in a formal clean room specification.
From a confidentiality perspective, the contract creation and proposal can be initiated by any of the collaborators or the operator without affecting the zero-trust assurances. In these samples, we assume that it was agreed upon that the operator undertakes this responsibility.
sequenceDiagram
title Proposing and agreeing upon a governance contract
actor m0 as litware
actor m1 as fabrikam
actor m2 as contosso
actor mx as operator
participant CCF as CCF instance
par
m0->>mx: Share litware contract fragment
and
m1->>mx: Share fabrikam contract fragment
and
m2->>mx: Share contosso contract fragment
end
mx->>mx: Merge contract fragments
mx->>CCF: Create governance contract
Note over CCF: State: Contract created
mx->>CCF: Propose contract
CCF-->>mx: Proposal ID
Note over CCF: State: Contract proposed
par
m0->>CCF: Get contract proposal details
CCF-->>m0: Proposal details
m0->>m0: Verify contract proposal details
m0->>CCF: Accept contract proposal
and
m1->>CCF: Get contract proposal details
CCF-->>m1: Proposal details
m1->>m1: Verify contract proposal details
m1->>CCF: Accept contract proposal
and
m2->>CCF: Get contract proposal details
CCF-->>m2: Proposal details
m2->>m2: Verify contract proposal details
m2->>CCF: Accept contract proposal
end
Note over CCF: State: Contract Accepted
The operator merges all the contract fragments shared by the collaborators and proposes the resultant clean room specification yaml as the final contract.
./scripts/contract/register-contract.ps1 -demo $demoTip
Set a variable $contractId to the contract ID generated above (e.g., "collab-cleanroomhello-job-8a106fb6") - it is a required input for subsequent steps.
$contractId = # generated contractId # E.g. "collab-cleanroomhello-job-8a106fb6"Warning
In the default sample environment, the containers for all participants have their /home/samples/demo-resources/public mapped to a single host directory, so the contract fragments would be available to all parties automatically once generated. If the configuration has been changed, the fragments of all other parties needs to made available in /home/samples/demo-resources/public of the operator's environment before running the command above.
Azure CLI commands used
az cleanroom config view- merge multiple contract fragments into a single clean room specification.az cleanroom governance contract create- initialize a collaboration contract.az cleanroom governance contract propose- propose a collaboration contract to the consortium.
The collaborating parties can now query the governance service to get the proposed contract, run their validations and accept or reject the contract.
./scripts/contract/confirm-contract.ps1 -contractId $contractId -demo $demoNote
After confirming the contract, the command may propose additional documents to the consortium to be associated with the contract in some demos. If accepted, these documents are presented to the application by the clean room infrastructure at runtime, and their contents are governed through the consortium.
Tip
While the sample scripts are registering and accepting these documents as part of finalizing the collaboration contract, these steps can be performed at a later point as well, including after clean room deployment as the clean room infrastructure always queries the consortium for the documents at runtime.
Azure CLI commands used
az cleanroom governance contract vote- accept / reject a collaboration contract.az cleanroom governance document create- initialize a collaboration document.az cleanroom governance document propose- propose a collaboration document to the consortium.
Once the contract is accepted by all the collaborators, the operator generates the artefacts required for deploying a CCR instance for the contained clean room specification using Azure Confidential Container Instances (C-ACI) and proposes them to the consortium.
./scripts/contract/register-deployment-artefacts.ps1 -contractId $contractIdTwo artefacts are required to deploy C-ACI containers - the C-ACI ARM deployment template, and the Confidential Computing Enforcement Policy (CCE policy) computed for this template.
The command generates these artefacts and proposes them to the governance service - the deployment proposal contains the ARM template (cleanroom-arm-template.json) which can be deployed for instantiating the clean room, and the policy proposal contains the clean room policy (cleanroom-governance-policy.json) which identifies this clean room when it is executing
In addition to this, the command submits an enable CA proposal to provision a CA cert for HTTPS calls that uniquely identifies this clean room when it is executing, as well as proposals for enabling log collection.
Tip
The samples take advantage of pre-calculated CCE policy fragments when computing the clean room policy. If desired, the policy can be computed afresh by setting the securityPolicy parameter to generate or generate-debug. Note that the command will take longer in this case as it invokes az confcom acipolicygen internally which takes 10-15 minutes to finish.
Azure CLI commands used
az cleanroom governance deployment generate- generate deployment template and CCE policy.az cleanroom governance deployment template propose- propose deployment template to consortium.az cleanroom governance deployment policy propose- propose CCE policy to the consortium.az cleanroom governance ca propose-enable- propose enabling root CA functionality to the consortium.az cleanroom governance contract runtime-option propose --option logging- propose enabling export of application telemetry.az cleanroom governance contract runtime-option propose --option telemetry- propose enabling export of infrastructure telemetry.
Once the ARM template and CCE policy proposals are available in the consortium, the collaborating parties validate and vote on these proposals. In these samples, we accept these proposals without any verification.
./scripts/contract/confirm-deployment-artefacts.ps1 -contractId $contractIdWhere applicable, any documents proposed to be associated with this contract are also accepted as part of this command.
Azure CLI commands used
az cleanroom governance deployment template show- show deployment template proposal.az cleanroom governance deployment policy show- show CCE policy proposal.az cleanroom governance ca show- show enable root CA proposal.az cleanroom governance contract runtime-option get --option logging- show enable export of application telemetry proposal.az cleanroom governance contract runtime-option get --option telemetry- show enable export of infrastructure telemetry proposal.az cleanroom governance proposal vote- accept/reject proposal.az cleanroom governance document show- show collaboration document.az cleanroom governance document vote- accept/reject collaboration document.
All the collaborating parties need to give access to the clean room so that the clean room environment can access resources in their respective tenants.
The DEKs that were created for dataset encryption as part of data publishing are now wrapped using a KEK generated for each contract. The KEK is uploaded in Key Vault and configured with a secure key release (SKR) policy while the wrapped-DEK is saved as a secret in Key Vault.
The managed identities created earlier as part of authoring the contract are given access to resources, and a federated credential is setup for these managed identities using the CGS OIDC identity provider. This federated credential allows the clean room to obtain an attestation based managed identity access token during execution.
./scripts/contract/grant-deployment-access.ps1 -contractId $contractId -demo $demoImportant
The command configures an OIDC issuer with the consortium at an Azure Active Directory Tenant level. In a setup where multiple parties belongs to the same tenant, it is important to avoid any race conditions in setting up this OIDC issuer. For such setups, it is recommended that this command should be executed by the affected parties one after the other, and not simultaenously.
The flow below is executed by all the collaborators in their respective Azure tenants.
sequenceDiagram
title Clean room access setup
actor m0 as Collaborator
participant akv as Azure Key Vault
participant storage as Azure Storage
participant mi as Managed Identity
m0->>m0: Create KEK
m0->>akv: Save KEK with SKR policy
loop every DEK
m0->>m0: Wrap DEK with KEK
m0->>akv: Save wrapped-DEK as secret
end
m0->>storage: Assign storage account permissions to MI
m0->>akv: Assign KV permissions to MI
m0->>m0: Setup OIDC issuer endpoint
m0->>mi: Setup federated credential on MI
Azure CLI commands used
az cleanroom config wrap-deks- create a KEK with SKR policy, wrap DEKs with the KEK and store in Key Vault.az role assignment create- configure resource access permissions for clean room managed identity.az cleanroom governance oidc-issuer set-issuer-url- configure OIDC identity provider for tenant.az identity federated-credential create- set up federation between managed identity and OIDC issuer.
Once the ARM template and CCE policy proposals have been accepted and access has been configured, the party deploying the clean room (the operator in our case) can do so by running the following:
./scripts/cleanroom/deploy-cleanroom.ps1 -contractId $contractIdRun the following script to wait for the cleanroom application to start:
./scripts/cleanroom/wait-cleanroom.ps1 -contractId $contractId -demo $demo[!TIP]
- If the cleanroom application is being executed as a job, add the
-jobswitch to wait for the job to complete. - If the cleanroom application has been configured to start automatically (--auto-start), add the -skipStart switch to skip that step.
Azure CLI commands used
az cleanroom governance ca generate-key- generate CA cert to be used by all clean room instances implementing this contract.az cleanroom governance ca show- display CA certificate details.az cleanroom governance deployment template show- show agreed upon ARM deployment template.az deployment group create- deploy ARM template.
Clean room applications executing as a job may write out computed outputs to the datasinks configured. In such cases, this (encrypted) output is downloaded from the the backing storage account and the decrypted locally by the party owning the datasink after the job has completed.
In case of clean room applications offering API endpoints, the endpoints can be invoked by any of the parties after the clean application has started.
The application specific output can be viewed by running the following command:
pwsh ./demos/$demo/show-output.ps1 -contractId $contractIdNote
Further details of the output for each demo may be found in the corresponding readme files - cleanroomhello-job cleanroomhello-api analytics
inference training
Azure CLI commands used
az cleanroom datastore download- download and decrypt data to local store.
The application developer (litware) can download the infrastructure telemetry and application logs. These are available post execution in an encrypted form. To decrypt and inspect, run the following:
./scripts/governance/show-telemetry.ps1 -demo $demo -contractId $contractIdThe infrastructure containers emit traces, logs and metrics that are useful for debugging errors, tracing the execution sequence etc. The telemetry dashboard uses .NET Aspire Dashboard to display these.
There are different views that are available:
- Traces: Track requests and activities across all the sidecars so that we can see where time is spent and track down specific failures.
- Logs: Record individual operations in the context of one of the request / activity.
- Metrics: Measure counters and gauges such as successful requests, failed requests, latency etc.
Azure CLI commands used
az cleanroom telemetry download- download and decrypt infrastructure telemetry to local store.az cleanroom logs download- download and decrypt application telemetry to local store.
All collaborators can check for any audit events raised by the clean room during its execution for the contract by running this command:
./scripts/governance/show-audit-events.ps1 -contractId $contractIdNote
The clean room infrastructure currently emits limited audit events, and doesn't offer an endpoint for an application to log audit events either. These limitations are being addressed in a future version of the infrastructure and samples.
Azure CLI commands used
az cleanroom governance contract event list- view logged events.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.


