Skip to content

Conversation

gladjohn
Copy link
Contributor

@gladjohn gladjohn commented Sep 9, 2025

This PR adds a process‑wide, identity‑scoped cache for the IMDSv2 mTLS client certificate, wires it into the MSI V2 flow

  • First iteration: cache the certificate

  • Next iteration: add tests, plug-in mTLS Authentication Operation to support Pop flows

@gladjohn gladjohn requested a review from a team as a code owner September 9, 2025 23:56
@AzureAD AzureAD deleted a comment from Robbie-Microsoft Sep 10, 2025
@AzureAD AzureAD deleted a comment from Robbie-Microsoft Sep 10, 2025
.ConfigureAwait(false);

// If using IMDSv2 (mTLS) and the mTLS cert is near expiry, skip the AT cache
if (cachedAccessTokenItem != null && ManagedIdentityClient.s_sourceName == ManagedIdentitySource.ImdsV2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The condition here does not match the comment. You should only look for a certificate if an mtls pop token is requested
  2. Let's please avoid peppering the codebase with if (static.sourceName == IMDSv2). First of all, it creates possible race conditions, because you don't know if the source has been evaluated (which is the case here, why would I care what the source is if I have a cached token?). Second, because there should be 1 place (Single Responsability Principle) that decides if MTLS POP is supported or not (e.g. each of the MSI sources should say this)

// If using IMDSv2 (mTLS) and the mTLS cert is near expiry, skip the AT cache
if (cachedAccessTokenItem != null && ManagedIdentityClient.s_sourceName == ManagedIdentitySource.ImdsV2)
{
string identityKey = ServiceBundle.Config.ClientId;
Copy link
Member

@bgavrilMS bgavrilMS Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is confusing. avoid using identity as it has too many meanings. How about mtlsCertCacheKey ?

}

// Identity-aware check over the mTLS cert cache
internal static bool IsMtlsCertExpiringSoon(string identityKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is not a responsability of the ManagedIdentityClient.

{
string identityKey = ServiceBundle.Config.ClientId;

if (ManagedIdentityClient.IsMtlsCertExpiringSoon(identityKey))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about smth like if (cert = cache.GetCertifiacate(key) && ~cert.IsExpiringSoon())? This will avoid having to expose a new complex API that does 2 things at once.

Alternatively, the cache.GetCertificate() can simply return NULL if the certificate is close to expiry. This is what we do for tokens.

internal static readonly ConcurrentDictionary<string,
(X509Certificate2 Cert, string ClientId, string TenantId, string Endpoint)> s_miCerts = new();

internal static void ResetSourceForTest()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name no longer matches intent

{
s_sourceName = ManagedIdentitySource.None;
s_miCerts.Clear();
s_timeService = new TimeService();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the time service is that it should be a singleton. A single instance is injected in all classes that deal with time. That instance can either be the real time or a test time. Instead, you are using many instances, so you are not really sure what you test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I agree that a big refactor here could take time, maybe add a task to use TimeService everywhere in MSAL where DateTime.now is used.

private const string WindowsHimdsFilePath = "%Programfiles%\\AzureConnectedMachineAgent\\himds.exe";
private const string LinuxHimdsFilePath = "/opt/azcmagent/bin/himds";
internal static ManagedIdentitySource s_sourceName = ManagedIdentitySource.None;
internal static ITimeService s_timeService = new TimeService();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal variables are not prefixed with s_ (see MtlsCertRefreshSkew )

}

// Test-only helpers
internal static bool IsCertExpiringSoon(X509Certificate2 cert, DateTime nowUtc, TimeSpan skew)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you bother introducing a test service then?

return nowUtc >= (notAfterUtc - skew);
}

internal static void SetTimeServiceForTest(ITimeService timeService) /* internal - test only usage */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already a internal property, why do you need an extra setter?

internal static ITimeService s_timeService = new TimeService();
internal static readonly TimeSpan MtlsCertRefreshSkew = TimeSpan.FromMinutes(5);

internal static readonly ConcurrentDictionary<string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered moving the cache elsewhere? It seems that this class doesn't even use it?

privateKey);
// 2) Try cached binding; refresh if cert expires within 5 minutes
if (!ManagedIdentityClient.s_miCerts.TryGetValue(key, out var binding) ||
ManagedIdentityClient.IsMtlsCertExpiringSoon(key))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another example why IsMtlsCertExpiringSoon is a bad API. You perform a cache GET twice even though you have your value.

internal static readonly TimeSpan MtlsCertRefreshSkew = TimeSpan.FromMinutes(5);

internal static readonly ConcurrentDictionary<string,
(X509Certificate2 Cert, string ClientId, string TenantId, string Endpoint)> s_miCerts = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are caching a lot more than certs, so I am not sure this is a good name.

MsalAccessTokenCacheItem cachedAccessTokenItem = await GetCachedAccessTokenAsync()
.ConfigureAwait(false);

// If using IMDSv2 (mTLS) and the mTLS cert is near expiry, skip the AT cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file, you do not take care of requesting a POP token form the cache (for which an IAuthenticationOperation is needed)

@gladjohn gladjohn closed this Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants