Skip to content
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
59a4ea3
ABFS WASB Compatibility testing
anmolanmol1234 Jun 30, 2025
a8e0488
Merge branch 'trunk' of https://github.com/anmolanmol1234/hadoop into…
anmolanmol1234 Jul 1, 2025
92c975c
Fix md5 test
anmolanmol1234 Jul 1, 2025
a161e06
fix unused param
anmolanmol1234 Jul 1, 2025
538a2cd
Test changes
anmolanmol1234 Jul 3, 2025
2bd7ce0
remove unused imports
anmolanmol1234 Jul 3, 2025
6f24907
fix build issue
anmolanmol1234 Jul 3, 2025
6b138c0
merge trunk
anmolanmol1234 Jul 3, 2025
8292192
fix main
anmolanmol1234 Jul 3, 2025
f15c5e6
fix javadocs
anmolanmol1234 Jul 3, 2025
7f8c465
checktsyle fixes
anmolanmol1234 Jul 3, 2025
c42ac75
fix test
anmolanmol1234 Jul 3, 2025
d0f2aea
PR comments
anmolanmol1234 Jul 9, 2025
07c396c
revert config change
anmolanmol1234 Jul 9, 2025
5e6298a
remove unintended changes
anmolanmol1234 Jul 9, 2025
d951cc0
fix comments
anmolanmol1234 Jul 9, 2025
9ae0198
PR comments
anmolanmol1234 Jul 9, 2025
5f48139
unused import
anmolanmol1234 Jul 10, 2025
ef1db63
Merge branch 'trunk' of https://github.com/anmolanmol1234/hadoop into…
anmolanmol1234 Jul 10, 2025
4b4b7a4
PR review comments
anmolanmol1234 Jul 18, 2025
513511a
checkstyle
anmolanmol1234 Jul 18, 2025
dbb743f
Variable name correction
anmolanmol1234 Jul 21, 2025
b28abe2
Merge branch 'trunk' of https://github.com/anmolanmol1234/hadoop into…
anmolanmol1234 Jul 31, 2025
3c249d9
md5 config changes
anmolanmol1234 Aug 1, 2025
4c24330
checkstyle fixes
anmolanmol1234 Aug 5, 2025
1200cb2
Merge branch 'trunk' of https://github.com/anmolanmol1234/hadoop into…
anmolanmol1234 Aug 5, 2025
493484e
Merge branch 'apache:trunk' into HADOOP-19604
anmolanmol1234 Aug 5, 2025
144ba1a
PR commnets
anmolanmol1234 Aug 8, 2025
f8a0a64
Merge branch 'HADOOP-19604' of https://github.com/anmolanmol1234/hado…
anmolanmol1234 Aug 8, 2025
20f3582
null checks
anmolanmol1234 Aug 8, 2025
4fd400b
Resolve merge conflicts
anmolanmol1234 Aug 20, 2025
c5f32be
PR comments
anmolanmol1234 Aug 20, 2025
f9a3529
remove unused import
anmolanmol1234 Aug 20, 2025
53a39c4
Merge branch 'trunk' of https://github.com/anmolanmol1234/hadoop into…
anmolanmol1234 Sep 1, 2025
4e5efa2
fix test
anmolanmol1234 Sep 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,10 @@ public class AbfsConfiguration{
FS_AZURE_ABFS_ENABLE_CHECKSUM_VALIDATION, DefaultValue = DEFAULT_ENABLE_ABFS_CHECKSUM_VALIDATION)
private boolean isChecksumValidationEnabled;

@BooleanConfigurationValidatorAnnotation(ConfigurationKey =
FS_AZURE_ABFS_ENABLE_FULL_BLOB_CHECKSUM_VALIDATION, DefaultValue = DEFAULT_ENABLE_FULL_BLOB_ABFS_CHECKSUM_VALIDATION)
private boolean isFullBlobChecksumValidationEnabled;

@BooleanConfigurationValidatorAnnotation(ConfigurationKey =
FS_AZURE_ENABLE_PAGINATED_DELETE, DefaultValue = DEFAULT_ENABLE_PAGINATED_DELETE)
private boolean isPaginatedDeleteEnabled;
Expand Down Expand Up @@ -1705,6 +1709,10 @@ public void setIsChecksumValidationEnabled(boolean isChecksumValidationEnabled)
this.isChecksumValidationEnabled = isChecksumValidationEnabled;
}

public boolean isFullBlobChecksumValidationEnabled() {
Copy link

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getter method isFullBlobChecksumValidationEnabled() lacks a corresponding setter method, which breaks the pattern established by other configuration properties like isChecksumValidationEnabled. This could cause issues for programmatic configuration changes.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setter is not used anywhere

return isFullBlobChecksumValidationEnabled;
}

public long getBlobCopyProgressPollWaitMillis() {
return blobCopyProgressPollWaitMillis;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -356,6 +356,9 @@ public final class ConfigurationKeys {
/** Add extra layer of verification of the integrity of the request content during transport: {@value}. */
public static final String FS_AZURE_ABFS_ENABLE_CHECKSUM_VALIDATION = "fs.azure.enable.checksum.validation";

/** Add extra layer of verification of the integrity of the full blob request content during transport: {@value}. */
public static final String FS_AZURE_ABFS_ENABLE_FULL_BLOB_CHECKSUM_VALIDATION = "fs.azure.enable.full.blob.checksum.validation";

public static String accountProperty(String property, String account) {
return property + DOT + account;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@ public final class FileSystemConfigurations {
public static final boolean DEFAULT_ENABLE_ABFS_RENAME_RESILIENCE = true;
public static final boolean DEFAULT_ENABLE_PAGINATED_DELETE = false;
public static final boolean DEFAULT_ENABLE_ABFS_CHECKSUM_VALIDATION = false;
public static final boolean DEFAULT_ENABLE_FULL_BLOB_ABFS_CHECKSUM_VALIDATION = false;

/**
* Limit of queued block upload operations before writes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1073,11 +1073,15 @@ public AbfsRestOperation flush(byte[] buffer,
requestHeaders.add(new AbfsHttpHeader(CONTENT_LENGTH, String.valueOf(buffer.length)));
requestHeaders.add(new AbfsHttpHeader(CONTENT_TYPE, APPLICATION_XML));
requestHeaders.add(new AbfsHttpHeader(IF_MATCH, eTag));
String md5Hash = null;
if (leaseId != null) {
requestHeaders.add(new AbfsHttpHeader(X_MS_LEASE_ID, leaseId));
}
if (blobMd5 != null) {
if (isFullBlobChecksumValidationEnabled() && blobMd5 != null) {
requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, blobMd5));
} else {
md5Hash = computeMD5Hash(buffer, 0, buffer.length);
requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, md5Hash));
Copy link

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable md5Hash is declared but only used in the else branch. Consider declaring it within the else block to improve code clarity and reduce the scope of the variable.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used later in catch block hence declared outside

}
final AbfsUriQueryBuilder abfsUriQueryBuilder = createDefaultUriQueryBuilder();
abfsUriQueryBuilder.addQuery(QUERY_PARAM_COMP, BLOCKLIST);
Expand All @@ -1103,8 +1107,21 @@ public AbfsRestOperation flush(byte[] buffer,
AbfsRestOperation op1 = getPathStatus(path, true, tracingContext,
contextEncryptionAdapter);
String metadataMd5 = op1.getResult().getResponseHeader(CONTENT_MD5);
if (blobMd5 != null && !blobMd5.equals(metadataMd5)) {
throw ex;
/*
* Validate the response by comparing the server's MD5 metadata against either:
* 1. The full blob content MD5 (if full blob checksum validation is enabled), or
* 2. The full block ID buffer MD5 (fallback if blob checksum validation is disabled)
*/
if (getAbfsConfiguration().isFullBlobChecksumValidationEnabled() && blobMd5 != null) {
Copy link

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration check getAbfsConfiguration().isFullBlobChecksumValidationEnabled() is performed inside the response validation logic, but this same check was already done earlier in the method. Consider storing the result in a variable to avoid redundant method calls.

Suggested change
if (getAbfsConfiguration().isFullBlobChecksumValidationEnabled() && blobMd5 != null) {
if (fullBlobChecksumValidationEnabled && blobMd5 != null) {

Copilot uses AI. Check for mistakes.
// Full blob content MD5 mismatch — integrity check failed
if (!blobMd5.equals(metadataMd5)) {
throw ex;
}
} else {
// Block ID buffer MD5 mismatch — integrity check failed
if (md5Hash != null && !md5Hash.equals(metadataMd5)) {
throw ex;
}
}
return op;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1423,6 +1423,17 @@ protected boolean isChecksumValidationEnabled() {
return getAbfsConfiguration().getIsChecksumValidationEnabled();
}

/**
* Conditions check for allowing checksum support for write operation.
* Server will support this if client sends the MD5 Hash as a request header.
* For azure stoage service documentation and more details refer to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: storage spelling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

* <a href="https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update">Path - Update Azure Rest API</a>.
* @return true if full blob checksum validation enabled.
*/
protected boolean isFullBlobChecksumValidationEnabled() {
return getAbfsConfiguration().isFullBlobChecksumValidationEnabled();
}

/**
* Compute MD5Hash of the given byte array starting from given offset up to given length.
* @param data byte array from which data is fetched to compute MD5 Hash.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -867,10 +867,9 @@ public AbfsRestOperation flush(final String path,
if (leaseId != null) {
requestHeaders.add(new AbfsHttpHeader(X_MS_LEASE_ID, leaseId));
}
if (isChecksumValidationEnabled() && blobMd5 != null) {
if (isFullBlobChecksumValidationEnabled() && blobMd5 != null) {
requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, blobMd5));
}

final AbfsUriQueryBuilder abfsUriQueryBuilder = createDefaultUriQueryBuilder();
abfsUriQueryBuilder.addQuery(QUERY_PARAM_ACTION, FLUSH_ACTION);
abfsUriQueryBuilder.addQuery(QUERY_PARAM_POSITION, Long.toString(position));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ public AbfsOutputStream(AbfsOutputStreamContext abfsOutputStreamContext)
md5 = MessageDigest.getInstance(MD5);
fullBlobContentMd5 = MessageDigest.getInstance(MD5);
} catch (NoSuchAlgorithmException e) {
if (client.isChecksumValidationEnabled()) {
if (isChecksumValidationEnabled()) {
throw new IOException("MD5 algorithm not available", e);
}
}
Expand Down Expand Up @@ -464,10 +464,13 @@ public synchronized void write(final byte[] data, final int off, final int lengt
AbfsBlock block = createBlockIfNeeded(position);
int written = bufferData(block, data, off, length);
// Update the incremental MD5 hash with the written data.
getMessageDigest().update(data, off, written);

if (isChecksumValidationEnabled()) {
getMessageDigest().update(data, off, written);
}
// Update the full blob MD5 hash with the written data.
getFullBlobContentMd5().update(data, off, written);
if (isFullBlobChecksumValidationEnabled()) {
getFullBlobContentMd5().update(data, off, written);
}
int remainingCapacity = block.remainingCapacity();

if (written < length) {
Expand Down Expand Up @@ -544,7 +547,12 @@ private void uploadBlockAsync(AbfsBlock blockToUpload,
outputStreamStatistics.bytesToUpload(bytesLength);
outputStreamStatistics.writeCurrentBuffer();
DataBlocks.BlockUploadData blockUploadData = blockToUpload.startUpload();
String md5Hash = getMd5();
String md5Hash;
if (getClient().getAbfsConfiguration().getIsChecksumValidationEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isChecksumValidationEnabled() can be used here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

md5Hash = getMd5();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change it to: String md5Hash = tisChecksumValidationEnabled() ? getMd5() : null;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

} else {
md5Hash = null;
}
final Future<Void> job =
executorService.submit(() -> {
AbfsPerfTracker tracker =
Expand Down Expand Up @@ -1222,6 +1230,20 @@ public MessageDigest getFullBlobContentMd5() {
return fullBlobContentMd5;
}

/**
* @return true if checksum validation is enabled.
*/
public boolean isChecksumValidationEnabled() {
return getClient().isChecksumValidationEnabled();
}

/**
* @return true if full blob checksum validation is enabled.
*/
public boolean isFullBlobChecksumValidationEnabled() {
return getClient().isFullBlobChecksumValidationEnabled();
}

/**
* Returns the Base64-encoded MD5 checksum based on the current digest state.
* This finalizes the digest calculation. Returns null if the digest is empty.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,9 @@ protected synchronized AbfsBlock createBlockInternal(long position)
setBlockCount(getBlockCount() + 1);
AbfsBlock activeBlock = new AbfsBlobBlock(getAbfsOutputStream(), position, getBlockIdLength(), getBlockCount());
activeBlock.setBlockEntry(addNewEntry(activeBlock.getBlockId(), activeBlock.getOffset()));
getAbfsOutputStream().getMessageDigest().reset();
if (getAbfsOutputStream().isChecksumValidationEnabled()) {
getAbfsOutputStream().getMessageDigest().reset();
}
setActiveBlock(activeBlock);
}
return getActiveBlock();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,10 @@ protected synchronized AbfsRestOperation remoteFlush(final long offset,
tracingContextFlush.setIngressHandler(BLOB_FLUSH);
tracingContextFlush.setPosition(String.valueOf(offset));
LOG.trace("Flushing data at offset {} for path {}", offset, getAbfsOutputStream().getPath());
String fullBlobMd5 = computeFullBlobMd5();
String fullBlobMd5 = null;
if (getClient().isFullBlobChecksumValidationEnabled()) {
fullBlobMd5 = computeFullBlobMd5();
}
op = getClient().flush(blockListXml.getBytes(StandardCharsets.UTF_8),
getAbfsOutputStream().getPath(),
isClose, getAbfsOutputStream().getCachedSasTokenString(), leaseId,
Expand All @@ -194,7 +197,9 @@ isClose, getAbfsOutputStream().getCachedSasTokenString(), leaseId,
LOG.error("Error in remote flush for path {} and offset {}", getAbfsOutputStream().getPath(), offset, ex);
throw ex;
} finally {
getAbfsOutputStream().getFullBlobContentMd5().reset();
if (getClient().isFullBlobChecksumValidationEnabled()) {
getAbfsOutputStream().getFullBlobContentMd5().reset();
}
}
return op;
}
Expand All @@ -221,7 +226,7 @@ protected AbfsRestOperation remoteAppendBlobWrite(String path,
AppendRequestParameters reqParams,
TracingContext tracingContext) throws IOException {
// Perform the remote append operation using the blob client.
AbfsRestOperation op = null;
AbfsRestOperation op;
try {
op = blobClient.appendBlock(path, reqParams, uploadData.toByteArray(), tracingContext);
} catch (AbfsRestOperationException ex) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@ protected synchronized AbfsBlock createBlockInternal(long position)
if (getActiveBlock() == null) {
setBlockCount(getBlockCount() + 1);
AbfsBlock activeBlock = new AbfsBlock(getAbfsOutputStream(), position);
getAbfsOutputStream().getMessageDigest().reset();
if (getAbfsOutputStream().isChecksumValidationEnabled()) {
getAbfsOutputStream().getMessageDigest().reset();
}
setActiveBlock(activeBlock);
}
return getActiveBlock();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,10 @@ protected synchronized AbfsRestOperation remoteFlush(final long offset,
tracingContextFlush.setIngressHandler(DFS_FLUSH);
tracingContextFlush.setPosition(String.valueOf(offset));
}
String fullBlobMd5 = computeFullBlobMd5();
String fullBlobMd5 = null;
if (getClient().isFullBlobChecksumValidationEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the check getClient().isFullBlobChecksumValidationEnabled() is common in both DFS, Blob ingress handlers- we can add it as a common method in the abstract class itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would not result in any major improvement as it's just called at 2 places

fullBlobMd5 = computeFullBlobMd5();
}
LOG.trace("Flushing data at offset {} and path {}", offset, getAbfsOutputStream().getPath());
AbfsRestOperation op;
try {
Expand All @@ -194,7 +197,9 @@ protected synchronized AbfsRestOperation remoteFlush(final long offset,
getAbfsOutputStream().getPath(), offset, ex);
throw ex;
} finally {
getAbfsOutputStream().getFullBlobContentMd5().reset();
if (getClient().isFullBlobChecksumValidationEnabled()) {
getAbfsOutputStream().getFullBlobContentMd5().reset();
}
}
return op;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -211,13 +211,18 @@ void createRenamePendingJson(Path path, byte[] bytes)
String blockId = generateBlockId();
String blockList = generateBlockListXml(blockId);
byte[] buffer = blockList.getBytes(StandardCharsets.UTF_8);
String computedMd5 = abfsClient.computeMD5Hash(buffer, 0, buffer.length);
String computedMd5 = null;
if (abfsClient.isFullBlobChecksumValidationEnabled()) {
computedMd5 = abfsClient.computeMD5Hash(buffer, 0, buffer.length);
}

AppendRequestParameters appendRequestParameters
= new AppendRequestParameters(0, 0,
bytes.length, AppendRequestParameters.Mode.APPEND_MODE, false, null,
abfsClient.getAbfsConfiguration().isExpectHeaderEnabled(),
new BlobAppendRequestParameters(blockId, eTag), abfsClient.computeMD5Hash(bytes, 0, bytes.length));
new BlobAppendRequestParameters(blockId, eTag),
abfsClient.isChecksumValidationEnabled() ? abfsClient.computeMD5Hash(
bytes, 0, bytes.length) : null);

abfsClient.append(path.toUri().getPath(), bytes,
appendRequestParameters, null, null, tracingContext);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2128,7 +2128,7 @@ private void testRenamePreRenameFailureResolution(final AzureBlobFileSystem fs)
Mockito.anyBoolean(), Mockito.nullable(String.class),
Mockito.nullable(String.class), Mockito.anyString(),
Mockito.nullable(ContextEncryptionAdapter.class),
Mockito.any(TracingContext.class), Mockito.anyString());
Mockito.any(TracingContext.class), Mockito.nullable(String.class));
return createAnswer.callRealMethod();
};
RenameAtomicityTestUtils.addCreatePathMock(client,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@

import static java.net.HttpURLConnection.HTTP_CONFLICT;
import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.FS_AZURE_ABFS_ENABLE_CHECKSUM_VALIDATION;
import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.FS_AZURE_ABFS_ENABLE_FULL_BLOB_CHECKSUM_VALIDATION;
import static org.apache.hadoop.fs.contract.ContractTestUtils.assertDeleted;
import static org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory;
import static org.apache.hadoop.fs.contract.ContractTestUtils.assertMkdirs;
Expand Down Expand Up @@ -113,7 +114,10 @@ public void testReadFile() throws Exception {
boolean[] createFileWithAbfs = new boolean[]{false, true, false, true};
boolean[] readFileWithAbfs = new boolean[]{false, true, true, false};

AzureBlobFileSystem abfs = getFileSystem();
Configuration conf = getRawConfiguration();
conf.setBoolean(FS_AZURE_ABFS_ENABLE_FULL_BLOB_CHECKSUM_VALIDATION, true);
FileSystem fileSystem = FileSystem.newInstance(conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When creating new instance, we need to close it. Can you check for all the tests in this class and make sure any new instance sis getting closed within test itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

AzureBlobFileSystem abfs = (AzureBlobFileSystem) fileSystem;
// test only valid for non-namespace enabled account
Assume.assumeFalse("Namespace enabled account does not support this test",
getIsNamespaceEnabled(abfs));
Expand Down Expand Up @@ -412,7 +416,10 @@ public void testScenario2() throws Exception {
*/
@Test
public void testScenario3() throws Exception {
AzureBlobFileSystem abfs = getFileSystem();
Configuration conf = getRawConfiguration();
conf.setBoolean(FS_AZURE_ABFS_ENABLE_FULL_BLOB_CHECKSUM_VALIDATION, true);
FileSystem fileSystem = FileSystem.newInstance(conf);
AzureBlobFileSystem abfs = (AzureBlobFileSystem) fileSystem;
Assume.assumeFalse("Namespace enabled account does not support this test",
getIsNamespaceEnabled(abfs));
Assume.assumeFalse("Not valid for APPEND BLOB", isAppendBlobEnabled());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,12 @@ public static void setMockAbfsRestOperationForFlushOperation(
requestHeaders.add(new AbfsHttpHeader(CONTENT_LENGTH, String.valueOf(buffer.length)));
requestHeaders.add(new AbfsHttpHeader(CONTENT_TYPE, APPLICATION_XML));
requestHeaders.add(new AbfsHttpHeader(IF_MATCH, eTag));
requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, blobMd5));
if (spiedClient.isFullBlobChecksumValidationEnabled()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be simplified by having a single statement for adding blobMd5 to request header.
blobMd5 value can be computed using a trilean operator.

Similar can be used in production code to make sure one of the way is used to compute Md5 and variable is never null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5, blobMd5));
} else {
requestHeaders.add(new AbfsHttpHeader(X_MS_BLOB_CONTENT_MD5,
spiedClient.computeMD5Hash(buffer, 0, buffer.length)));
}
final AbfsUriQueryBuilder abfsUriQueryBuilder = spiedClient.createDefaultUriQueryBuilder();
abfsUriQueryBuilder.addQuery(QUERY_PARAM_COMP, BLOCKLIST);
abfsUriQueryBuilder.addQuery(QUERY_PARAM_CLOSE, String.valueOf(false));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -427,7 +427,7 @@ public void testNoNetworkCallsForSecondFlush() throws Exception {
Mockito.any(TracingContext.class));
Mockito.verify(blobClient, Mockito.times(1)).
flush(Mockito.any(byte[].class), Mockito.anyString(), Mockito.anyBoolean(), Mockito.any(), Mockito.any(), Mockito.anyString(), Mockito.any(),
Mockito.any(TracingContext.class), Mockito.anyString());
Mockito.any(TracingContext.class), Mockito.nullable(String.class));
}

/**
Expand Down Expand Up @@ -481,6 +481,8 @@ public void testResetCalledOnExceptionInRemoteFlush() throws Exception {
//expected exception
}
// Verify that reset was called on the message digest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a similar test where with config disabled we assert that methods to compute md5 hash were not called at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

Mockito.verify(mockMessageDigest, Mockito.times(1)).reset();
if (spiedClient.isChecksumValidationEnabled()) {
Copy link

Copilot AI Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is checking isChecksumValidationEnabled() but the code change suggests this should be checking isFullBlobChecksumValidationEnabled() since the MD5 reset behavior is now conditional on full blob checksum validation, not general checksum validation.

Suggested change
if (spiedClient.isChecksumValidationEnabled()) {
if (spiedClient.isFullBlobChecksumValidationEnabled()) {

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

Mockito.verify(mockMessageDigest, Mockito.times(1)).reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: assert message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken

}
}
}