Skip to content

Conversation

cachemeifyoucan
Copy link

Improve MappedFileRegionBumpPtr so it can handle being opened with different capacities.

@cachemeifyoucan
Copy link
Author

@swift-ci please test llvm

@cachemeifyoucan cachemeifyoucan force-pushed the eng/PR-mapped-file-bump-ptr branch from c227290 to fe4b0e1 Compare August 28, 2025 20:36
@cachemeifyoucan cachemeifyoucan force-pushed the eng/PR-mapped-file-bump-ptr branch from fe4b0e1 to fa08de2 Compare September 2, 2025 17:42
assert(InitLock.Locked == sys::fs::LockKind::Exclusive);
// If the BumpPtr larger than or equal to the size of the file (it can be
// larger if process is terminated when the out of memory allocation
// happens) and smaller than capacity, this was shrunken by a previous

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Result.H->BumpPtr == FileSize->Size we don't know if it's safe to resize the file. The "expected" case is that this happens when we truncated the file during the last close and now reopen it. But it could also happen if the CAS is already open with a smaller capacity, but has allocated all of that smaller capacity. The assert above is correct that we have "exclusive access" on the InitLock, but that only tells us that we are not racing to initialize it, not whether it is currently open (only exclusively locking SharedLockFD like we do in destroyImpl can determine that).

One possible fix would be to try to lock SharedLockFD exclusively here and if that succeeds we resize but if it fails we don't. The downside is we would have to do that almost every time we reopen the CAS, making it more expensive by several syscalls (unlock, try-lock exclusive, unlock, lock shared), and it would need careful handling since the file could have been grown and re-shrunk during the window where we don't hold the shared lock.

Maybe there is a better fix I haven't thought of.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good fix for that so.... I am reimplemented it again with a even stricter model.

@cachemeifyoucan cachemeifyoucan force-pushed the eng/PR-mapped-file-bump-ptr branch 2 times, most recently from 5f3599f to c50a1b2 Compare September 4, 2025 22:50
@cachemeifyoucan
Copy link
Author

@swift-ci please test llvm

Improve MappedFileRegionBumpPtr so it can handle being opened with
different capacities. Mismatching different capacities and header
offsets will result in error on creation.
@cachemeifyoucan cachemeifyoucan force-pushed the eng/PR-mapped-file-bump-ptr branch from c50a1b2 to dec5bac Compare September 4, 2025 23:44
@cachemeifyoucan
Copy link
Author

@swift-ci please test llvm

Copy link

@benlangmuir benlangmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think storing the capacity in the lock file is not ideal. This hides information that is now necessary for using the CAS in a secondary location. It also complicates opening the file by needing to read an additional file first. I think it would be better to just store the capacity in the header and not worry about the header offset changing. We can do a pread at the header location if we want to avoid needing to fix up the mmap size.


// Return true if succeed to lock the file exclusively.
bool tryLockExclusive() {
assert(!Locked && "can only try to lock if not locked");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent between lock and tryLock.... I suggest we revert the change to lock and make the caller unlock first if necessary for consistency.


// Release the lock so it will not be unlocked on destruction.
void release() {
Locked = std::nullopt;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also set FD = -1 so that you cannot lock it again by mistake.

uint64_t ExpectedOffset = support::endian::read<uint64_t, endianness::little>(
ConfigBuffer.data() + sizeof(uint64_t));

if (ExpectedCapacity != Capacity)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards ignoring this and just using the existing capacity if the CAS is already open.

@cachemeifyoucan
Copy link
Author

This hides information that is now necessary for using the CAS in a secondary location.

I don't see why this is a problem?

I think it would be better to just store the capacity in the header and not worry about the header offset changing.

We could reserve a real header in the beginning of the file and header offset starts after the reserve space?

We can do a pread at the header location if we want to avoid needing to fix up the mmap size.

We don't have pread. I guess we can readNativeFileSlice, which is the closest we have.

@benlangmuir
Copy link

I don't see why this is a problem?

I find it conceptually bad to store information in the lock file that is not part of the locking mechanism. From a performance perspective it means we need to read an extra file when opening the CAS. I admit these are not huge issues.

We could reserve a real header in the beginning of the file and header offset starts after the reserve space?

This seems fine too, but might need more work to restructure the client code since their own header moves later.

We don't have pread. I guess we can readNativeFileSlice, which is the closest we have.

readNativeFileSlice is implemented using pread when available

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants