Skip to content

Conversation

guptask
Copy link

@guptask guptask commented Oct 6, 2023

No description provided.

igchor and others added 30 commits July 23, 2023 14:16
Run tests on CI

Run long tests (navy/bench) every day on CI

Run CI on prebuild docker image

Run only centos build on CI

Update docker file used in CI

Centos8 is EOL

Disable failing clang-format-check

Add extra param to build-package.sh

Add scripts for rebuilding/pushing docker images

Taken from: pmem/dev-utils-kit@30794c3

Extend CI to rebuild docker automatically

Update build-cachelib-docker.yml

Do not use shallow clone to make sure Docker rebuild logic works correctly.

Added required packages to install Intel ittapi

Update CI to use intel/CacheLib repo (pmem#17)

Add multi-tier navy benchmark and run it on CI
- fix navy multi-tier config for NUMA bindings

added code coverage support in CacheLib

Adding libdml to CentOS docker image (pmem#53)

only exclude allocator-test-NavySetupTestm, shm-test-test_page_size tests

added perf and numactl to docker packages

---------------------------------------------
one large commit for all CI and code coverage
see above for the change history.
This includes printing:
- allocSize
- allocated memory size
- memory usage fraction
…art 2)

fix for compressed ptr (upstream) -> compress from false to true
for different pool sizes. We also use getPoolSize(pid),
to get total size from all pools across allocators.

It also fixes the tiering sizes (pulls changes from
what was issue75 rebased commit that did not make
it into upstream commits).

Rebased to use ramCacheSize.
for the compressed ptr changes that were introduced
upstream.
 - Includes later cosmetic changes added by sounak
9cb5c29
fix for rolling stats (on multi-tier to be followed by multi-tier rolling stats
implementation in the following commit)
Hot queue iterator for 2Q. Will start at Hot queue and move to Warm queue if hot queue is exhausted. Useful for promotion semantics if using 2Q replacement. rebased on to develop and added some tests.
- transparent item movement
- multi-tier combined locking with exclusive bit (pmem#38)
with refactored incRef to support returning the result of
markMoving (fail if already moving or exclusvie bit is set) option.
- add tests (updated for numa bindings - post combined locking)
for transparent item movement
This would lead to deadlock (.e.g in forEachChainedItem)
if the child is moving (e.g. marked by Slab Release thread).

Instead treat moving bit only to prevent freeing the item and
do all synchronization on parent.
Background data movement using periodic workers. Attempts to evict/promote items per given thresholds for each class. These reduce p99 latency since there is a higher chance that an allocation slot is free in the tier we are allocating in.

fix race in promotion where releaseBackToAllocator
was being called before wakeUpWaiters.

reinsert to mm container on failed promotion
…move to fail

 - updated slab release logic for move failure, but there is still an issue
   with slab movement. currently investigating.
The assumption for moving items was that once item is unmarked no one
can add new waiters for that item. However, since incrementing item ref
count was not done under the MoveMap lock, there was a race: item could
have been unmarked right after incRef returned incFailedMoving.
* Fix issue with token creation

* Do not increment evictFail* stats if evictFailConcurrentFill were incremented
updated the docker gcc version to 12

---------

Co-authored-by: Matt Rae <[email protected]>
- we first check if an item is expired under mmContainer
  lock and if so mark it for eviction so it is recycled
  back up to allocateInternalTier.
instead of always inserting to topmost tier
* Chained item movement between tiers - currently sync on the parent
item for moving.
 - updated tests accordingly, note that we can no longer swap
   parent item if chained item is being moved for slab release.

* added some debug checks around chained item check
* fix slab release behavior if no movecb
Track latency of per item eviction/promotion between memory tiers
if (!handler.valid()) {
auto status = handler.get();
XDCHECK(handler.valid()) << dmlErrStr(status);
throw std::runtime_error(folly::sformat(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often do we expect this to happen? I'm wondering if it would make sense to return an error code here instead of throwing an exception

Copy link
Author

@guptask guptask Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not seen this error if DML is configured correctly. Theoretically it is possible to hit this error if we oversubscribe to DSA work queues. But that's more of a config issue - the DSA vs CPU batch split ratio has to be tuned to prevent this scenario. If, in the near future, I add any self-tuned mechanism for this DSA vs CPU batch split, then it makes sense to use this error scenario to adjust the ratio and proceed rather than throw a runtime exception.

@guptask guptask force-pushed the dsa_eviction_promotion branch 2 times, most recently from c5976af to 59a71ee Compare October 9, 2023 08:16
@guptask guptask changed the title Background batch data movement - eviction Background batch data movement - refactoring and DML based alternate path Oct 9, 2023
auto dmlBatchRatio = isLarge ? config_.largeItemBatchEvictDsaUsageFraction :
config_.smallItemBatchEvictDsaUsageFraction;
size_t dmlBatchSize =
(config_.dsaEnabled && evictionData.size() >= config_.minBatchSizeForDsaUsage) ?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no DSA device - we use regular DML software path?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implicit assumption is if dsaEnabled flag is set to true, there is at least one DSA device enabled.

handler = dml::submitdml::hardware(dml::batch, sequence);

In the code currently, if dsaEnabled flag is wrongly set to true without enabling any DSA device, it would trigger a runtime exception. I can set it to dml::automatic which would take care of this scenario but we would lose explicit control over whether a request goes to software or hardware.

//
// @param oldItem Reference to the item being moved
// @param newItemHdl Reference to the handle of the new item being moved into
// @return true If the containers were updated successfully.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really all that is left is to do, and what this function does is one thing: update in the access container - you could add that to the comments.

Or change the function name from book keeper to completeAccessContainerUpdate() or something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed name of book keeper function to completeAccessContainerUpdate()

Copy link
Owner

@byrnedj byrnedj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really good - thanks - I left two minor comments - once you respond I can merge this.

@guptask guptask force-pushed the dsa_eviction_promotion branch from 59a71ee to eb23469 Compare October 16, 2023 10:00
@guptask guptask requested a review from byrnedj October 16, 2023 10:12
@byrnedj byrnedj force-pushed the bg-improvements-no-pool-rebal branch from 36129e4 to c67df92 Compare February 28, 2024 21:27
@byrnedj byrnedj force-pushed the bg-improvements-no-pool-rebal branch from c67df92 to 57eade8 Compare March 25, 2024 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants