-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add tunables to control inode & dentry reclaim behaviour #17746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ahh, doesn't compile on 6.17, because of the change to |
I don't know what is the kernel's motivation to give this flexibility to the file system, but I doubt it was just to make user configure it. I won't object too hard, but this is not a kind of tunables I prefer. We should try to be more clever. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the number of performance issues we've seen over the years related to how the ARC+DBUF caches can interact with the kernel's inode and dentry caches I'm okay adding these tunables. They really shouldn't be needed, but I'm sure having them available with help us experiment with better tuning the behavior.
It's nice to have the flexibility of controlling the release of inode and dentry data in memory. I can throw heavy load to test this PR once the compiling error in 6.17 is fixed. |
bd5cce2
to
3c4baf6
Compare
Last push addresses the nits, and adds the support for 6.17 (replaces setting |
@amotin the motivation is that it lets the filesystem set its own caching policy and/or do all the caching itself, because it has a better idea of the cost of reinflating an inode/dentry vs holding it in memory.
Agreed; I would prefer the ARC understand and manage this too; possibly through the dbuf cache, not sure (I have ideas of course). For now, this was intended as the initial hook.
Yeah, I agree. My position on tunables is that they shouldn't exist at all, as in, there should be more sensible and cross-platform ways to configure the system, including self-tuning. But for now they're the thing we have. These I definitely don't want to see used as a general tuning mechanism though; I don't want to have to support them. I wonder if they'd be better named with a |
Skimming the |
Traditionally, unused inodes would be held on the superblock inode cache until the associated on-disk file is removed or the kernel requests reclaim. On filesystems with millions of rarely-used files, this can be a lot of unusable memory. Here we implement the superblock drop_inode method, and add a zfs_delete_inode tunable to control its behaviour. By default it continues the traditional behaviour, but when the tunable is enabled, we signal that the inode should be deleted immediately when the last reference is dropped, rather than cached. This releases the associated data to the dbuf cache and ARC, allowing them to be reclaimed normally. Sponsored-by: Klara, Inc. Sponsored-by: Fastmail Pty Ltd Signed-off-by: Rob Norris <[email protected]>
Traditionally, unused dentries would be cached in the dentry cache until the associated entry is no longer on disk. The cached dentry continues to hold an inode reference, causing the inode to be pinned (see previous commit). Here we implement the dentry op d_delete, which is roughly analogous to the drop_inode superblock op, and add a zfs_delete_dentry tunable to control its behaviour. By default it continues the traditional behaviour, but when the tunable is enabled, we signal that an unused dentry should be freed immediately, releasing its inode reference, and so allowing that inode to be deleted if no longer in use. Sponsored-by: Klara, Inc. Sponsored-by: Fastmail Pty Ltd Signed-off-by: Rob Norris <[email protected]>
3c4baf6
to
6f150e1
Compare
@behlendorf this is good, done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if you see it useful for anything.
Traditionally, unused dentries would be cached in the dentry cache until the associated entry is no longer on disk. The cached dentry continues to hold an inode reference, causing the inode to be pinned (see previous commit). Here we implement the dentry op d_delete, which is roughly analogous to the drop_inode superblock op, and add a zfs_delete_dentry tunable to control its behaviour. By default it continues the traditional behaviour, but when the tunable is enabled, we signal that an unused dentry should be freed immediately, releasing its inode reference, and so allowing that inode to be deleted if no longer in use. Sponsored-by: Klara, Inc. Sponsored-by: Fastmail Pty Ltd Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #17746
Traditionally, unused inodes would be held on the superblock inode cache until the associated on-disk file is removed or the kernel requests reclaim. On filesystems with millions of rarely-used files, this can be a lot of unusable memory. Here we implement the superblock drop_inode method, and add a zfs_delete_inode tunable to control its behaviour. By default it continues the traditional behaviour, but when the tunable is enabled, we signal that the inode should be deleted immediately when the last reference is dropped, rather than cached. This releases the associated data to the dbuf cache and ARC, allowing them to be reclaimed normally. Sponsored-by: Klara, Inc. Sponsored-by: Fastmail Pty Ltd Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17746
Traditionally, unused dentries would be cached in the dentry cache until the associated entry is no longer on disk. The cached dentry continues to hold an inode reference, causing the inode to be pinned (see previous commit). Here we implement the dentry op d_delete, which is roughly analogous to the drop_inode superblock op, and add a zfs_delete_dentry tunable to control its behaviour. By default it continues the traditional behaviour, but when the tunable is enabled, we signal that an unused dentry should be freed immediately, releasing its inode reference, and so allowing that inode to be deleted if no longer in use. Sponsored-by: Klara, Inc. Sponsored-by: Fastmail Pty Ltd Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17746
[Sponsors: Klara, Inc., Fastmail Pty Ltd]
Motivation and Context
This PR came out of customer investigation into the memory reclaim issues in 2.3.0 that were eventually identified as a regression and resolved by #17487.
That change is undoubtedly correct, ie, when the kernel comes asking for memory, we need to give back what we can. We were tackling it from the other side, trying to release things earlier if both the kernel and OpenZFS agree that they won't be needed again.
This PR adds the scaffolding and a pair of tunables for the kernel's inode and dentry reclaim queries. In both cases, the kernel asks the filesystem "I don't need these any more, should I free them or cache them?". We didn't implement these queries at all at all; this PR adds them, with tunables to control simple "always" and "never" responses, with the defaults set to "always" to keep the traditional behaviour.
The intent at this point is to have a couple of extra hooks available for debugging related issues at customer sites (with somewhat finer control that
echo 3 > drop_caches
, the brute force way to achieve same), but also some shapes to allow experimentation with cache control policy (eg #17159).Description
Implement
super_operations.drop_inode()
, called by the kernel when it wants to drop the last inode reference. The return decides if it should destroy the inode immediately, or put it on thes_inodes
list in the superblock. The tunablezfs_delete_inode
decides if the kernel'sgeneric_delete_inode()
orgeneric_drop_inode()
policy functions are called, which signal either an immediate destroy, or (default) cache if the file still exists on disk.Implement
dentry_operations.d_delete()
, called by the kernel when it wants to drop the last dentry reference. The return decides if it should destroy the dentry immediately, or return it to the dentry cache. Thezfs_delete_dentry
tunable directly decides this; the default of0
returns it to the cache. Worth noting that while we don't interact directly with the dentry cache, every active dentry holds an inode reference, which does pin the inode as above.How Has This Been Tested?
Compiled back to 4.19, ZTS run on 6.1.x completed.
I ran both of these "on" (destroy everything immediately) as a daily driver for a month. Mostly unnoticeable on my under-loaded laptop, but at least, nothing crashed or broke.
Types of changes
Checklist:
Signed-off-by
.