Skip to content

Conversation

ckane
Copy link
Contributor

@ckane ckane commented Jul 7, 2025

Motivation and Context

The problem is described in #17321 - due to some apparent change in Kbuild, our source tree stopped building when the build was done in a separate directory, e.g.:

git clone https://github.com/openzfs/zfs zfs
cd zfs && ./autogen.sh
cd ..
mkdir zfs-build
cd zfs-build
../zfs/configure --with-linux=/usr/src/linux
make gitrev && make -C module/ modules

The make -C module/ modules command would typically fail with messages about not finding any matching targets for the *.o files, typically starting with os/linux/spl/spl-atomic.o due to its place at the top of the lexical order.

Description

Though I am uncertain what change in Kbuild caused this, I was able to determine that the pattern targets in the Linux Kbuild system are no longer matching the %.o targets in subdirectories. In response to this, I added some dynamically-generated targets to the module/Kbuild file to fill in the missing targets and allow the build to succeed. Seems to work for me at the moment, both in-source and out-of-source builds.

I am not certain how well it works with earlier kernels (like 6.1.x branch, which probably doesn't have the Kbuild change).

If someone understands better what causes the breakage and can propose a cleaner fix here, I'm happy to implement it instead of mine. Otherwise, I'd recommend accepting this one (if it passes testing) until that mystery can be sorted out, so that things like the Arch AUR zfs-dkms-git package can work again, as well as anything else that builds like this.

How Has This Been Tested?

I have tested the module build on 6.15.x and 6.16-rc kernels.
I have tested the module build both out-of-source and in-source builds.
I have not tested kernels where the out-of-source build was still working, yet - maybe someone with one of these kernels already in place can give this a whirl and approve here.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn
Copy link
Member

robn commented Jul 7, 2025

Oh, interesting. I wonder if that's related to this output change in the build recently. Up to and including 6.12:

  CC [M]  /home/robn/code/zfs/module/zfs/abd.o
  CC [M]  /home/robn/code/zfs/module/zfs/aggsum.o
  CC [M]  /home/robn/code/zfs/module/zfs/arc.o

6.13+:

  CC [M]  zfs/abd.o
  CC [M]  zfs/aggsum.o
  CC [M]  zfs/arc.o

Regardless, I'll have a look over this tonight.

@robn
Copy link
Member

robn commented Jul 7, 2025

Seems to build in- and out-of-dir back to 4.19, but I didn't try very hard yet. I'm guessing it's at least in part related to torvalds/linux@b1992c3772e6, but I haven't had a chance to really think it through either.

(Sorry that's not more helpful, but I'm out of brain today. I'll try to look more tomorrow).

@ckane
Copy link
Contributor Author

ckane commented Jul 7, 2025

Seems to build in- and out-of-dir back to 4.19, but I didn't try very hard yet. I'm guessing it's at least in part related to torvalds/linux@b1992c3772e6, but I haven't had a chance to really think it through either.

(Sorry that's not more helpful, but I'm out of brain today. I'll try to look more tomorrow).

Thanks for the testing - sounds good as I think the earliest kernel we're committed to support is like 5.13 or 5.18 in the next major release. I tried unwinding the spaghetti in Linux's Kbuild code for much of the weekend myself and ran out of brain too, which is largely the reason I put this band-aid together after coming up empty on the root cause. Might be related to the $(src) refactor, not sure, I came across that change as well and it looked like a likely culprit but also wasn't able to nail down the "how?", let alone "what's the proper way to do this?"

I scanned through the other DKMS modules I've got on my system, and the closest one in similarity (with subdir targets under the module build dir) was the vboxhost modules ... which still (and for awhile) have implemented the compilation steps inside the vendor's kbuild Makefile logic. Though I chalk that behavior more up to VirtualBox probably exposing some details of Oracle's in-house CI/CD and engineering practices than anything related to the regression we're encountering here.

@ckane
Copy link
Contributor Author

ckane commented Jul 7, 2025

Oh, interesting. I wonder if that's related to this output change in the build recently. Up to and including 6.12:

  CC [M]  /home/robn/code/zfs/module/zfs/abd.o
  CC [M]  /home/robn/code/zfs/module/zfs/aggsum.o
  CC [M]  /home/robn/code/zfs/module/zfs/arc.o

6.13+:

  CC [M]  zfs/abd.o
  CC [M]  zfs/aggsum.o
  CC [M]  zfs/arc.o

Regardless, I'll have a look over this tonight.

So, reflecting on this comment, I went back to module/Kbuild.in in the source tree and notice that we're setting src and obj there, when KBUILD_EXTMOD is set:

...
ifneq ($(KBUILD_EXTMOD),)
zfs_include = @abs_top_srcdir@/include
icp_include = @abs_srcdir@/icp/include
zstd_include = @abs_srcdir@/zstd/include
ZFS_MODULE_CFLAGS += -include @abs_top_builddir@/zfs_config.h
ZFS_MODULE_CFLAGS += -I@abs_top_builddir@/include
src = @abs_srcdir@
obj = @abs_builddir@
else
...

However, looking at my out-of-source build tree, both @abs_srcdir@ and @abs_builddir@ expanded to the same path (the source tree).

My bad, was looking at module/Kbuild in the wrong project directory for that last comment

@AttilaFueloep
Copy link
Contributor

Oh, interesting. I wonder if that's related to this output change in the build recently. Up to and including 6.12:

Not sure if it helps, but I noticed recently (#17541) that newer kernels involve an additional make step, changing to the build directory.

make -C /usr/src/kernels/6.15.5-200.fc42.x86_64  \
        ...
  	M="$PWD"  CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
  make[5]: Entering directory '/usr/src/kernels/6.15.5-200.fc42.x86_64'
  make[6]: Entering directory '/tmp/zfs-build-zfs-H0flgge8/BUILD/zfs-kmod-2.3.99-build/zfs-kmod-2.3.99 _kmod_build_6.15.5-200.fc42.x86_64/module'
    CC [M]  os/linux/spl/spl-atomic.o

vs.

  make -C /usr/src/kernels/5.14.0-596.el9.x86_64  \
  	 ...
  	M="$PWD"  CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
  make[5]: Entering directory '/usr/src/kernels/5.14.0-596.el9.x86_64'
    CC [M]  /tmp/zfs-build-zfs-wS0IOhqi/BUILD/zfs-2.3.99/module/os/linux/spl/spl-atomic.o

@ckane
Copy link
Contributor Author

ckane commented Jul 14, 2025

Oh, interesting. I wonder if that's related to this output change in the build recently. Up to and including 6.12:

Not sure if it helps, but I noticed recently (#17541) that newer kernels involve an additional make step, changing to the build directory.

make -C /usr/src/kernels/6.15.5-200.fc42.x86_64  \
        ...
  	M="$PWD"  CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
  make[5]: Entering directory '/usr/src/kernels/6.15.5-200.fc42.x86_64'
  make[6]: Entering directory '/tmp/zfs-build-zfs-H0flgge8/BUILD/zfs-kmod-2.3.99-build/zfs-kmod-2.3.99 _kmod_build_6.15.5-200.fc42.x86_64/module'
    CC [M]  os/linux/spl/spl-atomic.o

vs.

  make -C /usr/src/kernels/5.14.0-596.el9.x86_64  \
  	 ...
  	M="$PWD"  CONFIG_DEBUG_INFO=y CONFIG_ZFS=m modules
  make[5]: Entering directory '/usr/src/kernels/5.14.0-596.el9.x86_64'
    CC [M]  /tmp/zfs-build-zfs-wS0IOhqi/BUILD/zfs-2.3.99/module/os/linux/spl/spl-atomic.o

Yeah, this is what we were talking about above, too. Seems like the way targets are being defined changed, but after messing with the existing code a bunch I couldn't figure out why the targets are now not being built correctly automatically by the Kbuild recipes, using our existing code, after the change.

@ckane ckane force-pushed the fix-out-of-src-build branch 2 times, most recently from 4ea428f to f7a335b Compare July 22, 2025 15:53
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played about with this a bit myself, but was only able to come up with worse ways to handle this. Unless someone has a better solution for this, I'm good with merging this fix until we figure something better out. @ckane thanks for digging in to this.

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Jul 23, 2025
@ckane
Copy link
Contributor Author

ckane commented Jul 24, 2025

I played about with this a bit myself, but was only able to come up with worse ways to handle this. Unless someone has a better solution for this, I'm good with merging this fix until we figure something better out. @ckane thanks for digging in to this.

You're welcome - I've tried similar over the past week or so too and also haven't found a more elegant way to address what is going on, either.

The root-level targets seem to be generated fine by Kbuild (and thus, most kernel modules build fine as they're typically a flat src tree), but the targets further down in submodules seem to not be traversed and generated by Kbuild anymore, even if we put their *.o targets into the correct list.

The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Signed-off-by: Coleman Kane <[email protected]>
@ckane ckane force-pushed the fix-out-of-src-build branch from f7a335b to ad41d15 Compare July 24, 2025 19:18
@github-actions github-actions bot removed the Status: Accepted Ready to integrate (reviewed, tested) label Jul 24, 2025
@ckane
Copy link
Contributor Author

ckane commented Jul 24, 2025

Also just now rebased it against latest master

@ckane
Copy link
Contributor Author

ckane commented Jul 24, 2025

oops sorry @behlendorf looks like that removed the "Accepted" status tag

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Jul 24, 2025
@behlendorf behlendorf merged commit 5a9b9c7 into openzfs:master Jul 24, 2025
22 of 25 checks passed
amotin pushed a commit to amotin/zfs that referenced this pull request Jul 28, 2025
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes openzfs#17517
@ze-zhang
Copy link

Yocto 5.2, using meta-riscv to build riscv version, problem still exist,

after linux 6.13 can use -f instead of -C, I changed Makefile.in, output:

屏幕截图 2025-07-28 152513

@ze-zhang
Copy link

so Kbuild file must be in the same dir which stores source code?
ChatGPT 说:

Yes — in the Linux kernel build system, when building an external module, the Kbuild file (or more commonly, a Makefile containing obj-m := ...) must be placed in the source directory, not the build directory.
✅ Why the Kbuild/Makefile must be in the source directory

When you build an external module using the kernel build system like this:

make -C /lib/modules/$(uname -r)/build M=/path/to/module-src O=/path/to/module-build modules

M=/path/to/module-src tells the kernel to look in this directory for the module’s source files and the corresponding Kbuild/Makefile.

O=/path/to/module-build tells it to put all the build output (e.g. .o, .mod.c, .ko) into that separate directory.

The kernel build system expects to find obj-m := my_module.o in the source dir (i.e. where you point M=...), and it always looks for it there, never in the build directory.
🔁 Recap: file roles
File Location Purpose
Kbuild / Makefile Source directory (M=...) Tells kernel what to build (obj-m :=)
Output files Build directory (O=...) .o, .ko, .mod.c, etc. go here

behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Aug 13, 2025
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes openzfs#17517
ixhamza pushed a commit to truenas/zfs that referenced this pull request Aug 28, 2025
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes openzfs#17517
spauka pushed a commit to spauka/zfs that referenced this pull request Aug 30, 2025
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes openzfs#17517
aswild added a commit to aswild/meta-wild-common that referenced this pull request Sep 7, 2025
ZFS can't compile in a separate build directory on newer kernels, which
I hit when building Linux 6.12.

The problem is described in [1] and fixed in [2] but that fix is already
present in 2.3.4 and the problem still persists in Yocto builds.

I don't know what's wrong, it seems to be an issue deep in the guts of
Kbuild, but building in-tree (${S} == ${B}) seems to work so just do
that.

[1] openzfs/zfs#17321
[2] openzfs/zfs#17517
bugclerk pushed a commit to truenas/zfs that referenced this pull request Sep 8, 2025
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes openzfs#17517
(cherry picked from commit 7191c1d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants