Skip to content

Conversation

pwhelan
Copy link
Contributor

@pwhelan pwhelan commented Aug 22, 2025

Summary

allow setting FLB_USE_SIMDUTF=No explicitly to disable it. This is the only way to compile fluent-bit on ubuntu 22.04 and earlier or other linux distributions with gcc<=12. The simdutf library otherwise fails to build with the following error:

error: inlining failed in call to 'always_inline' '__m256i _mm256_cmpeq_epi8(__m256i, __m256i)': target specific option mismatch


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • [N/A] Example configuration file for the change
  • [N/A] Debug log output from testing the change
  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Allow external override of SIMDUTF usage via build flag.
    • Expose a build indicator for Unicode encoder support to dependent components.
  • Bug Fixes

    • Build SIMDUTF only when Unicode encoding is enabled and the override is set.
    • Respect pre-defined SIMDUTF settings to prevent unintended changes.
    • Emit a clear fatal error if Unicode encoder is requested without SIMDUTF, preventing misconfiguration.

Copy link

coderabbitai bot commented Aug 22, 2025

Walkthrough

Respect FLB_USE_SIMDUTF external overrides, only enable the simdutf subdirectory when both FLB_UNICODE_ENCODER and FLB_USE_SIMDUTF are true, define FLB_HAVE_UNICODE_ENCODER when enabling the SIMDUTF path, and emit a fatal error if FLB_UNICODE_ENCODER is requested without FLB_USE_SIMDUTF.

Changes

Cohort / File(s) Summary of Changes
Build configuration
CMakeLists.txt
Guarded assignment of FLB_USE_SIMDUTF so it is only set when not already defined; require both FLB_UNICODE_ENCODER and FLB_USE_SIMDUTF to add the simdutf subdirectory; define FLB_HAVE_UNICODE_ENCODER via build definitions when SIMDUTF path is enabled; emit a fatal error if FLB_UNICODE_ENCODER is ON while FLB_USE_SIMDUTF is OFF.

Sequence Diagram(s)

sequenceDiagram
  participant User as User / CMake CLI
  participant CMake as CMake configuration
  participant Build as Build system

  rect rgb(245, 250, 240)
    note over User,CMake: User may pass -D flags
    User->>CMake: -DFLB_UNICODE_ENCODER=ON/OFF\n(optional) -DFLB_USE_SIMDUTF=ON/OFF
    CMake->>CMake: If `FLB_USE_SIMDUTF` NOT DEFINED -> set default
  end

  alt FLB_UNICODE_ENCODER == ON and FLB_USE_SIMDUTF == ON
    CMake->>CMake: add definition `FLB_HAVE_UNICODE_ENCODER`
    CMake->>Build: add_subdirectory(simdutf)
    note over Build: SIMDUTF path and Unicode encoder enabled
  else FLB_UNICODE_ENCODER == ON and FLB_USE_SIMDUTF == OFF
    CMake->>User: fatal_error("FLB_UNICODE_ENCODER requires FLB_USE_SIMDUTF")
    note over CMake,User: Configuration fails
  else otherwise
    CMake-->>Build: skip simdutf
    note over Build: SIMDUTF not built
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my whiskers at CMake's cue,
Flags set true or false — I chew.
When Unicode wants SIMD to play,
Both must nod to join the fray.
Overrides kept, the build hops through. 🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b22b583 and 9a33c70.

📒 Files selected for processing (1)
  • CMakeLists.txt (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CMakeLists.txt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pwhelan-make-simdutf-optional

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
CMakeLists.txt (1)

602-607: Action: add strict C++ settings and fix CMake gating inconsistencies for simdutf

Short summary: rg shows no unguarded uses of FLB_HAVE_UNICODE_ENCODER (all usages are behind #ifdef), but CMake gating is inconsistent — some subdirectories/install rules are still enabled by FLB_UNICODE_ENCODER alone. Please apply the fixes below.

Files / changes to make:

  • CMakeLists.txt (top-level, lines ~602-607) — add REQUIRED + disable extensions:
 if(FLB_UNICODE_ENCODER AND FLB_USE_SIMDUTF)
   enable_language(CXX)
-  set (CMAKE_CXX_STANDARD 11)
+  set (CMAKE_CXX_STANDARD 11)
+  set (CMAKE_CXX_STANDARD_REQUIRED ON)
+  set (CMAKE_CXX_EXTENSIONS OFF)
   add_subdirectory(${FLB_PATH_LIB_SIMDUTF} EXCLUDE_FROM_ALL)
   FLB_DEFINITION(FLB_HAVE_UNICODE_ENCODER)
 endif()
  • src/CMakeLists.txt (around line ~349) — gate adding the local simdutf connector on FLB_USE_SIMDUTF too:
-if(FLB_UNICODE_ENCODER)
+if(FLB_UNICODE_ENCODER AND FLB_USE_SIMDUTF)
   add_subdirectory(simdutf)
   set(src
     ${src}
  • include/CMakeLists.txt (around line ~37) — gate header installation the same way:
-if(FLB_UNICODE_ENCODER)
+if(FLB_UNICODE_ENCODER AND FLB_USE_SIMDUTF)
   file(GLOB headers "fluent-bit/simdutf/*.h")
   install(FILES ${headers}
     DESTINATION ${FLB_INSTALL_INCLUDEDIR}/fluent-bit/simdutf/
     COMPONENT headers

Optional: tighten the FLB_COVERAGE CXX flags gating (lines ~433-437) to require FLB_USE_SIMDUTF when setting CXX flags so flags are only set when C++ targets will be built.

Reasoning / result of verification:

  • ripgrep found all FLB_HAVE_UNICODE_ENCODER usages guarded by #ifdef (no unconditional references).
  • ripgrep also showed CMake still adds/installs simdutf artifacts when only FLB_UNICODE_ENCODER is set (src/CMakeLists.txt, include/CMakeLists.txt). That creates a mismatch with your top-level change which defines FLB_HAVE_UNICODE_ENCODER only when FLB_USE_SIMDUTF is true — causing incomplete/incorrect C++ target wiring or header installs when FLB_USE_SIMDUTF is off.

Please make the three small CMake edits above to keep gating consistent and to enforce a strict C++ standard.

🧹 Nitpick comments (1)
CMakeLists.txt (1)

135-143: Make FLB_USE_SIMDUTF a cached option with a computed default; normalize bools to ON/OFF.

Intent looks good and the guard respects external -D overrides. To improve UX and consistency with CMake conventions:

  • Expose FLB_USE_SIMDUTF as a CACHE BOOL so it shows up in ccmake/GUI and persists across re-configures.
  • Use canonical ON/OFF values (CMake treats Yes/No as truthy, but ON/OFF is more idiomatic).
  • Emit a single summary message of the resolved value.

Suggested diff confined to this hunk:

-if (NOT DEFINED FLB_USE_SIMDUTF)
-  if(CMAKE_CXX_COMPILER)
-    message(STATUS "CXX compiler found, enable simdutf.")
-    set(FLB_USE_SIMDUTF Yes)
-  else()
-    message(STATUS "CXX compiler not found, disable simdutf.")
-    set(FLB_USE_SIMDUTF No)
-  endif()
-endif()
+if (NOT DEFINED FLB_USE_SIMDUTF)
+  set(_FLB_USE_SIMDUTF_DEFAULT OFF)
+  if (CMAKE_CXX_COMPILER)
+    set(_FLB_USE_SIMDUTF_DEFAULT ON)
+  endif()
+  option(FLB_USE_SIMDUTF "Use simdutf for Unicode encoding (requires C++)" ${_FLB_USE_SIMDUTF_DEFAULT})
+else()
+  # Surface user-provided value in cache (no FORCE, to respect -D overrides)
+  set(FLB_USE_SIMDUTF "${FLB_USE_SIMDUTF}" CACHE BOOL "Use simdutf for Unicode encoding (requires C++)")
+endif()
+message(STATUS "FLB_USE_SIMDUTF=${FLB_USE_SIMDUTF}")

This keeps override semantics intact and aligns with how other toggles are exposed.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 54acd76 and ea1961b.

📒 Files selected for processing (1)
  • CMakeLists.txt (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7

@pwhelan
Copy link
Contributor Author

pwhelan commented Aug 22, 2025

Some instances are also showing linking errors:

2025-08-22T21:59:01.0409136Z /usr/bin/ld: ../library/libsimdutf-static.a(simdutf.cpp.o): undefined reference to symbol '_ZSt20__throw_length_errorPKc@@GLIBCXX_3.4'
2025-08-22T21:59:01.0410689Z /usr/bin/ld: /lib/x86_64-linux-gnu/libstdc++.so.6: error adding symbols: DSO missing from command line

I still have not been able to get to the root cause as is yet.

Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. Thank you!

@pwhelan pwhelan merged commit eb77790 into master Aug 25, 2025
45 checks passed
@pwhelan pwhelan deleted the pwhelan-make-simdutf-optional branch August 25, 2025 14:13
@cosmo0920
Copy link
Contributor

cosmo0920 commented Aug 26, 2025

This patch should be backported to 4.0 branch as well. Could you handle on that, @pwhelan ?

@pwhelan
Copy link
Contributor Author

pwhelan commented Aug 26, 2025

This patch should be backported to 4.0 branch as well. Could you handle on that, @pwhelan ?

Any chance you can also look at #10786? It makes sure that FLB_UNICODE_ENCODER does not get enabled when FLB_USE_SIMDUTF is disabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants