Skip to content

[libc][NFC] refactor Cortex memcpy code #148204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 16, 2025
Merged

Conversation

gchatelet
Copy link
Contributor

@gchatelet gchatelet commented Jul 11, 2025

This patch is in preparation for the Cortex memset implementation.
It does not change the generated code.

@llvmbot llvmbot added libc bazel "Peripheral" support tier build system: utils/bazel labels Jul 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 11, 2025

@llvm/pr-subscribers-libc

Author: Guillaume Chatelet (gchatelet)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/148204.diff

4 Files Affected:

  • (modified) libc/src/string/memory_utils/CMakeLists.txt (+1)
  • (added) libc/src/string/memory_utils/arm/common.h (+52)
  • (modified) libc/src/string/memory_utils/arm/inline_memcpy.h (+46-80)
  • (modified) utils/bazel/llvm-project-overlay/libc/BUILD.bazel (+1)
diff --git a/libc/src/string/memory_utils/CMakeLists.txt b/libc/src/string/memory_utils/CMakeLists.txt
index a967247db53f4..633d9f12949d2 100644
--- a/libc/src/string/memory_utils/CMakeLists.txt
+++ b/libc/src/string/memory_utils/CMakeLists.txt
@@ -7,6 +7,7 @@ add_header_library(
     aarch64/inline_memcpy.h
     aarch64/inline_memmove.h
     aarch64/inline_memset.h
+    arm/common.h
     arm/inline_memcpy.h
     generic/aligned_access.h
     generic/byte_per_byte.h
diff --git a/libc/src/string/memory_utils/arm/common.h b/libc/src/string/memory_utils/arm/common.h
new file mode 100644
index 0000000000000..dafd4aaf02343
--- /dev/null
+++ b/libc/src/string/memory_utils/arm/common.h
@@ -0,0 +1,52 @@
+//===-- Common constants and defines for arm --------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_STRING_MEMORY_UTILS_ARM_COMMON_H
+#define LLVM_LIBC_SRC_STRING_MEMORY_UTILS_ARM_COMMON_H
+
+#include "src/__support/macros/attributes.h" // LIBC_INLINE_VAR
+#include "src/string/memory_utils/utils.h"   // CPtr, Ptr, distance_to_align
+
+#include <stddef.h> // size_t
+
+// https://libc.llvm.org/compiler_support.html
+// Support for [[likely]] / [[unlikely]]
+//  [X] GCC 12.2
+//  [X] Clang 12
+//  [ ] Clang 11
+#define LIBC_ATTR_LIKELY [[likely]]
+#define LIBC_ATTR_UNLIKELY [[unlikely]]
+
+#if defined(LIBC_COMPILER_IS_CLANG)
+#if LIBC_COMPILER_CLANG_VER < 1200
+#undef LIBC_ATTR_LIKELY
+#undef LIBC_ATTR_UNLIKELY
+#define LIBC_ATTR_LIKELY
+#define LIBC_ATTR_UNLIKELY
+#endif
+#endif
+
+namespace LIBC_NAMESPACE_DECL {
+
+LIBC_INLINE_VAR constexpr size_t kWordSize = sizeof(uint32_t);
+
+enum class BumpSize : bool { kNo = false, kYes = true };
+enum class BlockOp : bool { kFull = false, kByWord = true };
+
+LIBC_INLINE auto misaligned(CPtr ptr) {
+  return distance_to_align_down<kWordSize>(ptr);
+}
+
+LIBC_INLINE CPtr bitwise_or(CPtr a, CPtr b) {
+  return cpp::bit_cast<CPtr>(cpp::bit_cast<uintptr_t>(a) |
+                             cpp::bit_cast<uintptr_t>(b));
+}
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC_STRING_MEMORY_UTILS_ARM_COMMON_H
diff --git a/libc/src/string/memory_utils/arm/inline_memcpy.h b/libc/src/string/memory_utils/arm/inline_memcpy.h
index 61efebe29b485..ecf938d9ba3a6 100644
--- a/libc/src/string/memory_utils/arm/inline_memcpy.h
+++ b/libc/src/string/memory_utils/arm/inline_memcpy.h
@@ -10,57 +10,35 @@
 
 #include "src/__support/macros/attributes.h"   // LIBC_INLINE
 #include "src/__support/macros/optimization.h" // LIBC_LOOP_NOUNROLL
+#include "src/string/memory_utils/arm/common.h" // LIBC_ATTR_LIKELY, LIBC_ATTR_UNLIKELY
 #include "src/string/memory_utils/utils.h" // memcpy_inline, distance_to_align
 
 #include <stddef.h> // size_t
 
-// https://libc.llvm.org/compiler_support.html
-// Support for [[likely]] / [[unlikely]]
-//  [X] GCC 12.2
-//  [X] Clang 12
-//  [ ] Clang 11
-#define LIBC_ATTR_LIKELY [[likely]]
-#define LIBC_ATTR_UNLIKELY [[unlikely]]
-
-#if defined(LIBC_COMPILER_IS_CLANG)
-#if LIBC_COMPILER_CLANG_VER < 1200
-#undef LIBC_ATTR_LIKELY
-#undef LIBC_ATTR_UNLIKELY
-#define LIBC_ATTR_LIKELY
-#define LIBC_ATTR_UNLIKELY
-#endif
-#endif
-
 namespace LIBC_NAMESPACE_DECL {
 
 namespace {
 
-LIBC_INLINE_VAR constexpr size_t kWordSize = sizeof(uint32_t);
-
-enum Strategy {
-  ForceWordLdStChain,
-  AssumeWordAligned,
-  AssumeUnaligned,
-};
+template <size_t bytes>
+LIBC_INLINE void copy_assume_aligned(void *dst, const void *src) {
+  constexpr size_t alignment = bytes > kWordSize ? kWordSize : bytes;
+  memcpy_inline<bytes>(assume_aligned<alignment>(dst),
+                       assume_aligned<alignment>(src));
+}
 
-template <size_t bytes, Strategy strategy = AssumeUnaligned>
-LIBC_INLINE void copy_and_bump_pointers(Ptr &dst, CPtr &src) {
-  if constexpr (strategy == AssumeUnaligned) {
-    memcpy_inline<bytes>(assume_aligned<1>(dst), assume_aligned<1>(src));
-  } else if constexpr (strategy == AssumeWordAligned) {
-    static_assert(bytes >= kWordSize);
-    memcpy_inline<bytes>(assume_aligned<kWordSize>(dst),
-                         assume_aligned<kWordSize>(src));
-  } else if constexpr (strategy == ForceWordLdStChain) {
+template <size_t bytes, BlockOp block_op = BlockOp::kFull>
+LIBC_INLINE void copy_block_and_bump_pointers(Ptr &dst, CPtr &src) {
+  if constexpr (block_op == BlockOp::kFull) {
+    copy_assume_aligned<bytes>(dst, src);
+  } else {
     // We restrict loads/stores to 4 byte to prevent the use of load/store
-    // multiple (LDM, STM) and load/store double (LDRD, STRD). First, they may
-    // fault (see notes below) and second, they use more registers which in turn
-    // adds push/pop instructions in the hot path.
-    static_assert((bytes % kWordSize == 0) && (bytes >= kWordSize));
+    // multiple (LDM, STM) and load/store double (LDRD, STRD). First, they
+    // may fault (see notes below) and second, they use more registers which
+    // in turn adds push/pop instructions in the hot path.
+    static_assert(bytes >= kWordSize);
     LIBC_LOOP_UNROLL
-    for (size_t i = 0; i < bytes / kWordSize; ++i) {
-      const size_t offset = i * kWordSize;
-      memcpy_inline<kWordSize>(dst + offset, src + offset);
+    for (size_t offset = 0; offset < bytes; offset += kWordSize) {
+      copy_assume_aligned<kWordSize>(dst + offset, src + offset);
     }
   }
   // In the 1, 2, 4 byte copy case, the compiler can fold pointer offsetting
@@ -72,30 +50,19 @@ LIBC_INLINE void copy_and_bump_pointers(Ptr &dst, CPtr &src) {
   src += bytes;
 }
 
-LIBC_INLINE void copy_bytes_and_bump_pointers(Ptr &dst, CPtr &src,
-                                              const size_t size) {
+template <size_t bytes, BlockOp block_op, BumpSize bump_size = BumpSize::kYes>
+LIBC_INLINE void consume_by_aligned_block(Ptr &dst, CPtr &src, size_t &size) {
   LIBC_LOOP_NOUNROLL
-  for (size_t i = 0; i < size; ++i)
-    *dst++ = *src++;
-}
-
-template <size_t block_size, Strategy strategy>
-LIBC_INLINE void copy_blocks_and_update_args(Ptr &dst, CPtr &src,
-                                             size_t &size) {
-  LIBC_LOOP_NOUNROLL
-  for (size_t i = 0; i < size / block_size; ++i)
-    copy_and_bump_pointers<block_size, strategy>(dst, src);
-  // Update `size` once at the end instead of once per iteration.
-  size %= block_size;
-}
-
-LIBC_INLINE CPtr bitwise_or(CPtr a, CPtr b) {
-  return cpp::bit_cast<CPtr>(cpp::bit_cast<uintptr_t>(a) |
-                             cpp::bit_cast<uintptr_t>(b));
+  for (size_t i = 0; i < size / bytes; ++i)
+    copy_block_and_bump_pointers<bytes, block_op>(dst, src);
+  if constexpr (bump_size == BumpSize::kYes) {
+    size %= bytes;
+  }
 }
 
-LIBC_INLINE auto misaligned(CPtr a) {
-  return distance_to_align_down<kWordSize>(a);
+LIBC_INLINE void copy_bytes_and_bump_pointers(Ptr &dst, CPtr &src,
+                                              size_t size) {
+  consume_by_aligned_block<1, BlockOp::kFull, BumpSize::kNo>(dst, src, size);
 }
 
 } // namespace
@@ -125,20 +92,21 @@ LIBC_INLINE auto misaligned(CPtr a) {
     if (src_alignment == 0)
       LIBC_ATTR_LIKELY {
         // Both `src` and `dst` are now word-aligned.
-        copy_blocks_and_update_args<64, AssumeWordAligned>(dst, src, size);
-        copy_blocks_and_update_args<16, AssumeWordAligned>(dst, src, size);
-        copy_blocks_and_update_args<4, AssumeWordAligned>(dst, src, size);
+        consume_by_aligned_block<64, BlockOp::kFull>(dst, src, size);
+        consume_by_aligned_block<16, BlockOp::kFull>(dst, src, size);
+        consume_by_aligned_block<4, BlockOp::kFull>(dst, src, size);
       }
     else {
       // `dst` is aligned but `src` is not.
       LIBC_LOOP_NOUNROLL
       while (size >= kWordSize) {
-        // Recompose word from multiple loads depending on the alignment.
+        // Recompose word from multiple loads depending on the
+        // alignment.
         const uint32_t value =
             src_alignment == 2
                 ? load_aligned<uint32_t, uint16_t, uint16_t>(src)
                 : load_aligned<uint32_t, uint8_t, uint16_t, uint8_t>(src);
-        memcpy_inline<kWordSize>(assume_aligned<kWordSize>(dst), &value);
+        copy_assume_aligned<kWordSize>(dst, &value);
         dst += kWordSize;
         src += kWordSize;
         size -= kWordSize;
@@ -169,31 +137,33 @@ LIBC_INLINE auto misaligned(CPtr a) {
       if (size < 8)
         LIBC_ATTR_UNLIKELY {
           if (size & 1)
-            copy_and_bump_pointers<1>(dst, src);
+            copy_block_and_bump_pointers<1>(dst, src);
           if (size & 2)
-            copy_and_bump_pointers<2>(dst, src);
+            copy_block_and_bump_pointers<2>(dst, src);
           if (size & 4)
-            copy_and_bump_pointers<4>(dst, src);
+            copy_block_and_bump_pointers<4>(dst, src);
           return;
         }
       if (misaligned(src))
         LIBC_ATTR_UNLIKELY {
           const size_t offset = distance_to_align_up<kWordSize>(dst);
           if (offset & 1)
-            copy_and_bump_pointers<1>(dst, src);
+            copy_block_and_bump_pointers<1>(dst, src);
           if (offset & 2)
-            copy_and_bump_pointers<2>(dst, src);
+            copy_block_and_bump_pointers<2>(dst, src);
           size -= offset;
         }
     }
-  copy_blocks_and_update_args<64, ForceWordLdStChain>(dst, src, size);
-  copy_blocks_and_update_args<16, ForceWordLdStChain>(dst, src, size);
-  copy_blocks_and_update_args<4, AssumeUnaligned>(dst, src, size);
+  // `dst` and `src` are not necessarily both aligned at that point but this
+  // implementation assumes hardware support for unaligned loads and stores.
+  consume_by_aligned_block<64, BlockOp::kByWord>(dst, src, size);
+  consume_by_aligned_block<16, BlockOp::kByWord>(dst, src, size);
+  consume_by_aligned_block<4, BlockOp::kFull>(dst, src, size);
   if (size & 1)
-    copy_and_bump_pointers<1>(dst, src);
+    copy_block_and_bump_pointers<1>(dst, src);
   if (size & 2)
     LIBC_ATTR_UNLIKELY
-  copy_and_bump_pointers<2>(dst, src);
+  copy_block_and_bump_pointers<2>(dst, src);
 }
 
 [[maybe_unused]] LIBC_INLINE void inline_memcpy_arm(void *__restrict dst_,
@@ -210,8 +180,4 @@ LIBC_INLINE auto misaligned(CPtr a) {
 
 } // namespace LIBC_NAMESPACE_DECL
 
-// Cleanup local macros
-#undef LIBC_ATTR_LIKELY
-#undef LIBC_ATTR_UNLIKELY
-
 #endif // LLVM_LIBC_SRC_STRING_MEMORY_UTILS_ARM_INLINE_MEMCPY_H
diff --git a/utils/bazel/llvm-project-overlay/libc/BUILD.bazel b/utils/bazel/llvm-project-overlay/libc/BUILD.bazel
index b13a909770e58..5fa6dc1ee04fa 100644
--- a/utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/libc/BUILD.bazel
@@ -4268,6 +4268,7 @@ libc_support_library(
         "src/string/memory_utils/aarch64/inline_memcpy.h",
         "src/string/memory_utils/aarch64/inline_memmove.h",
         "src/string/memory_utils/aarch64/inline_memset.h",
+        "src/string/memory_utils/arm/common.h",
         "src/string/memory_utils/arm/inline_memcpy.h",
         "src/string/memory_utils/generic/aligned_access.h",
         "src/string/memory_utils/generic/byte_per_byte.h",

@gchatelet gchatelet changed the title [libc] refactor Cortex memcpy code in preparation of memset [libc] refactor Cortex memcpy code Jul 11, 2025
@gchatelet gchatelet changed the title [libc] refactor Cortex memcpy code [libc][NFC] refactor Cortex memcpy code Jul 11, 2025
@gchatelet gchatelet requested a review from lntue July 11, 2025 11:16
@gchatelet gchatelet merged commit 7c69c3b into llvm:main Jul 16, 2025
16 of 19 checks passed
@gchatelet gchatelet deleted the arm_improve_memset branch July 16, 2025 08:06
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 16, 2025

LLVM Buildbot has detected a new failure on builder libc-arm32-qemu-debian-dbg running on libc-arm32-qemu-debian while building libc,utils at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/215/builds/1253

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: 'python ../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py ...' (failure)
...
  Math tests using MPC will be skipped.


-- check-runtimes does nothing.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/build
@@@BUILD_STEP build libc@@@
Running: ninja libc
[1/5] Building CXX object libc/src/string/CMakeFiles/libc.src.string.memcpy.dir/memcpy.cpp.o
FAILED: libc/src/string/CMakeFiles/libc.src.string.memcpy.dir/memcpy.cpp.o 
/usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_21_0_0_git -D_DEBUG -I/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc -isystem /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g --target=arm-linux-gnueabihf -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -DLIBC_ERRNO_MODE=LIBC_ERRNO_MODE_DEFAULT -fpie -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -Wno-c99-extensions -Wno-gnu-imaginary-constant -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -O3 -DLIBC_COPT_PUBLIC_PACKAGING -std=gnu++17 -MD -MT libc/src/string/CMakeFiles/libc.src.string.memcpy.dir/memcpy.cpp.o -MF libc/src/string/CMakeFiles/libc.src.string.memcpy.dir/memcpy.cpp.o.d -o libc/src/string/CMakeFiles/libc.src.string.memcpy.dir/memcpy.cpp.o -c /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memcpy.cpp
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memcpy.cpp:13:
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/inline_memcpy.h:26:
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:39:5: error: static_assert failed
    static_assert(false);
    ^             ~~~~~
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:57:5: error: static_assert failed "Invalid BlockOp"
    static_assert(false, "Invalid BlockOp");
    ^             ~~~~~
2 errors generated.
[2/5] Building CXX object libc/src/string/CMakeFiles/libc.src.string.mempcpy.dir/mempcpy.cpp.o
FAILED: libc/src/string/CMakeFiles/libc.src.string.mempcpy.dir/mempcpy.cpp.o 
/usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_21_0_0_git -D_DEBUG -I/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc -isystem /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g --target=arm-linux-gnueabihf -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -DLIBC_ERRNO_MODE=LIBC_ERRNO_MODE_DEFAULT -fpie -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -Wno-c99-extensions -Wno-gnu-imaginary-constant -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -DLIBC_COPT_PUBLIC_PACKAGING -std=gnu++17 -MD -MT libc/src/string/CMakeFiles/libc.src.string.mempcpy.dir/mempcpy.cpp.o -MF libc/src/string/CMakeFiles/libc.src.string.mempcpy.dir/mempcpy.cpp.o.d -o libc/src/string/CMakeFiles/libc.src.string.mempcpy.dir/mempcpy.cpp.o -c /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/mempcpy.cpp
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/mempcpy.cpp:12:
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/inline_memcpy.h:26:
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:39:5: error: static_assert failed
    static_assert(false);
    ^             ~~~~~
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:57:5: error: static_assert failed "Invalid BlockOp"
    static_assert(false, "Invalid BlockOp");
    ^             ~~~~~
2 errors generated.
[3/5] Building CXX object libc/src/string/CMakeFiles/libc.src.string.memmove.dir/memmove.cpp.o
FAILED: libc/src/string/CMakeFiles/libc.src.string.memmove.dir/memmove.cpp.o 
/usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_21_0_0_git -D_DEBUG -I/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc -isystem /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g --target=arm-linux-gnueabihf -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -DLIBC_ERRNO_MODE=LIBC_ERRNO_MODE_DEFAULT -fpie -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -Wno-c99-extensions -Wno-gnu-imaginary-constant -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -O3 -DLIBC_COPT_PUBLIC_PACKAGING -std=gnu++17 -MD -MT libc/src/string/CMakeFiles/libc.src.string.memmove.dir/memmove.cpp.o -MF libc/src/string/CMakeFiles/libc.src.string.memmove.dir/memmove.cpp.o.d -o libc/src/string/CMakeFiles/libc.src.string.memmove.dir/memmove.cpp.o -c /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memmove.cpp
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memmove.cpp:12:
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/inline_memcpy.h:26:
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:39:5: error: static_assert failed
    static_assert(false);
    ^             ~~~~~
/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/memory_utils/arm/inline_memcpy.h:57:5: error: static_assert failed "Invalid BlockOp"
    static_assert(false, "Invalid BlockOp");
    ^             ~~~~~
2 errors generated.
[4/5] Building CXX object libc/src/string/CMakeFiles/libc.src.string.strcpy.dir/strcpy.cpp.o
FAILED: libc/src/string/CMakeFiles/libc.src.string.strcpy.dir/strcpy.cpp.o 
/usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_21_0_0_git -D_DEBUG -I/home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc -isystem /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g --target=arm-linux-gnueabihf -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -DLIBC_ERRNO_MODE=LIBC_ERRNO_MODE_DEFAULT -fpie -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -Wno-c99-extensions -Wno-gnu-imaginary-constant -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -Wnewline-eof -Wnonportable-system-include-path -Wstrict-prototypes -Wthread-safety -Wglobal-constructors -DLIBC_COPT_PUBLIC_PACKAGING -std=gnu++17 -MD -MT libc/src/string/CMakeFiles/libc.src.string.strcpy.dir/strcpy.cpp.o -MF libc/src/string/CMakeFiles/libc.src.string.strcpy.dir/strcpy.cpp.o.d -o libc/src/string/CMakeFiles/libc.src.string.strcpy.dir/strcpy.cpp.o -c /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/strcpy.cpp
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-arm32-qemu-debian/libc-arm32-qemu-debian-dbg/llvm-project/libc/src/string/strcpy.cpp:12:

gchatelet added a commit that referenced this pull request Jul 16, 2025
gchatelet added a commit that referenced this pull request Jul 16, 2025
Reverts #148204

`libc-arm32-qemu-debian-dbg` is failing, reverting and investigating
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 16, 2025
Reverts llvm/llvm-project#148204

`libc-arm32-qemu-debian-dbg` is failing, reverting and investigating
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 16, 2025

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building libc,utils at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/38297

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
...
PASS: lit :: shtest-external-shell-kill.py (98609 of 101601)
PASS: lld :: COFF/duplicate-dwarf.s (98610 of 101601)
PASS: lld :: COFF/defparser.test (98611 of 101601)
PASS: lld :: COFF/dllexport-mingw.s (98612 of 101601)
PASS: lld :: COFF/entry-inference32.test (98613 of 101601)
PASS: lld :: COFF/duplicate.test (98614 of 101601)
PASS: lld :: COFF/entry-drectve.test (98615 of 101601)
PASS: cfi-devirt-lld-thinlto-x86_64 :: cross-dso/icall/dlopen.cpp (98616 of 101601)
PASS: lld :: COFF/entry-weak-external.s (98617 of 101601)
TIMEOUT: MLIR :: Examples/standalone/test.toy (98618 of 101601)
******************** TEST 'MLIR :: Examples/standalone/test.toy' FAILED ********************
Exit Code: 1
Timeout: Reached timeout of 60 seconds

Command Output (stdout):
--
# RUN: at line 1
"/etc/cmake/bin/cmake" "/build/buildbot/premerge-monolithic-linux/llvm-project/mlir/examples/standalone" -G "Ninja"  -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang  -DLLVM_ENABLE_LIBCXX=OFF -DMLIR_DIR=/build/buildbot/premerge-monolithic-linux/build/lib/cmake/mlir  -DLLVM_USE_LINKER=lld  -DPython3_EXECUTABLE="/usr/bin/python3.10"
# executed command: /etc/cmake/bin/cmake /build/buildbot/premerge-monolithic-linux/llvm-project/mlir/examples/standalone -G Ninja -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang -DLLVM_ENABLE_LIBCXX=OFF -DMLIR_DIR=/build/buildbot/premerge-monolithic-linux/build/lib/cmake/mlir -DLLVM_USE_LINKER=lld -DPython3_EXECUTABLE=/usr/bin/python3.10
# .---command stdout------------
# | -- The CXX compiler identification is Clang 16.0.6
# | -- The C compiler identification is Clang 16.0.6
# | -- Detecting CXX compiler ABI info
# | -- Detecting CXX compiler ABI info - done
# | -- Check for working CXX compiler: /usr/bin/clang++ - skipped
# | -- Detecting CXX compile features
# | -- Detecting CXX compile features - done
# | -- Detecting C compiler ABI info
# | -- Detecting C compiler ABI info - done
# | -- Check for working C compiler: /usr/bin/clang - skipped
# | -- Detecting C compile features
# | -- Detecting C compile features - done
# | -- Looking for histedit.h
# | -- Looking for histedit.h - found
# | -- Found LibEdit: /usr/include (found version "2.11") 
# | -- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
# | -- Found LibXml2: /usr/lib/x86_64-linux-gnu/libxml2.so (found version "2.9.13") 
# | -- Using MLIRConfig.cmake in: /build/buildbot/premerge-monolithic-linux/build/lib/cmake/mlir
# | -- Using LLVMConfig.cmake in: /build/buildbot/premerge-monolithic-linux/build/lib/cmake/llvm
# | -- Linker detection: unknown
# | -- Performing Test LLVM_LIBSTDCXX_MIN
# | -- Performing Test LLVM_LIBSTDCXX_MIN - Success
# | -- Performing Test LLVM_LIBSTDCXX_SOFT_ERROR
# | -- Performing Test LLVM_LIBSTDCXX_SOFT_ERROR - Success
# | -- Performing Test CXX_SUPPORTS_CUSTOM_LINKER
# | -- Performing Test CXX_SUPPORTS_CUSTOM_LINKER - Success
# | -- Performing Test C_SUPPORTS_FPIC
# | -- Performing Test C_SUPPORTS_FPIC - Success
# | -- Performing Test CXX_SUPPORTS_FPIC

gchatelet added a commit that referenced this pull request Jul 17, 2025
The code for `memcpy` is the same as in #148204 but it fixes the build
bot error by using `static_assert(cpp::always_false<decltype(access)>)`
instead of `static_assert(false)` (older compilers fails on
`static_assert(false)` in `constexpr` `else` bodies).

The code for `memset` is new and vastly improves performance over the
current byte per byte implementation.

Both `memset` and `memcpy` implementations use prefetching for sizes >=
64. This lowers a bit the performance for sizes between 64 and 256 but
improves throughput for greater sizes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel libc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants