Add Mask::count() method to count true elements #490

GrigoryEvko · 2025-11-16T13:44:12Z

Add `Mask::count()` method

Motivation

The Mask API currently provides boolean queries (any(), all()) and index queries (first_set()), but lacks a method to count the number of true elements. This forces users to either convert to arrays and iterate, or manually use to_bitmask().count_ones(), which exposes implementation details.

Current workarounds:

// Option 1: Verbose, requires knowing bitmask representation
let count = mask.to_bitmask().count_ones() as usize;

// Option 2: Inefficient, allocates array
let count = mask.to_array().iter().filter(|&&x| x).count();

Proposed:

let count = mask.count();

This pattern appears frequently in SIMD code when pre-sizing allocations to avoid reallocation overhead:

// Two-pass filtering: count matches, allocate once, then collect
let mask = values.simd_gt(threshold);
let mut results = Vec::with_capacity(mask.count());
for (i, &val) in data.iter().enumerate() {
    if mask.test(i) {
        results.push(val);
    }
}

Other common use cases include histogram generation, SQL-style COUNT aggregation, and sparse data analysis.

API Design

impl<T, const N: usize> Mask<T, N>
where
    T: MaskElement,
    LaneCount<N>: SupportedLaneCount,
{
    #[inline]
    #[must_use]
    pub fn count(self) -> usize {
        self.to_bitmask().count_ones() as usize
    }
}

Design decisions:

Returns usize - Consistent with Iterator::count() and suitable for array indexing
Named count() not len() - len() implies container size; count() matches the semantic operation (counting true values)
Simple #[must_use] attribute - Follows Vec::len() and slice::len() precedent (no message)
Not const - to_bitmask() uses intrinsics that cannot be const-evaluated

Implementation

The implementation delegates to to_bitmask().count_ones(), which already uses LLVM's llvm.ctpop intrinsic. This compiles to efficient platform-specific instructions:

x86/x86_64: POPCNT (SSE4.2)
ARM/AArch64: CNT (NEON)
RISC-V: CPOP (Zbb extension)
WebAssembly: i64.popcnt

No platform-specific code is required; LLVM handles optimization for each target.

Performance

Benchmarked on x86_64 (Intel Core i7-14700HX, -C target-cpu=native):

Mask size	count()	Manual iteration	Speedup
mask32x4	0.36 ns	0.52 ns	44%
mask32x8	0.45 ns	0.76 ns	69%
mask32x16	1.04 ns	1.15 ns	11%

Assembly verification shows the expected codegen (x86_64):

vmovmskps  eax, ymm0    ; Extract mask to integer
popcnt     eax, eax     ; Population count

The operation is branch-free and density-independent: mask16 measured at 1.03-1.05ns across all densities (0%, 25%, 50%, 75%, 100%), confirming constant-time behavior regardless of true element count.

Implements a simple, efficient method to count the number of `true` elements in a SIMD mask. This is a common operation needed for: - Pre-sizing allocations before filtering - SQL-style COUNT(WHERE ...) operations - Histogram generation - Sparse data statistics Implementation delegates to `to_bitmask().count_ones()`, which compiles to a single POPCNT instruction on x86_64 and equivalent efficient instructions on other platforms (CNT on ARM, CPOP on RISC-V, i64.popcnt on WASM). Performance: ~0.7ns per operation, O(1) regardless of bit density. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The feature stdarch_x86_avx512 has been stable since Rust 1.89.0 and no longer requires a feature gate.

GrigoryEvko and others added 2 commits November 15, 2025 23:35

Remove stable feature stdarch_x86_avx512

5ba1dc1

The feature stdarch_x86_avx512 has been stable since Rust 1.89.0 and no longer requires a feature gate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Mask::count() method to count true elements #490

Add Mask::count() method to count true elements #490

Uh oh!

GrigoryEvko commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add Mask::count() method to count true elements #490

Are you sure you want to change the base?

Add Mask::count() method to count true elements #490

Uh oh!

Conversation

GrigoryEvko commented Nov 16, 2025