-
Notifications
You must be signed in to change notification settings - Fork 972
Document vectorized STL algorithms #5789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AlexGuteniev
wants to merge
15
commits into
MicrosoftDocs:main
Choose a base branch
from
AlexGuteniev:vector-algorithms
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+98
−0
Open
Changes from 4 commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
814cd8f
Document vectorized STL algorithms
AlexGuteniev 4272d7e
validation errors fix
AlexGuteniev cc385c5
Un-nest to make that work
AlexGuteniev 86a3e29
Typo in file name
AlexGuteniev 53ae06f
Typoes
AlexGuteniev a36c5a8
Spelling
AlexGuteniev 6c10dc3
Complete the lists
AlexGuteniev b1600d8
Update docs/standard-library/vectorized-stl-algorithms.md
AlexGuteniev becc300
Review comments
AlexGuteniev 053dac2
Update docs/standard-library/vectorized-stl-algorithms.md
AlexGuteniev 8f4b39f
Review feedback
AlexGuteniev 6d98a0e
STL review feedback
AlexGuteniev 59f9977
Spelling
AlexGuteniev ce5dca4
Global macro
AlexGuteniev eff4d93
Link to documentation on how to set macro globally
AlexGuteniev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
--- | ||
description: "Vectorized STL Algorithms" | ||
title: "Vectorized STL Algorithms" | ||
ms.date: "09/19/2025" | ||
helpviewer_keywords: ["Vector Algorithms", "Vectorization", "SIMD"] | ||
--- | ||
# Vectorized STL Algorithms | ||
|
||
Under certain conditions, STL algorithms execute not element-wise, but multiple element at once on a single CPU core. This is possible due to SIMD (single instruction, multiple data). The use of such approach instead of | ||
element-wise approach is called vectorization. The implementation that is not vectorized is called scalar. | ||
|
||
The conditions for vectorization are: | ||
- The container or range is contigous. `array`, `vector`, and `basic_string` are contigous containers, `span` and `basic_string_view` provide conditions ranges. | ||
- There are such SIMD insstructions available for the target platform that implement the particular algorithm on particular element types efficiently. Usually this is true for plain types (like built-in integers) and simple operations. | ||
- Either of the following: | ||
- The compiler is capable emiting vectorized machine code for an implementation written as scalar code (auto-vectorization) | ||
- The implementation itself is written as vectorized code (manual vectorization) | ||
|
||
## Auto-vectorization in STL | ||
AlexGuteniev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
See [Auto-Vectorizer](../parallel/auto-parallelization-and-auto-vectorization.md#auto-vectorizer). It applies to the STL implementation code the same way as to user code. | ||
|
||
Algorithms like `transform`, `reduce`, `accumulate` heavily benefit from auto-vectorization. | ||
|
||
## Manual vectorization in STL | ||
|
||
For x64 and x86 targets, certain algorithms have manual vectorization implemented. This implementation is pre-compiled, and uses runtime CPU dispatch, so it is engaged on suitable CPUs only. | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
The manually vectorized algorithms use template meta-programming to detect the suitable element types, so they only vectorized for simple types, like standard integer types. | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
Generally, programs either benefit in performance from this manual vectorization or are unaffected by it. In case of any problem, you can disable manual vectorization by defining `_USE_STD_VECTOR_ALGORITHMS` macro set to 0. It defaults to 1 on x64 and x86, which means that manually vectorized algorithms are enabled by default. | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
The following algorithms have manual vectorization controlled via `_USE_STD_VECTOR_ALGORITHMS` macro: | ||
- `contains` | ||
- `contains_subrange` | ||
- `find` | ||
- `find_last` | ||
- `find_end` | ||
- `find_first_of` | ||
- `adjacent_find` | ||
- `count` | ||
- `mismatch` | ||
- `search` | ||
- `search_n` | ||
- `swap_ranges` | ||
- `replace` | ||
- `remove` | ||
- `remove_copy` | ||
- `unique` | ||
- `unique_copy` | ||
- `reverse` | ||
- `rotate` | ||
- `is_sorted` | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
- `is_sorted_until` | ||
- `minmax_element` | ||
- `minmax` | ||
- `lexicographical_compare` | ||
- `lexicographical_compare_three_way` | ||
|
||
In addition to algorithms, the macro controls the manual vectorization of: | ||
- `basic_string` and `basic_string_view` members: | ||
- `find` | ||
- `rfind` | ||
- `find_first_of` | ||
- `find_first_not_of` | ||
- `find_last_of` | ||
- `find_last_not_of` | ||
- `bitset` constructors from string and `bitset::to_string` | ||
|
||
## Manually vectorized algorithms for floating point types | ||
|
||
Vectorization of floating point types is connected with extra difficulties: | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
- For floating point results, the order of operations may matter. Some reordering may yield a different result, whether more precise, or less precise. Vecotization may need operations reordering, so it may affect that. | ||
- Floating point types may contain NaN values, which don't behave transitively while comparing. | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
- Floating point operations may raise exceptions. | ||
|
||
The STL deals with the first two difficulties safely. Only `minmax_element`, `minmax`, `is_sorted`, and `is_sorted_until` are manually vectorized. These algorithms: | ||
- Do not compute new floating point values, only compare the existing values, so different order does not affect precision. | ||
- As sorting algorithms, require elements transitivity, so NaNs are not allowed as elements. | ||
|
||
There's `_USE_STD_VECTOR_FLOATING_ALGORITHMS` to control the use of these vectorized algorithms for floating point types. Set it to 0 to disable the vectorization. The macro has no effect if `_USE_STD_VECTOR_ALGORITHMS` is set to 0. | ||
|
||
`_USE_STD_VECTOR_FLOATING_ALGORITHMS` defaults to 0 when `/fp:except` option is set. This is to avoid problems with exceptions. | ||
AlexGuteniev marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
## See also | ||
|
||
[Auto-Vectorizer](../parallel/auto-parallelization-and-auto-vectorization.md#auto-vectorizer) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.