@@ -416,17 +416,16 @@ \subsection{Exchange}
416
416
To see where this might be useful,
417
417
let's tweak our example from \secref {atomicity }:
418
418
instead of displaying the total number of processed files,
419
- the \textsc {ui } might want to show how many were processed per second.
420
- We could implement this by having the \textsc {ui } thread read the counter then zero it each second.
419
+ the \textsc {UI } might want to show how many were processed per second.
420
+ We could implement this by having the \textsc {UI } thread read the counter then zero it each second.
421
421
But we could get the following race condition if reading and zeroing are separate steps:
422
422
\begin {enumerate }
423
- \item The \textsc {ui} thread reads the counter.
424
- \item Before the \textsc {ui} thread has the chance to zero it,
425
- the worker thread increments it again.
426
- \item The \textsc {ui} thread now zeroes the counter, and the previous increment
427
- is lost.
423
+ \item The \textsc {UI} thread reads the counter.
424
+ \item Before the \textsc {UI} thread has the chance to zero it,
425
+ the worker thread increments it again.
426
+ \item The \textsc {UI} thread now zeroes the counter, and the previous increment is lost.
428
427
\end {enumerate }
429
- If the \textsc {ui } thread atomically exchanges the current value with zero,
428
+ If the \textsc {UI } thread atomically exchanges the current value with zero,
430
429
the race disappears.
431
430
432
431
\subsection {Test and set }
@@ -462,10 +461,10 @@ \subsection{Fetch and…}
462
461
all as part of a single atomic operation.
463
462
You might recall from the exchange example that additions by the worker thread must be atomic to prevent races, where:
464
463
\begin {enumerate }
465
- \item The worker thread loads the current counter value and adds one.
466
- \item Before that thread can store the value back,
467
- the \textsc {ui } thread zeroes the counter.
468
- \item The worker now performs its store, as if the counter was never cleared.
464
+ \item The worker thread loads the current counter value and adds one.
465
+ \item Before that thread can store the value back,
466
+ the \textsc {UI } thread zeroes the counter.
467
+ \item The worker now performs its store, as if the counter was never cleared.
469
468
\end {enumerate }
470
469
471
470
\subsection {Compare and swap }
@@ -718,10 +717,10 @@ \subsection{Spurious LL/SC failures}
718
717
Many lockless algorithms use \textsc {CAS} loops like this to atomically update a variable when calculating its new value is not atomic.
719
718
They:
720
719
\begin {enumerate }
721
- \item Read the variable.
722
- \item Perform some (non-atomic) operation on its value.
723
- \item \textsc {CAS} the new value with the previous one.
724
- \item If the \textsc {CAS} failed, another thread beat us to the punch, so try again.
720
+ \item Read the variable.
721
+ \item Perform some (non-atomic) operation on its value.
722
+ \item \textsc {CAS} the new value with the previous one.
723
+ \item If the \textsc {CAS} failed, another thread beat us to the punch, so try again.
725
724
\end {enumerate }
726
725
If we use \monobox {compare\_ exchange\_ strong} for this family of algorithms,
727
726
the compiler must emit nested loops:
@@ -988,14 +987,12 @@ \subsection{Acquire-Release}
988
987
Order does not matter when incrementing the reference count since no action is taken as a result.
989
988
However, when we decrement, we must ensure that:
990
989
\begin {enumerate }
991
- \item All access to the referenced object happens
992
- \emph {before } the count reaches zero.
993
- \item Deletion happens \emph {after } the reference count reaches
994
- zero.\punckern \footnote {This can be optimized even further by
995
- making the acquire barrier only occur conditionally, when the reference
996
- count is zero.
997
- Standalone barriers are outside the scope of this paper,
998
- since they are almost always pessimal compared to a combined load-acquire or store-release.}
990
+ \item All access to the referenced object happens \emph {before } the count reaches zero.
991
+ \item Deletion happens \emph {after } the reference count reaches zero.\punckern \footnote {%
992
+ This can be optimized even further by making the acquire barrier only occur conditionally,
993
+ when the reference count is zero.
994
+ Standalone barriers are outside the scope of this paper,
995
+ since they are almost always pessimal compared to a combined load-acquire or store-release.}
999
996
\end {enumerate }
1000
997
1001
998
Curious readers might be wondering about the difference between acquire-release and sequentially consistent operations.
@@ -1170,33 +1167,31 @@ \section{If concurrency is the question, \texttt{volatile} is not the answer.}
1170
1167
(This is how most machines ultimately interact with the outside world.)
1171
1168
\keyword {volatile} implies two guarantees:
1172
1169
\begin {enumerate }
1173
- \item The compiler will not elide loads and stores that seem `` unnecessary'' \quotekern .
1174
- For example, if I have some function:
1175
- \begin {colfigure }
1176
- \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1177
- void write(int *t)
1178
- {
1179
- *t = 2;
1180
- *t = 42;
1181
- }
1182
- \end {minted }
1183
- \end {colfigure }
1184
- the compiler would normally optimize it to:
1185
- \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1186
- void write(int *t)
1187
- {
1188
- *t = 42;
1189
- }
1190
- \end {minted }
1191
- \mintinline {cpp}{*t = 2} is often considered a \introduce {dead store},
1192
- seemingly performing no function.
1193
- However, when \texttt {t } is directed at an \textsc {MMIO} register,
1194
- this assumption becomes unsafe.
1195
- In such cases, each write operation could potentially influence the behavior of the associated hardware.
1196
-
1197
- \item The compiler will not reorder \keyword {volatile}
1198
- reads and writes with respect to other \keyword {volatile} ones
1199
- for similar reasons.
1170
+ \item The compiler will not elide loads and stores that seem `` unnecessary'' \quotekern .
1171
+ For example, if I have some function:
1172
+ \begin {colfigure }
1173
+ \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1174
+ void write(int *t)
1175
+ {
1176
+ *t = 2;
1177
+ *t = 42;
1178
+ }
1179
+ \end {minted }
1180
+ \end {colfigure }
1181
+ the compiler would normally optimize it to:
1182
+ \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1183
+ void write(int *t)
1184
+ {
1185
+ *t = 42;
1186
+ }
1187
+ \end {minted }
1188
+ \mintinline {cpp}{*t = 2} is often considered a \introduce {dead store},
1189
+ seemingly performing no function.
1190
+ However, when \texttt {t } is directed at an \textsc {MMIO} register,
1191
+ this assumption becomes unsafe.
1192
+ In such cases, each write operation could potentially influence the behavior of the associated hardware.
1193
+
1194
+ \item The compiler will not reorder \keyword {volatile} reads and writes with respect to other \keyword {volatile} ones for similar reasons.
1200
1195
\end {enumerate }
1201
1196
1202
1197
These rules fall short of providing the atomicity and order required for safe communication between threads.
0 commit comments