Commit 0e16776
scsi: sg: Fix occasional bogus elapsed time that exceeds timeout
A race condition was found in sg_proc_debug_helper(). It was observed on
a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring
/proc/scsi/sg/debug every second. A very large elapsed time would
sometimes appear. This is caused by two race conditions.
We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64
architecture. A patched kernel was built, and the race condition could
not be observed anymore after the application of this patch. A
reproducer C program utilising the scsi_debug module was also built by
Changhui Zhong and can be viewed here:
https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c
The first race happens between the reading of hp->duration in
sg_proc_debug_helper() and request completion in sg_rq_end_io(). The
hp->duration member variable may hold either of two types of
information:
#1 - The start time of the request. This value is present while
the request is not yet finished.
#2 - The total execution time of the request (end_time - start_time).
If sg_proc_debug_helper() executes *after* the value of hp->duration was
changed from #1 to #2, but *before* srp->done is set to 1 in
sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the
elapsed time (value type #2) is subtracted from a timestamp, which
cannot yield a valid elapsed time (which is a type #2 value as well).
To fix this issue, the value of hp->duration must change under the
protection of the sfp->rq_list_lock in sg_rq_end_io(). Since
sg_proc_debug_helper() takes this read lock, the change to srp->done and
srp->header.duration will happen atomically from the perspective of
sg_proc_debug_helper() and the race condition is thus eliminated.
The second race condition happens between sg_proc_debug_helper() and
sg_new_write(). Even though hp->duration is set to the current time
stamp in sg_add_request() under the write lock's protection, it gets
overwritten by a call to get_sg_io_hdr(), which calls copy_from_user()
to copy struct sg_io_hdr from userspace into kernel space. hp->duration
is set to the start time again in sg_common_write(). If
sg_proc_debug_helper() is called between these two calls, an arbitrary
value set by userspace (usually zero) is used to compute the elapsed
time.
To fix this issue, hp->duration must be set to the current timestamp
again after get_sg_io_hdr() returns successfully. A small race window
still exists between get_sg_io_hdr() and setting hp->duration, but this
window is only a few instructions wide and does not result in observable
issues in practice, as confirmed by testing.
Additionally, we fix the format specifier from %d to %u for printing
unsigned int values in sg_proc_debug_helper().
Signed-off-by: Michal Rábek <[email protected]>
Suggested-by: Tomas Henzl <[email protected]>
Tested-by: Changhui Zhong <[email protected]>
Reviewed-by: Ewan D. Milne <[email protected]>
Reviewed-by: John Meneghini <[email protected]>
Reviewed-by: Tomas Henzl <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Martin K. Petersen <[email protected]>1 parent d373163 commit 0e16776
1 file changed
+13
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
731 | 731 | | |
732 | 732 | | |
733 | 733 | | |
| 734 | + | |
| 735 | + | |
734 | 736 | | |
735 | 737 | | |
736 | 738 | | |
| |||
815 | 817 | | |
816 | 818 | | |
817 | 819 | | |
818 | | - | |
819 | 820 | | |
820 | 821 | | |
821 | 822 | | |
| |||
1338 | 1339 | | |
1339 | 1340 | | |
1340 | 1341 | | |
1341 | | - | |
1342 | | - | |
1343 | | - | |
1344 | 1342 | | |
1345 | 1343 | | |
1346 | 1344 | | |
| |||
1389 | 1387 | | |
1390 | 1388 | | |
1391 | 1389 | | |
| 1390 | + | |
| 1391 | + | |
| 1392 | + | |
1392 | 1393 | | |
1393 | 1394 | | |
1394 | 1395 | | |
| |||
2533 | 2534 | | |
2534 | 2535 | | |
2535 | 2536 | | |
| 2537 | + | |
2536 | 2538 | | |
2537 | 2539 | | |
2538 | 2540 | | |
| |||
2570 | 2572 | | |
2571 | 2573 | | |
2572 | 2574 | | |
2573 | | - | |
| 2575 | + | |
2574 | 2576 | | |
2575 | 2577 | | |
2576 | | - | |
| 2578 | + | |
| 2579 | + | |
| 2580 | + | |
| 2581 | + | |
| 2582 | + | |
2577 | 2583 | | |
2578 | 2584 | | |
2579 | | - | |
| 2585 | + | |
2580 | 2586 | | |
2581 | 2587 | | |
2582 | 2588 | | |
| |||
0 commit comments