|
Miroslav Benes |
4f656d |
From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
|
|
Miroslav Benes |
4f656d |
Date: Mon, 22 Jun 2020 15:18:15 -0400
|
|
Miroslav Benes |
4f656d |
Subject: ring-buffer: Zero out time extend if it is nested and not absolute
|
|
Miroslav Benes |
4f656d |
Git-commit: 097350d1c6e1f5808cae142006f18a0bbc57018d
|
|
Miroslav Benes |
4f656d |
Patch-mainline: v5.8-rc3
|
|
Miroslav Benes |
4f656d |
References: git-fixes
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
Currently the ring buffer makes events that happen in interrupts that preempt
|
|
Miroslav Benes |
4f656d |
another event have a delta of zero. (Hopefully we can change this soon). But
|
|
Miroslav Benes |
4f656d |
this is to deal with the races of updating a global counter with lockless
|
|
Miroslav Benes |
4f656d |
and nesting functions updating deltas.
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
With the addition of absolute time stamps, the time extend didn't follow
|
|
Miroslav Benes |
4f656d |
this rule. A time extend can happen if two events happen longer than 2^27
|
|
Miroslav Benes |
4f656d |
nanoseconds appart, as the delta time field in each event is only 27 bits.
|
|
Miroslav Benes |
4f656d |
If that happens, then a time extend is injected with 2^59 bits of
|
|
Miroslav Benes |
4f656d |
nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
|
|
Miroslav Benes |
4f656d |
two events, and as it is writing the event, an interrupt triggers, it will
|
|
Miroslav Benes |
4f656d |
see the 2^27 difference as well and inject a time extend of its own. But a
|
|
Miroslav Benes |
4f656d |
recent change made the time extend logic not take into account the nesting,
|
|
Miroslav Benes |
4f656d |
and this can cause two time extend deltas to happen moving the time stamp
|
|
Miroslav Benes |
4f656d |
much further ahead than the current time. This gets all reset when the ring
|
|
Miroslav Benes |
4f656d |
buffer moves to the next page, but that can cause time to appear to go
|
|
Miroslav Benes |
4f656d |
backwards.
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
This was observed in a trace-cmd recording, and since the data is saved in a
|
|
Miroslav Benes |
4f656d |
file, with trace-cmd report --debug, it was possible to see that this indeed
|
|
Miroslav Benes |
4f656d |
did happen!
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
bash-52501 110d... 81778.908247: sched_switch: bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
|
|
Miroslav Benes |
4f656d |
<idle>-0 110d... 81778.908757: sched_switch: swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
|
|
Miroslav Benes |
4f656d |
TIME EXTEND: delta:306454770 length:0
|
|
Miroslav Benes |
4f656d |
bash-52501 110.... 81779.215212: sched_swap_numa: src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
|
|
Miroslav Benes |
4f656d |
TIME EXTEND: delta:306458165 length:0
|
|
Miroslav Benes |
4f656d |
bash-52501 110dNh. 81779.521670: sched_wakeup: migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
and at the next page, caused the time to go backwards:
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
bash-52504 110d... 81779.685411: sched_switch: bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
|
|
Miroslav Benes |
4f656d |
CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
|
|
Miroslav Benes |
4f656d |
<idle>-0 110dN.. 81779.379166: sched_wakeup: bash:52504 [120] success=1 CPU:110 [0:0x10:40]
|
|
Miroslav Benes |
4f656d |
<idle>-0 110d... 81779.379167: sched_switch: swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
Cc: Ingo Molnar <mingo@kernel.org>
|
|
Miroslav Benes |
4f656d |
Cc: Andrew Morton <akpm@linux-foundation.org>
|
|
Miroslav Benes |
4f656d |
Cc: Tom Zanussi <zanussi@kernel.org>
|
|
Miroslav Benes |
4f656d |
Cc: stable@vger.kernel.org
|
|
Miroslav Benes |
4f656d |
Fixes: dc4e2801d400b ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
|
|
Miroslav Benes |
4f656d |
Reported-by: Julia Lawall <julia.lawall@inria.fr>
|
|
Miroslav Benes |
4f656d |
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
Miroslav Benes |
4f656d |
Acked-by: Miroslav Benes <mbenes@suse.cz>
|
|
Miroslav Benes |
4f656d |
---
|
|
Miroslav Benes |
4f656d |
kernel/trace/ring_buffer.c | 2 +-
|
|
Miroslav Benes |
4f656d |
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
|
|
Miroslav Benes |
4f656d |
index b8e1ca48be50..00867ff82412 100644
|
|
Miroslav Benes |
4f656d |
--- a/kernel/trace/ring_buffer.c
|
|
Miroslav Benes |
4f656d |
+++ b/kernel/trace/ring_buffer.c
|
|
Miroslav Benes |
4f656d |
@@ -2427,7 +2427,7 @@ rb_update_event(struct ring_buffer_per_cpu *cpu_buffer,
|
|
Miroslav Benes |
4f656d |
if (unlikely(info->add_timestamp)) {
|
|
Miroslav Benes |
4f656d |
bool abs = ring_buffer_time_stamp_abs(cpu_buffer->buffer);
|
|
Miroslav Benes |
4f656d |
|
|
Miroslav Benes |
4f656d |
- event = rb_add_time_stamp(event, info->delta, abs);
|
|
Miroslav Benes |
4f656d |
+ event = rb_add_time_stamp(event, abs ? info->delta : delta, abs);
|
|
Miroslav Benes |
4f656d |
length -= RB_LEN_TIME_EXTEND;
|
|
Miroslav Benes |
4f656d |
delta = 0;
|
|
Miroslav Benes |
4f656d |
}
|
|
Miroslav Benes |
4f656d |
|