Miroslav Benes 4f656d
From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Miroslav Benes 4f656d
Date: Mon, 22 Jun 2020 15:18:15 -0400
Miroslav Benes 4f656d
Subject: ring-buffer: Zero out time extend if it is nested and not absolute
Miroslav Benes 4f656d
Git-commit: 097350d1c6e1f5808cae142006f18a0bbc57018d
Miroslav Benes 4f656d
Patch-mainline: v5.8-rc3
Miroslav Benes 4f656d
References: git-fixes
Miroslav Benes 4f656d
Miroslav Benes 4f656d
Currently the ring buffer makes events that happen in interrupts that preempt
Miroslav Benes 4f656d
another event have a delta of zero. (Hopefully we can change this soon). But
Miroslav Benes 4f656d
this is to deal with the races of updating a global counter with lockless
Miroslav Benes 4f656d
and nesting functions updating deltas.
Miroslav Benes 4f656d
Miroslav Benes 4f656d
With the addition of absolute time stamps, the time extend didn't follow
Miroslav Benes 4f656d
this rule. A time extend can happen if two events happen longer than 2^27
Miroslav Benes 4f656d
nanoseconds appart, as the delta time field in each event is only 27 bits.
Miroslav Benes 4f656d
If that happens, then a time extend is injected with 2^59 bits of
Miroslav Benes 4f656d
nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
Miroslav Benes 4f656d
two events, and as it is writing the event, an interrupt triggers, it will
Miroslav Benes 4f656d
see the 2^27 difference as well and inject a time extend of its own. But a
Miroslav Benes 4f656d
recent change made the time extend logic not take into account the nesting,
Miroslav Benes 4f656d
and this can cause two time extend deltas to happen moving the time stamp
Miroslav Benes 4f656d
much further ahead than the current time. This gets all reset when the ring
Miroslav Benes 4f656d
buffer moves to the next page, but that can cause time to appear to go
Miroslav Benes 4f656d
backwards.
Miroslav Benes 4f656d
Miroslav Benes 4f656d
This was observed in a trace-cmd recording, and since the data is saved in a
Miroslav Benes 4f656d
file, with trace-cmd report --debug, it was possible to see that this indeed
Miroslav Benes 4f656d
did happen!
Miroslav Benes 4f656d
Miroslav Benes 4f656d
  bash-52501   110d... 81778.908247: sched_switch:         bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
Miroslav Benes 4f656d
  <idle>-0     110d... 81778.908757: sched_switch:         swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
Miroslav Benes 4f656d
 TIME EXTEND: delta:306454770 length:0
Miroslav Benes 4f656d
  bash-52501   110.... 81779.215212: sched_swap_numa:      src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
Miroslav Benes 4f656d
 TIME EXTEND: delta:306458165 length:0
Miroslav Benes 4f656d
  bash-52501   110dNh. 81779.521670: sched_wakeup:         migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
Miroslav Benes 4f656d
Miroslav Benes 4f656d
and at the next page, caused the time to go backwards:
Miroslav Benes 4f656d
Miroslav Benes 4f656d
  bash-52504   110d... 81779.685411: sched_switch:         bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
Miroslav Benes 4f656d
CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
Miroslav Benes 4f656d
  <idle>-0     110dN.. 81779.379166: sched_wakeup:         bash:52504 [120] success=1 CPU:110 [0:0x10:40]
Miroslav Benes 4f656d
  <idle>-0     110d... 81779.379167: sched_switch:         swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
Miroslav Benes 4f656d
Miroslav Benes 4f656d
Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
Miroslav Benes 4f656d
Miroslav Benes 4f656d
Cc: Ingo Molnar <mingo@kernel.org>
Miroslav Benes 4f656d
Cc: Andrew Morton <akpm@linux-foundation.org>
Miroslav Benes 4f656d
Cc: Tom Zanussi <zanussi@kernel.org>
Miroslav Benes 4f656d
Cc: stable@vger.kernel.org
Miroslav Benes 4f656d
Fixes: dc4e2801d400b ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
Miroslav Benes 4f656d
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Miroslav Benes 4f656d
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Miroslav Benes 4f656d
Acked-by: Miroslav Benes <mbenes@suse.cz>
Miroslav Benes 4f656d
---
Miroslav Benes 4f656d
 kernel/trace/ring_buffer.c | 2 +-
Miroslav Benes 4f656d
 1 file changed, 1 insertion(+), 1 deletion(-)
Miroslav Benes 4f656d
Miroslav Benes 4f656d
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
Miroslav Benes 4f656d
index b8e1ca48be50..00867ff82412 100644
Miroslav Benes 4f656d
--- a/kernel/trace/ring_buffer.c
Miroslav Benes 4f656d
+++ b/kernel/trace/ring_buffer.c
Miroslav Benes 4f656d
@@ -2427,7 +2427,7 @@ rb_update_event(struct ring_buffer_per_cpu *cpu_buffer,
Miroslav Benes 4f656d
 	if (unlikely(info->add_timestamp)) {
Miroslav Benes 4f656d
 		bool abs = ring_buffer_time_stamp_abs(cpu_buffer->buffer);
Miroslav Benes 4f656d
 
Miroslav Benes 4f656d
-		event = rb_add_time_stamp(event, info->delta, abs);
Miroslav Benes 4f656d
+		event = rb_add_time_stamp(event, abs ? info->delta : delta, abs);
Miroslav Benes 4f656d
 		length -= RB_LEN_TIME_EXTEND;
Miroslav Benes 4f656d
 		delta = 0;
Miroslav Benes 4f656d
 	}
Miroslav Benes 4f656d