Tree - kernel/kernel-source - Pagure for openSUSE

kernel / kernel-source

Source
Stats

Blame patches.suse/msft-hv-2311-Drivers-hv-vmbus-Increase-wait-time-for-VMbus-unload.patch

Blob History Raw

Olaf Hering	f1e50a	`From: Michael Kelley <mikelley@microsoft.com>`
Olaf Hering	f1e50a	`Date: Mon, 19 Apr 2021 21:48:09 -0700`
Olaf Hering	f1e50a	`Patch-mainline: v5.13-rc1`
Olaf Hering	f1e50a	`Subject: Drivers: hv: vmbus: Increase wait time for VMbus unload`
Olaf Hering	f1e50a	`Git-commit: 77db0ec8b7764cb9b09b78066ebfd47b2c0c1909`
Olaf Hering	f1e50a	`References: bsc#1185724`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`When running in Azure, disks may be connected to a Linux VM with`
Olaf Hering	f1e50a	`read/write caching enabled. If a VM panics and issues a VMbus`
Olaf Hering	f1e50a	`UNLOAD request to Hyper-V, the response is delayed until all dirty`
Olaf Hering	f1e50a	`data in the disk cache is flushed. In extreme cases, this flushing`
Olaf Hering	f1e50a	`can take 10's of seconds, depending on the disk speed and the amount`
Olaf Hering	f1e50a	`of dirty data. If kdump is configured for the VM, the current 10 second`
Olaf Hering	f1e50a	`timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD`
Olaf Hering	f1e50a	`complete message may arrive well after the kdump kernel is already`
Olaf Hering	f1e50a	`running, causing problems. Note that no problem occurs if kdump is`
Olaf Hering	f1e50a	`not enabled because Hyper-V waits for the cache flush before doing`
Olaf Hering	f1e50a	`a reboot through the BIOS/UEFI code.`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`Fix this problem by increasing the timeout in vmbus_wait_for_unload()`
Olaf Hering	f1e50a	`to 100 seconds. Also output periodic messages so that if anyone is`
Olaf Hering	f1e50a	`watching the serial console, they won't think the VM is completely`
Olaf Hering	f1e50a	`hung.`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")`
Olaf Hering	f1e50a	`Signed-off-by: Michael Kelley <mikelley@microsoft.com>`
Olaf Hering	f1e50a	`Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>`
Olaf Hering	f1e50a	`Link: https://lore.kernel.org/r/1618894089-126662-1-git-send-email-mikelley@microsoft.com`
Olaf Hering	f1e50a	`Signed-off-by: Wei Liu <wei.liu@kernel.org>`
Olaf Hering	f1e50a	`Acked-by: Olaf Hering <ohering@suse.de>`
Olaf Hering	f1e50a	`---`
Olaf Hering	f1e50a	`drivers/hv/channel_mgmt.c \| 30 +++++++++++++++++++++++++-----`
Olaf Hering	f1e50a	`1 file changed, 25 insertions(+), 5 deletions(-)`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c`
Olaf Hering	f1e50a	`--- a/drivers/hv/channel_mgmt.c`
Olaf Hering	f1e50a	`+++ b/drivers/hv/channel_mgmt.c`
Olaf Hering	f1e50a	`@@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)`
Olaf Hering	f1e50a	`free_cpumask_var(available_mask);`
Olaf Hering	f1e50a	`}`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`+#define UNLOAD_DELAY_UNIT_MS 10 /* 10 milliseconds */`
Olaf Hering	f1e50a	`+#define UNLOAD_WAIT_MS (1001000) / 100 seconds */`
Olaf Hering	f1e50a	`+#define UNLOAD_WAIT_LOOPS (UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)`
Olaf Hering	f1e50a	`+#define UNLOAD_MSG_MS (51000) / Every 5 seconds */`
Olaf Hering	f1e50a	`+#define UNLOAD_MSG_LOOPS (UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)`
Olaf Hering	f1e50a	`+`
Olaf Hering	f1e50a	`static void vmbus_wait_for_unload(void)`
Olaf Hering	f1e50a	`{`
Olaf Hering	f1e50a	`int cpu;`
Olaf Hering	f1e50a	`@@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)`
Olaf Hering	f1e50a	`* vmbus_connection.unload_event. If not, the last thing we can do is`
Olaf Hering	f1e50a	`* read message pages for all CPUs directly.`
Olaf Hering	f1e50a	`*`
Olaf Hering	f1e50a	`- * Wait no more than 10 seconds so that the panic path can't get`
Olaf Hering	f1e50a	`- * hung forever in case the response message isn't seen.`
Olaf Hering	f1e50a	`+ * Wait up to 100 seconds since an Azure host must writeback any dirty`
Olaf Hering	f1e50a	`+ * data in its disk cache before the VMbus UNLOAD request will`
Olaf Hering	f1e50a	`+ * complete. This flushing has been empirically observed to take up`
Olaf Hering	f1e50a	`+ * to 50 seconds in cases with a lot of dirty data, so allow additional`
Olaf Hering	f1e50a	`+ * leeway and for inaccuracies in mdelay(). But eventually time out so`
Olaf Hering	f1e50a	`+ * that the panic path can't get hung forever in case the response`
Olaf Hering	f1e50a	`+ * message isn't seen.`
Olaf Hering	f1e50a	`*/`
Olaf Hering	f1e50a	`- for (i = 0; i < 1000; i++) {`
Olaf Hering	f1e50a	`+ for (i = 1; i <= UNLOAD_WAIT_LOOPS; i++) {`
Olaf Hering	f1e50a	`if (completion_done(&vmbus_connection.unload_event))`
Olaf Hering	f1e50a	`- break;`
Olaf Hering	f1e50a	`+ goto completed;`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`for_each_online_cpu(cpu) {`
Olaf Hering	f1e50a	`struct hv_per_cpu_context *hv_cpu`
Olaf Hering	f1e50a	`@@ -800,9 +811,18 @@ static void vmbus_wait_for_unload(void)`
Olaf Hering	f1e50a	`vmbus_signal_eom(msg, message_type);`
Olaf Hering	f1e50a	`}`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`- mdelay(10);`
Olaf Hering	f1e50a	`+ /*`
Olaf Hering	f1e50a	`+ * Give a notice periodically so someone watching the`
Olaf Hering	f1e50a	`+ * serial output won't think it is completely hung.`
Olaf Hering	f1e50a	`+ */`
Olaf Hering	f1e50a	`+ if (!(i % UNLOAD_MSG_LOOPS))`
Olaf Hering	f1e50a	`+ pr_notice("Waiting for VMBus UNLOAD to complete\n");`
Olaf Hering	f1e50a	`+`
Olaf Hering	f1e50a	`+ mdelay(UNLOAD_DELAY_UNIT_MS);`
Olaf Hering	f1e50a	`}`
Olaf Hering	f1e50a	`+ pr_err("Continuing even though VMBus UNLOAD did not complete\n");`
Olaf Hering	f1e50a
Olaf Hering	f1e50a	`+completed:`
Olaf Hering	f1e50a	`/*`
Olaf Hering	f1e50a	`* We're crashing and already got the UNLOAD_RESPONSE, cleanup all`
Olaf Hering	f1e50a	`* maybe-pending messages on all CPUs to be able to receive new`

kernel / kernel-source

Source Code

Blame patches.suse/msft-hv-2311-Drivers-hv-vmbus-Increase-wait-time-for-VMbus-unload.patch