|
Petr Tesarik |
1acccf |
From: Benjamin Block <bblock@linux.ibm.com>
|
|
Petr Tesarik |
1acccf |
Date: Wed, 16 Nov 2022 11:50:37 +0100
|
|
Petr Tesarik |
1acccf |
Subject: scsi: zfcp: Fix double free of FSF request when qdio send fails
|
|
Petr Tesarik |
1acccf |
Git-commit: 0954256e970ecf371b03a6c9af2cf91b9c4085ff
|
|
Petr Tesarik |
1acccf |
Patch-mainline: v6.1-rc6
|
|
Petr Tesarik |
1acccf |
References: git-fixes
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache
|
|
Petr Tesarik |
1acccf |
the FSF request ID when sending a new FSF request. This is used in case the
|
|
Petr Tesarik |
1acccf |
sending fails and we need to remove the request from our internal hash
|
|
Petr Tesarik |
1acccf |
table again (so we don't keep an invalid reference and use it when we free
|
|
Petr Tesarik |
1acccf |
the request again).
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32
|
|
Petr Tesarik |
1acccf |
bit wide), but the rest of the zfcp code (and the firmware specification)
|
|
Petr Tesarik |
1acccf |
handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x
|
|
Petr Tesarik |
1acccf |
ELF ABI]). For one this has the obvious problem that when the ID grows
|
|
Petr Tesarik |
1acccf |
past 32 bit (this can happen reasonably fast) it is truncated to 32 bit
|
|
Petr Tesarik |
1acccf |
when storing it in the cache variable and so doesn't match the original ID
|
|
Petr Tesarik |
1acccf |
anymore. The second less obvious problem is that even when the original ID
|
|
Petr Tesarik |
1acccf |
has not yet grown past 32 bit, as soon as the 32nd bit is set in the
|
|
Petr Tesarik |
1acccf |
original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we
|
|
Petr Tesarik |
1acccf |
cast it back to 'unsigned long'. As the cached variable is of a signed
|
|
Petr Tesarik |
1acccf |
type, the compiler will choose a sign-extending instruction to load the 32
|
|
Petr Tesarik |
1acccf |
bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once
|
|
Petr Tesarik |
1acccf |
we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the
|
|
Petr Tesarik |
1acccf |
request again all the leading zeros will be flipped to ones to extend the
|
|
Petr Tesarik |
1acccf |
sign and won't match the original ID anymore (this has been observed in
|
|
Petr Tesarik |
1acccf |
practice).
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
If we can't successfully remove the request from the hash table again after
|
|
Petr Tesarik |
1acccf |
'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify
|
|
Petr Tesarik |
1acccf |
the adapter about new work because the adapter is already gone during
|
|
Petr Tesarik |
1acccf |
e.g. a ChpID toggle) we will end up with a double free. We unconditionally
|
|
Petr Tesarik |
1acccf |
free the request in the calling function when 'zfcp_fsf_req_send()' fails,
|
|
Petr Tesarik |
1acccf |
but because the request is still in the hash table we end up with a stale
|
|
Petr Tesarik |
1acccf |
memory reference, and once the zfcp adapter is either reset during recovery
|
|
Petr Tesarik |
1acccf |
or shutdown we end up freeing the same memory twice.
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
The resulting stack traces vary depending on the kernel and have no direct
|
|
Petr Tesarik |
1acccf |
correlation to the place where the bug occurs. Here are three examples that
|
|
Petr Tesarik |
1acccf |
have been seen in practice:
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
list_del corruption. next->prev should be 00000001b9d13800, but was 00000000dead4ead. (next=00000001bd131a00)
|
|
Petr Tesarik |
1acccf |
------------[ cut here ]------------
|
|
Petr Tesarik |
1acccf |
kernel BUG at lib/list_debug.c:62!
|
|
Petr Tesarik |
1acccf |
monitor event: 0040 ilc:2 [#1] PREEMPT SMP
|
|
Petr Tesarik |
1acccf |
Modules linked in: ...
|
|
Petr Tesarik |
1acccf |
CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded
|
|
Petr Tesarik |
1acccf |
Hardware name: ...
|
|
Petr Tesarik |
1acccf |
Krnl PSW : 0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140)
|
|
Petr Tesarik |
1acccf |
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
|
|
Petr Tesarik |
1acccf |
Krnl GPRS: 00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6
|
|
Petr Tesarik |
1acccf |
0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8
|
|
Petr Tesarik |
1acccf |
00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800
|
|
Petr Tesarik |
1acccf |
00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70
|
|
Petr Tesarik |
1acccf |
Krnl Code: 00000003cbeea1e8: c020004f68a7 larl %r2,00000003cc8d7336
|
|
Petr Tesarik |
1acccf |
00000003cbeea1ee: c0e50027fd65 brasl %r14,00000003cc3e9cb8
|
|
Petr Tesarik |
1acccf |
#00000003cbeea1f4: af000000 mc 0,0
|
|
Petr Tesarik |
1acccf |
>00000003cbeea1f8: c02000920440 larl %r2,00000003cd12aa78
|
|
Petr Tesarik |
1acccf |
00000003cbeea1fe: c0e500289c25 brasl %r14,00000003cc3fda48
|
|
Petr Tesarik |
1acccf |
00000003cbeea204: b9040043 lgr %r4,%r3
|
|
Petr Tesarik |
1acccf |
00000003cbeea208: b9040051 lgr %r5,%r1
|
|
Petr Tesarik |
1acccf |
00000003cbeea20c: b9040032 lgr %r3,%r2
|
|
Petr Tesarik |
1acccf |
Call Trace:
|
|
Petr Tesarik |
1acccf |
[<00000003cbeea1f8>] __list_del_entry_valid+0x98/0x140
|
|
Petr Tesarik |
1acccf |
([<00000003cbeea1f4>] __list_del_entry_valid+0x94/0x140)
|
|
Petr Tesarik |
1acccf |
[<000003ff7ff502fe>] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff7ff49cd0>] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff7ff4a22e>] zfcp_erp_strategy+0x21e/0xca0 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff7ff4ad34>] zfcp_erp_thread+0x84/0x1a0 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<00000003cb5eece8>] kthread+0x138/0x150
|
|
Petr Tesarik |
1acccf |
[<00000003cb557f3c>] __ret_from_fork+0x3c/0x60
|
|
Petr Tesarik |
1acccf |
[<00000003cc4172ea>] ret_from_fork+0xa/0x40
|
|
Petr Tesarik |
1acccf |
INFO: lockdep is turned off.
|
|
Petr Tesarik |
1acccf |
Last Breaking-Event-Address:
|
|
Petr Tesarik |
1acccf |
[<00000003cc3e9d04>] _printk+0x4c/0x58
|
|
Petr Tesarik |
1acccf |
Kernel panic - not syncing: Fatal exception: panic_on_oops
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
or:
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
Unable to handle kernel pointer dereference in virtual kernel address space
|
|
Petr Tesarik |
1acccf |
Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
|
|
Petr Tesarik |
1acccf |
Fault in home space mode while using kernel ASCE.
|
|
Petr Tesarik |
1acccf |
AS:0000000063b10007 R3:0000000000000024
|
|
Petr Tesarik |
1acccf |
Oops: 0038 ilc:3 [#1] SMP
|
|
Petr Tesarik |
1acccf |
Modules linked in: ...
|
|
Petr Tesarik |
1acccf |
CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded
|
|
Petr Tesarik |
1acccf |
Hardware name: ...
|
|
Petr Tesarik |
1acccf |
Krnl PSW : 0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp])
|
|
Petr Tesarik |
1acccf |
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
|
|
Petr Tesarik |
1acccf |
Krnl GPRS: 5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8
|
|
Petr Tesarik |
1acccf |
0000000000000000 0000000000000055 0000000000000000 00000000a8515800
|
|
Petr Tesarik |
1acccf |
0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c
|
|
Petr Tesarik |
1acccf |
000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48
|
|
Petr Tesarik |
1acccf |
Krnl Code: 000003ff7febaf7e: a7f4003d brc 15,000003ff7febaff8
|
|
Petr Tesarik |
1acccf |
000003ff7febaf82: e32020000004 lg %r2,0(%r2)
|
|
Petr Tesarik |
1acccf |
#000003ff7febaf88: ec2100388064 cgrj %r2,%r1,8,000003ff7febaff8
|
|
Petr Tesarik |
1acccf |
>000003ff7febaf8e: e3b020100020 cg %r11,16(%r2)
|
|
Petr Tesarik |
1acccf |
000003ff7febaf94: a774fff7 brc 7,000003ff7febaf82
|
|
Petr Tesarik |
1acccf |
000003ff7febaf98: ec280030007c cgij %r2,0,8,000003ff7febaff8
|
|
Petr Tesarik |
1acccf |
000003ff7febaf9e: e31020080004 lg %r1,8(%r2)
|
|
Petr Tesarik |
1acccf |
000003ff7febafa4: e33020000004 lg %r3,0(%r2)
|
|
Petr Tesarik |
1acccf |
Call Trace:
|
|
Petr Tesarik |
1acccf |
[<000003ff7febaf8e>] zfcp_fsf_reqid_check+0x86/0x158 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff7febbdbc>] zfcp_qdio_int_resp+0x6c/0x170 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff7febbf90>] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<0000000061d90a04>] tasklet_action_common.constprop.0+0xdc/0x128
|
|
Petr Tesarik |
1acccf |
[<000000006292f300>] __do_softirq+0x130/0x3c0
|
|
Petr Tesarik |
1acccf |
[<0000000061d906c6>] irq_exit_rcu+0xfe/0x118
|
|
Petr Tesarik |
1acccf |
[<000000006291e818>] do_io_irq+0xc8/0x168
|
|
Petr Tesarik |
1acccf |
[<000000006292d516>] io_int_handler+0xd6/0x110
|
|
Petr Tesarik |
1acccf |
[<000000006292d596>] psw_idle_exit+0x0/0xa
|
|
Petr Tesarik |
1acccf |
([<0000000061d3be50>] arch_cpu_idle+0x40/0xd0)
|
|
Petr Tesarik |
1acccf |
[<000000006292ceea>] default_idle_call+0x52/0xf8
|
|
Petr Tesarik |
1acccf |
[<0000000061de4fa4>] do_idle+0xd4/0x168
|
|
Petr Tesarik |
1acccf |
[<0000000061de51fe>] cpu_startup_entry+0x36/0x40
|
|
Petr Tesarik |
1acccf |
[<0000000061d4faac>] smp_start_secondary+0x12c/0x138
|
|
Petr Tesarik |
1acccf |
[<000000006292d88e>] restart_int_handler+0x6e/0x90
|
|
Petr Tesarik |
1acccf |
Last Breaking-Event-Address:
|
|
Petr Tesarik |
1acccf |
[<000003ff7febaf94>] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp]
|
|
Petr Tesarik |
1acccf |
Kernel panic - not syncing: Fatal exception in interrupt
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
or:
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
Unable to handle kernel pointer dereference in virtual kernel address space
|
|
Petr Tesarik |
1acccf |
Failing address: 523b05d3ae76a000 TEID: 523b05d3ae76a803
|
|
Petr Tesarik |
1acccf |
Fault in home space mode while using kernel ASCE.
|
|
Petr Tesarik |
1acccf |
AS:0000000077c40007 R3:0000000000000024
|
|
Petr Tesarik |
1acccf |
Oops: 0038 ilc:3 [#1] SMP
|
|
Petr Tesarik |
1acccf |
Modules linked in: ...
|
|
Petr Tesarik |
1acccf |
CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded
|
|
Petr Tesarik |
1acccf |
Hardware name: ...
|
|
Petr Tesarik |
1acccf |
Workqueue: kblockd blk_mq_run_work_fn
|
|
Petr Tesarik |
1acccf |
Krnl PSW : 0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398)
|
|
Petr Tesarik |
1acccf |
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
|
|
Petr Tesarik |
1acccf |
Krnl GPRS: ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20
|
|
Petr Tesarik |
1acccf |
0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20
|
|
Petr Tesarik |
1acccf |
00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6
|
|
Petr Tesarik |
1acccf |
00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0
|
|
Petr Tesarik |
1acccf |
Krnl Code: 0000000076fc0302: c004000000d4 brcl 0,76fc04aa
|
|
Petr Tesarik |
1acccf |
0000000076fc0308: b904001b lgr %r1,%r11
|
|
Petr Tesarik |
1acccf |
#0000000076fc030c: e3106020001a algf %r1,32(%r6)
|
|
Petr Tesarik |
1acccf |
>0000000076fc0312: e31010000082 xg %r1,0(%r1)
|
|
Petr Tesarik |
1acccf |
0000000076fc0318: b9040001 lgr %r0,%r1
|
|
Petr Tesarik |
1acccf |
0000000076fc031c: e30061700082 xg %r0,368(%r6)
|
|
Petr Tesarik |
1acccf |
0000000076fc0322: ec59000100d9 aghik %r5,%r9,1
|
|
Petr Tesarik |
1acccf |
0000000076fc0328: e34003b80004 lg %r4,952
|
|
Petr Tesarik |
1acccf |
Call Trace:
|
|
Petr Tesarik |
1acccf |
[<0000000076fc0312>] __kmalloc+0xd2/0x398
|
|
Petr Tesarik |
1acccf |
[<0000000076f318f2>] mempool_alloc+0x72/0x1f8
|
|
Petr Tesarik |
1acccf |
[<000003ff8027c5f8>] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff8027f1bc>] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff80280f1a>] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp]
|
|
Petr Tesarik |
1acccf |
[<000003ff800b4218>] scsi_queue_rq+0x778/0xa10 [scsi_mod]
|
|
Petr Tesarik |
1acccf |
[<00000000771782a0>] __blk_mq_try_issue_directly+0x130/0x208
|
|
Petr Tesarik |
1acccf |
[<000000007717a124>] blk_mq_request_issue_directly+0x4c/0xa8
|
|
Petr Tesarik |
1acccf |
[<000003ff801302e2>] dm_mq_queue_rq+0x2ea/0x468 [dm_mod]
|
|
Petr Tesarik |
1acccf |
[<0000000077178c12>] blk_mq_dispatch_rq_list+0x33a/0x818
|
|
Petr Tesarik |
1acccf |
[<000000007717f064>] __blk_mq_do_dispatch_sched+0x284/0x2f0
|
|
Petr Tesarik |
1acccf |
[<000000007717f44c>] __blk_mq_sched_dispatch_requests+0x1c4/0x218
|
|
Petr Tesarik |
1acccf |
[<000000007717fa7a>] blk_mq_sched_dispatch_requests+0x52/0x90
|
|
Petr Tesarik |
1acccf |
[<0000000077176d74>] __blk_mq_run_hw_queue+0x9c/0xc0
|
|
Petr Tesarik |
1acccf |
[<0000000076da6d74>] process_one_work+0x274/0x4d0
|
|
Petr Tesarik |
1acccf |
[<0000000076da7018>] worker_thread+0x48/0x560
|
|
Petr Tesarik |
1acccf |
[<0000000076daef18>] kthread+0x140/0x160
|
|
Petr Tesarik |
1acccf |
[<000000007751d144>] ret_from_fork+0x28/0x30
|
|
Petr Tesarik |
1acccf |
Last Breaking-Event-Address:
|
|
Petr Tesarik |
1acccf |
[<0000000076fc0474>] __kmalloc+0x234/0x398
|
|
Petr Tesarik |
1acccf |
Kernel panic - not syncing: Fatal exception: panic_on_oops
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
To fix this, simply change the type of the cache variable to 'unsigned
|
|
Petr Tesarik |
1acccf |
long', like the rest of zfcp and also the argument for
|
|
Petr Tesarik |
1acccf |
'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension
|
|
Petr Tesarik |
1acccf |
and so can successfully remove the request from the hash table.
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
Fixes: e60a6d69f1f8 ("[SCSI] zfcp: Remove function zfcp_reqlist_find_safe")
|
|
Petr Tesarik |
1acccf |
Cc: <stable@vger.kernel.org> #v2.6.34+
|
|
Petr Tesarik |
1acccf |
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
|
|
Petr Tesarik |
1acccf |
Link: https://lore.kernel.org/r/979f6e6019d15f91ba56182f1aaf68d61bf37fc6.1668595505.git.bblock@linux.ibm.com
|
|
Petr Tesarik |
1acccf |
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
|
|
Petr Tesarik |
1acccf |
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Petr Tesarik |
1acccf |
Acked-by: Petr Tesarik <ptesarik@suse.com>
|
|
Petr Tesarik |
1acccf |
---
|
|
Petr Tesarik |
1acccf |
drivers/s390/scsi/zfcp_fsf.c | 2 +-
|
|
Petr Tesarik |
1acccf |
1 file changed, 1 insertion(+), 1 deletion(-)
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
--- a/drivers/s390/scsi/zfcp_fsf.c
|
|
Petr Tesarik |
1acccf |
+++ b/drivers/s390/scsi/zfcp_fsf.c
|
|
Petr Tesarik |
1acccf |
@@ -751,7 +751,7 @@ static int zfcp_fsf_req_send(struct zfcp
|
|
Petr Tesarik |
1acccf |
struct zfcp_adapter *adapter = req->adapter;
|
|
Petr Tesarik |
1acccf |
struct zfcp_qdio *qdio = adapter->qdio;
|
|
Petr Tesarik |
1acccf |
int with_qtcb = (req->qtcb != NULL);
|
|
Petr Tesarik |
1acccf |
- int req_id = req->req_id;
|
|
Petr Tesarik |
1acccf |
+ unsigned long req_id = req->req_id;
|
|
Petr Tesarik |
1acccf |
|
|
Petr Tesarik |
1acccf |
zfcp_reqlist_add(adapter->req_list, req);
|
|
Petr Tesarik |
1acccf |
|