From 86bc363029faac03de3ca8bad1ae057f321c7a58 Mon Sep 17 00:00:00 2001 From: Coly Li Date: Apr 13 2024 06:53:45 +0000 Subject: dm verity: don't perform FEC for failed readahead IO (git-fixes). --- diff --git a/patches.suse/dm-verity-don-t-perform-FEC-for-failed-readahead-IO-0193.patch b/patches.suse/dm-verity-don-t-perform-FEC-for-failed-readahead-IO-0193.patch new file mode 100644 index 0000000..a0293b7 --- /dev/null +++ b/patches.suse/dm-verity-don-t-perform-FEC-for-failed-readahead-IO-0193.patch @@ -0,0 +1,88 @@ +From 0193e3966ceeeef69e235975918b287ab093082b Mon Sep 17 00:00:00 2001 +From: Wu Bo +Date: Tue, 21 Nov 2023 20:51:50 -0700 +Subject: [PATCH] dm verity: don't perform FEC for failed readahead IO +Git-commit: 0193e3966ceeeef69e235975918b287ab093082b +Patch-mainline: v6.7-rc4 +References: git-fixes + +We found an issue under Android OTA scenario that many BIOs have to do +FEC where the data under dm-verity is 100% complete and no corruption. + +Android OTA has many dm-block layers, from upper to lower: +dm-verity +dm-snapshot +dm-origin & dm-cow +dm-linear +ufs + +DM tables have to change 2 times during Android OTA merging process. +When doing table change, the dm-snapshot will be suspended for a while. +During this interval, many readahead IOs are submitted to dm_verity +from filesystem. Then the kverity works are busy doing FEC process +which cost too much time to finish dm-verity IO. This causes needless +delay which feels like system is hung. + +After adding debugging it was found that each readahead IO needed +around 10s to finish when this situation occurred. This is due to IO +Amplification: + +dm-snapshot suspend +erofs_readahead // 300+ io is submitted + dm_submit_bio (dm_verity) + dm_submit_bio (dm_snapshot) + bio return EIO + bio got nothing, it's empty + verity_end_io + verity_verify_io + forloop range(0, io->n_blocks) // each io->nblocks ~= 20 + verity_fec_decode + fec_decode_rsb + fec_read_bufs + forloop range(0, v->fec->rsn) // v->fec->rsn = 253 + new_read + submit_bio (dm_snapshot) + end loop + end loop +dm-snapshot resume + +Readahead BIOs get nothing while dm-snapshot is suspended, so all of +them will cause verity's FEC. +Each readahead BIO needs to verify ~20 (io->nblocks) blocks. +Each block needs to do FEC, and every block needs to do 253 +(v->fec->rsn) reads. +So during the suspend interval(~200ms), 300 readahead BIOs trigger +~1518000 (300*20*253) IOs to dm-snapshot. + +As readahead IO is not required by userspace, and to fix this issue, +it is best to pass readahead errors to upper layer to handle it. + +Cc: stable@vger.kernel.org +Fixes: a739ff3f543a ("dm verity: add support for forward error correction") +Signed-off-by: Wu Bo +Reviewed-by: Mikulas Patocka +Signed-off-by: Mike Snitzer +Signed-off-by: Coly Li + +--- + drivers/md/dm-verity-target.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) + +diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c +index beec14b6b044..14e58ae70521 100644 +--- a/drivers/md/dm-verity-target.c ++++ b/drivers/md/dm-verity-target.c +@@ -667,7 +667,9 @@ static void verity_end_io(struct bio *bio) + struct dm_verity_io *io = bio->bi_private; + + if (bio->bi_status && +- (!verity_fec_is_enabled(io->v) || verity_is_system_shutting_down())) { ++ (!verity_fec_is_enabled(io->v) || ++ verity_is_system_shutting_down() || ++ (bio->bi_opf & REQ_RAHEAD))) { + verity_finish_io(io, bio->bi_status); + return; + } +-- +2.35.3 + diff --git a/series.conf b/series.conf index 79c4f81..06e2b03 100644 --- a/series.conf +++ b/series.conf @@ -44898,6 +44898,7 @@ patches.suse/wifi-cfg80211-lock-wiphy-mutex-for-rfkill-poll.patch patches.suse/uapi-propagate-__struct_group-attributes-to-the-cont.patch patches.suse/dm-verity-initialize-fec-io-before-freeing-it-7be0.patch + patches.suse/dm-verity-don-t-perform-FEC-for-failed-readahead-IO-0193.patch patches.suse/nvme-core-check-for-too-small-lba-shift.patch patches.suse/drm-i915-Call-intel_pre_plane_updates-also-for-pipes.patch patches.suse/drm-amd-display-Include-udelay-when-waiting-for-INBO.patch