|
Anthony Iliopoulos |
1cb8cc |
From 869ae85dae64b5540e4362d7fe4cd520e10ec05c Mon Sep 17 00:00:00 2001
|
|
Anthony Iliopoulos |
1cb8cc |
From: Brian Foster <bfoster@redhat.com>
|
|
Anthony Iliopoulos |
1cb8cc |
Date: Thu, 29 Oct 2020 14:30:48 -0700
|
|
Anthony Iliopoulos |
1cb8cc |
Subject: [PATCH] xfs: flush new eof page on truncate to avoid post-eof
|
|
Anthony Iliopoulos |
1cb8cc |
corruption
|
|
Anthony Iliopoulos |
1cb8cc |
Git-commit: 869ae85dae64b5540e4362d7fe4cd520e10ec05c
|
|
Anthony Iliopoulos |
1cb8cc |
Patch-mainline: v5.10-rc3
|
|
Anthony Iliopoulos |
1cb8cc |
References: git-fixes
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
It is possible to expose non-zeroed post-EOF data in XFS if the new
|
|
Anthony Iliopoulos |
1cb8cc |
EOF page is dirty, backed by an unwritten block and the truncate
|
|
Anthony Iliopoulos |
1cb8cc |
happens to race with writeback. iomap_truncate_page() will not zero
|
|
Anthony Iliopoulos |
1cb8cc |
the post-EOF portion of the page if the underlying block is
|
|
Anthony Iliopoulos |
1cb8cc |
unwritten. The subsequent call to truncate_setsize() will, but
|
|
Anthony Iliopoulos |
1cb8cc |
doesn't dirty the page. Therefore, if writeback happens to complete
|
|
Anthony Iliopoulos |
1cb8cc |
after iomap_truncate_page() (so it still sees the unwritten block)
|
|
Anthony Iliopoulos |
1cb8cc |
but before truncate_setsize(), the cached page becomes inconsistent
|
|
Anthony Iliopoulos |
1cb8cc |
with the on-disk block. A mapped read after the associated page is
|
|
Anthony Iliopoulos |
1cb8cc |
reclaimed or invalidated exposes non-zero post-EOF data.
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
For example, consider the following sequence when run on a kernel
|
|
Anthony Iliopoulos |
1cb8cc |
modified to explicitly flush the new EOF page within the race
|
|
Anthony Iliopoulos |
1cb8cc |
Window:
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
$ xfs_io -fc "falloc 0 4k" -c fsync /mnt/file
|
|
Anthony Iliopoulos |
1cb8cc |
$ xfs_io -c "pwrite 0 4k" -c "truncate 1k" /mnt/file
|
|
Anthony Iliopoulos |
1cb8cc |
...
|
|
Anthony Iliopoulos |
1cb8cc |
$ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file
|
|
Anthony Iliopoulos |
1cb8cc |
00000400: 00 00 00 00 00 00 00 00 ........
|
|
Anthony Iliopoulos |
1cb8cc |
$ umount /mnt/; mount <dev> /mnt/
|
|
Anthony Iliopoulos |
1cb8cc |
$ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file
|
|
Anthony Iliopoulos |
1cb8cc |
00000400: cd cd cd cd cd cd cd cd ........
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
Update xfs_setattr_size() to explicitly flush the new EOF page prior
|
|
Anthony Iliopoulos |
1cb8cc |
to the page truncate to ensure iomap has the latest state of the
|
|
Anthony Iliopoulos |
1cb8cc |
underlying block.
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
Fixes: 68a9f5e7007c ("xfs: implement iomap based buffered write path")
|
|
Anthony Iliopoulos |
1cb8cc |
Signed-off-by: Brian Foster <bfoster@redhat.com>
|
|
Anthony Iliopoulos |
1cb8cc |
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Anthony Iliopoulos |
1cb8cc |
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Anthony Iliopoulos |
1cb8cc |
Acked-by: Anthony Iliopoulos <ailiop@suse.com>
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
---
|
|
Anthony Iliopoulos |
1cb8cc |
fs/xfs/xfs_iops.c | 10 ++++++++++
|
|
Anthony Iliopoulos |
1cb8cc |
1 file changed, 10 insertions(+)
|
|
Anthony Iliopoulos |
1cb8cc |
|
|
Anthony Iliopoulos |
1cb8cc |
--- a/fs/xfs/xfs_iops.c
|
|
Anthony Iliopoulos |
1cb8cc |
+++ b/fs/xfs/xfs_iops.c
|
|
Anthony Iliopoulos |
1cb8cc |
@@ -886,6 +886,16 @@
|
|
Anthony Iliopoulos |
1cb8cc |
error = iomap_zero_range(inode, oldsize, newsize - oldsize,
|
|
Anthony Iliopoulos |
1cb8cc |
&did_zeroing, &xfs_iomap_ops);
|
|
Anthony Iliopoulos |
1cb8cc |
} else {
|
|
Anthony Iliopoulos |
1cb8cc |
+ /*
|
|
Anthony Iliopoulos |
1cb8cc |
+ * iomap won't detect a dirty page over an unwritten block (or a
|
|
Anthony Iliopoulos |
1cb8cc |
+ * cow block over a hole) and subsequently skips zeroing the
|
|
Anthony Iliopoulos |
1cb8cc |
+ * newly post-EOF portion of the page. Flush the new EOF to
|
|
Anthony Iliopoulos |
1cb8cc |
+ * convert the block before the pagecache truncate.
|
|
Anthony Iliopoulos |
1cb8cc |
+ */
|
|
Anthony Iliopoulos |
1cb8cc |
+ error = filemap_write_and_wait_range(inode->i_mapping, newsize,
|
|
Anthony Iliopoulos |
1cb8cc |
+ newsize);
|
|
Anthony Iliopoulos |
1cb8cc |
+ if (error)
|
|
Anthony Iliopoulos |
1cb8cc |
+ return error;
|
|
Anthony Iliopoulos |
1cb8cc |
error = iomap_truncate_page(inode, newsize, &did_zeroing,
|
|
Anthony Iliopoulos |
1cb8cc |
&xfs_iomap_ops);
|
|
Anthony Iliopoulos |
1cb8cc |
}
|