Anthony Iliopoulos 1cb8cc
From 869ae85dae64b5540e4362d7fe4cd520e10ec05c Mon Sep 17 00:00:00 2001
Anthony Iliopoulos 1cb8cc
From: Brian Foster <bfoster@redhat.com>
Anthony Iliopoulos 1cb8cc
Date: Thu, 29 Oct 2020 14:30:48 -0700
Anthony Iliopoulos 1cb8cc
Subject: [PATCH] xfs: flush new eof page on truncate to avoid post-eof
Anthony Iliopoulos 1cb8cc
 corruption
Anthony Iliopoulos 1cb8cc
Git-commit: 869ae85dae64b5540e4362d7fe4cd520e10ec05c
Anthony Iliopoulos 1cb8cc
Patch-mainline: v5.10-rc3
Anthony Iliopoulos 1cb8cc
References: git-fixes
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
It is possible to expose non-zeroed post-EOF data in XFS if the new
Anthony Iliopoulos 1cb8cc
EOF page is dirty, backed by an unwritten block and the truncate
Anthony Iliopoulos 1cb8cc
happens to race with writeback. iomap_truncate_page() will not zero
Anthony Iliopoulos 1cb8cc
the post-EOF portion of the page if the underlying block is
Anthony Iliopoulos 1cb8cc
unwritten. The subsequent call to truncate_setsize() will, but
Anthony Iliopoulos 1cb8cc
doesn't dirty the page. Therefore, if writeback happens to complete
Anthony Iliopoulos 1cb8cc
after iomap_truncate_page() (so it still sees the unwritten block)
Anthony Iliopoulos 1cb8cc
but before truncate_setsize(), the cached page becomes inconsistent
Anthony Iliopoulos 1cb8cc
with the on-disk block. A mapped read after the associated page is
Anthony Iliopoulos 1cb8cc
reclaimed or invalidated exposes non-zero post-EOF data.
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
For example, consider the following sequence when run on a kernel
Anthony Iliopoulos 1cb8cc
modified to explicitly flush the new EOF page within the race
Anthony Iliopoulos 1cb8cc
Window: 
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
$ xfs_io -fc "falloc 0 4k" -c fsync /mnt/file
Anthony Iliopoulos 1cb8cc
$ xfs_io -c "pwrite 0 4k" -c "truncate 1k" /mnt/file
Anthony Iliopoulos 1cb8cc
  ...
Anthony Iliopoulos 1cb8cc
$ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file
Anthony Iliopoulos 1cb8cc
00000400: 00 00 00 00 00 00 00 00  ........
Anthony Iliopoulos 1cb8cc
$ umount /mnt/; mount <dev> /mnt/
Anthony Iliopoulos 1cb8cc
$ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file
Anthony Iliopoulos 1cb8cc
00000400: cd cd cd cd cd cd cd cd  ........
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
Update xfs_setattr_size() to explicitly flush the new EOF page prior
Anthony Iliopoulos 1cb8cc
to the page truncate to ensure iomap has the latest state of the
Anthony Iliopoulos 1cb8cc
underlying block.
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
Fixes: 68a9f5e7007c ("xfs: implement iomap based buffered write path")
Anthony Iliopoulos 1cb8cc
Signed-off-by: Brian Foster <bfoster@redhat.com>
Anthony Iliopoulos 1cb8cc
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Anthony Iliopoulos 1cb8cc
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Anthony Iliopoulos 1cb8cc
Acked-by: Anthony Iliopoulos <ailiop@suse.com>
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
---
Anthony Iliopoulos 1cb8cc
 fs/xfs/xfs_iops.c |   10 ++++++++++
Anthony Iliopoulos 1cb8cc
 1 file changed, 10 insertions(+)
Anthony Iliopoulos 1cb8cc
Anthony Iliopoulos 1cb8cc
--- a/fs/xfs/xfs_iops.c
Anthony Iliopoulos 1cb8cc
+++ b/fs/xfs/xfs_iops.c
Anthony Iliopoulos 1cb8cc
@@ -886,6 +886,16 @@
Anthony Iliopoulos 1cb8cc
 		error = iomap_zero_range(inode, oldsize, newsize - oldsize,
Anthony Iliopoulos 1cb8cc
 				&did_zeroing, &xfs_iomap_ops);
Anthony Iliopoulos 1cb8cc
 	} else {
Anthony Iliopoulos 1cb8cc
+		/*
Anthony Iliopoulos 1cb8cc
+		 * iomap won't detect a dirty page over an unwritten block (or a
Anthony Iliopoulos 1cb8cc
+		 * cow block over a hole) and subsequently skips zeroing the
Anthony Iliopoulos 1cb8cc
+		 * newly post-EOF portion of the page. Flush the new EOF to
Anthony Iliopoulos 1cb8cc
+		 * convert the block before the pagecache truncate.
Anthony Iliopoulos 1cb8cc
+		 */
Anthony Iliopoulos 1cb8cc
+		error = filemap_write_and_wait_range(inode->i_mapping, newsize,
Anthony Iliopoulos 1cb8cc
+						     newsize);
Anthony Iliopoulos 1cb8cc
+		if (error)
Anthony Iliopoulos 1cb8cc
+			return error;
Anthony Iliopoulos 1cb8cc
 		error = iomap_truncate_page(inode, newsize, &did_zeroing,
Anthony Iliopoulos 1cb8cc
 				&xfs_iomap_ops);
Anthony Iliopoulos 1cb8cc
 	}