[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-btrfs
Subject: [PATCH V20 00/19] Allow I/O on blocks whose size is less than page size
From: Chandan Rajendra <chandan () linux ! vnet ! ibm ! com>
Date: 2016-07-04 4:46:20
Message-ID: 1467606879-14181-1-git-send-email-chandan () linux ! vnet ! ibm ! com
[Download RAW message or body]
Btrfs assumes block size to be the same as the machine's page
size. This would mean that a Btrfs instance created on a 4k page size
machine (e.g. x86) will not be mountable on machines with larger page
sizes (e.g. PPC64/AARCH64). This patchset aims to resolve this
incompatibility.
This patchset continues with the work posted previously at
http://thread.gmane.org/gmane.comp.file-systems.btrfs/57282
I have reverted the upstream commit "btrfs: fix lockups from
btrfs_clear_path_blocking" (f82c458a2c3ffb94b431fc6ad791a79df1b3713e)
since this led to soft-lockups when the patch "Btrfs:
subpagesize-blocksize: Prevent writes to an extent buffer when
PG_writeback flag is set" is applied. During 2015's Vault Conference
Btrfs meetup, Chris Mason had suggested that he will write up a
suitable locking function to be used when writing dirty pages that map
metadata blocks. Until we have a suitable locking function available,
this patchset temporarily disables the commit
f82c458a2c3ffb94b431fc6ad791a79df1b3713e.
The commits for the Btrfs kernel module can be found at
https://github.com/chandanr/linux/tree/btrfs/subpagesize-blocksize.
To create a filesystem with block size < page size, a patched version
of the Btrfs-progs package is required. The corresponding fixes for
Btrfs-progs can be found at
https://github.com/chandanr/btrfs-progs/tree/btrfs/subpagesize-blocksize.
The patchset is based off kdave/for-next branch. I had cherry picked the
following fixes from Chris Mason's git tree,
1. Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes
Fstests run status:
1. x86_64
- With 4k sectorsize, all the tests that succeed with the for-next
branch at git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
branch also do so with the patches applied.
- With 2k sectorsize, generic/027 never seems to complete. In my
case, the test did not complete even after 45 mins of run time.
2. ppc64
- With 4k sectorsize, 16k nodesize and with "nospace_cache" mount
option, except for scrub and compression tests, all the tests
that succeed with the for-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
branch also do so with the patches applied.
- With 64k sectorsize & nodesize, all the tests that succeed with
the for-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
branch also do so with the patches applied.
TODO:
1. On ppc64, btrfsck segfaults when checking a filesystem instance
having 2k sectorsize.
2. I am planning to fix scrub & compression via a separate patchset.
Changes from V19:
1. The patchset has been rebased on top of kdave/for-next branch.
2. The patch "Btrfs: subpage-blocksize: extent_clear_unlock_delalloc:
Prevent page from being unlocked more than once" changes the
signatures of the functions "cow_file_range" &
"extent_clear_unlock_delalloc". This patch has now been moved to be
the first patch in the patchset.
3. A new patch "Btrfs: subpage-blocksize: Rate limit scrub error
message" has been added. btrfs/073 invokes the scrub ioctl in a
tight loop. In subpage-blocksize scenario this results in a lot of
"scrub: size assumption sectorsize != PAGE_SIZE" messages being
printed on the console. Hence this patch rate limits such error
messages.
Changes from V18:
1. The per-page bitmap used to track the block status is now allocated
from a slab cache.
2. The per-page bitmap is allocated and used only in cases where
sectorsize < PAGE_SIZE.
3. The new patch "Btrfs: subpage-blocksize: Disable compression"
disables compression in subpage-blocksize scenario.
Changes from V17:
1. Due to mistakes made during git rebase operations, fixes ended up
in incorrect patches. This patchset gets the fixes in the right
patches.
Changes from V16:
1. The V15 patchset consisted of patches obtained from an incorrect
git branch. Apologies for the mistake. All the entries listed under
"Changes from V15" hold good for V16.
Changes from V15:
1. The invocation of cleancache_get_page() in __do_readpage() assumed
blocksize to be same as PAGE_SIZE. We now invoke cleancache_get_page()
only if blocksize is same as PAGE_SIZE. Thanks to David Sterba for
pointing this out.
2. In __extent_writepage_io() we used to accumulate all the contiguous
dirty blocks within the page before submitting the file offset range
for I/O. In some cases this caused the bio to span across more than
a stripe. For example, With 4k block size, 64K stripe size
and 64K page size, assume
- All the blocks mapped by the page are contiguous on the logical
address space.
- The first block of the page is mapped to the second block of the
stripe.
In such a scenario, we would add all the blocks of the page to
bio. This would mean that we would overflow the stripe by one 4K
block. Hence this patchset removes the optimization and invokes
submit_extent_page() for every dirty 4K block.
3. The following patches are newly added:
- Btrfs: subpage-blocksize: __btrfs_lookup_bio_sums: Set offset
when moving to a new bio_vec
- Btrfs: subpage-blocksize: Make file extent relocate code subpage
blocksize aware
- Btrfs: btrfs_clone: Flush dirty blocks of a page that do not map
the clone range
Changes from V14:
1. Fix usage of cleancache_get_page() in __do_readpage().
In filesystems which support subpage-blocksize scenario, a page can
map one or more blocks. Hence cleancache_get_page() should be
invoked only when the page maps a non-hole extent and block size
being used is equal to the page size. Thanks to David Sterba for
pointing this out.
2. Replace page_read_complete() and page_write_complete() functions
with page_io_complete().
3. Provide more documentation (as part of both commit message and code
comments) about the usage of the per-page
btrfs_page_private->io_lock.
Changes from V13:
1. Enable dedup ioctl to work in subpagesize-blocksize scenario.
Changes from V12:
1. The logic in the function btrfs_punch_hole() has been fixed to
check for the presence of BLK_STATE_UPTODATE flags for blocks in
pages which partially map the file range being punched.
Changes from V11:
1. Addressed the review comments provided by Liu Bo for version V11.
2. Fixed file defragmentation code to work in subpagesize-blocksize
scenario.
3. Many "hard to reproduce" bugs were fixed.
Chandan Rajendra (19):
Btrfs: subpage-blocksize: Fix whole page read.
Btrfs: subpage-blocksize: Fix whole page write
Btrfs: subpage-blocksize: Make sure delalloc range intersects with the
locked page's range
Btrfs: subpage-blocksize: Define extent_buffer_head
Btrfs: subpage-blocksize: Read tree blocks whose size is < PAGE_SIZE
Btrfs: subpage-blocksize: Write only dirty extent buffers belonging to
a page
Btrfs: subpage-blocksize: Allow mounting filesystems where sectorsize
< PAGE_SIZE
Btrfs: subpage-blocksize: Deal with partial ordered extent
allocations.
Btrfs: subpage-blocksize: Explicitly track I/O status of blocks of an
ordered extent.
Btrfs: subpage-blocksize: btrfs_punch_hole: Fix uptodate blocks check
Btrfs: subpage-blocksize: Prevent writes to an extent buffer when
PG_writeback flag is set
Revert "btrfs: fix lockups from btrfs_clear_path_blocking"
Btrfs: subpage-blocksize: Fix file defragmentation code
Btrfs: subpage-blocksize: Enable dedupe ioctl
Btrfs: subpage-blocksize: btrfs_clone: Flush dirty blocks of a page
that do not map the clone range
Btrfs: subpage-blocksize: Make file extent relocate code subpage
blocksize aware
Btrfs: subpage-blocksize: __btrfs_lookup_bio_sums: Set offset when
moving to a new bio_vec
Btrfs: subpage-blocksize: Disable compression
Btrfs: subpage-blocksize: Rate limit scrub error message
fs/btrfs/ctree.c | 36 +-
fs/btrfs/ctree.h | 6 +-
fs/btrfs/disk-io.c | 167 ++--
fs/btrfs/disk-io.h | 5 +-
fs/btrfs/extent-tree.c | 20 +-
fs/btrfs/extent_io.c | 1687 +++++++++++++++++++++++---------
fs/btrfs/extent_io.h | 147 ++-
fs/btrfs/file-item.c | 7 +-
fs/btrfs/file.c | 106 +-
fs/btrfs/inode.c | 404 ++++++--
fs/btrfs/ioctl.c | 232 +++--
fs/btrfs/locking.c | 24 +-
fs/btrfs/locking.h | 2 -
fs/btrfs/ordered-data.c | 19 +
fs/btrfs/ordered-data.h | 4 +
fs/btrfs/relocation.c | 86 +-
fs/btrfs/root-tree.c | 2 +-
fs/btrfs/scrub.c | 2 +-
fs/btrfs/super.c | 29 +-
fs/btrfs/tests/btrfs-tests.c | 12 +-
fs/btrfs/tests/extent-io-tests.c | 5 +-
fs/btrfs/tests/free-space-tree-tests.c | 79 +-
fs/btrfs/tree-log.c | 2 +-
fs/btrfs/volumes.c | 12 +-
include/trace/events/btrfs.h | 2 +-
25 files changed, 2227 insertions(+), 870 deletions(-)
--
2.5.5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic