Skip to content
  • Vladimir Sementsov-Ogievskiy's avatar
    8b117001
    block: introduce BDRV_MAX_LENGTH · 8b117001
    Vladimir Sementsov-Ogievskiy authored
    
    
    We are going to modify block layer to work with 64bit requests. And
    first step is moving to int64_t type for both offset and bytes
    arguments in all block request related functions.
    
    It's mostly safe (when widening signed or unsigned int to int64_t), but
    switching from uint64_t is questionable.
    
    So, let's first establish the set of requests we want to work with.
    First signed int64_t should be enough, as off_t is signed anyway. Then,
    obviously offset + bytes should not overflow.
    
    And most interesting: (offset + bytes) being aligned up should not
    overflow as well. Aligned to what alignment? First thing that comes in
    mind is bs->bl.request_alignment, as we align up request to this
    alignment. But there is another thing: look at
    bdrv_mark_request_serialising(). It aligns request up to some given
    alignment. And this parameter may be bdrv_get_cluster_size(), which is
    often a lot greater than bs->bl.request_alignment.
    Note also, that bdrv_mark_request_serialising() uses signed int64_t for
    calculations. So, actually, we already depend on some restrictions.
    
    Happily, bdrv_get_cluster_size() returns int and
    bs->bl.request_alignment has 32bit unsigned type, but defined to be a
    power of 2 less than INT_MAX. So, we may establish, that INT_MAX is
    absolute maximum for any kind of alignment that may occur with the
    request.
    
    Note, that bdrv_get_cluster_size() is not documented to return power
    of 2, still bdrv_mark_request_serialising() behaves like it is.
    Also, backup uses bdi.cluster_size and is not prepared to it not being
    power of 2.
    So, let's establish that Qemu supports only power-of-2 clusters and
    alignments.
    
    So, alignment can't be greater than 2^30.
    
    Finally to be safe with calculations, to not calculate different
    maximums for different nodes (depending on cluster size and
    request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
    as absolute maximum bytes length for Qemu. Actually, it's not much less
    than INT64_MAX.
    
    OK, then, let's apply it to block/io.
    
    Let's consider all block/io entry points of offset/bytes:
    
    4 bytes/offset interface functions: bdrv_co_preadv_part(),
    bdrv_co_pwritev_part(), bdrv_co_copy_range_internal() and
    bdrv_co_pdiscard() and we check them all with bdrv_check_request().
    
    We also have one entry point with only offset: bdrv_co_truncate().
    Check the offset.
    
    And one public structure: BdrvTrackedRequest. Happily, it has only
    three external users:
    
     file-posix.c: adopted by this patch
     write-threshold.c: only read fields
     test-write-threshold.c: sets obviously small constant values
    
    Better is to make the structure private and add corresponding
    interfaces.. Still it's not obvious what kind of interface is needed
    for file-posix.c. Let's keep it public but add corresponding
    assertions.
    
    After this patch we'll convert functions in block/io.c to int64_t bytes
    and offset parameters. We can assume that offset/bytes pair always
    satisfy new restrictions, and make
    corresponding assertions where needed. If we reach some offset/bytes
    point in block/io.c missing bdrv_check_request() it is considered a
    bug. As well, if block/io.c modifies a offset/bytes request, expanding
    it more then aligning up to request_alignment, it's a bug too.
    
    For all io requests except for discard we keep for now old restriction
    of 32bit request length.
    
    iotest 206 output error message changed, as now test disk size is
    larger than new limit. Add one more test case with new maximum disk
    size to cover too-big-L1 case.
    
    Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Message-Id: <20201203222713.13507-5-vsementsov@virtuozzo.com>
    Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
    8b117001
    block: introduce BDRV_MAX_LENGTH
    Vladimir Sementsov-Ogievskiy authored
    
    
    We are going to modify block layer to work with 64bit requests. And
    first step is moving to int64_t type for both offset and bytes
    arguments in all block request related functions.
    
    It's mostly safe (when widening signed or unsigned int to int64_t), but
    switching from uint64_t is questionable.
    
    So, let's first establish the set of requests we want to work with.
    First signed int64_t should be enough, as off_t is signed anyway. Then,
    obviously offset + bytes should not overflow.
    
    And most interesting: (offset + bytes) being aligned up should not
    overflow as well. Aligned to what alignment? First thing that comes in
    mind is bs->bl.request_alignment, as we align up request to this
    alignment. But there is another thing: look at
    bdrv_mark_request_serialising(). It aligns request up to some given
    alignment. And this parameter may be bdrv_get_cluster_size(), which is
    often a lot greater than bs->bl.request_alignment.
    Note also, that bdrv_mark_request_serialising() uses signed int64_t for
    calculations. So, actually, we already depend on some restrictions.
    
    Happily, bdrv_get_cluster_size() returns int and
    bs->bl.request_alignment has 32bit unsigned type, but defined to be a
    power of 2 less than INT_MAX. So, we may establish, that INT_MAX is
    absolute maximum for any kind of alignment that may occur with the
    request.
    
    Note, that bdrv_get_cluster_size() is not documented to return power
    of 2, still bdrv_mark_request_serialising() behaves like it is.
    Also, backup uses bdi.cluster_size and is not prepared to it not being
    power of 2.
    So, let's establish that Qemu supports only power-of-2 clusters and
    alignments.
    
    So, alignment can't be greater than 2^30.
    
    Finally to be safe with calculations, to not calculate different
    maximums for different nodes (depending on cluster size and
    request_alignment), let's simply set QEMU_ALIGN_DOWN(INT64_MAX, 2^30)
    as absolute maximum bytes length for Qemu. Actually, it's not much less
    than INT64_MAX.
    
    OK, then, let's apply it to block/io.
    
    Let's consider all block/io entry points of offset/bytes:
    
    4 bytes/offset interface functions: bdrv_co_preadv_part(),
    bdrv_co_pwritev_part(), bdrv_co_copy_range_internal() and
    bdrv_co_pdiscard() and we check them all with bdrv_check_request().
    
    We also have one entry point with only offset: bdrv_co_truncate().
    Check the offset.
    
    And one public structure: BdrvTrackedRequest. Happily, it has only
    three external users:
    
     file-posix.c: adopted by this patch
     write-threshold.c: only read fields
     test-write-threshold.c: sets obviously small constant values
    
    Better is to make the structure private and add corresponding
    interfaces.. Still it's not obvious what kind of interface is needed
    for file-posix.c. Let's keep it public but add corresponding
    assertions.
    
    After this patch we'll convert functions in block/io.c to int64_t bytes
    and offset parameters. We can assume that offset/bytes pair always
    satisfy new restrictions, and make
    corresponding assertions where needed. If we reach some offset/bytes
    point in block/io.c missing bdrv_check_request() it is considered a
    bug. As well, if block/io.c modifies a offset/bytes request, expanding
    it more then aligning up to request_alignment, it's a bug too.
    
    For all io requests except for discard we keep for now old restriction
    of 32bit request length.
    
    iotest 206 output error message changed, as now test disk size is
    larger than new limit. Add one more test case with new maximum disk
    size to cover too-big-L1 case.
    
    Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Message-Id: <20201203222713.13507-5-vsementsov@virtuozzo.com>
    Signed-off-by: default avatarKevin Wolf <kwolf@redhat.com>
Loading