Skip to content
Snippets Groups Projects
user avatar
Vladimir Sementsov-Ogievskiy authored
Current scheme of image fleecing looks like this:

[guest]                    [NBD export]
  |                              |
  |root                          | root
  v                              v
[copy-before-write] -----> [temp.qcow2]
  |                 target  |
  |file                     |backing
  v                         |
[active disk] <-------------+

 - On guest writes copy-before-write filter copies old data from active
   disk to temp.qcow2. So fleecing client (NBD export) when reads
   changed regions from temp.qcow2 image and unchanged from active disk
   through backing link.

This patch makes possible new image fleecing scheme:

[guest]                   [NBD export]
   |                            |
   | root                       | root
   v                 file       v
[copy-before-write]<------[snapshot-access]
   |           |
   | file      | target
   v           v
[active-disk] [temp.img]

 - copy-before-write does CBW operations and also provides
   snapshot-access API. The API may be accessed through
   snapshot-access driver.

Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
   by original dirty bitmap used on copy-before-write open, client gets
   -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
   discarding data in temp.img informs block-copy process to not copy
   these clusters. Next read from discarded area will return -EACCES.
   This is significant thing: when fleecing user reads data that was
   not yet copied to temp.img, we can avoid copying it on further guest
   write.

3. Synchronisation between client reads and block-copy write is more
   efficient. In old scheme we just rely on BDRV_REQ_SERIALISING flag
   used for writes to temp.qcow2. New scheme is less blocking:
     - fleecing reads are never blocked: if data region is untouched or
       in-flight, we just read from active-disk, otherwise we read from
       temp.img
     - writes to temp.img are not blocked by fleecing reads
     - still, guest writes of-course are blocked by in-flight fleecing
       reads, that currently read from active-disk - it's the minimum
       necessary blocking

4. Temporary image may be of any format, as we don't rely on backing
   feature.

5. Permission relation are simplified. With old scheme we have to share
   write permission on target child of copy-before-write, otherwise
   backing link conflicts with copy-before-write file child write
   permissions. With new scheme we don't have backing link, and
   copy-before-write node may have unshared access to temporary node.
   (Not realized in this commit, will be in future).

6. Having control on fleecing reads we'll be able to implement
   alternative behavior on failed copy-before-write operations.
   Currently we just break guest request (that's a historical behavior
   of backup). But in some scenarios it's a bad behavior: better
   is to drop the backup as failed but don't break guest request.
   With new scheme we can simply unset some bits in a bitmap on CBW
   failure and further fleecing reads will -EACCES, or something like
   this. (Not implemented in this commit, will be in future)
   Additional application for this is implementing timeout for CBW
   operations.

Iotest 257 output is updated, as two more bitmaps now live in
copy-before-write filter.

Signed-off-by: default avatarVladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220303194349.2304213-13-vsementsov@virtuozzo.com>
Signed-off-by: default avatarHanna Reitz <hreitz@redhat.com>
af5bcd77
History

QEMU README

QEMU is a generic and open source machine & userspace emulator and virtualizer.

QEMU is capable of emulating a complete machine in software without any need for hardware virtualization support. By using dynamic translation, it achieves very good performance. QEMU can also integrate with the Xen and KVM hypervisors to provide emulated hardware while allowing the hypervisor to manage the CPU. With hypervisor support, QEMU can achieve near native performance for CPUs. When QEMU emulates CPUs directly it is capable of running operating systems made for one machine (e.g. an ARMv7 board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux and BSD kernel interfaces. This allows binaries compiled against one architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a different architecture ABI (e.g. the Linux x86_64 ABI). This does not involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly by users wishing to have full control over its behaviour and settings. It also aims to facilitate integration into higher level management layers, by providing a stable command line interface and monitor API. It is commonly invoked indirectly via the libvirt library when using open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License, version 2. For full licensing details, consult the LICENSE file.

Documentation

Documentation can be found hosted online at https://www.qemu.org/documentation/. The documentation for the current development version that is available at https://www.qemu.org/docs/master/ is generated from the docs/ folder in the source tree, and is built by Sphinx <https://www.sphinx-doc.org/en/master/>_.

Building

QEMU is multi-platform software intended to be buildable on all modern Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other UNIX targets. The simple steps to build QEMU are:

mkdir build
cd build
../configure
make

Additional information can also be found online via the QEMU website:

Submitting patches

The QEMU source code is maintained under the GIT version control system.

git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git format-patch' and/or 'git send-email' to format & send the mail to the qemu-devel@nongnu.org mailing list. All patches submitted must contain a 'Signed-off-by' line from the author. Patches should follow the guidelines set out in the style section <https://www.qemu.org/docs/master/devel/style.html> of the Developers Guide.

Additional information on submitting patches can be found online via the QEMU website

The QEMU website is also maintained under source control.

git clone https://gitlab.com/qemu-project/qemu-web.git

A 'git-publish' utility was created to make above process less cumbersome, and is highly recommended for making regular contributions, or even just for sending consecutive patch series revisions. It also requires a working 'git send-email' setup, and by default doesn't automate everything, so you may want to go through the above steps manually for once.

For installation instructions, please go to

The workflow with 'git-publish' is:

$ git checkout master -b my-feature
$ # work on new commits, add your 'Signed-off-by' lines to each
$ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer back to it in the future.

Sending v2:

$ git checkout my-feature # same topic branch
$ # making changes to the commits (using 'git rebase', for example)
$ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip will be tagged as my-feature-v2.

Bug reporting

The QEMU project uses GitLab issues to track bugs. Bugs found when running code built from QEMU git or upstream released sources should be reported via:

If using QEMU via an operating system vendor pre-built binary package, it is preferable to report bugs to the vendor's own bug tracker first. If the bug is also known to affect latest upstream code, it can also be reported via GitLab.

For additional information on bug reporting consult:

ChangeLog

For version history and release notes, please visit https://wiki.qemu.org/ChangeLog/ or look at the git history for more detailed information.

Contact

The QEMU community can be contacted in a number of ways, with the two main methods being email and IRC

Information on additional methods of contacting the community can be found online via the QEMU website: