Skip to content
Snippets Groups Projects
user avatar
Maciej S. Szmigiero authored
This driver is like virtio-balloon on steroids: it allows both changing the
guest memory allocation via ballooning and (in the next patch) inserting
pieces of extra RAM into it on demand from a provided memory backend.

The actual resizing is done via ballooning interface (for example, via
the "balloon" HMP command).
This includes resizing the guest past its boot size - that is, hot-adding
additional memory in granularity limited only by the guest alignment
requirements, as provided by the next patch.

In contrast with ACPI DIMM hotplug where one can only request to unplug a
whole DIMM stick this driver allows removing memory from guest in single
page (4k) units via ballooning.

After a VM reboot the guest is back to its original (boot) size.

In the future, the guest boot memory size might be changed on reboot
instead, taking into account the effective size that VM had before that
reboot (much like Hyper-V does).

For performance reasons, the guest-released memory is tracked in a few
range trees, as a series of (start, count) ranges.
Each time a new page range is inserted into such tree its neighbors are
checked as candidates for possible merging with it.

Besides performance reasons, the Dynamic Memory protocol itself uses page
ranges as the data structure in its messages, so relevant pages need to be
merged into such ranges anyway.

One has to be careful when tracking the guest-released pages, since the
guest can maliciously report returning pages outside its current address
space, which later clash with the address range of newly added memory.
Similarly, the guest can report freeing the same page twice.

The above design results in much better ballooning performance than when
using virtio-balloon with the same guest: 230 GB / minute with this driver
versus 70 GB / minute with virtio-balloon.

During a ballooning operation most of time is spent waiting for the guest
to come up with newly freed page ranges, processing the received ranges on
the host side (in QEMU and KVM) is nearly instantaneous.

The unballoon operation is also pretty much instantaneous:
thanks to the merging of the ballooned out page ranges 200 GB of memory can
be returned to the guest in about 1 second.
With virtio-balloon this operation takes about 2.5 minutes.

These tests were done against a Windows Server 2019 guest running on a
Xeon E5-2699, after dirtying the whole memory inside guest before each
balloon operation.

Using a range tree instead of a bitmap to track the removed memory also
means that the solution scales well with the guest size: even a 1 TB range
takes just a few bytes of such metadata.

Since the required GTree operations aren't present in every Glib version
a check for them was added to the meson build script, together with new
"--enable-hv-balloon" and "--disable-hv-balloon" configure arguments.
If these GTree operations are missing in the system's Glib version this
driver will be skipped during QEMU build.

An optional "status-report=on" device parameter requests memory status
events from the guest (typically sent every second), which allow the host
to learn both the guest memory available and the guest memory in use
counts.

Following commits will add support for their external emission as
"HV_BALLOON_STATUS_REPORT" QMP events.

The driver is named hv-balloon since the Linux kernel client driver for
the Dynamic Memory Protocol is named as such and to follow the naming
pattern established by the virtio-balloon driver.
The whole protocol runs over Hyper-V VMBus.

The driver was tested against Windows Server 2012 R2, Windows Server 2016
and Windows Server 2019 guests and obeys the guest alignment requirements
reported to the host via DM_CAPABILITIES_REPORT message.

Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
0d9e8c0b
History

QEMU README

QEMU is a generic and open source machine & userspace emulator and virtualizer.

QEMU is capable of emulating a complete machine in software without any need for hardware virtualization support. By using dynamic translation, it achieves very good performance. QEMU can also integrate with the Xen and KVM hypervisors to provide emulated hardware while allowing the hypervisor to manage the CPU. With hypervisor support, QEMU can achieve near native performance for CPUs. When QEMU emulates CPUs directly it is capable of running operating systems made for one machine (e.g. an ARMv7 board) on a different machine (e.g. an x86_64 PC board).

QEMU is also capable of providing userspace API virtualization for Linux and BSD kernel interfaces. This allows binaries compiled against one architecture ABI (e.g. the Linux PPC64 ABI) to be run on a host using a different architecture ABI (e.g. the Linux x86_64 ABI). This does not involve any hardware emulation, simply CPU and syscall emulation.

QEMU aims to fit into a variety of use cases. It can be invoked directly by users wishing to have full control over its behaviour and settings. It also aims to facilitate integration into higher level management layers, by providing a stable command line interface and monitor API. It is commonly invoked indirectly via the libvirt library when using open source applications such as oVirt, OpenStack and virt-manager.

QEMU as a whole is released under the GNU General Public License, version 2. For full licensing details, consult the LICENSE file.

Documentation

Documentation can be found hosted online at https://www.qemu.org/documentation/. The documentation for the current development version that is available at https://www.qemu.org/docs/master/ is generated from the docs/ folder in the source tree, and is built by Sphinx.

Building

QEMU is multi-platform software intended to be buildable on all modern Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other UNIX targets. The simple steps to build QEMU are:

mkdir build
cd build
../configure
make

Additional information can also be found online via the QEMU website:

Submitting patches

The QEMU source code is maintained under the GIT version control system.

git clone https://gitlab.com/qemu-project/qemu.git

When submitting patches, one common approach is to use 'git format-patch' and/or 'git send-email' to format & send the mail to the qemu-devel@nongnu.org mailing list. All patches submitted must contain a 'Signed-off-by' line from the author. Patches should follow the guidelines set out in the style section of the Developers Guide.

Additional information on submitting patches can be found online via the QEMU website

The QEMU website is also maintained under source control.

git clone https://gitlab.com/qemu-project/qemu-web.git

A 'git-publish' utility was created to make above process less cumbersome, and is highly recommended for making regular contributions, or even just for sending consecutive patch series revisions. It also requires a working 'git send-email' setup, and by default doesn't automate everything, so you may want to go through the above steps manually for once.

For installation instructions, please go to

The workflow with 'git-publish' is:

$ git checkout master -b my-feature
$ # work on new commits, add your 'Signed-off-by' lines to each
$ git publish

Your patch series will be sent and tagged as my-feature-v1 if you need to refer back to it in the future.

Sending v2:

$ git checkout my-feature # same topic branch
$ # making changes to the commits (using 'git rebase', for example)
$ git publish

Your patch series will be sent with 'v2' tag in the subject and the git tip will be tagged as my-feature-v2.

Bug reporting

The QEMU project uses GitLab issues to track bugs. Bugs found when running code built from QEMU git or upstream released sources should be reported via:

If using QEMU via an operating system vendor pre-built binary package, it is preferable to report bugs to the vendor's own bug tracker first. If the bug is also known to affect latest upstream code, it can also be reported via GitLab.

For additional information on bug reporting consult:

ChangeLog

For version history and release notes, please visit https://wiki.qemu.org/ChangeLog/ or look at the git history for more detailed information.

Contact

The QEMU community can be contacted in a number of ways, with the two main methods being email and IRC

Information on additional methods of contacting the community can be found online via the QEMU website: