- May 26, 2016
-
-
Alexey Kardashevskiy authored
Since a788f227 "memory: Allow replay of IOMMU mapping notifications" when new VFIO listener is added, all existing IOMMU mappings are replayed. However there is a problem that the base address of an IOMMU memory region (IOMMU MR) is ignored which is not a problem for the existing user (which is pseries) with its default 32bit DMA window starting at 0 but it is if there is another DMA window. This stores the IOMMU's offset_within_address_space and adjusts the IOVA before calling vfio_dma_map/vfio_dma_unmap. As the IOMMU notifier expects IOVA offset rather than the absolute address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before calling notifier(s). Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Alex Williamson <alex.williamson@redhat.com>
-
Alexey Kardashevskiy authored
7532d3cb "vfio: Fix 128 bit handling" added support for 64bit IOMMU memory regions when those are added to VFIO address space; however removing code cannot cope with these as int128_get64() will fail on 1<<64. This copies 128bit handling from region_add() to region_del(). Since the only machine type which is actually going to use 64bit IOMMU is pseries and it never really removes them (instead it will dynamically add/remove subregions), this should cause no behavioral change. Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by:
Alex Williamson <alex.williamson@redhat.com>
-
Alex Williamson authored
Document the usage modes, host primary graphics considerations, usage, and fw_cfg ABI required for IGD assignment with vfio. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
The IGD OpRegion is enabled automatically when running in legacy mode, but it can sometimes be useful in universal passthrough mode as well. Without an OpRegion, output spigots don't work, and even though Intel doesn't officially support physical outputs in UPT mode, it's a useful feature. Note that if an OpRegion is enabled but a monitor is not connected, some graphics features will be disabled in the guest versus a headless system without an OpRegion, where they would work. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
Enable quirks to support SandyBridge and newer IGD devices as primary VM graphics. This requires new vfio-pci device specific regions added in kernel v4.6 to expose the IGD OpRegion, the shadow ROM, and config space access to the PCI host bridge and LPC/ISA bridge. VM firmware support, SeaBIOS only so far, is also required for reserving memory regions for IGD specific use. In order to enable this mode, IGD must be assigned to the VM at PCI bus address 00:02.0, it must have a ROM, it must be able to enable VGA, it must have or be able to create on its own an LPC/ISA bridge of the proper type at PCI bus address 00:1f.0 (sorry, not compatible with Q35 yet), and it must have the above noted vfio-pci kernel features and BIOS. The intention is that to enable this mode, a user simply needs to assign 00:02.0 from the host to 00:02.0 in the VM: -device vfio-pci,host=0000:00:02.0,bus=pci.0,addr=02.0 and everything either happens automatically or it doesn't. In the case that it doesn't, we leave error reports, but assume the device will operate in universal passthrough mode (UPT), which doesn't require any of this, but has a much more narrow window of supported devices, supported use cases, and supported guest drivers. When using IGD in this mode, the VM firmware is required to reserve some VM RAM for the OpRegion (on the order or several 4k pages) and stolen memory for the GTT (up to 8MB for the latest GPUs). An additional option, x-igd-gms allows the user to specify some amount of additional memory (value is number of 32MB chunks up to 512MB) that is pre-allocated for graphics use. TBH, I don't know of anything that requires this or makes use of this memory, which is why we don't allocate any by default, but the specification suggests this is not actually a valid combination, so the option exists as a workaround. Please report if it's actually necessary in some environment. See code comments for further discussion about the actual operation of the quirks necessary to assign these devices. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
Capability probing modifies wmask, which quirks may be interested in changing themselves. Apply our BAR quirks after the capability scan to make this possible. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
Combine VGA discovery and registration. Quirks can have dependencies on BARs, so the quirks push out until after we've scanned the BARs. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
This function returns success if either we setup the VGA region or the host vfio doesn't return enough regions to support the VGA index. This latter case doesn't make any sense. If we're asked to populate VGA, fail if it doesn't exist and let the caller decide if that's important. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
Given a device specific region type and sub-type, find it. Also cleanup return point on error in vfio_get_region_info() so that we always return 0 with a valid pointer or -errno and NULL. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Alex Williamson authored
The sparse mmap capability in a vfio region info allows vfio to tell us which sub-areas of a region may be mmap'd. Thus rather than assuming a single mmap covers the entire region and later frobbing it ourselves for things like the PCI MSI-X vector table, we can read that directly from vfio. Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Reviewed-by:
Gerd Hoffmann <kraxel@redhat.com> Tested-by:
Gerd Hoffmann <kraxel@redhat.com>
-
Peter Maydell authored
Block layer patches # gpg: Signature made Wed 25 May 2016 18:32:40 BST using RSA key ID C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" * remotes/kevin/tags/for-upstream: (31 commits) blockjob: Remove BlockJob.bs commit: Use BlockBackend for I/O backup: Use BlockBackend for I/O backup: Remove bs parameter from backup_do_cow() backup: Pack Notifier within BackupBlockJob backup: Don't leak BackupBlockJob in error path mirror: Use BlockBackend for I/O mirror: Allow target that already has a BlockBackend stream: Use BlockBackend for I/O block: Make blk_co_preadv/pwritev() public block: Convert block job core to BlockBackend block: Default to enabled write cache in blk_new() block: Cancel jobs first in bdrv_close_all() block: keep a list of block jobs block: Rename blk_write_zeroes() dma-helpers: change BlockBackend to opaque value in DMAIOFunc dma-helpers: change interface to byte-based block: Propagate .drained_begin/end callbacks block: Fix reconfiguring graph with drained nodes block: Make bdrv_drain() use bdrv_drained_begin/end() ... Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Andreas Färber authored
Move bus type and related APIs to a separate file bus.c. This is a first step in breaking up qdev.c into more manageable chunks. Reviewed-by:
Peter Maydell <peter.maydell@linaro.org> [AF: Rebased onto osdep.h] Signed-off-by:
Andreas Färber <afaerber@suse.de> [PMM: added bus.o to link line for test-qdev-global-props] Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Sergey Fedorov authored
It is not safe to make a direct jump to a TB spanning two pages in system emulation because the mapping for the second page can get changed but we don't take care of direct jumps in this case. However in user mode emulation, this is not the case because there's only static address translation and TBs are always invalidated properly. Fixes: 5b053a4a ("tcg: Clean up direct block chaining safety checks") Reported-by:
Max Filippov <jcmvbkbc@gmail.com> Signed-off-by:
Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by:
Sergey Fedorov <sergey.fedorov@linaro.org> Tested-by:
Max Filippov <jcmvbkbc@gmail.com> Message-id: 1463404380-29302-1-git-send-email-sergey.fedorov@linaro.org Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
Peter Maydell authored
Andreas stepping down from most maintainer positions # gpg: Signature made Wed 25 May 2016 16:53:45 BST using RSA key ID 3E7E013F # gpg: Good signature from "Andreas Färber <afaerber@suse.de>" # gpg: aka "Andreas Färber <afaerber@suse.com>" * remotes/afaerber/tags/maintainers-for-peter: MAINTAINERS: Drop Andreas as CPU maintainer MAINTAINERS: Drop Andreas as 0.15 maintainer MAINTAINERS: Drop Andreas as PReP maintainer MAINTAINERS: Drop Andreas as Cocoa maintainer Signed-off-by:
Peter Maydell <peter.maydell@linaro.org>
-
- May 25, 2016
-
-
Kevin Wolf authored
There is a single remaining user in qemu-img, and another one in a test case, both of which can be trivially converted to using BlockJob.blk instead. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com>
-
Kevin Wolf authored
This changes the commit block job to use the job's BlockBackend for performing its I/O. job->bs isn't used by the commit code any more afterwards. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
This changes the backup block job to use the job's BlockBackend for performing its I/O. job->bs isn't used by the backup code any more afterwards. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
Now that we pass the job to the function, bs is implied by that. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com>
-
John Snow authored
Instead of relying on peeking at bs->job, we want to explicitly get a reference to the job that was involved in this notifier callback. Pack the Notifier inside of the BackupBlockJob so we can use container_of to get a reference back to the BackupBlockJob object. This cuts out one more case where we rely unnecessarily on bs->job. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
John Snow <jsnow@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com>
-
Kevin Wolf authored
This changes the mirror block job to use the job's BlockBackend for performing its I/O. job->bs isn't used by the mirroring code any more afterwards. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
We had to forbid mirroring to a target BDS that already had a BB attached because the node swapping at job completion would add a second BB and we didn't support multiple BBs on a single BDS at the time. Now we do, so we can lift the restriction. As we allow additional BlockBackends for the target, we must expect other users to be sending requests. There may no requests be in flight during the graph modification, so we have to drain those users now. The core part of this patch is a revert of commit 40365552. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
This changes the streaming block job to use the job's BlockBackend for performing the COR reads. job->bs isn't used by the streaming code any more afterwards. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
Also add trace points now that the function can be directly called. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com>
-
Kevin Wolf authored
This adds a new BlockBackend field to the BlockJob struct, which coexists with the BlockDriverState while converting the individual jobs. When creating a block job, a new BlockBackend is created on top of the given BlockDriverState, and it is destroyed when the BlockJob ends. The reference to the BDS is now held by the BlockBackend instead of calling bdrv_ref/unref manually. We have to be careful when we use bdrv_replace_in_backing_chain() in block jobs because this changes the BDS that job->blk points to. At the moment block jobs are too tightly coupled with their BDS, so that moving a job to another BDS isn't easily possible; therefore, we need to just manually undo this change afterwards. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
The existing users of the function are: 1. blk_new_open(), which already enabled the write cache 2. Some test cases that don't care about the setting 3. blockdev_init() for empty drives, where the cache mode is overridden with the value from the options when a medium is inserted Therefore, this patch doesn't change the current behaviour. It will be convenient, however, for additional users of blk_new() (like block jobs) if the most sensible WCE setting is the default. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com>
-
Kevin Wolf authored
So far, bdrv_close_all() first removed all root BlockDriverStates of BlockBackends and monitor owned BDSes, and then assumed that the remaining BDSes must be related to jobs and cancelled these jobs. This order doesn't work that well any more when block jobs use BlockBackends internally because then they will lose their BDS before being cancelled. This patch changes bdrv_close_all() to first cancel all jobs and then remove all root BDSes from the remaining BBs. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Alberto Garcia authored
The current way to obtain the list of existing block jobs is to iterate over all root nodes and check which ones own a job. Since we want to be able to support block jobs in other nodes as well, this patch keeps a list of jobs that is updated every time one is created or destroyed. Signed-off-by:
Alberto Garcia <berto@igalia.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Eric Blake authored
Commit 983a1600 changed the semantics of blk_write_zeroes() to be byte-based rather than sector-based, but did not change the name, which is an open invitation for other code to misuse the function. Renaming to pwrite_zeroes() makes it more in line with other byte-based interfaces, and will help make it easier to track which remaining write_zeroes interfaces still need conversion. Reported-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
Callers of dma_blk_io have no way to pass extra data to the DMAIOFunc, because the original callback and opaque are gone by the time DMAIOFunc is called. On the other hand, the BlockBackend is usually derived from those extra data that you could pass to the DMAIOFunc (in the next patch, that would be the SCSIRequest). So change DMAIOFunc's prototype, decoupling it from blk_aio_readv and blk_aio_writev's. The new prototype loses the BlockBackend and gains an extra opaque value which, in the case of dma_blk_readv and dma_blk_writev, is of course used for the BlockBackend. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Kevin Wolf authored
When draining intermediate nodes (i.e. nodes that aren't the root node for at least one of their parents; with node references, the user can always configure the graph to create this situation), we need to propagate the .drained_begin/end callbacks all the way up to the root for the drain to be effective. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com>
-
Kevin Wolf authored
When changing the BlockDriverState that a BdrvChild points to while the node is currently drained, we must call the .drained_end() parent callback. Conversely, when this means attaching a new node that is already drained, we need to call .drained_begin(). bdrv_root_attach_child() takes now an opaque parameter, which is needed because the callbacks must also be called if we're attaching a new child to the BlockBackend when the root node is already drained, and they need a way to identify the BlockBackend. Previously, child->opaque was set too late and the callbacks would still see it as NULL. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com>
-
Kevin Wolf authored
Until now, bdrv_drained_begin() used bdrv_drain() internally to drain the queue. This is kind of backwards and caused quiescing code to be duplicated because bdrv_drained_begin() had to ensure that no new requests come in even after bdrv_drain() returns, whereas bdrv_drain() had to have them because it could be called from other places. Instead move the bdrv_drain() code to bdrv_drained_begin() and make bdrv_drain() a simple wrapper around bdrv_drained_begin/end(). Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com>
-
Kevin Wolf authored
This adds a common function that is called when attaching a new child to a parent, removing a child from a parent and when reconfiguring the graph so that an existing child points to a different node now. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com>
-
Hanna Reitz authored
blk_new() cannot fail so its Error ** parameter has become superfluous. Signed-off-by:
Max Reitz <mreitz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
bdrv_close() now asserts that the BDS's refcount is 0, therefore it cannot have any parents and the bdrv_parent_cb_change_media() call is a no-op. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
The only caller of bdrv_close() left is bdrv_delete(). We may as well assert that, in a way (there are some things in bdrv_close() that make more sense under that assumption, such as the call to bdrv_release_all_dirty_bitmaps() which in turn assumes that no frozen bitmaps are attached to the BDS). In addition, being called only in bdrv_delete() means that we can drop bdrv_close()'s forward declaration at the top of block.c. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
There are no callers to bdrv_open() or bdrv_open_inherit() left that pass a pointer to a non-NULL BDS pointer as the first argument of these functions, so we can finally drop that parameter and just make them return the new BDS. Generally, the following pattern is applied: bs = NULL; ret = bdrv_open(&bs, ..., &local_err); if (ret < 0) { error_propagate(errp, local_err); ... } by bs = bdrv_open(..., errp); if (!bs) { ret = -EINVAL; ... } Of course, there are only a few instances where the pattern is really pure. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hanna Reitz authored
It is unused now, so we may just as well drop it. Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Alberto Garcia <berto@igalia.com> Reviewed-by:
Kevin Wolf <kwolf@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-