Skip to content
Snippets Groups Projects
  1. Nov 01, 2021
    • Hyman Huang's avatar
      migration/dirtyrate: move init step of calculation to main thread · 9865d0f6
      Hyman Huang authored
      
      since main thread may "query dirty rate" at any time, it's better
      to move init step into main thead so that synchronization overhead
      between "main" and "get_dirtyrate" can be reduced.
      
      Signed-off-by: default avatarHyman Huang(黄勇) <huangy81@chinatelecom.cn>
      Message-Id: <109f8077518ed2f13068e3bfb10e625e964780f1.1624040308.git.huangy81@chinatelecom.cn>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      9865d0f6
    • Hyman Huang's avatar
      migration/dirtyrate: adjust order of registering thread · 15eb2d64
      Hyman Huang authored
      
      registering get_dirtyrate thread in advance so that both
      page-sampling and dirty-ring mode can be covered.
      
      Signed-off-by: default avatarHyman Huang(黄勇) <huangy81@chinatelecom.cn>
      Message-Id: <d7727581a8e86d4a42fc3eacf7f310419b9ebf7e.1624040308.git.huangy81@chinatelecom.cn>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      15eb2d64
    • Hyman Huang's avatar
      migration/dirtyrate: introduce struct and adjust DirtyRateStat · 71864ead
      Hyman Huang authored
      
      introduce "DirtyRateMeasureMode" to specify what method should be
      used to calculate dirty rate, introduce "DirtyRateVcpu" to store
      dirty rate for each vcpu.
      
      use union to store stat data of specific mode
      
      Signed-off-by: default avatarHyman Huang(黄勇) <huangy81@chinatelecom.cn>
      Message-Id: <661c98c40f40e163aa58334337af8f3ddf41316a.1624040308.git.huangy81@chinatelecom.cn>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      71864ead
    • Hyman Huang's avatar
      memory: make global_dirty_tracking a bitmask · 63b41db4
      Hyman Huang authored
      
      since dirty ring has been introduced, there are two methods
      to track dirty pages of vm. it seems that "logging" has
      a hint on the method, so rename the global_dirty_log to
      global_dirty_tracking would make description more accurate.
      
      dirty rate measurement may start or stop dirty tracking during
      calculation. this conflict with migration because stop dirty
      tracking make migration leave dirty pages out then that'll be
      a problem.
      
      make global_dirty_tracking a bitmask can let both migration and
      dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION
      and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty
      tracking aims for, migration or dirty rate.
      
      Signed-off-by: default avatarHyman Huang(黄勇) <huangy81@chinatelecom.cn>
      Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      63b41db4
    • Hyman Huang's avatar
      KVM: introduce dirty_pages and kvm_dirty_ring_enabled · 7786ae40
      Hyman Huang authored
      
      dirty_pages is used to calculate dirtyrate via dirty ring, when
      enabled, kvm-reaper will increase the dirty pages after gfns
      being dirtied.
      
      kvm_dirty_ring_enabled shows if kvm-reaper is working. dirtyrate
      thread could use it to check if measurement can base on dirty
      ring feature.
      
      Signed-off-by: default avatarHyman Huang(黄勇) <huangy81@chinatelecom.cn>
      Message-Id: <fee5fb2ab17ec2159405fc54a3cff8e02322f816.1624040308.git.huangy81@chinatelecom.cn>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      7786ae40
    • Li Zhijian's avatar
      migration/rdma: Fix out of order wrid · b390afd8
      Li Zhijian authored
      
      destination:
      ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server-migration.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5902,disable-ticketing -incoming rdma:192.168.22.23:8888
      qemu-system-x86_64: -spice streaming-video=filter,port=5902,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
      Please use disable-ticketing=on instead
      QEMU 6.0.50 monitor - type 'help' for more information
      (qemu) trace-event qemu_rdma_block_for_wrid_miss on
      (qemu) dest_init RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
      qemu_rdma_block_for_wrid_miss A Wanted wrid CONTROL SEND (2000) but got CONTROL RECV (4000)
      
      source:
      ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5901,disable-ticketing -S
      qemu-system-x86_64: -spice streaming-video=filter,port=5901,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
      Please use disable-ticketing=on instead
      QEMU 6.0.50 monitor - type 'help' for more information
      (qemu)
      (qemu) trace-event qemu_rdma_block_for_wrid_miss on
      (qemu) migrate -d rdma:192.168.22.23:8888
      source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
      (qemu) qemu_rdma_block_for_wrid_miss A Wanted wrid WRITE RDMA (1) but got CONTROL RECV (4000)
      
      NOTE: we use soft RoCE as the rdma device.
      [root@iaas-rpma images]# rdma link show rxe_eth0/1
      link rxe_eth0/1 state ACTIVE physical_state LINK_UP netdev eth0
      
      This migration could not be completed when out of order(OOO) CQ event occurs.
      The send queue and receive queue shared a same completion queue, and
      qemu_rdma_block_for_wrid() will drop the CQs it's not interested in. But
      the dropped CQs by qemu_rdma_block_for_wrid() could be later CQs it wants.
      So in this case, qemu_rdma_block_for_wrid() will block forever.
      
      OOO cases will occur in both source side and destination side. And a
      forever blocking happens on only SEND and RECV are out of order. OOO between
      'WRITE RDMA' and 'RECV' doesn't matter.
      
      below the OOO sequence:
             source                             destination
            rdma_write_one()                   qemu_rdma_registration_handle()
      1.    S1: post_recv X                    D1: post_recv Y
      2.    wait for recv CQ event X
      3.                                       D2: post_send X     ---------------+
      4.                                       wait for send CQ send event X (D2) |
      5.    recv CQ event X reaches (D2)                                          |
      6.  +-S2: post_send Y                                                       |
      7.  | wait for send CQ event Y                                              |
      8.  |                                    recv CQ event Y (S2) (drop it)     |
      9.  +-send CQ event Y reaches (S2)                                          |
      10.                                      send CQ event X reaches (D2)  -----+
      11.                                      wait recv CQ event Y (dropped by (8))
      
      Although a hardware IB works fine in my a hundred of runs, the IB specification
      doesn't guaratee the CQ order in such case.
      
      Here we introduce a independent send completion queue to distinguish
      ibv_post_send completion queue from the original mixed completion queue.
      It helps us to poll the specific CQE we are really interested in.
      
      Signed-off-by: default avatarLi Zhijian <lizhijian@cn.fujitsu.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      b390afd8
  2. Oct 30, 2021
  3. Oct 29, 2021
    • Richard Henderson's avatar
      Merge remote-tracking branch 'remotes/ehabkost/tags/x86-next-pull-request' into staging · a856cce3
      Richard Henderson authored
      
      x86 queue, 2021-10-29
      
      Bug fixes:
      * Remove core-capability in Snowridge CPU model
      
      # gpg: Signature made Fri 29 Oct 2021 12:05:14 PM PDT
      # gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
      # gpg:                issuer "ehabkost@redhat.com"
      # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
      
      * remotes/ehabkost/tags/x86-next-pull-request:
        target/i386: Remove core-capability in Snowridge CPU model
      
      Signed-off-by: default avatarRichard Henderson <richard.henderson@linaro.org>
      a856cce3
    • Markus Armbruster's avatar
      qapi: Extend -compat to set policy for unstable interfaces · 57df0dff
      Markus Armbruster authored
      
      New option parameters unstable-input and unstable-output set policy
      for unstable interfaces just like deprecated-input and
      deprecated-output set policy for deprecated interfaces (see commit
      6dd75472 "qemu-options: New -compat to set policy for deprecated
      interfaces").  This is intended for testing users of the management
      interfaces.  It is experimental.
      
      For now, this covers only syntactic aspects of QMP, i.e. stuff tagged
      with feature 'unstable'.  We may want to extend it to cover semantic
      aspects, or the command line.
      
      Note that there is no good way for management application to detect
      presence of these new option parameters: they are not visible output
      of query-qmp-schema or query-command-line-options.  Tolerable, because
      it's meant for testing.  If running with -compat fails, skip the test.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Acked-by: default avatarJohn Snow <jsnow@redhat.com>
      Message-Id: <20211028102520.747396-10-armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      [Doc comments fixed up]
      57df0dff
Loading