Skip to content
Snippets Groups Projects
  1. Jan 20, 2024
  2. Dec 01, 2023
  3. Nov 30, 2023
  4. Nov 15, 2023
  5. Nov 07, 2023
  6. Nov 03, 2023
  7. Nov 02, 2023
  8. Nov 01, 2023
    • Steve Sistare's avatar
      migration: per-mode blockers · fa3673e4
      Steve Sistare authored
      
      Extend the blocker interface so that a blocker can be registered for
      one or more migration modes.  The existing interfaces register a
      blocker for all modes, and the new interfaces take a varargs list
      of modes.
      
      Internally, maintain a separate blocker list per mode.  The same Error
      object may be added to multiple lists.  When a block is deleted, it is
      removed from every list, and the Error is freed.
      
      No functional change until a new mode is added.
      
      Signed-off-by: default avatarSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <1698263069-406971-3-git-send-email-steven.sistare@oracle.com>
      fa3673e4
    • Steve Sistare's avatar
      migration: mode parameter · eea1e5c9
      Steve Sistare authored
      
      Create a mode migration parameter that can be used to select alternate
      migration algorithms.  The default mode is normal, representing the
      current migration algorithm, and does not need to be explicitly set.
      
      No functional change until a new mode is added, except that the mode is
      shown by the 'info migrate' command.
      
      Signed-off-by: default avatarSteve Sistare <steven.sistare@oracle.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
      eea1e5c9
    • Peter Xu's avatar
      migration: Add tracepoints for downtime checkpoints · 3e5f3bcd
      Peter Xu authored
      This patch is inspired by Joao Martin's patch here:
      
      https://lore.kernel.org/r/20230926161841.98464-1-joao.m.martins@oracle.com
      
      
      
      Add tracepoints for major downtime checkpoints on both src and dst.  They
      share the same tracepoint with a string showing its stage.
      
      Besides the checkpoints in the previous patch, this patch also added
      destination checkpoints.
      
      On src, we have these checkpoints added:
      
        - src-downtime-start: right before vm stops on src
        - src-vm-stopped: after vm is fully stopped
        - src-iterable-saved: after all iterables saved (END sections)
        - src-non-iterable-saved: after all non-iterable saved (FULL sections)
        - src-downtime-stop: migration fully completed
      
      On dst, we have these checkpoints added:
      
        - dst-precopy-loadvm-completes: after loadvm all done for precopy
        - dst-precopy-bh-*: record BH steps to resume VM for precopy
        - dst-postcopy-bh-*: record BH steps to resume VM for postcopy
      
      On dst side, we don't have a good way to trace total time consumed by
      iterable or non-iterable for now.  We can mark it by 1st time receiving a
      FULL / END section, but rather than that let's just rely on the other
      tracepoints added for vmstates to back up the information.
      
      With this patch, one can enable "vmstate_downtime*" tracepoints and it'll
      enable all tracepoints for downtime measurements necessary.
      
      Drop loadvm_postcopy_handle_run_bh() tracepoint alongside, because they
      service the same purpose, which was only for postcopy.  We then have
      unified prefix for all downtime relevant tracepoints.
      
      Co-developed-by: default avatarJoao Martins <joao.m.martins@oracle.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231030163346.765724-6-peterx@redhat.com>
      3e5f3bcd
    • Peter Xu's avatar
      migration: migration_stop_vm() helper · 93bdf888
      Peter Xu authored
      
      Provide a helper for non-COLO use case of migration to stop a VM.  This
      prepares for adding some downtime relevant tracepoints to migration, where
      they may or may not apply to COLO.
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231030163346.765724-5-peterx@redhat.com>
      93bdf888
    • Peter Xu's avatar
      migration: Add per vmstate downtime tracepoints · 3c80f142
      Peter Xu authored
      
      We have a bunch of savevm_section* tracepoints, they're good to analyze
      migration stream, but not always suitable if someone would like to analyze
      the migration downtime.  Two major problems:
      
        - savevm_section* tracepoints are dumping all sections, we only care
          about the sections that contribute to the downtime
      
        - They don't have an identifier to show the type of sections, so no way
          to filter downtime information either easily.
      
      We can add type into the tracepoints, but instead of doing so, this patch
      kept them untouched, instead of adding a bunch of downtime specific
      tracepoints, so one can enable "vmstate_downtime*" tracepoints and get a
      full picture of how the downtime is distributed across iterative and
      non-iterative vmstate save/load.
      
      Note that here both save() and load() need to be traced, because both of
      them may contribute to the downtime.  The contribution is not a simple "add
      them together", though: consider when the src is doing a save() of device1
      while the dest can be load()ing for device2, so they can happen
      concurrently.
      
      Tracking both sides make sense because device load() and save() can be
      imbalanced, one device can save() super fast, but load() super slow, vice
      versa.  We can't figure that out without tracing both.
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231030163346.765724-4-peterx@redhat.com>
      3c80f142
    • Peter Xu's avatar
      migration: Add migration_downtime_start|end() helpers · e22ffad0
      Peter Xu authored
      
      Unify the three users on recording downtimes with the same pair of helpers.
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarFabiano Rosas <farosas@suse.de>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231030163346.765724-3-peterx@redhat.com>
      e22ffad0
    • Peter Xu's avatar
      migration: Set downtime_start even for postcopy · 62f5da7d
      Peter Xu authored
      
      Postcopy calculates its downtime separately.  It always sets
      MigrationState.downtime properly, but not MigrationState.downtime_start.
      
      Make postcopy do the same as other modes on properly recording the
      timestamp when the VM is going to be stopped.  Drop the temporary variable
      in postcopy_start() along the way.
      
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarFabiano Rosas <farosas@suse.de>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231030163346.765724-2-peterx@redhat.com>
      62f5da7d
    • Peter Xu's avatar
      migration: Check in savevm_state_handler_insert for dups · caa91b3c
      Peter Xu authored
      
      Before finally register one SaveStateEntry, we detect for duplicated
      entries.  This could be helpful to notify us asap instead of get
      silent migration failures which could be hard to diagnose.
      
      For example, this patch will generate a message like this (if without
      previous fixes on x2apic) as long as we wants to boot a VM instance
      with "-smp 200,maxcpus=288,sockets=2,cores=72,threads=2" and QEMU will
      bail out even before VM starts:
      
      savevm_state_handler_insert: Detected duplicate SaveStateEntry: id=apic, instance_id=0x0
      
      Suggested-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarJuan Quintela <quintela@redhat.com>
      Reviewed-by: default avatarDr. David Alan Gilbert <dgilbert@redhat.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231020090731.28701-10-quintela@redhat.com>
      caa91b3c
    • Juan Quintela's avatar
      migration: Hack to maintain backwards compatibility for ppc · 485fb955
      Juan Quintela authored
      
      Current code does:
      - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
        dependinfg on cpu number
      - for newer machines, it register vmstate_icp with "icp/server" name
        and instance 0
      - now it unregisters "icp/server" for the 1st instance.
      
      This is wrong at many levels:
      - we shouldn't have two VMSTATEDescriptions with the same name
      - In case this is the only solution that we can came with, it needs to
        be:
        * register pre_2_10_vmstate_dummy_icp
        * unregister pre_2_10_vmstate_dummy_icp
        * register real vmstate_icp
      
      Created vmstate_replace_hack_for_ppc() with warnings left and right
      that it is a hack.
      
      CC: Cedric Le Goater <clg@kaod.org>
      CC: Daniel Henrique Barboza <danielhb413@gmail.com>
      CC: David Gibson <david@gibson.dropbear.id.au>
      CC: Greg Kurz <groug@kaod.org>
      
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarJuan Quintela <quintela@redhat.com>
      Message-ID: <20231020090731.28701-8-quintela@redhat.com>
      485fb955
  9. Oct 31, 2023
Loading