Skip to content
Snippets Groups Projects
  • Colin Xu's avatar
    6f38dca6
    hax: Honor CPUState::halted · 6f38dca6
    Colin Xu authored
    
    QEMU tracks whether a vcpu is halted using CPUState::halted. E.g.,
    after initialization or reset, halted is 0 for the BSP (vcpu 0)
    and 1 for the APs (vcpu 1, 2, ...). A halted vcpu should not be
    handed to the hypervisor to run (e.g. hax_vcpu_run()).
    
    Under HAXM, Android Emulator sometimes boots into a "vcpu shutdown
    request" error while executing in SeaBIOS, with the HAXM driver
    logging a guest triple fault in vcpu 1, 2, ... at RIP 0x3. That is
    ultimately because the HAX accelerator asks HAXM to run those APs
    when they are still in the halted state.
    
    Normally, the vcpu thread for an AP will start by looping in
    qemu_wait_io_event(), until the BSP kicks it via a pair of IPIs
    (INIT followed by SIPI). But because the HAX accelerator does not
    honor cpu->halted, it allows the AP vcpu thread to proceed to
    hax_vcpu_run() as soon as it receives any kick, even if the kick
    does not come from the BSP. It turns out that emulator has a
    worker thread which periodically kicks every vcpu thread (possibly
    to collect CPU usage data), and if one of these kicks comes before
    those by the BSP, the AP will start execution from the wrong RIP,
    resulting in the aforementioned SMP boot failure.
    
    The solution is inspired by the KVM accelerator (credit to
    Chuanxiao Dong <chuanxiao.dong@intel.com> for the pointer):
    
    1. Get rid of questionable logic that unconditionally resets
       cpu->halted before hax_vcpu_run(). Instead, only reset it at the
       right moments (there are only a few "unhalt" events).
    2. Add a check for cpu->halted before hax_vcpu_run().
    
    Note that although the non-Unrestricted Guest (!ug_platform) code
    path also forcibly resets cpu->halted, it is left untouched,
    because only the UG code path supports SMP guests.
    
    The patch is first merged to android emulator with Change-Id:
    I9c5752cc737fd305d7eace1768ea12a07309d716
    
    Cc: Yu Ning <yu.ning@intel.com>
    Cc: Chuanxiao Dong <chuanxiao.dong@intel.com>
    Signed-off-by: default avatarColin Xu <colin.xu@intel.com>
    Message-Id: <20190610021939.13669-1-colin.xu@intel.com>
    6f38dca6
    History
    hax: Honor CPUState::halted
    Colin Xu authored
    
    QEMU tracks whether a vcpu is halted using CPUState::halted. E.g.,
    after initialization or reset, halted is 0 for the BSP (vcpu 0)
    and 1 for the APs (vcpu 1, 2, ...). A halted vcpu should not be
    handed to the hypervisor to run (e.g. hax_vcpu_run()).
    
    Under HAXM, Android Emulator sometimes boots into a "vcpu shutdown
    request" error while executing in SeaBIOS, with the HAXM driver
    logging a guest triple fault in vcpu 1, 2, ... at RIP 0x3. That is
    ultimately because the HAX accelerator asks HAXM to run those APs
    when they are still in the halted state.
    
    Normally, the vcpu thread for an AP will start by looping in
    qemu_wait_io_event(), until the BSP kicks it via a pair of IPIs
    (INIT followed by SIPI). But because the HAX accelerator does not
    honor cpu->halted, it allows the AP vcpu thread to proceed to
    hax_vcpu_run() as soon as it receives any kick, even if the kick
    does not come from the BSP. It turns out that emulator has a
    worker thread which periodically kicks every vcpu thread (possibly
    to collect CPU usage data), and if one of these kicks comes before
    those by the BSP, the AP will start execution from the wrong RIP,
    resulting in the aforementioned SMP boot failure.
    
    The solution is inspired by the KVM accelerator (credit to
    Chuanxiao Dong <chuanxiao.dong@intel.com> for the pointer):
    
    1. Get rid of questionable logic that unconditionally resets
       cpu->halted before hax_vcpu_run(). Instead, only reset it at the
       right moments (there are only a few "unhalt" events).
    2. Add a check for cpu->halted before hax_vcpu_run().
    
    Note that although the non-Unrestricted Guest (!ug_platform) code
    path also forcibly resets cpu->halted, it is left untouched,
    because only the UG code path supports SMP guests.
    
    The patch is first merged to android emulator with Change-Id:
    I9c5752cc737fd305d7eace1768ea12a07309d716
    
    Cc: Yu Ning <yu.ning@intel.com>
    Cc: Chuanxiao Dong <chuanxiao.dong@intel.com>
    Signed-off-by: default avatarColin Xu <colin.xu@intel.com>
    Message-Id: <20190610021939.13669-1-colin.xu@intel.com>
cpus.c 67.56 KiB