cpus.c · f056158d694d2adc63ff120ca71c73ae8b14426c · Anton / libtcg

Apr 23, 2018

cpus: Fix event order on resume of stopped guest · f056158d

Markus Armbruster authored Apr 23, 2018



When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.

Reproducer:

1. Create a scratch image
   $ dd if=/dev/zero of=scratch.img bs=1M count=100

   Size doesn't actually matter.

2. Prepare blkdebug configuration:

   $ cat >blkdebug.conf <<EOF
   [inject-error]
   event = "write_aio"
   errno = "5"
   EOF

   Note that errno 5 is EIO.

3. Run a guest with an additional scratch disk, i.e. with additional
   arguments
   -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
   -device virtio-blk-pci,id=scratch,drive=scratch-drive

   The blkdebug part makes all writes to the scratch drive fail with
   EIO.  The werror=stop pauses the guest on write errors.

4. Connect to the QMP socket e.g. like this:
   $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '

   Issue QMP command 'qmp_capabilities':
   QMP> { "execute": "qmp_capabilities" }

5. Boot the guest.

6. In the guest, write to the scratch disk, e.g. like this:

   # dd if=/dev/zero of=/dev/vdb count=1

   Do double-check the device specified with of= is actually the
   scratch device!

7. Issue QMP command 'cont':
   QMP> { "execute": "cont" }

After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.

After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".

The culprit is vm_prepare_start().

    /* Ensure that a STOP/RESUME pair of events is emitted if a
     * vmstop request was pending.  The BLOCK_IO_ERROR event, for
     * example, according to documentation is always followed by
     * the STOP event.
     */
    if (runstate_is_running()) {
        qapi_event_send_stop(&error_abort);
        res = -1;
    } else {
        replay_enable_events();
        cpu_enable_ticks();
        runstate_set(RUN_STATE_RUNNING);
        vm_state_notify(1, RUN_STATE_RUNNING);
    }

    /* We are sending this now, but the CPUs will be resumed shortly later */
    qapi_event_send_resume(&error_abort);
    return res;

When resuming a stopped guest, we take the else branch before we get
to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
event.

Reshuffle vm_prepare_start() to send the RESUME event earlier.

Fixes RHBZ 1566153.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180423084518.2426-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f056158d

cpus: Fix event order on resume of stopped guest

Markus Armbruster authored Apr 23, 2018



When resume of a stopped guest immediately runs into block device
errors, the BLOCK_IO_ERROR event is sent before the RESUME event.

Reproducer:

1. Create a scratch image
   $ dd if=/dev/zero of=scratch.img bs=1M count=100

   Size doesn't actually matter.

2. Prepare blkdebug configuration:

   $ cat >blkdebug.conf <<EOF
   [inject-error]
   event = "write_aio"
   errno = "5"
   EOF

   Note that errno 5 is EIO.

3. Run a guest with an additional scratch disk, i.e. with additional
   arguments
   -drive if=none,id=scratch-drive,format=raw,werror=stop,file=blkdebug:blkdebug.conf:scratch.img
   -device virtio-blk-pci,id=scratch,drive=scratch-drive

   The blkdebug part makes all writes to the scratch drive fail with
   EIO.  The werror=stop pauses the guest on write errors.

4. Connect to the QMP socket e.g. like this:
   $ socat UNIX:/your/qmp/socket READLINE,history=$HOME/.qmp_history,prompt='QMP> '

   Issue QMP command 'qmp_capabilities':
   QMP> { "execute": "qmp_capabilities" }

5. Boot the guest.

6. In the guest, write to the scratch disk, e.g. like this:

   # dd if=/dev/zero of=/dev/vdb count=1

   Do double-check the device specified with of= is actually the
   scratch device!

7. Issue QMP command 'cont':
   QMP> { "execute": "cont" }

After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event.  Good.

After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP.  Not so
good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP.

The funny event order confuses libvirt: virsh -r domstate DOMAIN
--reason reports "paused (unknown)" rather than "paused (I/O error)".

The culprit is vm_prepare_start().

    /* Ensure that a STOP/RESUME pair of events is emitted if a
     * vmstop request was pending.  The BLOCK_IO_ERROR event, for
     * example, according to documentation is always followed by
     * the STOP event.
     */
    if (runstate_is_running()) {
        qapi_event_send_stop(&error_abort);
        res = -1;
    } else {
        replay_enable_events();
        cpu_enable_ticks();
        runstate_set(RUN_STATE_RUNNING);
        vm_state_notify(1, RUN_STATE_RUNNING);
    }

    /* We are sending this now, but the CPUs will be resumed shortly later */
    qapi_event_send_resume(&error_abort);
    return res;

When resuming a stopped guest, we take the else branch before we get
to sending RESUME.  vm_state_notify() runs virtio_vmstate_change(),
among other things.  This restarts I/O, triggering the BLOCK_IO_ERROR
event.

Reshuffle vm_prepare_start() to send the RESUME event earlier.

Fixes RHBZ 1566153.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180423084518.2426-1-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>