Skip to content
  • Paolo Bonzini's avatar
    050de36b
    coroutine-lock: Reimplement CoRwlock to fix downgrade bug · 050de36b
    Paolo Bonzini authored
    
    
    An invariant of the current rwlock is that if multiple coroutines hold a
    reader lock, all must be runnable. The unlock implementation relies on
    this, choosing to wake a single coroutine when the final read lock
    holder exits the critical section, assuming that it will wake a
    coroutine attempting to acquire a write lock.
    
    The downgrade implementation violates this assumption by creating a
    read lock owning coroutine that is exclusively runnable - any other
    coroutines that are waiting to acquire a read lock are *not* made
    runnable when the write lock holder converts its ownership to read
    only.
    
    More in general, the old implementation had lots of other fairness bugs.
    The root cause of the bugs was that CoQueue would wake up readers even
    if there were pending writers, and would wake up writers even if there
    were readers.  In that case, the coroutine would go back to sleep *at
    the end* of the CoQueue, losing its place at the head of the line.
    
    To fix this, keep the queue of waiters explicitly in the CoRwlock
    instead of using CoQueue, and store for each whether it is a
    potential reader or a writer.  This way, downgrade can look at the
    first queued coroutines and wake it only if it is a reader, causing
    all other readers in line to be released in turn.
    
    Reported-by: default avatarDavid Edmondson <david.edmondson@oracle.com>
    Reviewed-by: default avatarDavid Edmondson <david.edmondson@oracle.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Message-id: 20210325112941.365238-5-pbonzini@redhat.com
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
    050de36b
    coroutine-lock: Reimplement CoRwlock to fix downgrade bug
    Paolo Bonzini authored
    
    
    An invariant of the current rwlock is that if multiple coroutines hold a
    reader lock, all must be runnable. The unlock implementation relies on
    this, choosing to wake a single coroutine when the final read lock
    holder exits the critical section, assuming that it will wake a
    coroutine attempting to acquire a write lock.
    
    The downgrade implementation violates this assumption by creating a
    read lock owning coroutine that is exclusively runnable - any other
    coroutines that are waiting to acquire a read lock are *not* made
    runnable when the write lock holder converts its ownership to read
    only.
    
    More in general, the old implementation had lots of other fairness bugs.
    The root cause of the bugs was that CoQueue would wake up readers even
    if there were pending writers, and would wake up writers even if there
    were readers.  In that case, the coroutine would go back to sleep *at
    the end* of the CoQueue, losing its place at the head of the line.
    
    To fix this, keep the queue of waiters explicitly in the CoRwlock
    instead of using CoQueue, and store for each whether it is a
    potential reader or a writer.  This way, downgrade can look at the
    first queued coroutines and wake it only if it is a reader, causing
    all other readers in line to be released in turn.
    
    Reported-by: default avatarDavid Edmondson <david.edmondson@oracle.com>
    Reviewed-by: default avatarDavid Edmondson <david.edmondson@oracle.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Message-id: 20210325112941.365238-5-pbonzini@redhat.com
    Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
Loading