Skip to content
  • bauerchen's avatar
    037fb5eb
    mem-prealloc: optimize large guest startup · 037fb5eb
    bauerchen authored
    
    
    [desc]:
        Large memory VM starts slowly when using -mem-prealloc, and
        there are some areas to optimize in current method;
    
        1、mmap will be used to alloc threads stack during create page
        clearing threads, and it will attempt mm->mmap_sem for write
        lock, but clearing threads have hold read lock, this competition
        will cause threads createion very slow;
    
        2、methods of calcuating pages for per threads is not well;if we use
        64 threads to split 160 hugepage,63 threads clear 2page,1 thread
        clear 34 page,so the entire speed is very slow;
    
        to solve the first problem,we add a mutex in thread function,and
        start all threads when all threads finished createion;
        and the second problem, we spread remainder to other threads,in
        situation that 160 hugepage and 64 threads, there are 32 threads
        clear 3 pages,and 32 threads clear 2 pages.
    
    [test]:
        320G 84c VM start time can be reduced to 10s
        680G 84c VM start time can be reduced to 18s
    
    Signed-off-by: default avatarbauerchen <bauerchen@tencent.com>
    Reviewed-by: default avatarPan Rui <ruippan@tencent.com>
    Reviewed-by: default avatarIvan Ren <ivanren@tencent.com>
    [Simplify computation of the number of pages per thread. - Paolo]
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    037fb5eb
    mem-prealloc: optimize large guest startup
    bauerchen authored
    
    
    [desc]:
        Large memory VM starts slowly when using -mem-prealloc, and
        there are some areas to optimize in current method;
    
        1、mmap will be used to alloc threads stack during create page
        clearing threads, and it will attempt mm->mmap_sem for write
        lock, but clearing threads have hold read lock, this competition
        will cause threads createion very slow;
    
        2、methods of calcuating pages for per threads is not well;if we use
        64 threads to split 160 hugepage,63 threads clear 2page,1 thread
        clear 34 page,so the entire speed is very slow;
    
        to solve the first problem,we add a mutex in thread function,and
        start all threads when all threads finished createion;
        and the second problem, we spread remainder to other threads,in
        situation that 160 hugepage and 64 threads, there are 32 threads
        clear 3 pages,and 32 threads clear 2 pages.
    
    [test]:
        320G 84c VM start time can be reduced to 10s
        680G 84c VM start time can be reduced to 18s
    
    Signed-off-by: default avatarbauerchen <bauerchen@tencent.com>
    Reviewed-by: default avatarPan Rui <ruippan@tencent.com>
    Reviewed-by: default avatarIvan Ren <ivanren@tencent.com>
    [Simplify computation of the number of pages per thread. - Paolo]
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
Loading