Skip to content
Snippets Groups Projects
  • Daniel P. Berrangé's avatar
    c31fa24e
    gitlab: always build container images · c31fa24e
    Daniel P. Berrangé authored
    
    Currently we attempt to skip building container images if the commits do
    not involve changes to the dockerfiles or gitlab CI definitions.
    
    Conceptually this makes sense, but there is a challenge in the real
    world implementation of this in gitlab.
    
    In the case of a CI pipeline triggered from a merge request, GitLab
    knows the common ancestor of the merge request and the main git repo,
    so it can trivially determine if any of the commits associated with
    the MR change the dockerfiles.
    
    In the case of a CI pipeline triggered from a push to a branch, it is
    much more difficult. There is no concept of a common ancestor in this
    case. Instead GitLab looks at the set of commits in the git push event.
    
    On the surface this may sound reasonable, but it doesn't take into
    account that a push event does not always contain the full set of
    patches from a branch.
    
    For example, consider pushing 5 commits, one of which contains a
    dockerfile change. This will trigger a CI pipeline for the
    containers. Now consider you do some more work on the branch and push 3
    further commits, so you now have a branch of 8 commits. For the second
    push GitLab will only look at the 3 most recent commits, the other 5
    were already present. Thus GitLab will not realize that the branch has
    dockerfile changes that need to trigger the container build.
    
    This can cause real world problems:
    
     - Push 5 commits to branch "foo", including a dockerfile change
    
        => rebuilds the container images with content from "foo"
        => build jobs runs against containers from "foo"
    
     - Refresh your master branch with latest upstream master
    
        => rebuilds the container images with content from "master"
        => build jobs runs against containers from "master"
    
     - Push 3 more commits to branch "foo", with no dockerfile change
    
        => no container rebuild triggers
        => build jobs runs against containers from "master"
    
    The "changes" conditional in gitlab is OK, *provided* your build
    jobs are not relying on any external state from previous builds.
    
    This is NOT the case in QEMU, because we are building container
    images and these are cached. This is a scenario in which the
    "changes" conditional is not usuable.
    
    The only other way to avoid this problem would be to use the git
    branch name as the container image tag, instead of always using
    "latest". The downside of this approach is that the user's gitlab
    registry will grow significantly until it starts to trigger
    GitLab's automatic deletion policy.  Every time the user starts
    a new branch they will have to trigger a rebuild of the container
    images. Given this, we might as well just drop the conditional
    and always build the container images. Most of the time docker
    will be able to use the layer cache to avoid the most expensive
    part of the rebuild process (installing all the RPMs/debs/etc)
    
    Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
    Message-Id: <20210216132954.295906-2-berrange@redhat.com>
    Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
    c31fa24e
    History
    gitlab: always build container images
    Daniel P. Berrangé authored
    
    Currently we attempt to skip building container images if the commits do
    not involve changes to the dockerfiles or gitlab CI definitions.
    
    Conceptually this makes sense, but there is a challenge in the real
    world implementation of this in gitlab.
    
    In the case of a CI pipeline triggered from a merge request, GitLab
    knows the common ancestor of the merge request and the main git repo,
    so it can trivially determine if any of the commits associated with
    the MR change the dockerfiles.
    
    In the case of a CI pipeline triggered from a push to a branch, it is
    much more difficult. There is no concept of a common ancestor in this
    case. Instead GitLab looks at the set of commits in the git push event.
    
    On the surface this may sound reasonable, but it doesn't take into
    account that a push event does not always contain the full set of
    patches from a branch.
    
    For example, consider pushing 5 commits, one of which contains a
    dockerfile change. This will trigger a CI pipeline for the
    containers. Now consider you do some more work on the branch and push 3
    further commits, so you now have a branch of 8 commits. For the second
    push GitLab will only look at the 3 most recent commits, the other 5
    were already present. Thus GitLab will not realize that the branch has
    dockerfile changes that need to trigger the container build.
    
    This can cause real world problems:
    
     - Push 5 commits to branch "foo", including a dockerfile change
    
        => rebuilds the container images with content from "foo"
        => build jobs runs against containers from "foo"
    
     - Refresh your master branch with latest upstream master
    
        => rebuilds the container images with content from "master"
        => build jobs runs against containers from "master"
    
     - Push 3 more commits to branch "foo", with no dockerfile change
    
        => no container rebuild triggers
        => build jobs runs against containers from "master"
    
    The "changes" conditional in gitlab is OK, *provided* your build
    jobs are not relying on any external state from previous builds.
    
    This is NOT the case in QEMU, because we are building container
    images and these are cached. This is a scenario in which the
    "changes" conditional is not usuable.
    
    The only other way to avoid this problem would be to use the git
    branch name as the container image tag, instead of always using
    "latest". The downside of this approach is that the user's gitlab
    registry will grow significantly until it starts to trigger
    GitLab's automatic deletion policy.  Every time the user starts
    a new branch they will have to trigger a rebuild of the container
    images. Given this, we might as well just drop the conditional
    and always build the container images. Most of the time docker
    will be able to use the layer cache to avoid the most expensive
    part of the rebuild process (installing all the RPMs/debs/etc)
    
    Signed-off-by: default avatarDaniel P. Berrangé <berrange@redhat.com>
    Message-Id: <20210216132954.295906-2-berrange@redhat.com>
    Signed-off-by: default avatarThomas Huth <thuth@redhat.com>