ci: Make all job timeouts explicit

Enforce a default job timeout of 1 second, to make jobs which don't
explicitly specify a timeout insta-fail, rather than potentially hanging
around for an hour.

Container builds get the full hour as they can run long and are not run
in pre-merge context, and LAVA jobs also get the full hour as they have
multiple internal timeout mechanisms which aim to fast-fail jobs once
they actually start. However, as they just queue jobs to an external
host (shared with other projects like KernelCI), these timeouts aren't
reflected into the GitLab CI definitions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34280>
This commit is contained in:
Daniel Stone
2025-03-29 17:52:10 +00:00
committed by Marge Bot
parent 3fee0ef129
commit 3b6a40af53
3 changed files with 15 additions and 0 deletions

View File

@@ -124,6 +124,7 @@ variables:
DATA_STORAGE_PATH: data_storage
default:
timeout: 1m # catch any jobs which don't specify a timeout
id_tokens:
S3_JWT:
aud: https://s3.freedesktop.org

View File

@@ -52,6 +52,7 @@
.container:
stage: container
timeout: 1h
extends:
- .container+build-rules
- .incorporate-templates-commit

View File

@@ -5,6 +5,19 @@ variables:
.lava-test:
# Cancel job if a newer commit is pushed to the same branch
interruptible: true
# The jobs themselves shouldn't actually run for an hour, of course.
# Jobs are picked up greedily by a GitLab CI runner which is deliberately
# overprovisioned compared to the number of available devices. They are
# submitted to the LAVA co-ordinator with a job priority which gives
# pre-merge priority over everyone else. User-submitted and nightly jobs
# can thus spend ages just waiting around in a queue to be run at some
# point as they get pre-empted by other things.
# Non-queue time has strict timeouts for each stage, e.g. for downloading
# the artifacts, booting the device, device setup, running the tests, etc,
# which is handled by LAVA itself.
# So the only reason we should see anyone bouncing off this timeout is due
# to a lack of available devices to run the jobs.
timeout: 1h
variables:
GIT_STRATEGY: none # testing doesn't build anything from source
FDO_CI_CONCURRENT: 6 # should be replaced by per-machine definitions