Sporadically a6xx gpu will fail to recover causing the lava job
a660_vk_full to loop on error messages for three hours before timing
out.
A few sporadic error messages may still be recoverable, but when multiple
errors occur over a short period, successful recovery is unlikely. Parse
the logs to look for repeated error messages within a short time period.
If found, cancel the lava job and rerun it.
Also add unit tests for this behaviour.
cc: mesa-stable
Reported-by: Valentine Burley <valentine.burley@gmail.com>
Acked-by: Daniel Stone <daniel.stone@collabora.com>
Reviewed-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Signed-off-by: Deborah Brouwer <deborah.brouwer@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30032>