anv: set ComputeMode.PixelAsyncComputeThreadLimit = 4

Heuristic-based optimization throttling CCS work (async compute).
Without throttling, background compute work consumes all threads,
deminishing performance gains by running dispatch in parallel with
3D work.

Optimization is heuristics based, meaning a workload might slow
down when using async compute.

Best value: PixelAsyncComputeThreadLimit = 4. On DG2, this
equates to a max CCS thread occupancy of 37.5%.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25508>
This commit is contained in:
Felix DeGrood
2023-09-13 20:56:59 +00:00
committed by Marge Bot
parent 8ff4847b64
commit b561bcd78c

View File

@@ -654,7 +654,10 @@ init_compute_queue_state(struct anv_queue *queue)
ANV_PIPE_HDC_PIPELINE_FLUSH_BIT);
}
anv_batch_emit(&batch, GENX(STATE_COMPUTE_MODE), zero);
anv_batch_emit(&batch, GENX(STATE_COMPUTE_MODE), cm) {
cm.PixelAsyncComputeThreadLimit = 4;
cm.PixelAsyncComputeThreadLimitMask = 0x7;
}
#endif
init_common_queue_state(queue, &batch);