Commit Graph

24 Commits

Author SHA1 Message Date
Eric Anholt
f49d112a01 v3d: Implement ALPHA_TO_COVERAGE.
There's a convenient "FTOC" instruction for generating the coverage now,
unlike vc4.  This fixes
dEQP-GLES3.functional.multisample.fbo_4_samples.proportionality_alpha_to_coverage
2018-06-20 09:30:46 -07:00
Eric Anholt
e130ada243 v3d: Fix shaders using pixel center W but no varyings.
The docs called this field "uses both center W and centroid W", but
actually it's "do you need center W even if varyings don't obviously call
for it?"

Fixes dEQP-GLES3.functional.shaders.builtin_variable.fragcoord_w
2018-06-15 16:09:39 -07:00
Eric Anholt
76ee9edcb4 broadcom/vc5: Add support for centroid varyings.
It would be nice to share the flags packet emit logic with flat shade
flags, but I couldn't come up with a good way while still using our pack
macros.  We need to refactor this to shader record setup at compile time,
anyway.

Fixes ext_framebuffer_multisample-interpolation * centroid-*
2018-04-26 11:30:22 -07:00
Eric Anholt
d7a015cbc6 broadcom/vc5: Account for InstanceID/VertexID in VPM segment size.
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
2018-03-22 15:12:21 -07:00
Eric Anholt
baeb6a4b4a broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type.  Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.

Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
2018-03-21 14:02:34 -07:00
Eric Anholt
00910e3057 broadcom/vc5: Don't annotate dumps with stale live intervals.
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet.  Just skip the dump in that case.
2018-03-19 16:44:20 -07:00
Eric Anholt
facc3c6f58 broadcom/vc5: Add support for register spilling.
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).

We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.

Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
2018-03-19 16:44:06 -07:00
Eric Anholt
271fc58ba1 broadcom/vc5: Remove redundant last_inst lookup.
The point was to get the MOV, which the MOV_dest already returned.
2018-03-19 16:42:59 -07:00
Eric Anholt
d721348dcd broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
2018-03-19 16:42:59 -07:00
Eric Anholt
368bab43fd broadcom/vc5: Add support for loading varyings in V3D 4.1.
The LDVARY signal now writes an arbitrary register, so I took out the
magic src register file and replaced it with an instruction with LDVARY
set so we have somewhere to hang a QFILE_TEMP destination for register
allocation.
2018-01-12 21:57:21 -08:00
Eric Anholt
5aaea3c4a0 broadcom/vc5: Add compiler support for V3D 4.x texturing. 2018-01-12 21:56:57 -08:00
Eric Anholt
90269ba353 broadcom/vc5: Use THRSW to enable multi-threaded shaders.
This is a major performance boost on all of V3D, but is required on V3D
4.x where shaders are always either 2- or 4-threaded.
2018-01-12 21:55:30 -08:00
Eric Anholt
a075bb6726 broadcom/vc5: Implement GFXH-1684 workaround.
Apparently the VPM writes need to be flushed out before we end the shader.
2018-01-12 21:55:15 -08:00
Eric Anholt
22a02f3e34 broadcom/vc5: Use the new LDVPM/STVPM opcodes on V3D 4.1.
Now, instead of a magic write register for VPM stores we have an
instruction to do them (which means no packing of other ALU ops into it),
with the ability to reorder the VPM stores due to the offset being baked
into the instruction.

VPM loads also gain the ability to be reordered by packing the row into
the A argument.  They also no longer write to the r3 accumulator, and
instead must be stored to a physical register.
2018-01-12 21:54:33 -08:00
Eric Anholt
dfee62eed3 broadcom/vc5: Add support for V3Dv4 signal bits.
The WRTMUC replaces the implicit uniform loads in the first two texture
instructions.  LDVPM disappears in favor of an ALU op.  LDVARY, LDTMU,
LDTLB, and LDUNIF*RF now write to arbitrary registers, which required
passing the devinfo through to a few more functions.
2018-01-12 21:53:45 -08:00
Eric Anholt
8e5a0ed953 broadcom/vc5: Emit flat shade flags for varying components > 24.
This means that with no flatshading we'll emit the single-byte
ZERO_ALL_FLAT_SHADE_FLAGS, and otherwise emit a set of FLAT_SHADE_FLAGS to
get all the bits we need set.

There's a _SET enum in the packet we could use to possibly set entire
ranges of the bitfield without using another packet, but this at least
fixes the conformance failure.
2018-01-03 14:25:23 -08:00
Eric Anholt
2056e4a777 broadcom/vc5: Emit proper flatshading code for glShadeModel(GL_FLAT).
In updating the simulator, behavior changed slightly so that our old code
wasn't getting glxgears's flatshading interpolated right.  Emit flat
shading code just like we would for a normal flat-shaded varying, by
passing a flag in the shader key for glShadeModel(GL_FLAT) state and
customizing the color inputs based on that.
2018-01-03 14:25:23 -08:00
Eric Anholt
1171f1749d broadcom/vc5: Enable NIR txd lowering on all txd instructions.
Fixes almost all of piglit's arb_shader_texture_lod grad tests, except for
the base -texgrad/texgradcube ones which fail on what appear to be
precision problems.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-12-14 14:36:17 -08:00
Eric Anholt
e717e3e7cd broadcom/vc5: Add lowering for txf_ms to a txf on a 2x2-scaled texture.
The HW has no native sampler support for multisample textures, but since
we only need to support txf_ms and the layout is UIF, we just need to
scale up the texcoords and then add in the sample.

This drops the old TEXTURE_MSAA_ADDR special uniform, since we're treating
MSAA textures as textures, rather than basically texbos like VC4 had to.
2017-10-30 13:31:27 -07:00
Eric Anholt
eecdbaa985 broadcom/vc5: Add PIPE_TEX_WRAP_CLAMP support for linear-filtered textures.
I already had the texture's wrapping set up to use different behavior for
nearest or linear, so we just needed to saturate the coordinates in linear
mode to get the "proper" blend between the edge and border values.
2017-10-30 13:31:16 -07:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Eric Anholt
361c5f28bd broadcom/vc5: Fix handling of interp qualifiers on builtin color inputs.
The interpolation qualifier, if specified, is supposed to take precedence
over glShadeModel().
2017-10-10 11:42:05 -07:00
Eric Anholt
732a3a72cb broadcom/compiler: Set up passthrough Z when doing FS discards.
In order to keep early-Z from writing early in a discard shader, you need
to set the "modifies Z" bit in the shader state (which the new
prog_data.discards will indicate).  Then, in the shader we do a TLB write
to make Z passthrough happen (the QPU result is ignored, so we use a NULL
source).
2017-10-10 11:42:05 -07:00
Eric Anholt
ade416d023 broadcom: Add VC5 NIR compiler.
This is a pretty straightforward fork of VC4's NIR compiler to VC5.  The
condition codes, registers, and I/O have all changed, making the backend
hard to share, though their heritage is still recognizable.

v2: Move to src/broadcom/compiler to match intel's layout, rename more
    "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts
    with vc4, use new v3d_debug header, add compiler init/free functions,
    do texture swizzling in NIR to allow optimization.
2017-10-10 11:42:04 -07:00