Commit Graph

1448 Commits

Author SHA1 Message Date
Kenneth Graunke
a62fe34098 i965/vec4: Actually handle atomic op intrinsics.
Embarassingly, someone enabled the ARB_shader_atomic_counter_ops
extension for Gen7+ but never added the intrinsics to the switch
statement in the vec4 backend, so they just hit an unreachable()
call and died.

Fixes: 40dd45d0c6 (i965: Enable ARB_shader_atomic_counter_ops)
Cc: "17.2 17.1 17.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-26 15:35:06 -07:00
Timothy Arceri
49e4248a93 i965/nir: export nir_optimize
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Kenneth Graunke
c9fbe772ba i965: Handle unwritten PSIZ/VIEWPORT/LAYER outputs in vec4 shaders.
This can occur if the shader is capturing some of the values from the
VUE header for transform feedback, but the shader hasn't written all of
them.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-21 09:39:27 -07:00
Jason Ekstrand
d496780fb2 intel/eu/validate: Look up types on demand in execution_type()
We are looking up the execution type prior to checking how many sources
we have.  This leads to looking for a type for src1 on MOV instructions
which is bogus.  On BDW+, the src1 register type overlaps with the
64-bit immediate and causes us problems.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-12 15:01:00 -07:00
Matt Turner
dff75c7175 i965: Drop unnecessary conditional
Clang doesn't realize that 0 and 1 are the only possibilities, a thinks
lots of variables might be uninitialized.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Topi Pohjolainen
5dd072380a intel/compiler: Cast reg types explicitly
Makes coverity happier.

CID: 1416799
Fixes: c1ac1a3d25 (i965: Add a brw_hw_type_to_reg_type() function)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-28 14:43:39 +03:00
Jason Ekstrand
95f533d922 anv,i965: Move CS shared lowering into anv
Right now, OpenGL uses the GLSL lowering for shared variables and anv
uses NIR to lower them.  For a long time, we've done this weird thing
where we do the NIR lowering unconditionally and then add the SLM sizes
from the two together.  This works because one of them will always be 0
but it's a bit sketchy.  Let's just move the NIR-based lowering into
anv_pipeline and get rid of the sketch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-24 16:34:29 -07:00
Kenneth Graunke
4ffa9f3635 i965: Stop using wm_prog_data->binding_table.render_target_start.
Render target surfaces always start at binding table index 0.
This is required for us to use headerless FB writes, which we
really want to do.  So, we'll never change that.

Given that, it's not necessary to look up a wm_prog_data field
which we already know contains 0.  We can drop the dependency in
brw_renderbuffer_surfaces (Gen4-5)...which was already confusingly
missing from gen6_renderbuffer_surfaces.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
274afad4cd i965: Add a brw_wm_prog_data::has_render_target_reads field.
State upload code should use prog_data rather than poking at shader_info
directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Matt Turner
d37d9f84ac i965: Mark functions static
Cuts 300 bytes of .text

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
f30902629c i965/vec4: Use 'class' src_reg, rather than 'struct' src_reg
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
a77d5b28ac i965/vec4: Return float from spill_cost_for_type()
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-08-21 14:45:44 -07:00
Matt Turner
a98b1a8922 i965: Optimize reading the destination type
brw_hw_type_to_reg_type() needs to know only whether the file is
BRW_IMMEDIATE_VALUE or not, which is not a valid file for the
destination. gcc and clang will evaluate __builtin_strcmp() at compile
time, so we can use it to pass a constant file for the destination.

   text	   data	    bss	    dec	    hex	filename
7816214	 346248	 420496	8582958	 82f72e	i965_dri.so before
7816070	 346248	 420496	8582814	 82f69e	i965_dri.so after

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
91ef949054 i965: Mark brw_hw_type_to_reg_type() as a pure function
text	   data	    bss	    dec	    hex	filename
7816886	 346248	 420496	8583630	 82f9ce	i965_dri.so before
7816214	 346248	 420496	8582958	 82f72e	i965_dri.so after

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
e07fe89035 i965: Hide the register type hardware encodings
So we stop mixing them with the logical enum.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
4fab67a441 i965: Stop using hardware register types directly
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
c746f1c888 i965: Add brw_hw_reg_type_to_letters() and use it in brw_disasm.c
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
6a2471b501 i965: Move brw_reg_type_letters() as well
And add "to_" to the name for consistency with the other functions in
this file.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
1cb0a7941b i965: Switch to using the logical register types
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
cb2cd462b1 i965: Add functions to abstract access to register types
Previously the brw_inst{,_set}_{dst,src0,src1}_reg_type() functions
provided access to the hardware encodings for the register types. We
often mixed these with the logical BRW_REGISTER_TYPE_* enums (which
themselves used to be the hardware format!) with bad results.

With that functionality now available with the hw_ versions (see
previous commit), we now add functions that take the logical
BRW_REGISTER_TYPE_* enums and convert into the hardware format and vice
versa. To do the conversion we also have to provide the file.

Note the asymmetry between the two functions: the new getter reads the
file from the instruction word, and to ensure that is always set the
setter writes both the file and the type.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
9fb8323328 i965: Rename brw_inst's functions that access the register type
Put hw_ in the name so that it's clear these are the hardware encodings.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
3e379af492 i965: Index brw_hw_reg_type_to_size()'s table by logical type
I'll be transitioning everything to use the logical types.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
c1ac1a3d25 i965: Add a brw_hw_type_to_reg_type() function
Will be used in later commits.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
dbe7dd13dd i965: Use a common table to translate logical to hardware types
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
bfcc9aa829 i965: Extract functions dealing with register types to separate file
I'm going to encapsulate all of the logic dealing with register types in
this file.

Rename the parameters for the hardware encodings from type -> hw_type at
the same time.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
890f863da0 i965: Reverse file/type arguments to register type functions
I think of the initial arguments as "state" and the last as the actual
subject.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
92f787ff86 i965: Add support for disassembling 64-bit integer immediates
After the last patch converted things into enums, I helpfully got a
compiler warning about these missing from the switch statement.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
deae25ce37 i965: Use separate enums for register vs immediate types
The hardware encodings often mean different things depending on whether
the source is an immediate.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
8815b9677f i965: Reorder brw_reg_type enum values
These vaguely corresponded to the hardware encodings, but that is purely
historical at this point. Reorder them so we stop making things "almost
work" when mixing enums.

The ordering has been closen so that no enum value is the same as a
compatible hardware encoding.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
ce6b8627d8 i965: Validate destination restrictions with vector immediates
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
1d79c828d8 i965: Don't let raw-move check be tricked by immediate vector types
UB and B type encodings are the same as UV and VF. Noticed when writing
the following patch.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
48aa6ecb87 i965: Only change type of 0.0f to VF if destination stride == 1
The destination stride must be equivalent to a dword if VF is used.

Also, since the only compaction table entires with "i:vf" have the
destination as "r:f" specifically check that the destination is of type
float.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
56a676eed2 i965: Remove CONT/BREAK from instruction compaction test
These cannot be compacted. A similar mistake was fixed in commit
90eaf01616

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
3d661e6062 i965: Test instruction compaction on all supported Gens
Note that there's no point in testing on G45, since its compaction is
the same as Gen5. Same logic applies to Gen7 variants and low-power
parts.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
9ff7d9b853 i965: Silence signed/unsigned comparison warning
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
eac89911e5 i965: Move compaction "prepass" into brw_eu_compact.c
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
17641f6388 i965: Mark src inst pointer const in compaction code
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Iago Toral Quiroga
81615ad444 intel/compiler: properly size attribute wa_flags array for Vulkan
Mesa will map user defined vertex input attributes to slots
starting at VERT_ATTRIB_GENERIC0 which gives us room for only 16
slots (up to GL_VERT_ATTRIB_MAX). This sufficient for GL, where
we expose exactly 16 vertex attributes for user defined inputs, but
in Vulkan we can expose up to 28 (which are also mapped from
VERT_ATTRIB_GENERIC0 onwards) so we need to account for this when
we scope the size of the array of attribute workaround flags
that is used during the brw_vertex_workarounds NIR pass. This
prevents out-of-bounds accesses in that array for NIR shaders
that use more than 16 vertex input attributes.

Fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.*

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-11 10:41:44 +02:00
Dave Airlie
271fa3a684 intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-03 16:54:08 +10:00
Matt Turner
858f554078 i965: Fix indentation 2017-08-02 16:49:32 -07:00
Kenneth Graunke
30d6bc470a i965: Set lower_vote_trivial in vector_nir_options_gen6 too.
There's a second struct for Gen6+.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-21 18:09:01 -07:00
Matt Turner
069bf7c907 i965/fs: Match destination type to size for ballot
No use in taking a 64-bit value when we know the high 32-bits are zero.
2017-07-20 16:56:50 -07:00
Matt Turner
1038d385a9 nir: Reduce destination size of ballot intrinsic when possible
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
782ef30451 i965/fs: Implement ARB_shader_ballot operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
8238930510 i965/fs: Do not move MOVs writing the flag outside of control flow
The implementation of ballotARB() will start by zeroing the flags
register. So, a doing something like

        if (gl_SubGroupInvocationARB % 2u == 0u) {
                ... = ballotARB(true);
		[...]
        } else {
                ... = ballotARB(true);
		[...]
	}

(like fs-ballot-if-else.shader_test does) would generate identical MOVs
to the same destination (the flag register!), and we definitely do not
want to pull that out of the control flow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez
f1b7c47913 i965/fs: Handle explicit flag sources in flags_read()
The implementations of the ARB_shader_ballot intrinsics will explicitly
read the flag as a source register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner
43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
ee9fa4ac18 i965/fs: Implement ARB_shader_group_vote operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez
93dc736f4e i965/fs: Handle explicit flag destinations in flags_written()
The implementations of the ARB_shader_group_vote intrinsics will
explicitly write the flag as the destination register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner
30b72f4126 i965/vec4: Lower ARB_shader_group_vote intrinsics
I don't expect anyone is going to care about using this in vec4 programs
(vertex/tessellation/geometry on Gen6/7), no one has come up with a good
way to implement it much less test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00