Commit Graph

77295 Commits

Author SHA1 Message Date
Jason Ekstrand
4e22cd2e32 nir/spirv: Add support for switch statements 2015-12-29 12:50:31 -08:00
Jason Ekstrand
cf555dc1c2 nir/spirv: A couple simple loop fixes 2015-12-29 12:50:31 -08:00
Jason Ekstrand
303d095f58 nir/spirv: Add an actual CFG data structure
The current data structure doesn't handle much that we couldn't handle
before.  However, this will be absolutely crucial for doing swith
statements.  Also, this should fix structured continues.
2015-12-29 12:50:31 -08:00
Kristian Høgsberg Kristensen
55ca5b0e74 mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enums
GL_ARB_shader_draw_parameters added two new system values.  This gets us
back to mapping mesa system values to the right TGSI semantics.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-12-29 12:15:01 -08:00
Ilia Mirkin
724134f683 nv50/ir: float(s32 & 0xff) = float(u8), not s8
Make sure to make conversion unsigned when we're ANDing the high bits
away. Fixes corruption in dolphin.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2015-12-29 15:08:20 -05:00
Kristian Høgsberg Kristensen
581f81860e i965: Reemit vertex state between indirect multi draws
If we're doing an indirect draw, prims[i].basevertex is always 0 and the
real base vertex value is in the indirect parameter buffer. We try to
avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change,
which then breaks down for indirect draws. Thus, if a program uses base
vertex or base instance, and the draw call is indirect, always flag
BRW_NEW_VERTICES.  A new piglit test,
spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
f9283f2668 nir: Teach nir_opt_algebraic about adding and subtracting the same thing
This optimizes a + b - b to just a. Modest shader-db results (BDW):

  total instructions in shared programs: 7842452 -> 7841862 (-0.01%)
  instructions in affected programs:     61938 -> 61348 (-0.95%)
  total loops in shared programs:        2131 -> 2131 (0.00%)
  helped:                                263
  HURT:                                  0
  GAINED:                                0
  LOST:                                  0

but the optimization turns

  gl_VertexID - gl_BaseVertexARB

into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the
i965 hardware supports natively. That means we can avoid using the
internal vertex buffer for gl_BaseVertexARB in this case.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
cddfc2cefa i965: Add support for gl_DrawIDARB and enable extension
We have to break open a new vec4 for gl_DrawIDARB. We've used up all
space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its
own separate vertex buffer anyway.  This is because we point the vb for
base vertex and base instance into the draw parameter BO for indirect
draw calls, but the draw id is generated by mesa in a different buffer.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
17ebb55a14 i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB
We already have gl_BaseVertexARB in the .x component of the SGVS vec4
and plug gl_BaseInstanceARB into the last free component (.y).

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
b70616f3e7 i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered
fs_visitor::emit_vs_system_value() looks like it's trying to handle
SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the
backend.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
1a59aeaebd mesa: Add core mesa support for GL_ARB_shader_draw_parameters
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen
42dd2c028d mesa/vbo: Add draw_id field to struct _mesa_prim
The drivers will need this for passing in gl_DrawIDARB. For indirect
multidraw calls, we get the prim array and prim[i].draw_id == i and is
redundant. But for non-indirect calls, we get one primitive at a time
and need the draw_id field.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-12-29 10:39:25 -08:00
Aaron Watry
70d8dbc9a1 nir: Remove function overload in control flow test
Fixes make check.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-12-29 09:42:14 -08:00
Jason Ekstrand
bbf99511d0 gen7/8/pipeline: s/vb_used/elements in emit_vertex_input 2015-12-29 09:40:22 -08:00
Nicolai Hähnle
7b8db37abb radeonsi: add RADEON_REPLACE_SHADERS debug option
This option allows replacing a single shader by a pre-compiled ELF object
as generated by LLVM's llc, for example. This can be useful for debugging a
deterministically occuring error in shaders (and has in fact helped find
the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264).

v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:07:04 -05:00
Nicolai Hähnle
7d1fc2cf51 radeonsi: count compilations in si_compile_llvm
This changes the count slightly (because of si_generate_gs_copy_shader), but
this is only relevant for the driver-specific num-compilations query. It sets
the stage for the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:07:01 -05:00
Nicolai Hähnle
4711170239 gallium/util: add DEBUG_GET_ONCE_OPTION
This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-12-29 09:06:57 -05:00
Grazvydas Ignotas
da0e216e06 r600: fix constant buffer size programming
When buffer size is less than 16, zero ends up being programmed as
size, which prevents the hardware from fetching the correct values.
Fix it by combining shift and align so that the value is always
rounded up.

Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-12-29 09:05:55 -05:00
Kristian Høgsberg Kristensen
fc03723bcd vk: Fill out buffer surface state when updating descriptor set
We can do this when we update the descriptor set instead of on the
fly.
2015-12-28 21:57:56 -08:00
Kristian Høgsberg Kristensen
a00524a216 vk: Unstub VkSemaphore implementation
There really is nothing to do for us here, at least with the current
kernel interface.
2015-12-28 21:57:56 -08:00
Jason Ekstrand
5fab35d090 gen7/pipeline: Actually use inputs_read from the VS for laying out inputs 2015-12-28 18:21:11 -08:00
Jason Ekstrand
b090f9dce1 gen8/pipeline: Actually use inputs_read from the VS for laying out inputs 2015-12-28 18:21:11 -08:00
Jason Ekstrand
3eb108ef87 anv/meta: Fix the pos_out location for the vertex shader 2015-12-28 18:21:11 -08:00
Jason Ekstrand
b005fd62f9 nir/spirv: Add GLSL.std.450.h
It accidentally got removed during the mass rename.
2015-12-28 15:46:22 -08:00
Jason Ekstrand
9c84b6cce0 anv/device: Set device->info sooner in CreateDevice
anv_block_pool_init calls anv_block_pool_grow which checks
device->info.has_llc to see if it needs to set caching parameters.
If we don't set device->info early enough, this reads an undefined value
which is probably 0 and not what we want on llc platforms.

Found with valgrind.
2015-12-28 13:29:01 -08:00
Jason Ekstrand
763176a3e2 nir/lower_returns: Fix a bug in loop lowering 2015-12-28 13:22:09 -08:00
Kenneth Graunke
dfce9759ab docs: Mark ARB_tessellation_shader as done on all i965 platforms.
We now support all Intel GPUs which can do tessellation.

Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:17:08 -08:00
Kenneth Graunke
381a89cf2a i965: Enable ARB_tessellation_shader on Gen7-7.5.
We've resolved all the GPU hangs, and everything seems to be working.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:17:05 -08:00
Kenneth Graunke
bd8ab8dedb i965: Don't set interleave or complete on TCS EOT message.
Setting interleave on the TCS EOT message causes Ivybridge hardware to
GPU hang like crazy.  Individual tests would pass, but running even a
simple test like nop.shader_test in a loop would hang within 1-3 runs.
Adding sleep delays worked around the problem, somehow.

Interleave doesn't make much sense given that we only have one patch
URB handle, not two.  Complete doesn't seem useful either.

There's no reason to actually set those bits.  We were just being lazy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:17:03 -08:00
Kenneth Graunke
b7793783b3 i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish.
Pre-Broadwell hardware requires us to manually release the ICP Handles
by issuing URB read messages with the "Complete" bit set.  We can do
this in pairs to use fewer URB read messages.

Based heavily on work from Chris Forbes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:17:00 -08:00
Kenneth Graunke
6ceabb72ea i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail.
Gen7 uses bits 15:12 while Gen7+ uses bits 16:13.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:57 -08:00
Kenneth Graunke
5898cbae24 i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail.
Gen7 uses 22:16 while Gen7.5+ uses 23:17.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:54 -08:00
Kenneth Graunke
1245724f72 i965: Port tessellation evaluation shaders to vec4 mode.
This can be used on Broadwell by setting INTEL_SCALAR_TES=0.
More importantly, it will be used for Ivybridge and Haswell.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:48 -08:00
Kenneth Graunke
889d987904 i965: Emit a real 3DSTATE_DS on Gen7.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:45 -08:00
Kenneth Graunke
2c240b05e9 i965: Emit a real 3DSTATE_HS on Gen7.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:34 -08:00
Kenneth Graunke
74b83fe368 i965: Add the TCS/TES state upload atoms to the gen7_atoms list.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-12-28 13:16:19 -08:00
Jason Ekstrand
7aaed91581 nir/spirv: Move to its own directory 2015-12-28 11:49:39 -08:00
Jason Ekstrand
d5fa51bdee Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in the removal of nir_function_overload
2015-12-28 10:56:31 -08:00
Jason Ekstrand
d9dcfafacc nir/spirv: Use nir_build_alu for alu instructions 2015-12-28 10:35:31 -08:00
Jason Ekstrand
237f2f2d8b nir: Get rid of function overloads
When Connor originally drafted NIR, he copied the same function+overload
system that GLSL IR had with a few names changed.  However, this
double-indirection is not really needed and has only served to confuse
people.  Instead, let's just have functions which may not have unique names
and may or may not have an implementation.  If someone wants to do overload
resolving, they can hav a hash table based function+overload system in the
overload resolving pass.  There's no good reason to keep it in core NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

ir3 bits are

Reviewed-by: Rob Clark <robclark@gmail.com>
2015-12-28 09:59:53 -08:00
Jason Ekstrand
ea77b384e8 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in tessellation and the store_var changes that go with it.
2015-12-27 23:23:05 -08:00
Jason Ekstrand
f948767471 nir/lower_returns: Better algorithm as per connor 2015-12-27 22:50:45 -08:00
Jason Ekstrand
3489f66056 nir: Add a cursor helper for getting a cursor after any phi nodes 2015-12-27 22:50:14 -08:00
Ilia Mirkin
109c348284 nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion
Also release the scratch allocation if any.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2015-12-27 21:33:36 -05:00
Ilia Mirkin
28e07fdd4a nv50,nvc0: add a note when converting vertex elements using CPU
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-12-27 19:49:44 -05:00
Jason Ekstrand
c60456dfaa nir/gather_info: Handle multi-slot variables in io bitfields 2015-12-24 00:47:20 -08:00
Jason Ekstrand
bbebd2de13 nir: Add a helper for getting the bitmask for a variable's location 2015-12-24 00:47:20 -08:00
Jason Ekstrand
4ff4310a78 nir/types: Expose glsl_type::count_attribute_slots() 2015-12-24 00:47:19 -08:00
Jason Ekstrand
0bc1b0fd23 nir/lower_return: Do it for real this time 2015-12-24 00:47:19 -08:00
Jason Ekstrand
e1b1d58bec nir/cf: Make extracting or re-inserting nothing a no-op 2015-12-23 23:46:04 -08:00