Age | Commit message (Collapse) | Author |
|
|
|
Only option we have.
|
|
We'll need to force a temporary and mark it as precise.
MSL is a little weird here, but we can piggyback on top of the invariant
float math option here to force fma() operations everywhere.
|
|
Remove all shenanigans with propagation, and only consume nonuniform
qualifiers exactly where needed (last minute).
|
|
|
|
|
|
This was completely unimplemented for some reason.
|
|
|
|
|
|
|
|
|
|
This can make codegen more predictable since ByteAddressBuffer is SRV
and not UAV.
|
|
|
|
|
|
Need to bitcast the unrolled expressions as well.
|
|
This was straightforward to implement in GLSL. The
`ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT`
modes aren't implemented yet, because we don't support
`SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet.
HLSL and MSL were more interesting. They don't support this directly,
but they do support marking resources as "rasterizer ordered," which
does roughly the same thing. So this implementation scans all accesses
inside the critical section and marks all storage resources found
therein as rasterizer ordered. They also don't support the fine-grained
controls on pixel- vs. sample-level interlock and disabling ordering
guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here
merely means the order is undefined; that it just so happens to be the
same as rasterizer order is immaterial. As for pixel- vs. sample-level
interlock, Vulkan explicitly states:
> With sample shading enabled, [the `PixelInterlockOrderedEXT` and
> `PixelInterlockUnorderedEXT`] execution modes are treated like
> `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`
> respectively.
and:
> If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`]
> execution modes are used in single-sample mode they are treated like
> `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT`
> respectively.
So this will DTRT for MoltenVK and gfx-rs, at least.
MSL additionally supports multiple raster order groups; resources that
are not accessed together can be placed in different ROGs to allow them
to be synchronized separately. A more sophisticated analysis might be
able to place resources optimally, but that's outside the scope of this
change. For now, we assign all resources to group 0, which should do for
our purposes.
`glslang` doesn't support the `RasterizerOrdered` UAVs this
implementation produces for HLSL, so the test case needs `fxc.exe`.
It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`,
even though the spec says it needs either 4.20 or
`GL_ARB_shader_image_load_store`; and it doesn't support the
`GL_NV_fragment_shader_interlock` extension at all. So I haven't been
able to test those code paths.
Fixes #1002.
|
|
This extension provides a new operation which causes a fragment to be
discarded without terminating the fragment shader invocation. The
invocation for the discarded fragment becomes a helper invocation, so
that derivatives will remain defined. The old `HelperInvocation` builtin
becomes undefined when this occurs, so a second new instruction queries
the current helper invocation status.
This is only fully supported for GLSL. HLSL doesn't support the
`IsHelperInvocation` operation and MSL doesn't support the
`DemoteToHelperInvocation` op.
Fixes #1052.
|
|
Fix fallout from changes.
There's a bug in glslang that prevents `float16_t`, `[u]int16_t`, and
`[u]int8_t` constants from adding the corresponding SPIR-V capabilities.
SPIRV-Tools, meanwhile, tightened validation so that these constants are
only valid if the corresponding `Float16`, `Int16`, and `Int8` caps are
on. This affects the `16bit-constants.frag` test for GLSL and MSL.
|
|
There is a case where we can deduce a for/while loop, but the continue
block is actually very painful to deal with, so handle that case as
well. Removes an exceptional case.
|
|
|
|
|
|
HLSL and MSL don't support it, so fall back to simpler intrinsics.
|
|
|
|
Facilitates easier mapping from source language to cross-compiled output
in tooling.
|
|
MSL does not seem to have a qualifier for this, but HLSL SM 5.1 does.
glslangValidator for HLSL does not support this, so skip any validation,
but it passes in FXC.
|
|
|
|
Need to emulate these calls for correctness.
|
|
depth2d in MSL only returns float, not float4, even for normal sampling.
We need to conditionally remap-swizzle back to float4.
|
|
|
|
Fairly common pattern in unoptimized SPIR-V. Support this case as well.
|
|
Fix various bugs along the way.
|
|
|
|
|
|
Use correct block-name / other-name aliasing rules.
|
|
A block name cannot alias with any name in its own scope,
and it cannot alias with any other "global" name.
To solve this, we need to complicate the name cache updates a little bit
where we have a "primary" namespace and "secondary" namespace.
|
|
This is required to avoid relying on complex sub-expression elimination
in compilers, and generates cleaner code.
The problem case is if a complex expression is used in an access chain,
like:
Composite comp = buffer[texture(...)];
vec4 a = comp.a + comp.b + comp.c;
Before, we did not have common subexpression tracking for
OpLoad/OpAccessChain, so we easily ended up with code like:
vec4 a = buffer[texture(...)].a + buffer[texture(...)].b + buffer[texture(...)].c;
A good compiler will optimize this, but we should not rely on it, and
forcing texture(...) to a temporary also looks better.
The solution is to add a vector "implied_expression_reads", which works
similarly to expression_dependencies. We also need an extra mechanism in
to_expression which lets us skip expression read checking and do it
later. E.g. for expr -> access chain -> load, we should only trigger
a read of expr when using the loaded expression.
|
|
|
|
Adds support on HLSL SM 5.0, and fixes bug on GLSL.
Makes sure early fragment tests is tested on MSL as well.
|
|
|
|
When trying to validate buffer sizes, we usually need to bail out when
using SpecConstantOps, but for some very specific cases where we allow
unsized arrays currently, we can safely allow "unknown" sized arrays as
well.
This is probably the best we can do, when we have even more difficult
cases than this, we throw a more sensible error message.
|
|
|
|
- Add new Windows support
- Use CMake/CTest instead of Make + shell scripts
- Use --parallel in CTest
- Fix CTest on Windows
- Cleanups in test_shaders.py
- Force specific commit for SPIRV-Headers
- Fix Inf/NaN odd-ball case by moving to ASM
|
|
A lot of changes in spirv-opt output.
Some new invalid SPIR-V was found but most of them were not significant
for SPIRV-Cross, so just marked them as invalid.
|
|
|
|
|
|
|
|
|
|
When the name of an alias global variable collides with a global
declaration, MSL would emit inconsistent names, sometimes with the
naming fix, sometimes without, because names were being tracked in two
separate meta blocks. Fix this by always redirecting parameter naming to
the original base variable as necessary.
|
|
Just apply the offset directly, MSL has no immediate offset parameter.
|
|
MSL would force thread const& which would not work if the input argument
came from a different storage class.
Emit proper non-reference arguments for such values.
|