Age | Commit message (Collapse) | Author |
|
The memory leak is noticeable when using custom bone shapes. When using custom
bone shapes objects could be extracted twice. Where the second extraction can
overwrite data created by the first extraction what causes the memory leak.
Options that have been checked:
1. Use two task graphs phases. One for normal extraction (DST.task_graph) and
the other one will handle extractions that require blocking threads.
2. Keep a list of all objects that needs extraction and only start extraction
when all objects have been populated.
The second would slow performance as the extraction only happens when all
objects have been populated. In the future we might want to go for the second
option when we have the capability to render multiple viewports with a single
populate. As this design isn't clear this patch will implement the first
option.
Reviewed By: Clément Foucault
Differential Revision: https://developer.blender.org/D7969
|
|
|
|
This has been replace by manual model+viewproj transform inside the shader.
|
|
This cleanup use the recent changes in shader interface to allow querying
the binding location a texture should use.
This should aleviate all issue we have with texture state change recompiling
the shaders at drawtime.
All binds are now treated like persistent binds and will stick until a new
shading group bind a different shader. The only difference is that you can
still change it with a new subgroup or same shader shgroup.
Since unbinding can be heavy we only do it when using `--debug-gpu`.
|
|
These are the modifications:
-With DRW modification we reduce the number of passes we need to populate.
-Rename passes for consistent naming.
-Reduce complexity in code compilation
-Cleanup how renderpass accumulation passes are setup, using pass instances.
-Make sculpt mode compatible with shadows
-Make hair passes compatible with SSS
-Error shader and lookdev materials now use standalone materials.
-Support default shader (world and material) using a default nodetree internally.
-Change BLEND_CLIP to be emulated by gpu nodetree. Making less shader variations.
-Use BLI_memblock for cache memory allocation.
-Renderpasses are handled by switching a UBO ref bind.
One major hack in this patch is the use of modified pointer as ghash keys.
This rely on the assumption that the keys will never overlap because the
number of options per key will never be bigger than the pointed struct.
The use of one single nodetree to support default material is also a bit hacky
since it won't support concurent usage of this nodetree.
(see EEVEE_shader_default_surface_nodetree)
Another change is that objects with shader errors now appear solid magenta instead
of shaded magenta. This is only because of code reuse purpose but could be changed
if really needed.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D7642
|
|
|
|
This patch uses a graph flow scheduler for creating all mesh batches.
On a Ryzen 1700 the framerate of Spring/020_02A.anim.blend went from 10 fps to 11.5 fps.
For each mesh where batches needs to be updated a sub-graph will be added to the task_graph.
This sub-graph starts with an extract_render_data_node. This fills/converts the required data
from Mesh.
Small extractions and extractions that can't be multi-threaded are grouped in a single
`extract_single_threaded_task_node`.
Other extractions will create a node for each loop exceeding 4096 items. these nodes are
linked to the `user_data_init_task_node`. the `user_data_init_task_node` prepares the userdata
needed for the extraction based on the data extracted from the mesh.
Note: If the `lines` and `lines_loose` are requested, the `lines_loose` sub-buffer is created
as part of the lines extraction. When the lines_loose is only requested the sub-buffer is
created from the existing `lines` buffer. It is assumed that the lines buffer is always
requested before or together with the lines_loose what is always the case (see
`DRW_batch_requested(cache->batch.loose_edges, GPU_PRIM_LINES)` in `draw_cache_impl_mesh.c`).
Reviewed By: Clément Foucault
Differential Revision: https://developer.blender.org/D7618
|
|
This was already the case in most parts of the GPU API.
Use full name for descriptive-comments.
|
|
|
|
This commit is a full refactor of the grease pencil modules including Draw Engine, Modifiers, VFX, depsgraph update, improvements in operators and conversion of Sculpt and Weight paint tools to real brushes.
Also, a huge code cleanup has been done at all levels.
Thanks to @fclem for his work and yo @pepeland and @mendio for the testing and help in the development.
Differential Revision: https://developer.blender.org/D6293
|
|
We detect the case where we need to invert the facing directly inside the
DRWView update and do the appropriate GL calls at draw time.
Fix T63047 Camera with negative scale works only in Cycles Rendered view
Fix T71352 Negative scale camera causes BLI_assert
|
|
This is the unification of all overlays into one overlay engine as described in T65347.
I went over all the code making it more future proof with less hacks and removing old / not relevent parts.
Goals / Acheivements:
- Remove internal shader usage (only drw shaders)
- Remove viewportSize and viewportSizeInv and put them in gloabl ubo
- Fixed some drawing issues: Missing probe option and Missing Alt+B clipping of some shader
- Remove old (legacy) shaders dependancy (not using view UBO).
- Less shader variation (less compilation time at first load and less patching needed for vulkan)
- removed some geom shaders when I could
- Remove static e_data (except shaders storage where it is OK)
- Clear the way to fix some anoying limitations (dithered transparency, background image compositing etc...)
- Wireframe drawing now uses the same batching capabilities as workbench & eevee (indirect drawing).
- Reduced complexity, removed ~3000 Lines of code in draw (also removed a lot of unused shader in GPU).
- Post AA to avoid complexity and cost of MSAA.
Remaining issues:
- ~~Armature edits, overlay toggles, (... others?) are not refreshing viewport after AA is complete~~
- FXAA is not the best for wires, maybe investigate SMAA
- Maybe do something more temporally stable for AA.
- ~~Paint overlays are not working with AA.~~
- ~~infront objects are difficult to select.~~
- ~~the infront wires sometimes goes through they solid counterpart (missing clear maybe?) (toggle overlays on-off when using infront+wireframe overlay in solid shading)~~
Note: I made some decision to change slightly the appearance of some objects to simplify their drawing. Namely the empty arrows end (which is now hollow/wire) and distance points of the cameras/spots being done by lines.
Reviewed By: jbakker
Differential Revision: https://developer.blender.org/D6296
|
|
|
|
|
|
Reviewers: brecht
Differential Revision: D4997
|
|
This reverts commit ce34a6b0d727bbde6ae373afa8ec6c42bc8980ce.
|
|
Reviewers: brecht
Differential Revision: D4997
|
|
The object color property is added as an additional output in
the Object Info node.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D5554
|
|
T68035 by @luzpaz
|
|
|
|
The draw manager used to determine if the view transform should be
applied by checking if the scene was not rendered to an offscreen image.
As the sequencer and texture painting needs to render to an offscreen
image with the view transform applied we need to separate the
`do_color_management` from the `is_image_render`.
Reviewed By: fclem
Maniphest Tasks: T64849
Differential Revision: https://developer.blender.org/D4909
|
|
A lot of drawcalls don't use the object's properties and don't
need a dedicated DRWCallState. We allocate a unique one at
the begining and use it for all calls that uses the default
unit matrix.
|
|
|
|
|
|
|
|
Only update if the view changes.
|
|
This will have multiple benefit.
TODO detail benefits (culling, more explicit, handling of clipping planes)
For now the view usage is wrapped to make changes needed more progressive.
|
|
- Remove DST.frontface and DST.backface.
- Separate uniform update into its own function draw_update_uniforms.
|
|
Adding a constant yields quadratic time complexity which can
have quite a big impact on some scenes.
I used the file from T64901 for testing.
In the test file, the time it took to execute `wm_draw_update`
changed from `0.60s` to `0.51s`.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D4916
|
|
|
|
This is a big change that cleanup a lot of confusing code.
- The instancing/batching data buffer distribution in draw_instance_data.c.
- The selection & drawing code in draw_manager_exec.c
- Prety much every non-meshes object drawing (object_mode.c).
Most of the changes are just renaming but there still a chance a typo might
have sneek through.
The Batching/Instancing Shading groups are replace by DRWCallBuffers. This
is cleaner and conceptually more in line with what a DRWShadingGroup should
be.
There is still some little confusion in draw_common.c where some function
takes shgroup as input and some don't.
|
|
This simplify the rendering logic.
|
|
This is in order to have VAO handled by thoses batches instead of using a
common VAO. Even if the VAO has no importance in these case using a batch
will help when transitioning to Vulkan.
|
|
Goal is still to simplify the draw manager.
|
|
|
|
|
|
|
|
|
|
|
|
The end goal for this is to lower the number of needed matrices.
This also cleanup some uneeded transformation.
|
|
|
|
This reduces the cost of querying the batches to the batch cache.
|
|
This is in order to create less shading group when using duplis.
Data for dupli objects keep in draw manager state until the source object
changes so retrieval is fast.
Note that this system could be extended to all meshes.
|
|
This remove a bit of overhead specially in scene with lots of objects.
|
|
|
|
Comments after code can cause awkward line breaks.
|
|
|
|
Apply clang format as proposed in T53211.
For details on usage and instructions for migrating branches
without conflicts, see:
https://wiki.blender.org/wiki/Tools/ClangFormat
|
|
|
|
release_texture_slots() and release_ubo_slots() were one hotspot when
drawing taking ~9% of total CPU counters for no reason.
This was because of the loops using GPU_max_textures that was overkill and
slow.
Replace those by a simple 64bit bitwise OR operation.
|