git.blender.org/blender.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-01-25	Fix depsgraphs sharing IDs via evaluated edit mesh	Sergey Sharybin
	The evaluated mesh is a result of evaluated modifiers, and referencing other evaluated IDs such as materials. It can not be stored in the EditMesh structure which is intended to be re-used by many areas. Such sharing was causing ownership errors causing bugs like T93855: Cycles crash with edit mode and simultaneous viewport and final render The proposed solution is to store the evaluated edit mesh and its cage in the object's runtime field. The motivation goes as following: - It allows to avoid ownership problems like the ones in the linked report. - Object level is chosen over mesh level is because the evaluated mesh is affected by modifiers, which are on the object level. This patch allows to have modifier stack of an object which shares mesh with an object which is in edit mode to be properly taken into account (before the change the modifier stack from the active object will be used for all objects which share the mesh). There is a change in the way how copy-on-write is handled in the edit mode to allow proper state update when changing active scene (or having two windows with different scenes). Previously, the copt-on-write would have been ignored by skipping tagging CoW component. Now it is ignored from within the CoW operation callback. This allows to update edit pointers for objects which are not from the current depsgraph and where the edit_mesh was never assigned in the case when the depsgraph was evaluated prior the active depsgraph. There is no user level changes changes expected with the CoW handling changes: should not affect on neither performance, nor memory consumption. Tested scenarios: - Various modifiers configurations of objects sharing mesh and be part of the same scene. - Steps from the reports: T93855, T82952, T77359 This also fixes T76609, T72733 and perhaps other reports. Differential Revision: https://developer.blender.org/D13824
2022-01-24	GPU subdiv: reduce memory usage for point IBO	Kévin Dietrich
	The point IBO should only have data for coarse vertices (or in general, the vertices in the original mesh). As it used for displaying the vertices for selection in edit mode, and as it indexes into the VBOs for the positions and edit data, it is itself only indexed by coarse/ original vertex index. For the subdivision case, this would allocate space for the final subdivision vertex and reallocate to make room for loose geometry, although only the first coarse vertex count amount of data would be. Now just allocate for the required memory. Also reuse index buffer APIs instead of doing manual work.
2022-01-20	Subdivision: add support for vertex creasing	Kévin Dietrich
	This adds vertex creasing support for OpenSubDiv for modeling, rendering, Alembic and USD I/O. For modeling, vertex creasing follows the edge creasing implementation with an operator accessible through the Vertex menu in Edit Mode, and some parameter in the properties panel. The option in the Subsurf and Multires to use edge creasing also affects vertex creasing. The vertex crease data is stored as a CustomData layer, unlike edge creases which for now are stored in `MEdge`, but will in the future also be moved to a `CustomData` layer. See comments for details on the difference in behavior for the `CD_CREASE` layer between egdes and vertices. For Cycles this adds sockets on the Mesh node to hold data about which vertices are creased (one socket for the indices, one for the weigths). Viewport rendering of vertex creasing reuses the same color scheme as for edges and creased vertices are drawn bigger than uncreased vertices. For Alembic and USD, vertex crease support follows the edge crease implementation, they are always read, but only exported if a `Subsurf` modifier is present on the Mesh. Reviewed By: brecht, fclem, sergey, sybren, campbellbarton Differential Revision: https://developer.blender.org/D10145
2022-01-16	Fix T94865: GPU subdiv crash switching to texpaint area	Kévin Dietrich
	The crash is due to the fact that GPU subdivision extraction routines for edit data (including UVs) only worked for BMesh. However, a Mesh based version is still needed for texture painting. This adds the missing components. This also ensures all data are properly initialized (at least the ones revealed by the bug).
2022-01-16	Cleanup: deduplicate GPU subdiv data extraction loops	Kévin Dietrich
	This puts the loop over the final subdivision quads outside of the mesh iteration callback. This can also allow for easier parallel execution in the future if need be.
2022-01-13	Refactor: Move normals out of MVert, lazy calculation	Hans Goudey
	As described in T91186, this commit moves mesh vertex normals into a contiguous array of float vectors in a custom data layer, how face normals are currently stored. The main interface is documented in `BKE_mesh.h`. Vertex and face normals are now calculated on-demand and cached, retrieved with an "ensure" function. Since the logical state of a mesh is now "has normals when necessary", they can be retrieved from a `const` mesh. The goal is to use on-demand calculation for all derived data, but leave room for eager calculation for performance purposes (modifier evaluation is threaded, but viewport data generation is not). Benefits This moves us closer to a SoA approach rather than the current AoS paradigm. Accessing a contiguous `float3` is much more efficient than retrieving data from a larger struct. The memory requirements for accessing only normals or vertex locations are smaller, and at the cost of more memory usage for just normals, they now don't have to be converted between float and short, which also simplifies code In the future, the remaining items can be removed from `MVert`, leaving only `float3`, which has similar benefits (see T93602). Removing the combination of derived and original data makes it conceptually simpler to only calculate normals when necessary. This is especially important now that we have more opportunities for temporary meshes in geometry nodes. Performance In addition to the theoretical future performance improvements by making `MVert == float3`, I've done some basic performance testing on this patch directly. The data is fairly rough, but it gives an idea about where things stand generally. - Mesh line primitive 4m Verts: 1.16x faster (36 -> 31 ms), showing that accessing just `MVert` is now more efficient. - Spring Splash Screen: 1.03-1.06 -> 1.06-1.11 FPS, a very slight change that at least shows there is no regression. - Sprite Fright Snail Smoosh: 3.30-3.40 -> 3.42-3.50 FPS, a small but observable speedup. - Set Position Node with Scaled Normal: 1.36x faster (53 -> 39 ms), shows that using normals in geometry nodes is faster. - Normal Calculation 1.6m Vert Cube: 1.19x faster (25 -> 21 ms), shows that calculating normals is slightly faster now. - File Size of 1.6m Vert Cube: 1.03x smaller (214.7 -> 208.4 MB), Normals are not saved in files, which can help with large meshes. As for memory usage, it may be slightly more in some cases, but I didn't observe any difference in the production files I tested. Tests Some modifiers and cycles test results need to be updated with this commit, for two reasons: - The subdivision surface modifier is not responsible for calculating normals anymore. In master, the modifier creates different normals than the result of the `Mesh` normal calculation, so this is a bug fix. - There are small differences in the results of some modifiers that use normals because they are not converted to and from `short` anymore. Future improvements - Remove `ModifierTypeInfo::dependsOnNormals`. Code in each modifier already retrieves normals if they are needed anyway. - Copy normals as part of a better CoW system for attributes. - Make more areas use lazy instead of eager normal calculation. - Remove `BKE_mesh_normals_tag_dirty` in more places since that is now the default state of a new mesh. - Possibly apply a similar change to derived face corner normals. Differential Revision: https://developer.blender.org/D12770
2022-01-12	BLI: Refactor vector types & functions to use templates	Clément Foucault
	This patch implements the vector types (i.e:`float2`) by making heavy usage of templating. All vector functions are now outside of the vector classes (inside the `blender::math` namespace) and are not vector size dependent for the most part. In the ongoing effort to make shaders less GL centric, we are aiming to share more code between GLSL and C++ to avoid code duplication. ####Motivations: - We are aiming to share UBO and SSBO structures between GLSL and C++. This means we will use many of the existing vector types and others we currently don't have (uintX, intX). All these variations were asking for many more code duplication. - Deduplicate existing code which is duplicated for each vector size. - We also want to share small functions. Which means that vector functions should be static and not in the class namespace. - Reduce friction to use these types in new projects due to their incompleteness. - The current state of the `BLI_(float\|double\|mpq)(2\|3\|4).hh` is a bit of a let down. Most clases are incomplete, out of sync with each others with different codestyles, and some functions that should be static are not (i.e: `float3::reflect()`). ####Upsides: - Still support `.x, .y, .z, .w` for readability. - Compact, readable and easilly extendable. - All of the vector functions are available for all the vectors types and can be restricted to certain types. Also template specialization let us define exception for special class (like mpq). - With optimization ON, the compiler unroll the loops and performance is the same. ####Downsides: - Might impact debugability. Though I would arge that the bugs are rarelly caused by the vector class itself (since the operations are quite trivial) but by the type conversions. - Might impact compile time. I did not saw a significant impact since the usage is not really widespread. - Functions needs to be rewritten to support arbitrary vector length. For instance, one can't call `len_squared_v3v3` in `math::length_squared()` and call it a day. - Type cast does not work with the template version of the `math::` vector functions. Meaning you need to manually cast `float ` and `(float )[3]` to `float3` for the function calls. i.e: `math::distance_squared(float3(nearest.co), positions[i]);` - Some parts might loose in readability: `float3::dot(v1.normalized(), v2.normalized())` becoming `math::dot(math::normalize(v1), math::normalize(v2))` But I propose, when appropriate, to use `using namespace blender::math;` on function local or file scope to increase readability. `dot(normalize(v1), normalize(v2))` ####Consideration: - Include back `.length()` method. It is quite handy and is more C++ oriented. - I considered the GLM library as a candidate for replacement. It felt like too much for what we need and would be difficult to extend / modify to our needs. - I used Macros to reduce code in operators declaration and potential copy paste bugs. This could reduce debugability and could be reverted. - This touches `delaunay_2d.cc` and the intersection code. I would like to know @howardt opinion on the matter. - The `noexcept` on the copy constructor of `mpq(2\|3)` is being removed. But according to @JacquesLucke it is not a real problem for now. I would like to give a huge thanks to @JacquesLucke who helped during this and pushed me to reduce the duplication further. Reviewed By: brecht, sergey, JacquesLucke Differential Revision: https://developer.blender.org/D13791
2022-01-12	Revert "BLI: Refactor vector types & functions to use templates"	Clément Foucault
	Includes unwanted changes This reverts commit 46e049d0ce2bce2f53ddc41a0dbbea2969d00a5d.
2022-01-12	BLI: Refactor vector types & functions to use templates	Clment Foucault
	This patch implements the vector types (i.e:`float2`) by making heavy usage of templating. All vector functions are now outside of the vector classes (inside the `blender::math` namespace) and are not vector size dependent for the most part. In the ongoing effort to make shaders less GL centric, we are aiming to share more code between GLSL and C++ to avoid code duplication. ####Motivations: - We are aiming to share UBO and SSBO structures between GLSL and C++. This means we will use many of the existing vector types and others we currently don't have (uintX, intX). All these variations were asking for many more code duplication. - Deduplicate existing code which is duplicated for each vector size. - We also want to share small functions. Which means that vector functions should be static and not in the class namespace. - Reduce friction to use these types in new projects due to their incompleteness. - The current state of the `BLI_(float\|double\|mpq)(2\|3\|4).hh` is a bit of a let down. Most clases are incomplete, out of sync with each others with different codestyles, and some functions that should be static are not (i.e: `float3::reflect()`). ####Upsides: - Still support `.x, .y, .z, .w` for readability. - Compact, readable and easilly extendable. - All of the vector functions are available for all the vectors types and can be restricted to certain types. Also template specialization let us define exception for special class (like mpq). - With optimization ON, the compiler unroll the loops and performance is the same. ####Downsides: - Might impact debugability. Though I would arge that the bugs are rarelly caused by the vector class itself (since the operations are quite trivial) but by the type conversions. - Might impact compile time. I did not saw a significant impact since the usage is not really widespread. - Functions needs to be rewritten to support arbitrary vector length. For instance, one can't call `len_squared_v3v3` in `math::length_squared()` and call it a day. - Type cast does not work with the template version of the `math::` vector functions. Meaning you need to manually cast `float ` and `(float )[3]` to `float3` for the function calls. i.e: `math::distance_squared(float3(nearest.co), positions[i]);` - Some parts might loose in readability: `float3::dot(v1.normalized(), v2.normalized())` becoming `math::dot(math::normalize(v1), math::normalize(v2))` But I propose, when appropriate, to use `using namespace blender::math;` on function local or file scope to increase readability. `dot(normalize(v1), normalize(v2))` ####Consideration: - Include back `.length()` method. It is quite handy and is more C++ oriented. - I considered the GLM library as a candidate for replacement. It felt like too much for what we need and would be difficult to extend / modify to our needs. - I used Macros to reduce code in operators declaration and potential copy paste bugs. This could reduce debugability and could be reverted. - This touches `delaunay_2d.cc` and the intersection code. I would like to know @howardt opinion on the matter. - The `noexcept` on the copy constructor of `mpq(2\|3)` is being removed. But according to @JacquesLucke it is not a real problem for now. I would like to give a huge thanks to @JacquesLucke who helped during this and pushed me to reduce the duplication further. Reviewed By: brecht, sergey, JacquesLucke Differential Revision: https://developer.blender.org/D13791
2022-01-12	Revert "BLI: Refactor vector types & functions to use templates"	Clément Foucault
	Reverted because the commit removes a lot of commits. This reverts commit a2c1c368af48644fa8995ecbe7138cc0d7900c30.
2022-01-12	BLI: Refactor vector types & functions to use templates	Clément Foucault
	This patch implements the vector types (i.e:float2) by making heavy usage of templating. All vector functions are now outside of the vector classes (inside the blender::math namespace) and are not vector size dependent for the most part. In the ongoing effort to make shaders less GL centric, we are aiming to share more code between GLSL and C++ to avoid code duplication. Motivations: - We are aiming to share UBO and SSBO structures between GLSL and C++. This means we will use many of the existing vector types and others we currently don't have (uintX, intX). All these variations were asking for many more code duplication. - Deduplicate existing code which is duplicated for each vector size. - We also want to share small functions. Which means that vector functions should be static and not in the class namespace. - Reduce friction to use these types in new projects due to their incompleteness. - The current state of the BLI_(float\|double\|mpq)(2\|3\|4).hh is a bit of a let down. Most clases are incomplete, out of sync with each others with different codestyles, and some functions that should be static are not (i.e: float3::reflect()). Upsides: - Still support .x, .y, .z, .w for readability. - Compact, readable and easilly extendable. - All of the vector functions are available for all the vectors types and can be restricted to certain types. Also template specialization let us define exception for special class (like mpq). - With optimization ON, the compiler unroll the loops and performance is the same. Downsides: - Might impact debugability. Though I would arge that the bugs are rarelly caused by the vector class itself (since the operations are quite trivial) but by the type conversions. - Might impact compile time. I did not saw a significant impact since the usage is not really widespread. - Functions needs to be rewritten to support arbitrary vector length. For instance, one can't call len_squared_v3v3 in math::length_squared() and call it a day. - Type cast does not work with the template version of the math:: vector functions. Meaning you need to manually cast float * and (float *)[3] to float3 for the function calls. i.e: math::distance_squared(float3(nearest.co), positions[i]); - Some parts might loose in readability: float3::dot(v1.normalized(), v2.normalized()) becoming math::dot(math::normalize(v1), math::normalize(v2)) But I propose, when appropriate, to use using namespace blender::math; on function local or file scope to increase readability. dot(normalize(v1), normalize(v2)) Consideration: - Include back .length() method. It is quite handy and is more C++ oriented. - I considered the GLM library as a candidate for replacement. It felt like too much for what we need and would be difficult to extend / modify to our needs. - I used Macros to reduce code in operators declaration and potential copy paste bugs. This could reduce debugability and could be reverted. - This touches delaunay_2d.cc and the intersection code. I would like to know @Howard Trickey (howardt) opinion on the matter. - The noexcept on the copy constructor of mpq(2\|3) is being removed. But according to @Jacques Lucke (JacquesLucke) it is not a real problem for now. I would like to give a huge thanks to @Jacques Lucke (JacquesLucke) who helped during this and pushed me to reduce the duplication further. Reviewed By: brecht, sergey, JacquesLucke Differential Revision: http://developer.blender.org/D13791
2022-01-07	Cleanup: remove redundant const qualifiers for POD types	Campbell Barton
	MSVC used to warn about const mismatch for arguments passed by value. Remove these as newer versions of MSVC no longer show this warning.
2022-01-06	Fix T94672: incorrect Workbench shadows with GPU subdivision	Kévin Dietrich
	The `lines_adjacency` IBO build in the GPU subdivision case was missing edges at the boundaries of open meshes. As it is used for the shadow pass, the shadows were then not clipped properly. This would also make X-Ray mode render differently in those cases. To fix this, we can simply reuse the buffer finalization routine from the non-subdivision case, as such edges are handled there.
2022-01-06	GPU subdiv: fix wrong data sizes used for lines adjacency IBO	Kévin Dietrich
	Function parameters were mismatched, causing an assertion failure in debug builds.
2022-01-06	Cleanup: spelling in comments	Campbell Barton

2021-12-27	Cleanup: clang tidy	Jacques Lucke
	Use c++ headers; use nullptr; redundant `void` in parameter list; inconsistent parameter name.
2021-12-27	OpenSubDiv: add support for an OpenGL evaluator	Kévin Dietrich
	This evaluator is used in order to evaluate subdivision at render time, allowing for faster renders of meshes with a subdivision surface modifier placed at the last position in the modifier list. When evaluating the subsurf modifier, we detect whether we can delegate evaluation to the draw code. If so, the subdivision is first evaluated on the GPU using our own custom evaluator (only the coarse data needs to be initially sent to the GPU), then, buffers for the final `MeshBufferCache` are filled on the GPU using a set of compute shaders. However, some buffers are still filled on the CPU side, if doing so on the GPU is impractical (e.g. the line adjacency buffer used for x-ray, whose logic is hardly GPU compatible). This is done at the mesh buffer extraction level so that the result can be readily used in the various OpenGL engines, without having to write custom geometry or tesselation shaders. We use our own subdivision evaluation shaders, instead of OpenSubDiv's vanilla one, in order to control the data layout, and interpolation. For example, we store vertex colors as compressed 16-bit integers, while OpenSubDiv's default evaluator only work for float types. In order to still access the modified geometry on the CPU side, for use in modifiers or transform operators, a dedicated wrapper type is added `MESH_WRAPPER_TYPE_SUBD`. Subdivision will be lazily evaluated via `BKE_object_get_evaluated_mesh` which will create such a wrapper if possible. If the final subdivision surface is not needed on the CPU side, `BKE_object_get_evaluated_mesh_no_subsurf` should be used. Enabling or disabling GPU subdivision can be done through the user preferences (under Viewport -> Subdivision). See patch description for benchmarks. Reviewed By: campbellbarton, jbakker, fclem, brecht, #eevee_viewport Differential Revision: https://developer.blender.org/D12406
2021-12-14	Cleanup: correct unbalanced doxygen groups	Campbell Barton
	Also add groups in some files.
2021-12-08	Cleanup: move public doc-strings into headers for 'draw'	Campbell Barton
	Ref T92709
2021-11-19	Fix T91838 Crash when toggling edit mode on object with geometry node ↵	Clément Foucault
	modifier, but only if the instanced objects material has a normal map assigned. This is only a workaround to avoid the crash. The underlying issue is left unfixed. New report for tracking the underlying issue is T93223.
2021-11-15	Fix T92750: sculpt vertex colors missing in object mode	Kévin Dietrich
	The layers were not aliased properly for usage in the shaders. Regression caused by rB03013d19d167.
2021-10-27	Cleanup: clang-format, clang-tidy, spelling	Campbell Barton

2021-10-27	Revert "Revert "Eevee: support accessing custom mesh attributes""	Germano Cavalcante
	This reverts commit e7fedf6dba5fe2ec39260943361915a6b2b8270a. And also fix a compilation issue on windows. Differential Revision: https://developer.blender.org/D12969
2021-10-26	Revert "Eevee: support accessing custom mesh attributes"	Ray Molenkamp
	This reverts commit 03013d19d16704672f9db93bc62547651b6a5cb8. This commit broke the windows build pretty badly and I don't feel confident landing the fix for this without review. Will post a possible fix in D12969 and we'll take it from there.
2021-10-26	Eevee: support accessing custom mesh attributes	Kévin Dietrich
	This adds generic attribute rendering support for meshes for Eevee and Workbench. Each attribute is stored inside of the `MeshBufferList` as a separate VBO, with a maximum of `GPU_MAX_ATTR` VBOs for consistency with the GPU shader compilation code. Since `DRW_MeshCDMask` is not general enough, attribute requests are stored in new `DRW_AttributeRequest` structures inside of a convenient `DRW_MeshAttributes` structure. The latter is used in a similar manner as `DRW_MeshCDMask`, with the `MeshBatchCache` keeping track of needed, used, and used-over-time attributes. Again, `GPU_MAX_ATTR` is used in `DRW_MeshAttributes` to prevent too many attributes being used. To ensure thread-safety when updating the used attributes list, a mutex is added to the Mesh runtime. This mutex will also be used in the future for other things when other part of the rendre pre-processing are multi-threaded. `GPU_BATCH_VBO_MAX_LEN` was increased to 16 in order to accommodate for this design. Since `CD_PROP_COLOR` are a valid attribute type, sculpt vertex colors are now handled using this system to avoid to complicate things. In the future regular vertex colors will also use this. From this change, bit operations for DRW_MeshCDMask are now using uint32_t (to match the representation now used by the compiler). Due to the difference in behavior for implicit type conversion for scalar types between OpenGL and what users expect (a scalar `s` is converted to `vec4(s, 0, 0, 1)` by OpenGL, vs. `vec4(s, s, s, 1)` in Blender's various node graphs) , all scalar types are using a float3 internally for now, which increases memory usage. This will be resolved during or after the EEVEE rewrite as properly handling this involves much deeper changes. Ref T85075 Reviewed By: fclem Maniphest Tasks: T85075 Differential Revision: https://developer.blender.org/D12969
2021-08-23	Cleanup: move the buffer list to 'MeshBufferCache'	Germano Cavalcante
	The cache is used to fill the buffer list.
2021-08-23	Cleanup: rename 'MeshBufferExtractionCache' to 'MeshBufferCache'	Germano Cavalcante
	Matches the existing `MeshBatchCache`.
2021-08-23	Cleanup: rename 'MeshBufferCache' to 'MeshBufferList'	Germano Cavalcante
	`MeshBufferList` is more specific and can avoid confusion with `MeshBufferExtractionCache`.
2021-08-23	Cleanup: Move 'tris_per_mat' member out of 'MeshBufferCache'	Germano Cavalcante
	`MeshBufferCache` is a struct representing a list of buffers. As such, `GPUIndexBuf tris_per_mat` is out of place as it does not represent one of the buffers in the list. In fact this member should be close to `GPUBatch surface_per_mat` as they are related. The code for dependencies between buffer and batch had to be reworked as it relies on the member's position. Differential Revision: https://developer.blender.org/D12227
2021-07-26	Cleanup: Rearrange mesh extraction files	Germano Cavalcante
	In the draw module, it's not easy to identify what its header is, and where the shared functions are. So move `draw_cache_extract_mesh_extractors.c` and `draw_cache_extract_mesh_private.h` to the same folder as the extractors and rename these files to make them more identifiable. Reviewed By: jbakker Differential Revision: https://developer.blender.org/D11991
2021-07-21	Draw Cache: extract tris in parallel ranges	Germano Cavalcante
	The `ibo.tris` extraction in multithread is currently only done if the mesh has only 1 material. Now we cache a map indicating the index of each polygon after sort and thus allow the extraction of tris with materials in multithreaded. As caching is a heavy operation and was already being performed in multi-thread for triangle offsets, no significant improvements are expected. The benefit will be much greater when we can skip updating the cache while transforming a geometry. Profiling: \|\|master:\|PATCH: \|---\|---\|---\| \|large_mesh_editing_materials:\|Average: 13.855380 FPS\|Average: 15.525684 FPS \|\|rdata 9ms iter 36ms (frame 71ms)\|rdata 9ms iter 29ms (frame 64ms) \|subdiv_mesh_final_only_materials:\|Average: 28.113742 FPS\|Average: 28.633599 FPS \|\|rdata 0ms iter 1ms (frame 36ms)\|rdata 0ms iter 1ms (frame 35ms) 1.1x overall speedup Differential Revision: https://developer.blender.org/D11445
2021-07-21	Fix T90017: Bone widget drawing inconsistent with editing	Germano Cavalcante
	The `lines_loose` extractor did not trigger loose geometry caching.
2021-07-03	Cleanup: consistent use of tags: NOTE/TODO/FIXME/XXX	Campbell Barton
	Also use doxy style function reference `#` prefix chars when referencing identifiers.
2021-07-02	Cleanup: compiler & clang-tidy warnings	Campbell Barton

2021-07-02	Cleanup: Clang tidy, remove typedef	Hans Goudey

2021-07-01	Cleanup: Separate each extractor into specific compile units	Germano Cavalcante
	Makes code cleaner and easier to find.
2021-06-24	Cleanup: comment blocks, trailing space in comments	Campbell Barton

2021-06-18	Cleanup: clang-tidy	Campbell Barton

2021-06-15	DrawManager: Cache material offsets.	Jeroen Bakker
	When using multiple materials in a single mesh the most time is spend in counting the offsets of each material for the sorting. This patch moves the counting of the offsets to render mesh data and caches it as long as the geometry doesn't change. This patch doesn't include multithreading of this code. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11612
2021-06-11	Refactor: use 'BLI_task_parallel_range' in Draw Cache	Germano Cavalcante
	One drawback to trying to predict the number of threads that will be used in the `task_graph` is that we are only sure of the number when the threads are running. Using `BLI_task_parallel_range` allows the driver to choose the best thread distribution through `parallel_reduce`. The benefit is most evident on hardware with fewer cores. This is the result on an 4-core laptop: \|\|before:\|after: \|---\|---\|---\| \|large_mesh_editing:\|Average: 5.203638 FPS\|Average: 5.398925 FPS \|\|rdata 15ms iter 43ms (frame 193ms)\|rdata 14ms iter 36ms (frame 187ms) Differential Revision: https://developer.blender.org/D11558
2021-06-11	Refactor: Draw Cache: use 'BLI_task_parallel_range'	Germano Cavalcante
	This is an adaptation of {D11488}. A disadvantage of manually setting the iter ranges per thread is that we don't know how many threads are running in the background and so we don't know how to best distribute the ranges. To solve this limitation we can use `parallel_reduce` and thus let the driver choose the best distribution of ranges among the threads. This proved to be especially beneficial for computers with few cores. Benchmarking: Here's the result on an 4-core laptop: \|\|master:\|PATCH: \|---\|---\|---\| \|large_mesh_editing:\|Average: 5.203638 FPS\|Average: 5.398925 FPS \|\|rdata 15ms iter 43ms (frame 193ms)\|rdata 14ms iter 36ms (frame 187ms) Here's the result on an 8-core PC: \|\|master:\|PATCH: \|---\|---\|---\| \|large_mesh_editing:\|Average: 15.267482 FPS\|Average: 15.906881 FPS \|\|rdata 9ms iter 28ms (frame 65ms)\|rdata 9ms iter 25ms (frame 63ms) \|large_mesh_editing_ledge: \|Average: 15.145966 FPS\|Average: 15.520474 FPS \|\|rdata 9ms iter 29ms (frame 65ms)\|rdata 9ms iter 25ms (frame 64ms) \|looptris_test:\|Average: 4.001917 FPS\|Average: 4.061105 FPS \|\|rdata 12ms iter 90ms (frame 236ms)\|rdata 12ms iter 87ms (frame 230ms) \|subdiv_mesh_cage_and_final:\|Average: 1.917769 FPS\|Average: 1.971790 FPS \|\|rdata 7ms iter 37ms (frame 261ms)\|rdata 7ms iter 31ms (frame 258ms) \|\|rdata 7ms iter 38ms (frame 252ms)\|rdata 7ms iter 33ms (frame 249ms) \|subdiv_mesh_final_only:\|Average: 6.387240 FPS\|Average: 6.591251 FPS \|\|rdata 3ms iter 25ms (frame 151ms)\|rdata 3ms iter 16ms (frame 145ms) \|subdiv_mesh_final_only_ledge:\|Average: 6.247393 FPS\|Average: 6.596024 FPS \|\|rdata 3ms iter 26ms (frame 158ms)\|rdata 3ms iter 16ms (frame 148ms) Notes: - The improvement can only be noticed if all extracts are multithreaded. - This patch touches different areas of the code, so it can be split into another patch if the idea is accepted. These screenshots show how threads behave in a quadcore: Master: {F10164664} Patch: {F10164666} Differential Revision: https://developer.blender.org/D11558
2021-06-09	T88352: Use threaded ibo.tris extraction for single material meshes.	Jeroen Bakker
	This patch adds a specific extraction method when the mesh has only one material. This method is multi-threaded. There is a trade-off in this patch as the ibo isn't compressed (it adds restart indexes for hidden faces). So it depends if threading is faster than the additional GPU buffer upload. # Subdivided cube I used a cube subdivided 7 times, modifiers applied. that gives around 400000 faces. The test is selecting some vertices and move them. During this test the next buffers are updated on each frame: * vbo.pos_nor * vbo.lnor * vbo.edit_data * ibo.tris * ibo.points System info: \|platform\| Linux-5.11.0-7614-generic-x86_64-with-glibc2.33\| \| renderer\| AMD SIENNA_CICHLID (DRM 3.40.0, 5.11.0-7614-generic, LLVM 11.0.1)\| \|vendor\| AMD\| \|version\| 4.6 (Core Profile) Mesa 21.0.1\| \|cpu\| Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz\| \|compiler\| gcc version 10.3.0\| Timing have been measured using DEBUG_TIME in `draw_cache_extract_mesh`. master: `rdata 8ms iter 45ms (frame 153ms)` this patch `rdata 6ms iter 36ms (frame 132ms)` Reviewed By: mano-wii Maniphest Tasks: T88352 Differential Revision: https://developer.blender.org/D11290
2021-06-09	Draw Cache: use threading for Mesh extract lines	Germano Cavalcante
	This is an optimization, but the difference is still not that significant as some extractions are still done in single thread. Benchmarking \|\|before:\|after: \|---\|---\|---\| \|large_mesh_editing:\|Average: 14.246502 FPS\|Average: 15.438118 FPS \|\|rdata 9ms iter 31ms (frame 69ms)\|rdata 9ms iter 27ms (frame 65ms) \|large_mesh_editing_ledge: \|Average: 14.913622 FPS\|Average: 15.856538 FPS \|\|rdata 9ms iter 30ms (frame 67ms)\|rdata 9ms iter 26ms (frame 63ms) \|looptris_test:\|Average: 3.970774 FPS\|Average: 4.095200 FPS \|\|rdata 11ms iter 90ms (frame 235ms)\|rdata 12ms iter 87ms (frame 229ms) Reviewed By: jbakker Differential Revision: https://developer.blender.org/D11467
2021-06-08	GPU: Thread safe index buffer builders.	Jeroen Bakker
	Current index builder is designed to be used in a single thread. This makes all index buffer extractions single threaded. This patch adds a thread safe solution enabling multithreaded building of index buffers. To reduce locking the solution would provide a task/thread local index buffer builder (called sub builder). When a thread is finished this thread local index buffer builder can be joined with the initial index buffer builder. `GPU_indexbuf_subbuilder_init`: Initialized a sub builder. The index list is shared between the parent and sub buffer, but the counters are localized. Ensuring that updating counters would not need any locking. `GPU_indexbuf_subbuilder_finish`: merge the information of the sub builder back to the parent builder. Needs to be invoked outside the worker thread, or when sure that all worker threads have been finished. Internal the function is not thread safe. For testing purposes the extract_points extractor has been migrated to the new API. Herefore changes to the mesh extractor were needed. * When creating tasks, the task number of current task is stored in ExtractTaskData including the total number of tasks. * Adding two functions in `MeshExtract`. `task_init` will initialize the task specific userdata. `task_finish` should merge back the task specific userdata back. * adding task_id parameter to the iteration functions so they can access the correct task data without any need for locking. There is no noticeable change in end user performance. Reviewed By: mano-wii Differential Revision: https://developer.blender.org/D11499
2021-06-08	Revert "Cleanup: use cpp new/delete."	Jeroen Bakker
	This reverts commit 43464c94f4def8689dd99a9e459f5ff77420d27b.
2021-06-08	Cleanup: replace NULL with nullptr.	Jeroen Bakker

2021-06-08	Cleanup: use cpp new/delete.	Jeroen Bakker

2021-06-08	Cleanup: replace typedef structs with structs.	Jeroen Bakker

2021-06-08	Cleanup: Separate compile unit edituv.	Jeroen Bakker

2021-06-08	Cleanup: Separate compile unit lines_adjacency.	Jeroen Bakker