Age | Commit message (Collapse) | Author |
|
|
|
|
|
- Since element size its known it's less work to do inline.
- In test with high-poly model, gave ~9% overall speedup for decimate modifier.
|
|
|
|
This commit contains all the changes required for most optimal maximum threads
number bump. This is needed to avoid possibly unneeded initialization or data
allocation on systems with lower threads count.
TODO: Still need to review arrays in render data structures from render_types.h,
P.S. We might remove actual bump of max threads from this patch, so when we'll
be applying the patch we can do all the preparation work and then do actual
bump of max threads.
Reviewers: mont29, campbellbarton
Reviewed By: mont29, campbellbarton
Maniphest Tasks: T43306
Differential Revision: https://developer.blender.org/D1343
|
|
|
|
|
|
Convenient since its common to normalize then scale,
since these are inlined, use for regular normalize w/ 1.0 length.
|
|
|
|
Two were actual bugs, though they existed only in unused code:
* In Freestyle it was unintentionally copying a scene rather than referencing it.
* In BLI_array_store_is_valid there was use of uninitialized memory.
|
|
|
|
changes in BLI_kdopbvh:
- `BLI_bvhtree_find_nearest_to_ray` now takes is_ray_normalized and scale argument.
- `BLI_bvhtree_find_nearest_to_ray_angle` has been added (use for perspective view).
changes in BLI_bvhutils:
- `bvhtree_from_editmesh_edges_ex` was added.
changes in math_geom:
- `dist_squared_ray_to_seg_v3` was added.
other changes:
- `do_ray_start_correction` is no longer necessary to snap to verts.
- the way in which the test of depth was done before is being simulated in callbacks.
|
|
|
|
So the error seems to be in cubic_tangent_factor_circle_v3(),
which was introduced with D2001.
I've tweaked the most obvious culprit here - the epsilon factor.
It used to be 10^-7, but I've reduced it down to 10^-5 now,
and it's looking a lot more stable now :)
---------
BTW, about the derivation of the magic 0.390464 factor I briefly subbed back
as a workaround for this bug, see:
http://www.whizkidtech.redprince.net/bezier/circle/
|
|
|
|
|
|
Array search from back to front.
|
|
|
|
|
|
This allows the error threshold for calculating the optimized location to be much lower.
Resolves visible artifacts w/ 1m-tri happy-buddha example.
|
|
- mul_v3_m3v3_db
- mul_m3_v3_db
- negate_v3_db
|
|
project functions arg naming made it hard to tell which vector was projected onto.
|
|
Use to fill an array of bytes to random values.
|
|
particular scene.
Regression from recent rB2c5dc66d5effd4072f438afb, if last item of last chunk of a mempool was valid,
it would not be returned by mempool iterator step, which would always return NULL in that case.
|
|
|
|
|
|
|
|
|
|
|
|
Failed with chunk merging disabled
|
|
Minor optimization, avoid some checks each iteration.
|
|
|
|
|
|
Useless change in fact, sorry for the noise.
This reverts commit b08473680e141ab6f28f99fc3b1dbbc4add89bed.
|
|
Around ~10% improvement in own tests.
|
|
This reverts commit d5e0e681cea846facb4f2777921f6612be3ee193.
Tsk, these functions return false on a match.
|
|
Code intended to create only one pool by default here, but code in `mempool_maxchunks()` would make it two.
|
|
This also changes freeword to an intptr_t to ensure
not only the first 4 bits of a pointer are tested on 64bit systems.
|
|
|
|
|
|
|
|
This supported in-memory de-duplication,
useful to avoid in-efficient memory use when storing multiple, similar arrays.
|
|
|
|
|
|
non-parallelized case.
|
|
|
|
|
|
Together with the extended loop callback and userdata_chunk, this allows to perform
cumulative tasks (like aggregation) in a lockfree way using local userdata_chunk to store temp data,
and once all workers have finished, to merge those userdata_chunks in the finalize callback
(from calling thread, so no need to lock here either).
Note that this changes how userdata_chunk is handled (now fully from 'main' thread,
which means a given worker thread will always get the same userdata_chunk, without
being re-initialized anymore to init value at start of each iter chunk).
|
|
BLI_task_parallel_range() & co.
|
|
New code is actually much, much better than first version, using 'fetch_and_add' atomic op
here allows us to get rid of the loop etc.
The broken CAS issue remains on windows, to be investigated...
|