Age | Commit message (Collapse) | Author |
|
|
|
Also use doxy style function reference `#` prefix chars when
referencing identifiers.
|
|
|
|
|
|
More stupid mistake in recent enhanced reports for file load code,
rB82c17082ba0e left some read-after-free situations.
|
|
|
|
This reverts commit rB3a48147b8ab92, and fixes the issues with linking
etc.
Change compared to previous buggy commit (rBf8d219dfd4c31) is that
new `BlendFileReadReports` reports are now passed to the lowest level
function generating the `FileData` (`filedata_new()`), which ensures
(and asserts) that all code using it does have a valid non-NULL pointer
to a `BlendFileReadReport` data.
Sorry for the noise, it's always when you think a change is trivial and
do not test it well enough that you end up doing those kind of
mistakes...
|
|
|
|
After looking into task isolation issues with Sergey, we couldn't find the
reason behind the deadlocks that we are getting in T87938 and a Sprite Fright
file involving motion blur renders.
There is no apparent place where we adding or waiting on tasks in a task group
from different isolation regions, which is what is known to cause problems. Yet
it still hangs. Either we do not understand some limitation of TBB isolation,
or there is a bug in TBB, but we could not figure it out.
Instead the idea is to use isolation only where we know we need it: when
holding a mutex lock and then doing some multithreaded operation within that
locked region. Three places where we do this now:
* Generated images
* Cached BVH tree building
* OpenVDB lazy grid loading
Compared to the more automatic approach previously used, there is the downside
that it is easy to miss places where we need isolation. Yet doing it more
automatically is also causing unexpected issue and bugs that we found no
solution for, so this seems better.
Patch implemented by Sergey and me.
Differential Revision: https://developer.blender.org/D11603
|
|
If the last decoded frame had the same timestamp as the GOP current
packet, then we would skip over this frame when fast forwarding and we
would seek until the end of the file.
This would could only be triggered reliably in single threaded mode.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11601
|
|
On the Windows platform there will be some errors printed to the
console if the user's thumbnail cache folder doesn't already exist.
While creating those folders there is an attempt to do so multiple
times and so we get errors when trying to create when exists. This
is caused by paths that have hard-coded forward slashes, which causes
our path processing routines to not work correctly. This patch defines
those paths using platform-varying separator characters.
Differential Revision: https://developer.blender.org/D11505
Reviewed by Brecht Van Lommel
|
|
Scanline processor did its own heurestic what didn't scale well when
having a multiple cores. In stead of using our own code this patch will
leave it to TBB to determine how to split the scanlines over the
available threads.
Performance of the IMB_transform before this change was 0.002123s, with
this change 0.001601s. This change increases performance in other areas
as well including color management conversions.
Reviewed By: zeddb
Differential Revision: https://developer.blender.org/D11578
|
|
ffmpeg_generic_seek_workaround did work properly and our start pts
calculation was wrong.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11562
|
|
Because of the added sanity checks in rB14508ef100c9 (D11492), seeking
in proxies would not work correctly any more. This is because it wasn't
working as intended before, but in most cases this wouldn't be
noticeable. However now when the sanity checks are tripped it is very
noticeable that something is wrong
The indexer tried to use dts values for time stamps when we used pts in
our decode functions to get the time positions. This would make it
start in the wrong GOP frames when searching. Now that we enforce no
crossing of GOP frames when decoding after seek, this would lead to
issues.
Now we correctly use pts (or dts if pts is not available) and thus we
don't have any seeking issues because of time stamp format missmatch.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11561
|
|
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11567
|
|
When sampling ImBuf can be a char or a float buffer. Current sampling
functions added overhead by checking which kind of buffer was passed
every pixel that was sampled. When performing image processing this
check can be removed outside the inner loop adding 5% of performance
increase in the `IMB_transform` operator.
|
|
Inside the sequencer the cropping and transform of images/buffers were
implemented locally. This reduced the optimizations that a compiler
could do and added confusing code styles. This patch adds
`IMB_transform` to reduce the confusion and increases compiler
optimizations as more code can be inlined and we can keep track of
indices inside the inner loop.
This increases end-user performance by 30% when playing back aa video
in VSE.
Reviewed By: ISS, zeddb
Differential Revision: https://developer.blender.org/D11549
|
|
This change was prompted by D6408 which moves thumbnail extraction into
a shared function that happens use these endian defines but only links
blenlib.
There is no need for these defines to be associated with globals
so move into their own header.
|
|
We need to unref the packet to tell ffmpeg it is ok to free it after
use.
|
|
Under some circumstances using task isolation can cause deadlocks.
Previously, our task pool implementation would run all tasks in an
isolated region. Now using task isolation is optional and can be
turned on/off for individual task pools.
Task pools that spawn new tasks recursively should never enable
task isolation. There is a new check that finds these cases at runtime.
Right now this check is disabled, so that this commit is a pure refactor.
It will be enabled in an upcoming commit.
This fixes T88598.
Differential Revision: https://developer.blender.org/D11415
|
|
We should only check if the new pts value lies inside the duration of
the current frame.
|
|
Fixed the logic for seeking in ffmpeg video files.
The main fix is that we now apply a small offset in ffmpeg_get_seek_pos
to make sure we don't get the frame in front of the seek position when
seeking backward.
The rest of the changes is general cleanup and untangling code.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11492
|
|
|
|
Images with 4:2:2 and 4:4:4 chroma subsampling were blurred when
`SWS_FAST_BILINEAR` interpolation is set for `anim->img_convert_ctx`.
Use `SWS_BILINEAR` interpolation for all movies, as performance is
not impacted by this change.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D11457
|
|
Changes in rBce649c73446e, affected established proxy codec preset.
Presets were not working and all presets were similar to `veryfast`.
Tunes are now working too, so `fastdecode` tune can be used. I have
measured little improvement, but I tested this only on 2 machines and
I have been informed that `fastdecode` tune does influence decoding
performance for some users.
Change preset from `slow` to `veryfast` and add tune `fastdecode`
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D11454
|
|
The issue was two fold. We didn't properly:
1. Initialize the codec default values which would lead to VLC
complaining because of garbage/wrong codec settings.
2.Calculate the time base for the video. FFmpeg would happily accept
this but VLC seems to assume the time base value is at least somewhat
correct and couldn't properly display the frames as the internal time
base was huge. We are talking about 90k ticks (tbn) for one second of
video!
This patch initializes all codecs to use their default values and fixes
the time base calculation so it follows the guidelines from ffmpeg.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D11426
|
|
Before the FFmpeg commit: github.com/FFmpeg/FFmpeg/commit/1c0885334dda9ee8652e60c586fa2e3674056586
FFmpeg would use deprecated variables to calculate the video fps.
We don't use these deprecated variables anymore, so ensure that the
duration is correct in ffmpeg versions without this fix.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D11417
|
|
We didn't initialize the scaled proxy frame properly.
This would lead to issues in ffmpeg 4.4 as they are more strict that the API is properly used.
Now we initialize the size and format of the frame.
|
|
Event though in practice this wasn't causing problems as the fixed size
buffers are generally large enough not to truncate text.
Using the result from `snprint` or `BLI_snprintf` to step over a fixed
size buffer allows for buffer overruns as the returned value is the size
needed to copy the entire string, not the number of bytes copied.
Building strings using this convention with multiple calls:
ofs += BLI_snprintf(str + ofs, str_len_max - ofs);
.. caused the size argument to become negative,
wrapping it to a large value when cast to the unsigned argument.
|
|
Includes fixes to misspelled function names.
Ref D11280
|
|
There need to be more cleanup for ffmpeg 4.5 (ffmpeg master branch).
However this now compiles on ffmpeg 4.4 without and deprication
warnings.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D10338
|
|
|
|
|
|
|
|
|
|
Ref T78710
|
|
|
|
GOP size and quality are adjusted for h264 codec.
These new values are based on result of benchmark on 9 random files:
https://docs.google.com/spreadsheets/d/1nOyUGjoVWUyhQ2y2lAd8VtFfyaY1wQNGj1krCCNbk7Y/edit?usp=sharing
Reducing quality to 50 reduces proxy filesize by about 2x on average
and has no significant impact on decoding performance.
Increasing GOP size from 2 to 10 also reduces proxy filesize 2x-3x
while scrubbing is only about 8% slower. It is still around 100FPS
with 1920x1080 media.
This is unfortunately about 50% slower than MJPEG, but this can be
improved with `fastdecode` tune applied to libx264 encoder
Quite surprisingly h264 codec presets had little influence on proxy
building performance as well as proxy filesize. So far it looks that
FFmpeg does initialize encoder in different way then Blender.
This applies mot only for presets but for tune and profile libx264
setting.
Once this issue is resolved, performance of proxies may be optimized
further.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10897
|
|
Adds appropriate checks/guards around all the untrusted parameters
which are used for reading from memory.
Validation:
- All the crashing files within the bug have been checked to not causes
crashes any longer>
- A handful of correct .bmp were validated: 3 different files at each
of 1, 4, 8, 24, 32 bpp depth along with a random variety of other 24
bpp files (around 20 in total).
- ~280 million iterations of fuzzing using AFL were completed with 0
crashes. The old code experienced several dozen crashes in first
minutes of running {F8584509}.
Ref D7945
|
|
This was missing from rB19dfb6ea1f6745c0dbc2ce21839c30184b553878.
|
|
This removes a lot of unnecessary code that is generated by
the compiler automatically.
In very few cases, a defaulted destructor in a .cc file is
still necessary, because of forward declarations in the header.
I removed some defaulted virtual destructors, because they are not
necessary, when the parent class has a virtual destructor already.
Defaulted constructors are only necessary when there is another
constructor, but the class should still be default constructible.
Differential Revision: https://developer.blender.org/D10911
|
|
This condition was in contradiction with comment for function
`ffmpeg_generic_seek_workaround()`.
I have noticed, that formats that seeked well used this workaround.
Problem was that I misunderstood code from `av_seek_frame()` - formats
with `read_seek()` function stil don't use generic seeking method.
This is defined in `seek_frame_internal()`
|
|
`ffmpeg_generic_seek_workaround()` applied negative offset for seqrched
packet timestamp, but proxies always start from 0 and timestamp can be
negative.
Limit timestamp value to 0, because `av_seek_frame()` doesn't accept
negative timestamps and returns with error. This prevents seeking from
working correctly.
|
|
Most imbuf loaders already did this, use early exit for the remaining
loaders that didn't.
|
|
|
|
`av_seek_frame()` failed to seek to nearest I-frame. This seems to be
a bug or not implemented feature in FFmpeg. Looks like same issue as
ticket https://trac.ffmpeg.org/ticket/1607 on ffmpeg tracker.
If seeking is done using format specific function (`read_seek2`)
field of `AVInputFormat` is set, `see av_seek_frame()`, use
`av_seek_frame()` function. Otherwise use wrapper that actively searches
for I-frame packet.
Searching is flexible and tries to do minimum amount of work. Currently
it is limited to equivalent of 25 frames, which may not be enough for
some files, but there may be files with no I-frames at all, so it is
best to keep this limit as low as possible. Previously this problem was
masked by preseek, which was hard-coded to 25 frames. This was removed
in rB88604b79b7d1.
If this approach would be unnecessary for some formats, in worst case
file would be seeked 2 times which is very fast, so there will be no
visible impact on performance.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10845
|
|
|
|
When reading metadata from image files the metadata is trimmed to 1024.
For cryptomatte the metadata can contain json data and should not be
trimmed. Resulting additional checks in the manifest parser for
incomplete json data.
You could argue to add an exception for cryptomatte, but that would
still allows misuse. When the direction of this patch is accepted we
should consider removing `maxlen` from `IDP_AssignString` as it
doesn't seem to be used anywhere else.
Reviewed By: #images_movies, mont29, sergey
Differential Revision: https://developer.blender.org/D10825
|
|
Generalize threading settings in proxy building and use them for encoding
and decoding in general. Check codec capabilities, prefer FF_THREAD_FRAME
threading over FF_THREAD_SLICE and automatic thread count over setting it
explicitly.
ffmpeg-codecs man page suggests that threads option is global and used by
codecs, that supports this option. Form some tests I have done, it seems that
`av_dict_set_int(&codec_opts, "threads", BLI_system_thread_count(), 0)`
has same effect as
```
pCodecCtx->thread_count = BLI_system_thread_count();
pCodecCtx->thread_type = FF_THREAD_FRAME;
```
Looking at `ff_frame_thread_encoder_init()` code, these cases are not
equivalent. It is probably safer to leave threading setup on libavcodec than
setting up each codec threading individually.
From what I have read all over the internet, frame multithreading should be
faster than slice multithreading. Slice multithreading is mainly used for low
latency streaming.
When running Blender with --debug-ffmpeg it complains about
`pCodecCtx->thread_count = BLI_system_thread_count()` that using thread count
above 16 is not recommended. Using too many threads can negatively affect image
quality, but I am not sure if this is the case for decoding as well - see
https://streaminglearningcenter.com/blogs/ffmpeg-command-threads-how-it-affects-quality-and-performance.html
This is fine for proxies but may be undesirable for final renders.
Number of threads is limited by image size, because of size of motion vectors,
so if it is possible let libavcodec determine optimal thread count.
Performance difference:
Proxy building: None
Playback speed: 2x better on 1920x1080 sample h264 file
Scrubbing: Hard to quantify, but it's much more responsive
Rendering speed: None on 1920x1080 sample h264 file, there is improvement with codecs that do support FF_THREAD_FRAME for encoding like MPNG
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10791
|
|
Split seeking section of `ffmpeg_fetchibuf()` function into multiple
smaller functions.
Conditional statements are moved to own funtions with human readable
names, so code flow is more clear.
To remove one branch of seeking, first frame is now decoded by
scanning, which will do only one iteration. So nothing has technically
changed.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10638
|