Age | Commit message (Collapse) | Author |
|
ffmpeg_generic_seek_workaround did work properly and our start pts
calculation was wrong.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11562
|
|
Because of the added sanity checks in rB14508ef100c9 (D11492), seeking
in proxies would not work correctly any more. This is because it wasn't
working as intended before, but in most cases this wouldn't be
noticeable. However now when the sanity checks are tripped it is very
noticeable that something is wrong
The indexer tried to use dts values for time stamps when we used pts in
our decode functions to get the time positions. This would make it
start in the wrong GOP frames when searching. Now that we enforce no
crossing of GOP frames when decoding after seek, this would lead to
issues.
Now we correctly use pts (or dts if pts is not available) and thus we
don't have any seeking issues because of time stamp format missmatch.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11561
|
|
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11567
|
|
When sampling ImBuf can be a char or a float buffer. Current sampling
functions added overhead by checking which kind of buffer was passed
every pixel that was sampled. When performing image processing this
check can be removed outside the inner loop adding 5% of performance
increase in the `IMB_transform` operator.
|
|
Inside the sequencer the cropping and transform of images/buffers were
implemented locally. This reduced the optimizations that a compiler
could do and added confusing code styles. This patch adds
`IMB_transform` to reduce the confusion and increases compiler
optimizations as more code can be inlined and we can keep track of
indices inside the inner loop.
This increases end-user performance by 30% when playing back aa video
in VSE.
Reviewed By: ISS, zeddb
Differential Revision: https://developer.blender.org/D11549
|
|
This change was prompted by D6408 which moves thumbnail extraction into
a shared function that happens use these endian defines but only links
blenlib.
There is no need for these defines to be associated with globals
so move into their own header.
|
|
We need to unref the packet to tell ffmpeg it is ok to free it after
use.
|
|
Under some circumstances using task isolation can cause deadlocks.
Previously, our task pool implementation would run all tasks in an
isolated region. Now using task isolation is optional and can be
turned on/off for individual task pools.
Task pools that spawn new tasks recursively should never enable
task isolation. There is a new check that finds these cases at runtime.
Right now this check is disabled, so that this commit is a pure refactor.
It will be enabled in an upcoming commit.
This fixes T88598.
Differential Revision: https://developer.blender.org/D11415
|
|
We should only check if the new pts value lies inside the duration of
the current frame.
|
|
Fixed the logic for seeking in ffmpeg video files.
The main fix is that we now apply a small offset in ffmpeg_get_seek_pos
to make sure we don't get the frame in front of the seek position when
seeking backward.
The rest of the changes is general cleanup and untangling code.
Reviewed By: Richard Antalik
Differential Revision: http://developer.blender.org/D11492
|
|
|
|
Images with 4:2:2 and 4:4:4 chroma subsampling were blurred when
`SWS_FAST_BILINEAR` interpolation is set for `anim->img_convert_ctx`.
Use `SWS_BILINEAR` interpolation for all movies, as performance is
not impacted by this change.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D11457
|
|
Changes in rBce649c73446e, affected established proxy codec preset.
Presets were not working and all presets were similar to `veryfast`.
Tunes are now working too, so `fastdecode` tune can be used. I have
measured little improvement, but I tested this only on 2 machines and
I have been informed that `fastdecode` tune does influence decoding
performance for some users.
Change preset from `slow` to `veryfast` and add tune `fastdecode`
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D11454
|
|
The issue was two fold. We didn't properly:
1. Initialize the codec default values which would lead to VLC
complaining because of garbage/wrong codec settings.
2.Calculate the time base for the video. FFmpeg would happily accept
this but VLC seems to assume the time base value is at least somewhat
correct and couldn't properly display the frames as the internal time
base was huge. We are talking about 90k ticks (tbn) for one second of
video!
This patch initializes all codecs to use their default values and fixes
the time base calculation so it follows the guidelines from ffmpeg.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D11426
|
|
Before the FFmpeg commit: github.com/FFmpeg/FFmpeg/commit/1c0885334dda9ee8652e60c586fa2e3674056586
FFmpeg would use deprecated variables to calculate the video fps.
We don't use these deprecated variables anymore, so ensure that the
duration is correct in ffmpeg versions without this fix.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D11417
|
|
We didn't initialize the scaled proxy frame properly.
This would lead to issues in ffmpeg 4.4 as they are more strict that the API is properly used.
Now we initialize the size and format of the frame.
|
|
Event though in practice this wasn't causing problems as the fixed size
buffers are generally large enough not to truncate text.
Using the result from `snprint` or `BLI_snprintf` to step over a fixed
size buffer allows for buffer overruns as the returned value is the size
needed to copy the entire string, not the number of bytes copied.
Building strings using this convention with multiple calls:
ofs += BLI_snprintf(str + ofs, str_len_max - ofs);
.. caused the size argument to become negative,
wrapping it to a large value when cast to the unsigned argument.
|
|
Includes fixes to misspelled function names.
Ref D11280
|
|
There need to be more cleanup for ffmpeg 4.5 (ffmpeg master branch).
However this now compiles on ffmpeg 4.4 without and deprication
warnings.
Reviewed By: Sergey, Richard Antalik
Differential Revision: http://developer.blender.org/D10338
|
|
|
|
|
|
|
|
|
|
Ref T78710
|
|
|
|
GOP size and quality are adjusted for h264 codec.
These new values are based on result of benchmark on 9 random files:
https://docs.google.com/spreadsheets/d/1nOyUGjoVWUyhQ2y2lAd8VtFfyaY1wQNGj1krCCNbk7Y/edit?usp=sharing
Reducing quality to 50 reduces proxy filesize by about 2x on average
and has no significant impact on decoding performance.
Increasing GOP size from 2 to 10 also reduces proxy filesize 2x-3x
while scrubbing is only about 8% slower. It is still around 100FPS
with 1920x1080 media.
This is unfortunately about 50% slower than MJPEG, but this can be
improved with `fastdecode` tune applied to libx264 encoder
Quite surprisingly h264 codec presets had little influence on proxy
building performance as well as proxy filesize. So far it looks that
FFmpeg does initialize encoder in different way then Blender.
This applies mot only for presets but for tune and profile libx264
setting.
Once this issue is resolved, performance of proxies may be optimized
further.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10897
|
|
Adds appropriate checks/guards around all the untrusted parameters
which are used for reading from memory.
Validation:
- All the crashing files within the bug have been checked to not causes
crashes any longer>
- A handful of correct .bmp were validated: 3 different files at each
of 1, 4, 8, 24, 32 bpp depth along with a random variety of other 24
bpp files (around 20 in total).
- ~280 million iterations of fuzzing using AFL were completed with 0
crashes. The old code experienced several dozen crashes in first
minutes of running {F8584509}.
Ref D7945
|
|
This was missing from rB19dfb6ea1f6745c0dbc2ce21839c30184b553878.
|
|
This removes a lot of unnecessary code that is generated by
the compiler automatically.
In very few cases, a defaulted destructor in a .cc file is
still necessary, because of forward declarations in the header.
I removed some defaulted virtual destructors, because they are not
necessary, when the parent class has a virtual destructor already.
Defaulted constructors are only necessary when there is another
constructor, but the class should still be default constructible.
Differential Revision: https://developer.blender.org/D10911
|
|
This condition was in contradiction with comment for function
`ffmpeg_generic_seek_workaround()`.
I have noticed, that formats that seeked well used this workaround.
Problem was that I misunderstood code from `av_seek_frame()` - formats
with `read_seek()` function stil don't use generic seeking method.
This is defined in `seek_frame_internal()`
|
|
`ffmpeg_generic_seek_workaround()` applied negative offset for seqrched
packet timestamp, but proxies always start from 0 and timestamp can be
negative.
Limit timestamp value to 0, because `av_seek_frame()` doesn't accept
negative timestamps and returns with error. This prevents seeking from
working correctly.
|
|
Most imbuf loaders already did this, use early exit for the remaining
loaders that didn't.
|
|
|
|
`av_seek_frame()` failed to seek to nearest I-frame. This seems to be
a bug or not implemented feature in FFmpeg. Looks like same issue as
ticket https://trac.ffmpeg.org/ticket/1607 on ffmpeg tracker.
If seeking is done using format specific function (`read_seek2`)
field of `AVInputFormat` is set, `see av_seek_frame()`, use
`av_seek_frame()` function. Otherwise use wrapper that actively searches
for I-frame packet.
Searching is flexible and tries to do minimum amount of work. Currently
it is limited to equivalent of 25 frames, which may not be enough for
some files, but there may be files with no I-frames at all, so it is
best to keep this limit as low as possible. Previously this problem was
masked by preseek, which was hard-coded to 25 frames. This was removed
in rB88604b79b7d1.
If this approach would be unnecessary for some formats, in worst case
file would be seeked 2 times which is very fast, so there will be no
visible impact on performance.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10845
|
|
|
|
When reading metadata from image files the metadata is trimmed to 1024.
For cryptomatte the metadata can contain json data and should not be
trimmed. Resulting additional checks in the manifest parser for
incomplete json data.
You could argue to add an exception for cryptomatte, but that would
still allows misuse. When the direction of this patch is accepted we
should consider removing `maxlen` from `IDP_AssignString` as it
doesn't seem to be used anywhere else.
Reviewed By: #images_movies, mont29, sergey
Differential Revision: https://developer.blender.org/D10825
|
|
Generalize threading settings in proxy building and use them for encoding
and decoding in general. Check codec capabilities, prefer FF_THREAD_FRAME
threading over FF_THREAD_SLICE and automatic thread count over setting it
explicitly.
ffmpeg-codecs man page suggests that threads option is global and used by
codecs, that supports this option. Form some tests I have done, it seems that
`av_dict_set_int(&codec_opts, "threads", BLI_system_thread_count(), 0)`
has same effect as
```
pCodecCtx->thread_count = BLI_system_thread_count();
pCodecCtx->thread_type = FF_THREAD_FRAME;
```
Looking at `ff_frame_thread_encoder_init()` code, these cases are not
equivalent. It is probably safer to leave threading setup on libavcodec than
setting up each codec threading individually.
From what I have read all over the internet, frame multithreading should be
faster than slice multithreading. Slice multithreading is mainly used for low
latency streaming.
When running Blender with --debug-ffmpeg it complains about
`pCodecCtx->thread_count = BLI_system_thread_count()` that using thread count
above 16 is not recommended. Using too many threads can negatively affect image
quality, but I am not sure if this is the case for decoding as well - see
https://streaminglearningcenter.com/blogs/ffmpeg-command-threads-how-it-affects-quality-and-performance.html
This is fine for proxies but may be undesirable for final renders.
Number of threads is limited by image size, because of size of motion vectors,
so if it is possible let libavcodec determine optimal thread count.
Performance difference:
Proxy building: None
Playback speed: 2x better on 1920x1080 sample h264 file
Scrubbing: Hard to quantify, but it's much more responsive
Rendering speed: None on 1920x1080 sample h264 file, there is improvement with codecs that do support FF_THREAD_FRAME for encoding like MPNG
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10791
|
|
Split seeking section of `ffmpeg_fetchibuf()` function into multiple
smaller functions.
Conditional statements are moved to own funtions with human readable
names, so code flow is more clear.
To remove one branch of seeking, first frame is now decoded by
scanning, which will do only one iteration. So nothing has technically
changed.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10638
|
|
This was included for `FILE *` which isn't used in the header.
Ref D10799
|
|
|
|
|
|
Proxy coded has been changed to h264. Error code is more generic now.
|
|
Use h264 codec for output. This codec produces smaller files, can be
multithreaded and decodes even faster than MJPEG.
Quality setting 0-100 corresponds to "Lowest Quality" to
"Perceptually Lossless" in Blender's h264 encoding presets.
All available cores are used for decoding.
Same goes for decoding but only for codecs that supports this
(h264, vp9 seems to support this option out of th box as well).
Other decoders can probably be optimized in similar way, but threaded
encoding provides significant boost already.
I have tested variety of codecs, and all were transcoded properly.
Reviewed By: sergey, fsiddi
Differential Revision: https://developer.blender.org/D10731
|
|
|
|
Applying negative offset to seek position before scanning doesnn't have
any effect. This change results in 1.5x faster seeking (random frame,
average value) in sample file with 30 frame GOP length.
If I am not mistaken, B frames can have pts that can be less than
pts of I frame that must be decoded. Even in this case though, B frame
packet will be stored after that I frame.
In addition, preseek value is de facto hardcoded so seeking would fail
if it could. This can be hard to spot though.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D10529
|
|
Removal of 'camera frame' around blend file thumbnail images.
Differential Revision: https://developer.blender.org/D10490
Reviewed by Brecht Van Lommel
|
|
|
|
And fix a (harmless) compiler warning.
|
|
|
|
|