Age | Commit message (Collapse) | Author | |
---|---|---|---|
2023-12-15 | use opus_(re)alloc and opus_free for dnn and DRED related functionsopus-ng-fix-alloc | Michael Klingbeil | |
2023-12-14 | handle extensions in opus_repacketizer_out_range_impl | Michael Klingbeil | |
2023-12-06 | add extensions of the first frame of a multiframe packet | Michael Klingbeil | |
2023-12-06 | Fix RESYNTH bit rot | Jean-Marc Valin | |
2023-11-30 | use vec_avx.h for MSVC builds | Michael Klingbeil | |
2023-11-30 | don't redefine _mm_loadu_si32 on MSVC | Michael Klingbeil | |
2023-11-30 | Defining __SSEx__ macros when needed for MSVC | Jean-Marc Valin | |
2023-11-29 | fix autogen.bat model download | Michael Klingbeil | |
2023-11-29 | Add a script to shrink the DNN models | Jean-Marc Valin | |
Removes float debug weights, as well as useless spaces | |||
2023-11-29 | Fix Windows path | Jean-Marc Valin | |
2023-11-29 | Fix model download path for windows | Jean-Marc Valin | |
2023-11-29 | Opus github ci files | Jean-Marc Valin | |
Use OPUS_DRED instead of NEURAL_FEC | |||
2023-11-29 | Add dotprod support to meson | Jean-Marc Valin | |
Also default to disabling dnn float debugging | |||
2023-11-29 | Trying to fix/update meson build | Jean-Marc Valin | |
Still don't quite know what I'm doing | |||
2023-11-28 | Oops, fix the fixed-point build | Jean-Marc Valin | |
2023-11-28 | Trying to use fma instructions when possible | Jean-Marc Valin | |
Compilers sometimes replace vmlaq*() with fmul+fadd instead of fmla. Trying to use vfmaq*() instead when possible. | |||
2023-11-28 | FARGAN model update | Jean-Marc Valin | |
Finished adversarial training on 800k model. Also, move weights to a new location. | |||
2023-11-28 | Fixes for ARMv7/AArch32 | Jean-Marc Valin | |
1) Enable asm/intrinsics even for floating-point 2) Make sure ARMv8 asimd enables EDSP/MEDIA/Neon 3) Add dotp architecture to rtcd table since AArch *can* have dotp | |||
2023-11-28 | Enabling DNN optimizations for ARMv7 | Jean-Marc Valin | |
Adds RTCD tables for compute_activation() and compute_conv2d() | |||
2023-11-28 | Only force auto-vectorization for GCC >= 5.1 | Jean-Marc Valin | |
2023-11-28 | Force vectorization for DNN primitives | Jean-Marc Valin | |
Avoids having to write intrinsics for simple loops | |||
2023-11-27 | Enable floating-point approximations by default | Jean-Marc Valin | |
Enabling only on platforms that have been tested just in case we run into a non-IEEE754 platform where they would break. | |||
2023-11-27 | Fix ARMv7 optimizations for DNN code | Jean-Marc Valin | |
2023-11-26 | First step towards DNN optimization for ARMv7 Neon | Jean-Marc Valin | |
Still missing some intrinsics | |||
2023-11-26 | Fix potential read out of bounds in fargan | Jean-Marc Valin | |
2023-11-25 | Adding dotprod instruction to ARM rtcd | Jean-Marc Valin | |
Used for DNN matrix multiplies | |||
2023-11-25 | Speed up cross-correlation normalization | Jean-Marc Valin | |
2023-11-25 | Use arch-specific celt_inner_prod() for features | Jean-Marc Valin | |
2023-11-25 | Optimize biquad() to reduce dependency chains | Jean-Marc Valin | |
2023-11-24 | Remove process_single_frame() | Jean-Marc Valin | |
Code moved to compute_frame_features() | |||
2023-11-24 | Remove feature writing (fwrite()) from libopus | Jean-Marc Valin | |
2023-11-22 | Using the same condition for enabling rtcd | Jean-Marc Valin | |
for cmake, force PRESEUME_SSE4_1 on PRESUME_AVX2 | |||
2023-11-22 | Trying to fix CMake build | Jean-Marc Valin | |
aka banging on it until it builds on my machine. Further improvements welcome | |||
2023-11-21 | Add rtcd for silk_inner_product_FLP() | Jean-Marc Valin | |
2023-11-21 | Start enabling AVX2 silk_inner_product_FLP() | Jean-Marc Valin | |
Not yet with rtcd | |||
2023-11-21 | Avoids AVX2 optimizations being disabled | Jean-Marc Valin | |
2023-11-21 | Use SILK VBR when using CBR with DRED | Jean-Marc Valin | |
DRED will absorb the bitrate variation | |||
2023-11-21 | Misc fixes on previous patch | Jean-Marc Valin | |
Fixes warnings, undefined behaviour, and check-asm failure | |||
2023-11-21 | Optimize NSQ_del_dec() for AVX2 | Victor Ding | |
The optimization is bit-exact with C function. This optimization speeds up SILK encoder (floating point) as following: AMD Zen: Complexity 0-5 : 0% Complexity 6-7 : 3 - 7% Complexity 8-10: 8 - 15% Intel Skylake: Complexity 0-5 : 0% Complexity 6-7 : 14 - 18% Complexity 8-10: 17 - 22% Adapted by Jean-Marc Valin | |||
2023-11-21 | AVX2 version of silk_inner_product_FLP() | Jean-Marc Valin | |
Not hooked up | |||
2023-11-21 | Remove AVX pitch code for fixed-point | Jean-Marc Valin | |
2023-11-21 | Speeding up transient_analysis() | Jean-Marc Valin | |
Reducing dependency chains | |||
2023-11-20 | Make sure weights files are marked as modified | Jean-Marc Valin | |
2023-11-18 | Speed up silk_warped_autocorrelation_FLP() | Jean-Marc Valin | |
Reducing the dependency chain between tmp1 and tmp2 at the cost of an extra multiply. | |||
2023-11-18 | Add rtcd support for celt_pitch_xcorr_avx2() | Jean-Marc Valin | |
2023-11-18 | Fix non-RTCD case when SSE is not assumed present | Jean-Marc Valin | |
Should never occur on amd64, but it could on 32-bit x86 | |||
2023-11-18 | Use celt_pitch_xcorr_avx2() when guaranteed | Jean-Marc Valin | |
No RTCD yet | |||
2023-11-17 | Adding RTCD for compute_conv2d() | Jean-Marc Valin | |
2023-11-16 | FARGAN model update | Jean-Marc Valin | |
2023-11-16 | Smaller version of fargan | Jean-Marc Valin | |
800k parameters, 600 MFLOPS, with a receptive field of 3 feature vectors |