Age | Commit message (Collapse) | Author |
|
Fixes #42
|
|
pthread_jit_write_protect_np() is available but unusable on
iOS simulator:
../orc/orccompiler.c:460:3: error: 'pthread_jit_write_protect_np' is unavailable: not available on iOS
Fixes #43
|
|
Other backends would not work or do not make sense
|
|
|
|
Use the Windows API FlushInstructionCache() to flush the CPU cache on ARM64
Windows. As a consequence, include windows.h with WIN32_LEAN_AND_MEAN defined.
|
|
ARM64 Windows are supported on ARMv8 CPUs only, so just assume that we have
the NEON and EDSP ARM instructions.
|
|
unistd.h and sys/time.h may not be universally available, so only include them
if they were found at configure time.
|
|
On UNIX toolchains cross compiling for Windows, winpthread will be detected and
added as a dependency even though it's not used.
|
|
If the library is compiled statically the define also needs to be set in the
orc-0.4.pc file so that users of the library (in a UNIX toolchain cross
compiling to Windows) will not import the functions as DLL imports.
|
|
Fixes #40
|
|
.. instead of the deprecated meson.has_exe_wrapper()
|
|
https://gitlab.freedesktop.org/gstreamer/orc/-/issues/27
|
|
Fix shifted outputs when output array is 8-byte aligned but not 16-byte aligned and loop shift is 1.
Fixes #32
Signed-off-by: Gaetan Bahl <gaetan.bahl@nxp.com>
|
|
Set the FPCR.FZ bit before running tests using ARM NEON,
in order to make tests pass for most opcodes.
Add a way to check for expected failures in the test suite,
since ARM NEON does not comply to IEEE754.
Errors are expected when using divf (resp. sqrtf) on large
(resp. small) numbers.
Fixes #33, #20
Signed-off-by: Gaetan Bahl <gaetan.bahl@nxp.com>
|
|
This solves an issue where two out of four inputs are not processed
by passing the correct value of vec shift.
Fixes #33, #20.
Signed-off-by: Gaetan Bahl <gaetan.bahl@nxp.com>
|
|
This solves an issue where two out of four input values are not processed
by passing the correct value of vec shift.
Fixes #33, #20.
Signed-off-by: Gaetan Bahl <gaetan.bahl@nxp.com>
|
|
using neon
Setting the correct shift values solve the "out-of-shift" errors
and allow the following operators to successfully compile:
addf, subf, mulf, maxf, minf, cmpeqf, convfl, convlf, addd, subd, muld, divd
Fixes #33, #20, #2.
Signed-off-by: Gaetan Bahl <gaetan.bahl@nxp.com>
|
|
In file included from gstreamer/subprojects/orc/orc/orc.h:7,
from gstreamer/subprojects/orc/orc/orcprogram.h:5,
from gstreamer/subprojects/orc/orc/orccodemem.c:30:
gstreamer/subprojects/orc/orc/orccodemem.c: In function ‘orc_code_region_allocate_codemem_dual_map.constprop’:
gstreamer/subprojects/orc/orc/orcdebug.h:138:3: warning: pointer ‘filename’ may be used after ‘free’ [-Wuse-after-free]
138 | orc_debug_print((level), __FILE__, ORC_FUNCTION, __LINE__, __VA_ARGS__); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gstreamer/subprojects/orc/orc/orcdebug.h:92:26: note: in expansion of macro ‘ORC_DEBUG_PRINT’
92 | #define ORC_WARNING(...) ORC_DEBUG_PRINT(ORC_DEBUG_WARNING, __VA_ARGS__)
| ^~~~~~~~~~~~~~~
gstreamer/subprojects/orc/orc/orccodemem.c:252:5: note: in expansion of macro ‘ORC_WARNING’
252 | ORC_WARNING ("failed to create write map '%s'. err=%i", filename, errno);
| ^~~~~~~~~~~
gstreamer/subprojects/orc/orc/orccodemem.c:234:3: note: call to ‘free’ here
234 | free (filename);
| ^~~~~~~~~~~~~~~
gstreamer/subprojects/orc/orc/orcdebug.h:138:3: warning: pointer ‘filename’ may be used after ‘free’ [-Wuse-after-free]
138 | orc_debug_print((level), __FILE__, ORC_FUNCTION, __LINE__, __VA_ARGS__); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gstreamer/subprojects/orc/orc/orcdebug.h:92:26: note: in expansion of macro ‘ORC_DEBUG_PRINT’
92 | #define ORC_WARNING(...) ORC_DEBUG_PRINT(ORC_DEBUG_WARNING, __VA_ARGS__)
| ^~~~~~~~~~~~~~~
gstreamer/subprojects/orc/orc/orccodemem.c:245:5: note: in expansion of macro ‘ORC_WARNING’
245 | ORC_WARNING("failed to create exec map '%s'. err=%i", filename, errno);
| ^~~~~~~~~~~
gstreamer/subprojects/orc/orc/orccodemem.c:234:3: note: call to ‘free’ here
234 | free (filename);
| ^~~~~~~~~~~~~~~
Fixes: bb5fcb31 ("orccodemem: Report errno during failures to create mmap codemap.")
|
|
Cross and native files are modified versions of the files used in the
gstreamer CI since we use the same Docker image.
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/64>
|
|
The latest image contains VS 2019, and was built in
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1570
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/63>
|
|
This has direct impact on bayer2rgb performance. Tested on i.MX8mm aarch64 -> Speedboost of ~17%.
Reason:
The line loadoffw t, s, -1 results in orc silent compile error
Pipeline:
gst-launch-1.0 -v videotestsrc ! video/x-bayer,width=1920,height=1080 ! bayer2rgb ! fpsdisplaysink video-sink=fakesink sync=0
Average performance with fix: 25.21fps
Average performance without fix: 21.60fps
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/62>
|
|
Disable Windows ARM64 support/build for now, because it doesn't work.
Fixes: https://gitlab.freedesktop.org/gstreamer/orc/-/issues/36
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/61>
|
|
orc_executor_set_program() was missed when it was added to
orc_executor_new().
|
|
Fixes the following warning:
WARNING: extract_all_objects called without setting recursive
keyword argument. Meson currently defaults to
non-recursive to maintain backward compatibility but
the default will be changed in the future.
orc-test\meson.build:16:0: ERROR: Fatal warnings enabled, aborting
|
|
|
|
|
|
Build with powerpc and kernel < 4.11 is broken since version 0.4.30 and
https://gitlab.freedesktop.org/gstreamer/orc/-/commit/a999325abea6a5549d60d99ddeb0271d2aa00235:
FAILED: orc/liborc-0.4.so.0.32.0.p/orccpu-powerpc.c.o
/home/giuliobenetti/autobuild/run/instance-3/output-1/host/bin/powerpc-linux-gcc -Iorc/liborc-0.4.so.0.32.0.p -Iorc -I../orc -I. -I.. -fdiagnostics-color=always -pipe -Wall -Winvalid-pch -std=gnu99 -O3 -DHAVE_CONFIG_H -fvisibility=hidden -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O2 -fPIC -pthread -DORC_ENABLE_UNSTABLE_API -D_GNU_SOURCE -DBUILDING_ORC -MD -MQ orc/liborc-0.4.so.0.32.0.p/orccpu-powerpc.c.o -MF orc/liborc-0.4.so.0.32.0.p/orccpu-powerpc.c.o.d -o orc/liborc-0.4.so.0.32.0.p/orccpu-powerpc.c.o -c ../orc/orccpu-powerpc.c
../orc/orccpu-powerpc.c: In function 'orc_check_powerpc_proc_auxv':
../orc/orccpu-powerpc.c:164:21: error: 'AT_L1D_CACHESIZE' undeclared (first use in this function); did you mean 'AT_DCACHEBSIZE'?
164 | if (buf[i] == AT_L1D_CACHESIZE) {
| ^~~~~~~~~~~~~~~~
| AT_DCACHEBSIZE
../orc/orccpu-powerpc.c:164:21: note: each undeclared identifier is reported only once for each function it appears in
../orc/orccpu-powerpc.c:168:21: error: 'AT_L2_CACHESIZE' undeclared (first use in this function); did you mean 'AT_ICACHEBSIZE'?
168 | if (buf[i] == AT_L2_CACHESIZE) {
| ^~~~~~~~~~~~~~~
| AT_ICACHEBSIZE
../orc/orccpu-powerpc.c:172:21: error: 'AT_L3_CACHESIZE' undeclared (first use in this function); did you mean 'AT_ICACHEBSIZE'?
172 | if (buf[i] == AT_L3_CACHESIZE) {
| ^~~~~~~~~~~~~~~
| AT_ICACHEBSIZE
Indeed, AT_{L1D,L2,L3}_CACHESIZE is only defined since kernel 4.11 and
https://github.com/torvalds/linux/commit/98a5f361b8625c6f4841d6ba013bbf0e80d08147
Fixes:
- http://autobuild.buildroot.org/results/0821e96cba3e455edd47b87485501d892fc7ac6a
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/56>
|
|
Part-of: <https://gitlab.freedesktop.org/gstreamer/orc/-/merge_requests/55>
|
|
Fill in aarch32 opcodes for loadupdb instruction, which is used
by various color space conversion programs. There is likely still
some space for optimization.
|
|
Fill in aarch64 opcodes for loadupdb instruction, which is used
by various color space conversion programs. This is thus far only
available on aarch64, but arm32 port should be easy.
|
|
Fill in aarch64 opcodes for sqrtf instruction.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for divf instruction.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for double-precision floating point
arithmetic instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Implement support for .flags 2d by adding code for handling the loop
counter. The implementation is very similar to aarch32.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Add support for loading 64bit constants on aarch64 by emitting a
sequence of pc-relative load, branch past the literal pool and a
constant as a literal pool entry.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Add code to store the accumulator register.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for floating point arithmetic instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcode for andn instruction.
This is bic instruction with reversed operands.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
This uses TRN2 instruction twice on the same data, first to expand
top half-word of SRC register into TMP register and then to expand
top world of TMP register to DST register. The following scheme is
implemented: src[ABCDEFGH] -> tmp[ABABEFEF] -> dst[ABABABAB] .
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for swapX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for div255w instruction.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for mulX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for cmpX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for avgX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for splitX/splatX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for signX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for vminX/vmaxX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for accX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for loadiX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|
|
Fill in aarch64 opcodes for subX instructions.
Signed-off-by: Marek Vasut <marex@denx.de>
|