Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/xiph/speex.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorjm <jm@0101bb08-14d6-0310-b084-bc0e0c8e3800>2007-10-22 19:35:03 +0400
committerjm <jm@0101bb08-14d6-0310-b084-bc0e0c8e3800>2007-10-22 19:35:03 +0400
commit60171e87a998541d22c6ca3d9de313ff5a5b5c3d (patch)
tree07f8cc2ab846d32ba0ab84613764cc0a91646269 /doc/manual.lyx
parentff8f16bb834bca2816a69468a32122b30291ac32 (diff)
info on optimisations
git-svn-id: http://svn.xiph.org/trunk/speex@14035 0101bb08-14d6-0310-b084-bc0e0c8e3800
Diffstat (limited to 'doc/manual.lyx')
-rw-r--r--doc/manual.lyx103
1 files changed, 101 insertions, 2 deletions
diff --git a/doc/manual.lyx b/doc/manual.lyx
index e2f0322..671463d 100644
--- a/doc/manual.lyx
+++ b/doc/manual.lyx
@@ -1124,8 +1124,41 @@ CPU optimisation
\end_layout
\begin_layout Standard
-The following functions are usually the first ones you should consider optimisin
-g:
+The single that will affect the CPU usage of Speex the most is whether it
+ is compiled for floating point or fixed-point.
+ If your CPU/DSP does not have a floating-point unit FPU, then compiling
+ as fixed-point will be orders of magnitudes faster.
+ If there is an FPU present, then it is important to test which version
+ is faster.
+ On the x86 architecture, floating-point is
+\series bold
+generally
+\series default
+ faster, but not always.
+ To compile Speex as fixed-point, you need to pass --fixed-point to the
+ configure script or define the FIXED_POINT maccro for the compiler.
+\end_layout
+
+\begin_layout Standard
+Other important things to check on some DSP architectures are:
+\end_layout
+
+\begin_layout Itemize
+Make sure the cache is set to write-back mode
+\end_layout
+
+\begin_layout Itemize
+If the chip has SRAM instead of cache, make sure as much code and data are
+ in SRAM, rather than in RAM
+\end_layout
+
+\begin_layout Standard
+If you are going to be writing assembly, then the following functions are
+
+\series bold
+usually
+\series default
+ the first ones you should consider optimising:
\end_layout
\begin_layout Itemize
@@ -1173,6 +1206,21 @@ pitch_xcorr()
\end_layout
+\begin_layout Itemize
+\begin_inset listings
+inline true
+status collapsed
+
+\begin_layout Standard
+
+interp_pitch()
+\end_layout
+
+\end_inset
+
+
+\end_layout
+
\begin_layout Subsection
Memory optimisation
\end_layout
@@ -1211,6 +1259,57 @@ Static codebooks that are not needed for the bit-rates you are using (*_table.c
\end_layout
\begin_layout Standard
+Speex also has several methods for allocating temporary arrays.
+ When using a compiler that supports C99 properly (as of 2007, Microsoft
+ compilers don't, but gcc does), it is best to define VAR_ARRAYS.
+ That makes use of the variable-size array feature of C99.
+ The next best is to define USE_ALLOCA so that Speex can use alloca() to
+ allocate the temporary arrays.
+ Note that on many systems, alloca() is buggy so it may not work.
+ If none of VAR_ARRAYS and USE_ALLOCA are defined, then Speex falls back
+ to allocating a large
+\begin_inset Quotes eld
+\end_inset
+
+scratch space
+\begin_inset Quotes erd
+\end_inset
+
+ and doing its own internal allocation.
+ The main disadvantage of this solution is that it is wasteful.
+ It needs to allocate enough stack for the worst case scenario (worst bit-rate,
+ highest complexity setting, ...) and by default, the memory isn't shared between
+ multiple encoder/decoder states.
+ Still, if the
+\begin_inset Quotes eld
+\end_inset
+
+manual
+\begin_inset Quotes erd
+\end_inset
+
+ allocation is the only option left, there are a few things that can be
+ improved.
+ By overriding the speex_alloc_scratch() call in os_support.h, it is possible
+ to always return the same memory area for all states
+\begin_inset Foot
+status collapsed
+
+\begin_layout Standard
+In this case, one must be careful with threads
+\end_layout
+
+\end_inset
+
+.
+ In addition to that, by redefining the NB_ENC_STACK and NB_DEC_STACK (or
+ similar for wideband), it is possible to only allocate memory for a scenario
+ that is known in advange.
+ In this case, it is important to measure the amount of memory required
+ for the specific sampling rate, bit-rate and complexity level being used.
+\end_layout
+
+\begin_layout Standard
\newpage