Welcome to mirror list, hosted at ThFree Co, Russian Federation.

gitlab.xiph.org/xiph/opus.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJan Buethe <jbuethe@amazon.de>2023-10-07 19:52:38 +0300
committerJan Buethe <jbuethe@amazon.de>2023-10-07 19:52:38 +0300
commit0563d71b255c2ef0cb65aab706ecbd44e0328c8d (patch)
tree12b894ee47415f8c550c0f9c16920d94b946ad05
parent8f9a7e23c8067a90013f3e56592360132af47e7b (diff)
updated osce readme
-rw-r--r--dnn/torch/osce/README.md53
1 files changed, 51 insertions, 2 deletions
diff --git a/dnn/torch/osce/README.md b/dnn/torch/osce/README.md
index b1475d91..40cf72f8 100644
--- a/dnn/torch/osce/README.md
+++ b/dnn/torch/osce/README.md
@@ -1,7 +1,6 @@
# Opus Speech Coding Enhancement
-This folder hosts models for enhancing Opus SILK. See related Opus repo https://gitlab.xiph.org/xiph/opus/-/tree/exp-neural-silk-enhancement
-for feature generation.
+This folder hosts models for enhancing Opus SILK.
## Environment setup
The code is tested with python 3.11. Conda setup is done via
@@ -12,3 +11,53 @@ The code is tested with python 3.11. Conda setup is done via
`conda activate osce`
`python -m pip install -r requirements.txt`
+
+
+## Generating training data
+First step is to convert all training items to 16 kHz and 16 bit pcm and then concatenate them. A convenient way to do this is to create a file list and then run
+
+`python scripts/concatenator.py filelist 16000 dataset/clean.s16 --db_min -40 --db_max 0`
+
+which on top provides some random scaling.
+
+Second step is to run a patched version of opus_demo in the dataset folder, which will produce the coded output and add feature files. To build the patched opus_demo binary, check out the exp-neural-silk-enhancement branch and build opus_demo the usual way. Then run
+
+`cd dataset && <path_to_patched_opus_demo>/opus_demo voip 16000 1 9000 -silk_random_switching 249 clean.s16 coded.s16 `
+
+The argument to -silk_random_switching specifies the number of frames after which parameters are switched randomly.
+
+## Generating inference data
+Generating inference data is analogous to generating training data. Given an item 'item1.wav' run
+`mkdir item1.se && sox item1.wav -r 16000 -e signed-integer -b 16 item1.raw && cd item1.se && <path_to_patched_opus_demo>/opus_demo voip 16000 1 <bitrate> ../item1.raw noisy.s16`
+
+The folder item1.se then serves as input for the test_model.py script or for the --testdata argument of train_model.py resp. adv_train_model.py
+
+## Regression loss based training
+Create a default setup for LACE or NoLACE via
+
+`python make_default_setup.py model.yml --model lace/nolace --path2dataset <path2dataset>`
+
+Then run
+
+`python train_model.py model.yml <output folder> --no-redirect`
+
+for running the training script in foreground or
+
+`nohup python train_model.py model.yml <output folder> &`
+
+to run it in background. In the latter case the output is written to `<output folder>/out.txt`.
+
+## Adversarial training (NoLACE only)
+Create a default setup for NoLACE via
+
+`python make_default_setup.py nolace_adv.yml --model nolace --adversarial --path2dataset <path2dataset>`
+
+Then run
+
+`python adv_train_model.py nolace_adv.yml <output folder> --no-redirect`
+
+for running the training script in foreground or
+
+`nohup python adv_train_model.py nolace_adv.yml <output folder> &`
+
+to run it in background. In the latter case the output is written to `<output folder>/out.txt`. \ No newline at end of file