VoIP Contest: Round 2

General info about the contest is available on @contest. See also: This page in Russian

The task in this round of the VoIP Contest is to build a system for making high-quality audio calls between two participants, using a predefined interface and the testing tools provided.

Everyone is welcome to participate, including contestants who didn’t take part in the first round of the VoIP Contest.

You Will Use

The libtgvoip library. For the purposes of the contest, please use the commit f775311 version of the code.
A public interface for calls between two users in the header file TgVoip.h. The testing client tgvoipcall connects to the library using this interface; the same interface will also be implemented in Telegram mobile clients in the future.
A set of tools and sound samples that can be used for testing calls and evaluating their quality.
You will not need the Telegram API.

The Task

Improve the implementation of voice calls using one of the possible strategies.
Keep the public interface TgVoip.h unchanged.

Conditions

You must only use C++ (except for compilation scripts and testing), with code portability in mind. Bonus points will be awarded if your library can be built for Android.
It is acceptable to modify the data transfer protocol, as well as use an alternative protocol, or even create your own. End-to-End Encryption must be preserved (plaintext voice data must never reach any server; the encryption key must be derived from encryption_key passed to the library). Compatibility with the current implementation will bring bonus points.
It is acceptable to ignore the address and tag parameters of the Telegram Relay server, and instead use your own or a third-party Relay server (STUN, TURN, xirsys, etc). When publishing your submission, we recommend hosting your server in Amsterdam or Central Europe to minimize latency for our judges during testing.
You should keep external dependencies to a minimum.
Third-party code may be used only if it's published under GPL-compatible licenses.
The following functionality is outside of the scope of this contest: P2P (TgVoipEndpointType::Inet, TgVoipEndpointType::Lan), TCP Relay (TgVoipEndpointType::TcpRelay), SOCKS proxy (TgVoipProxy), Data saving mode (TgVoipDataSaving), Traffic stats (TgVoipTrafficStats), as well as the functions setGlobalServerConfig, onSignalBarsUpdated, setMuteMicrophone, and the flag TGVOIP_USE_CUSTOM_CRYPTO. They will not be taken into account or used during testing and may be ignored.
Your app must support TgVoipState, onStateUpdated and TgVoipAudioDataCallbacks.
You must pay close attention to the thread safety of your code, avoid deadlocks, etc.

Notes

P2P must be disabled using the library parameters.
IPv6 connections are outside the scope of the contest, IPv6 will be unavailable on our server during testing.
We recommend using the UDP protocol.
Submissions will be tested under Debian GNU/Linux 10.1 (buster), x86-64. Kindly ensure that your library works on a clean setup before submitting.
We will only test calls between clients using code from the same submission. Compatibility with the current Telegram implementation is optional (but will positively affect the judges' opinion).
Clients from one call may be launched from different physical servers.

As a result, you should get a ZIP-file with the following structure:

submission.zip
  -> libtgvoip.so - compiled library module
  -> README - build instructions, description of what you implemented
  -> src - a directory with your source code
  -> deb-packages.txt - a text file with a list of dependencies, formatted as line-break separated debian package names

Possible Approaches

There are several ways of approaching the contest task. Each of them has its benefits and may get you a prize, provided a sufficient amount of quality work goes into your submission. Here's what you can do, sorted from the most cautious to the most ambitious:

Find and fix issues in the current implementation of the library. For example, eliminate potential deadlocks or identify network conditions where voice quality or latency can be improved. For each issue you've identified, provide reproduction steps and a detailed description of the solution you implemented.
Rewrite the library, maintaining compatibility with the current clients (leaving network protocols unchanged).
Use third-party libraries and protocols for your implementation, losing compatibility. Optionally, you could also use your own Relay server.

Evaluation Criteria

To win in this round you will need to:

Meet all conditions
Pay special attention to the notes
Keep your code compact and efficient (size matters!)
Minimize external dependencies

All submissions will be tested on the same input data (audio samples and network conditions). Output files will be passed to tgvoiprate – the resulting ratings will be noted by the judges, but will not be the main criteria in the final scoring.

Test Suite

To help you test your library during development, we include a selection of tools and sound samples. Download from GitHub »

These tools allow you to:

Simulate voip calls using a binary .so file with the library. Instead of real microphone input, sound samples with speech or silence are used for each of the call participants. Output on the recipient's side is in turn recorded to an audio file.
Programmatically choose network conditions for one of the participants: packet loss, high latency, or limited bandwidth. Programmatically modify these conditions during the call.
Receive a numerical score for the quality of the output audio file on the recipient's side, compared to the original file and the preprocessed file that was actually sent over the network.
Write input and output filenames with respective scores to a CSV file for further analysis.
Control the above using PHP-scripts.
Calculate aggregated scores grouped by library version.

The instructions below are relevant for Debian GNU/Linux 10.1 (buster), x86-64.

Preparing your system

1. Install deb-packages:

$ sudo apt-get install php

2. Netem is used to emulate different network conditions, ip-netns is used to apply these conditions to just one of the two proccesses (the caller's side). The following commands must be run once before first launch:

$ sudo bash ./tests/setup-netns.sh
$ sudo tc qdisc list

This will set up the network namespace client1 with the virtual network interface v-peer1, which is needed for the calling software to work. Under normal conditions, you won't need to call it again. If any of the commands results in an error, you may need to install/enable netem or ip-netns on your system.

3. We'll often need to modify network conditions via netem, which requires root access/sudo, therefore it is neccessary to disable password prompts for sudo when launching those commands. To do this:

$ whoami
  user
$ sudo visudo

At the end of the file, add this line:

user ALL=(ALL) NOPASSWD: /usr/bin/ip netns exec client1 *

replacing user with the output of the whoami command. Save the file (if you're using the Vim editor, press Esc, then type :wq and Enter).

4. Lastly, you need to set your token to work with the VoIP Contest API. To do this, replace the default value 111222333444:AAABBBCCCDDD in the tests/token.php file with the token value you received @jobs_bot when joining the second round of the contest.

If you participated in the first round of this contest, you will still need to get a new token from the bot. The one from the previous round will not work.

Call + Rate

The main file that contains the testing script is tests/call.php. It manages the list of library .so files to be tested, number of call iterations, and sets of network conditions. The file is pre-filled with a sample scenario; we recommend modifying it according to your sepcific testing needs.

To launch:

$ php tests/call.php

This will run the complete test scenario. It might take a long time, since each iteration can take about 10-20 seconds depending on the chosen audio sample duration. The folders preprocessed and out will be filled with files containing audio data sent over the network and received by the recipient respectively. Each call will be immediately rated with the results written to a .csv file. If you‘re accessing the server over SSH, we recommend using nohup/screen to ensure that long testing sessions don’t get interrupted by connection losses.

At the end, the script will also output average call ratings for each of the library versions tested, e.g.:

Version stable (3234 ratings)
=============================================
ScoreCombined:      mean 3.063, stddev: 0.924
ScorePreprocess:    mean 4.257, stddev: 0.408
ScoreOutput:        mean 3.39, stddev: 0.942

....

Here ScorePreprocess shows the degradation of sound quality in the preprocessed file compared to the input sample. On the scale of 1.0-5.0, where 1.0 means complete degradation and 5.0 means unchanged.
ScoreOutput shows the degradation of sound quality on the recipient‘s side compared to the preprocessed file on the sender’s side. 1.0-5.0, where 1.0 means complete degradation and 5.0 means unchanged.
ScoreCombined shows the degradation of sound quality on the recipient‘s side compare to the original input file on the recipient’s side. This is a sort of aggregated rating for ScorePreprocess and ScoreOutput. 1.0-5.0, where 1.0 means complete degradation and 5.0 means unchanged.

You can always get these stats from the .csv file later:

$ php tests/mean.php

To clear all accumulated preprocessed and output files as well as the .csv file you can run:

$ sh tests/clean.sh

Suggested usage

One of the possible ways of using the test suite while developing your library could be as follows.

1. Develop your library in a separate directory. Write verbose logs to stdout/stderr during development.
2. Build library after each significant change, run git commit, and copy the output library .so file into the Test Suite's lib subdirectory with a unique name, e.g. libtgvoip.COMMIT.so. Then add it to the list of libraries to be tested in tests/call.php.
3. Launch a testing session for the changes in the background, while continuing development (since the testing process might take a while).
4. Inspect library output logs (they are written to the corresponding .log files in the out subfolder), if necessary, listen to audio files from the calls with the lowest ratings.
5. Find and fix bugs. Jump back to 2.

Once call quality reaches satisfactory levels, focus on finding new network conditions where quality degrades significantly, e.g. highly variable bandwidth, packet loss (use ->after() method) and fix where possible..