Tutorial: Getting Started

Getting Started

tutorial

Table of Contents

  1. Setup
  2. File Structure
  3. Licensing
  4. Configuration
  5. API Usage
    1. Core: Chat
    2. Core: Playback
    3. Settings
    4. Logging
    5. Callbacks
    6. Configuration Files
    7. Push-to-talk
    8. Processing

Setup

MDPS core is contained in the mdpslib.js file:

<script type="module" src="mdps/mdpslib.js"></script>

MDPS uses WebRTC for chat functionality, so if you are implementing a chat solution, the WebRTC adapter is also required:

<script src="https://webrtc.github.io/adapter/adapter-latest.js"></script>

File Structure

The following describes the directory structure that will reside on the server and provides the MDPS support.

mdps
│   mdpslib.js       // the mdps API library
|   mdpsasm.js       // the mdps asm.js
|   license.key      // the license file    
│
└───worklet          // the audio worklet js
|       *.js         // worklet process js files
│
└───config           // the configuration directory
|       configfiles  // list of all config files
|       presetfiles  // list of all preset files
│       abc.cfg      // config files
|       def.cfg
│       123.bgvx     // preset files
|       456.bgvx
│   
└───docs             // the documentation directory

Licensing

A license file is required to use the MDPS API and is tied to the domain(s) of the server(s). This file resides in the mdps sub-folder on the server. Contact Bongiovi Medical for more details.

Configuration

Library

The library can be configured via an optional configuration file that contains a mdpsConfig JavaScript dictionary:

mdpsConfig = {
  auxDir: '/mdps'
}

This file should be loaded as JavaScript prior to loading mdpslib.js via the script tag:

<script src="mdps.cfg"></script>
<script type="module" src="mdps/mdpslib.js"></script>

Currently, mdpsConfig defines a single setting, auxDir, which is the auxilliary directory location where the config directory and the license.key file reside.

If mdpsConfig is not utilized, default values will be used. The default auxDir location is /mdps.

NOTE that the core MDPS JavaScript files need to be located in /mdps off of webroot.

MDPS Settings

Configuration of the MDPS processing settings is specified via a .cfg file, along with an API for loading the .cfg files. The configuration is specified as a JavaScript dictionary, ala:

config = {
  presetFile: 'desktop_speakers.bgvx',
  userGain: 0,                  // -10.0 - 4.0 db
  noiseGateEnabled: 1,          // 0 or 1
  noiseGateThreshold: 0.5,      // 0.0 to 0.9
  noiseGateRelease: 0.999,      // 0.99097 - 0.99981
  noiseGateHold: 80,            // 0 - 99
  noiseGateFloor: 0.1,          // 0.09999 - 0.89125
  noiseGateHyst: 1,             // 1.0 - 2.0
  commFrequency: 0.08,          // 0.020 - 0.2 sec
  unmuteDelay: 0.5,             // 0.0 - 3.0 sec
  micAtten: 0.0,                // 0.0 - 1.0
  muteRampDuration: 0.2         // 0.003 - 0.3 sec (3 to 300 ms) 
}

Notice that it also specifies a bgvx file. Both the config and bgvx reside on the server in the mdps/config sub-folder by default.

The API also provides a way to get a list of available config and preset files - this list is maintained in two files within that same folder and lists the corresponding files. The two files are named configfiles and presetfiles. The default config list file, for example, looks something like the following:

music_broadcast_mastering.cfg, 0
music_mastering_laptop.cfg, 0
music_virtual_subwoofer.cfg, 0
music_web_mastering.cfg, 0
default.cfg, 1
desktop_speaker_endpoint.cfg, 1
emergency.cfg, 1
full_duplex.cfg, 1
laptop_endpoint.cfg, 1
security_monitor.cfg, 1
speakerphone_endpoint.cfg, 1

This defines the .cfg filename, as well as its type. The type defines the type of application the configuration is typically used for (e.g. music, voice, etc). This type identifier is also associated with the preset (bgvx) files.

API Usage

Our APIs make use of the Web Audio APIs, along with the WebRTC APIs for chat (data channel) support. There is a core API that is used to initialize and setup processing. Beyond the core is API to modify the MDPS settings for noise gate (microphone signal) and general audio processing (speaker signal), a logging facility, a callback API for receiving relevant notifications, API for managing config and settings files, an API for managing the push-to-talk feature, and a low-level processing API.

In order to use the API, simply create a new MDPS object, ala:

import * as mdpsLib from '/mdps/mdpslib.js';
var mdps = new mdpsLib.MDPS();

All API methods can be accessed from this object. Note that some of the API methods are asynchronous and need to be handled accordingly (eg wrapped in an anonymous async function).

Core: Chat

At the most basic level, you can make full use of MDPS for chat functionality through 4 API calls:

await mdps.initialize();
mdps.setMicStream(stream);
mdps.setSpeakerStream(stream);
mdps.setupDataChannel(peerConnection, isCaller);

setMicStream should be called with the result of a MediaDevices.getUserMedia() call. setSpeakerStream is called from the result of RTCPeerConnection's onaddstream callback. setupDataChannel is used to setup the WebRTC data channels needed to communicate state information between peers, for features like smart microphone attenuation.

This set of calls, along with proper configuration of the RTCPeerConnection, will provide basic peer-to-peer, WebRTC-based audio communications with MDPS processing and smart microphone attenuation. See the MDPS Chat Demo sample code for a full implementation.

Core: Playback

Playback involves processing on the local speaker or output stream. Therefore, playback processing can be attained via 2 core API calls:

await mdps.initialize();
mdps.setAudioStream(stream);

This assumes the use of the Web Audio APIs. If you are using the HTML5 audio tag, the stream can be setup via the onloadedmetadata event on the audio element:

    audio.onloadedmetadata = function(e) {
        var audioCtx = new(window.AudioContext || window.webkitAudioContext)();
        // get the source node from the audio element
        var sourceNode = audioCtx.createMediaElementSource(this);
        // create a stream from our AudioContext
        var dest = audioCtx.createMediaStreamDestination();
        // connect our audio element's output to the stream
        sourceNode.connect(dest);
        // connect our output stream to MDPS processing  
        mdps.setAudioStream(dest.stream);
        this.onloadedmetadata = null; // call only once
    };

Note that this should only be done one time - once the audio element is wired in with MDPS processing, any audio source played via that audio element will have it's audio processed by MDPS. See the MDPS Playback Demo sample code for a full implementation.

NOTE: As of 8/2018, the method described above will only work in Chrome and Firefox. There are known issues with Safari's Web Audio support that prevent it from properly handling this type of audio processing. If you have a non-Web Audio-based solution and have access to the raw audio data, you can consider using our low-level processing routines.

Settings

The settings API is broken up into 3 sections:

  • State - API for changing the current state or performance settings of the system.
  • Processing - API for changing the processing parameters.
  • NoiseGate - API for changing the noise gate parameters.

Logging

The logging API allows for the specification of the logging level and a log function. There are different levels defined for logging, such that messages will only get logged if they are of at least the current log level.

Callbacks

The system provides for the specification of callback functions in order to be notified of relevant events and information, such as mic attenuation/mute states, push-to-talk state changes, diagnostic information and parameter changes.

Configuration Files

As noted above, configuration files are maintained on the server. There are several API functions for managing and loading these files.

Push-to-talk

Push-to-talk is a feature that allows for half-duplex communication mode, meaning one side has exclusive, one-way communication rights. API functions are provided in order to explicitly enter and exit this mode.

Processing

The core API described earlier is provided for handling mic and speaker audio processing at a high level. There are also a number of routines for handling lower level processing. This allows for processing audio at the sample level, independent of WebRTC and Web Audio support.