audioooo glossary

Welcome to the Audio Glossary. This guide explains terms from the world of audio and digital signal processing (DSP). Please note, this glossary is still a work in progress, and more terms will be added over time.

AAX (Avid Audio eXtension)

Avid Audio eXtension (AAX) is a proprietary plugin format developed by Avid for use with their Pro Tools Digital Audio Workstation. It succeeded the previous RTAS and TDM formats and comes in two variants: AAX Native (runs on the host CPU) and AAX DSP (runs on dedicated Avid hardware).

ADC (Analog-to-Digital Converter)

An Analog-to-Digital Converter (ADC) is a hardware component that converts continuous analog audio signals (like those from a microphone or instrument) into discrete digital data that a computer can process. The quality of the ADC impacts the fidelity of the digital recording.

AU (Audio Unit)

Audio Unit (AU) is a plugin format developed by Apple for use on macOS and iOS. It allows for real-time audio processing and is used in a variety of audio software, including Logic Pro and GarageBand.

AUv3 (Audio Unit Version 3)

Audio Unit Version 3 (AUv3) is an updated version of the AU plugin format. It provides enhanced features and capabilities, such as increased stability, better performance, out-of-process hosting for plugins, and support for more complex audio processing tasks across macOS and iOS.

Audio File Formats

These define how digital audio data is stored. Common formats include uncompressed types like WAV and AIFF (preserving full quality), lossless compressed types like FLAC (reducing file size without quality loss), and lossy compressed types like MP3 and AAC (reducing file size significantly with some quality loss).

Bit Depth

Bit depth refers to the number of bits used to represent the amplitude of each individual audio sample. Higher bit depths (e.g., 24-bit, 32-bit float) provide greater dynamic range and lower noise floor compared to lower bit depths (e.g., 16-bit).

Buffer Size

In digital audio processing, the buffer size determines how many samples are processed in a single block or chunk. Smaller buffer sizes result in lower latency (delay) but require more CPU power, while larger buffers reduce CPU load but increase latency. It’s a critical setting for balancing performance and responsiveness.

Bussing / Routing

Bussing (or Routing) is the process of sending audio signals from one or more tracks or sources to a designated channel, known as a bus. This allows for group processing (applying effects to multiple tracks simultaneously), creating submixes, or managing complex signal flows within a DAW or audio system.

C++

C++ is a high-performance, general-purpose programming language widely used in audio development for creating plugins, DAWs, and standalone audio applications. Frameworks like JUCE are built upon C++, leveraging its speed and control over system resources necessary for real-time audio processing.

Chorus

Chorus is an audio effect that thickens a sound by mixing the original signal with one or more slightly delayed and pitch-modulated copies. This simulates the effect of multiple voices or instruments playing the same part simultaneously, adding richness and width.

CLAP (CLever Audio Plugin)

CLever Audio Plugin (CLAP) is a new open standard for audio plugins designed to be efficient and flexible. It aims to provide a modern alternative to existing plugin formats with improved performance, better modulation capabilities, and extensibility.

Compression (Dynamic Range Compression)

Compression is an audio process that reduces the dynamic range of a signal – the difference between the loudest and quietest parts. It attenuates levels above a certain threshold according to a specified ratio, often used to control peaks, increase perceived loudness, or shape the sonic envelope. Key parameters include Threshold, Ratio, Attack, and Release.

Convolution

Convolution is a mathematical operation used in DSP to apply the characteristics of one signal (like an Impulse Response) to another. Its primary use in audio is for Convolution Reverb, where the Impulse Response of a real space or hardware unit is used to realistically simulate that acoustic environment. It’s also fundamental to FIR filter design.

DAC (Digital-to-Analog Converter)

A Digital-to-Analog Converter (DAC) is a hardware component that converts digital audio data back into an analog electrical signal that can be sent to amplifiers and speakers for listening. The quality of the DAC affects the fidelity of the audio playback.

DAW (Digital Audio Workstation)

A Digital Audio Workstation (DAW) is software used for recording, editing, mixing, and producing audio files. DAWs provide a range of tools for manipulating audio tracks, adding effects, and arranging compositions. Examples include Pro Tools, Ableton Live, and Logic Pro. DAWs can have different workflows, such as linear timeline-based or clip/scene-based paradigms.

Delay

Delay is an audio effect that records an input signal and plays it back after a specified period, creating echoes or repetitions. Key parameters typically include delay time (how long before playback), feedback (how much of the output is fed back into the input for repeating echoes), and mix/wet/dry level (balance between original and delayed signal).

Dithering

Dithering is the process of adding a small amount of low-level random noise to an audio signal before reducing its bit depth (e.g., from 24-bit to 16-bit for CD mastering). This technique minimizes quantization distortion by randomizing the inevitable quantization errors, resulting in a more pleasant perceived sound.

DSP (Digital Signal Processing)

Digital Signal Processing (DSP) refers to the manipulation of audio signals using digital technology. DSP involves the use of algorithms and mathematical functions to alter audio data, allowing for enhancements like noise reduction, equalization, and effects processing.

Equalization (EQ)

Equalization (EQ) is the process of adjusting the balance between frequency components within an audio signal. It involves boosting or cutting specific frequency ranges (bands) to alter the timbre, correct tonal imbalances, or remove unwanted frequencies. Common EQ types include parametric, graphic, shelving, and bell filters.

FFT (Fast Fourier Transform)

The Fast Fourier Transform (FFT) is an algorithm that efficiently computes the Discrete Fourier Transform (DFT) of a sequence, or its inverse. FFT is widely used in audio processing to convert a signal from the time domain to the frequency domain, enabling tasks like spectrum analysis, filter design, and pitch shifting.

FIR Filter (Finite Impulse Response Filter)

A Finite Impulse Response (FIR) filter is a type of digital filter whose response to an impulse (a single, brief input sample) eventually settles to zero. FIR filters are known for their stability and ability to have linear phase response (preserving timing relationships between frequencies), but often require more computational power than IIR filters for similar frequency responses.

Filtering

Filtering is the process of removing or enhancing certain frequencies within an audio signal. Filters can be used to reduce noise, isolate specific frequencies, or shape the sound in various ways. Common types of filters include low-pass, high-pass, band-pass, and notch filters. See also IIR Filter and FIR Filter.

Flanger

A Flanger is a modulation effect created by mixing an audio signal with a slightly delayed copy, where the delay time is continuously modulated by a Low-Frequency Oscillator (LFO). This produces characteristic sweeping comb-filtering effects, often described as a “jet plane” sound.

FM Synthesis (Frequency Modulation Synthesis)

Frequency Modulation (FM) Synthesis is a technique for generating complex sounds where the frequency of one oscillator (the “carrier”) is modulated by the output of another oscillator (the “modulator”). Varying the modulation index and frequencies can create a wide range of timbres, from bell-like tones to complex, inharmonic sounds.

Frequency Scaling

Frequency scaling is the process of shifting the frequency components of an audio signal by a fixed factor. This transformation alters the pitch of the sound, making it higher or lower while preserving the relative harmonic structure. Often used interchangeably with basic pitch shifting.

Gain Staging

Gain staging is the practice of managing the level (gain) of an audio signal at each stage of the processing chain (e.g., preamp, EQ, compressor, DAW channel fader). Proper gain staging ensures an optimal signal level, maximizing the signal-to-noise ratio and preventing unwanted clipping or distortion at any point.

Headroom

Headroom is the difference, measured in decibels (dB), between the nominal operating level (average peak level) of an audio signal and the maximum level the system can handle without clipping (distortion). Maintaining sufficient headroom is crucial during recording and mixing to accommodate unexpected peaks.

IIR Filter (Infinite Impulse Response Filter)

An Infinite Impulse Response (IIR) filter is a type of digital filter whose response to an impulse can theoretically continue indefinitely due to internal feedback loops. IIR filters are computationally efficient and can model analog filter designs closely, but they can potentially be unstable and often exhibit non-linear phase response.

Impulse Response (IR)

An Impulse Response (IR) is a recording of how a system (like an acoustic space, amplifier, or effect unit) responds to a brief, sharp input signal (an impulse). IRs capture the linear characteristics (frequency and phase response) of the system and are primarily used in convolution reverb and emulation plugins to recreate those characteristics digitally.

JUCE Audio Framework

The JUCE Audio Framework is a widely used C++ library for developing cross-platform audio applications. JUCE provides a comprehensive set of tools for building audio plugins (VST, AU, AAX, CLAP), standalone applications, and mobile apps. It offers features like real-time audio processing, GUI components, MIDI support, and much more.

Latency

Latency in digital audio refers to the time delay introduced as a signal passes through a system. It’s caused by various factors including ADC/DAC conversion times, buffer sizes used for processing, and the processing time of plugins or software. Low latency is critical for real-time applications like live monitoring and performance.

Limiting

Limiting is an extreme form of dynamic range compression with a very high ratio (often ∞:1 or close to it) and typically a fast attack time. Its primary purpose is to prevent an audio signal’s peak level from exceeding a specified ceiling (the threshold), thus avoiding digital clipping while maximizing perceived loudness.

LV2 (LADSPA Version 2)

LV2 is an open standard for audio plugins and matching hosts, primarily popular on Linux systems but designed to be cross-platform. It is extensible and offers features beyond the older LADSPA standard, such as support for MIDI, custom UIs, and property manipulation.

Max/MSP

Max/MSP is a visual programming language for music and multimedia. It allows users to create interactive audio and visual applications by connecting objects that perform specific functions. Max handles the visual programming and control data, while MSP (Max Signal Processing) is the component dedicated to real-time audio processing.

MIDI (Musical Instrument Digital Interface)

Musical Instrument Digital Interface (MIDI) is a technical standard that allows electronic musical instruments, computers, and other devices to communicate and synchronize with each other. MIDI carries event messages such as note-on and note-off, velocity, pitch bend, timing clocks, and control signals (like MIDI CC). It does not transmit audio itself.

MIDI CC (MIDI Continuous Controller)

MIDI Continuous Controller (MIDI CC) messages are a type of MIDI data used to control various parameters in a MIDI-enabled device or software in real-time. Each CC message has a unique controller number (0-127) that corresponds to a specific parameter, such as volume (CC#7), pan (CC#10), modulation wheel (CC#1), or expression (CC#11).

Modulation Effects

Modulation Effects are a category of audio effects that use a Low-Frequency Oscillator (LFO) or other modulation source to periodically alter a parameter of the sound, such as pitch, amplitude, or filter frequency. Common examples include Chorus, Flanger, Phaser, Tremolo, and Vibrato.

Morphing

Morphing in audio refers to the gradual transformation of one sound into another. This process involves analyzing and interpolating between the characteristics (like spectral content or formants) of two or more sounds, creating a seamless transition between them. Morphing is commonly used in sound design and synthesis.

Normalization

Normalization is the process of adjusting the overall gain of an audio file or selection so that its amplitude reaches a target level, without changing its dynamic range. Peak normalization adjusts gain so the highest peak reaches a specific level (often 0 dBFS), while Loudness normalization adjusts gain based on perceived loudness (e.g., LUFS).

OSC (Open Sound Control)

Open Sound Control (OSC) is a protocol for communication between computers, synthesizers, sensors, and other multimedia devices. Designed as a modern alternative to MIDI, OSC offers higher resolution data, symbolic naming of parameters (address patterns), and is well-suited for transmission over networks (using UDP or TCP).

Panning

Panning is the distribution of a monophonic audio signal into a stereo or multi-channel sound field. By adjusting the relative levels sent to the left and right (or other) channels, panning creates the perception of a sound source’s position within the stereo image.

Phaser

A Phaser is a modulation effect that creates sweeping notches in the frequency spectrum of an audio signal. It works by splitting the signal, passing one part through a series of all-pass filters (which shift phase), and mixing it back with the original. Modulating the filter frequencies with an LFO produces the characteristic swirling, ethereal sound.

Pitch Transposition

Pitch transposition is the process of changing the pitch of an audio signal without affecting its duration. This can be achieved by shifting the entire frequency spectrum of the sound up or down by a specific musical interval or frequency amount, often using techniques like resampling or phase vocoding.

PureData

PureData (Pd) is an open-source visual programming language for multimedia. Similar to Max/MSP, PureData allows users to create interactive audio, video, and graphical applications by connecting graphical objects (“patches”) that represent functions or data flow. It’s known for its accessibility and use in experimental music and art.

Quantization

In digital audio, quantization is the process during Analog-to-Digital Conversion (ADC) where the continuous amplitude of an analog signal is mapped to the nearest available discrete value allowed by the digital format’s bit depth. This process introduces a small amount of error known as quantization error or quantization noise.

Reverb (Reverberation)

Reverb, or Reverberation, is the persistence of sound in a space after the original sound source has stopped, caused by numerous reflections off surfaces. Audio effects simulate this phenomenon to add depth, spaciousness, and ambiance. Common types include algorithmic reverb (using mathematical models) and convolution reverb (using Impulse Responses).

Sample Rate

Sample Rate refers to the number of samples (snapshots of amplitude) taken from a continuous analog audio signal per second during digital conversion, measured in Hertz (Hz) or kilohertz (kHz). Common sample rates include 44.1 kHz (CD quality), 48 kHz (common for video), and higher rates like 96 kHz or 192 kHz used for high-resolution audio. Higher sample rates can capture higher frequencies.

Sidechaining

Sidechaining is a technique where the effect processor on one track (e.g., a compressor or gate) is controlled by the audio signal from a different track (the sidechain input). A common use is sidechain compression, where a compressor on a bass track is triggered by the kick drum signal, causing the bass to temporarily duck in volume when the kick hits.

Stems

In a Digital Audio Workstation (DAW), stems are audio tracks or groups of related tracks (e.g., all drum tracks, all vocal tracks) bounced down to single stereo or mono files, often including their associated processing and effects. Stems simplify complex projects for mixing, mastering, collaboration, or creating remixes by consolidating elements while still allowing some level control.

Subtractive Synthesis

Subtractive Synthesis is a common method of sound synthesis that starts with a harmonically rich waveform (like a sawtooth or square wave generated by an oscillator) and then uses filters to subtract or attenuate certain frequencies, shaping the timbre of the sound. Envelopes and LFOs are typically used to modulate the oscillator, filter, and amplifier.

Synthesis

Synthesis is the process of creating sound using electronic hardware or software, rather than recording acoustic sounds. Various techniques exist, each manipulating basic components like oscillators, filters, and amplifiers in different ways. See also Subtractive Synthesis, FM Synthesis, Additive Synthesis, Wavetable Synthesis, Granular Synthesis.

Time Scaling

Time scaling (also known as time stretching or compression) is the process of changing the duration (speed) of an audio signal without affecting its pitch. Algorithms like Phase Vocoder or granular techniques are often used to achieve this, allowing audio to be fit to a specific length or synchronized with video.

VST2 (Virtual Studio Technology 2)

Virtual Studio Technology 2 (VST2) is a widely used plugin interface standard developed by Steinberg for integrating virtual instruments and audio effects plugins with DAWs and other host software. While largely superseded by VST3, VST2 plugins remain common.

VST3 (Virtual Studio Technology 3)

Virtual Studio Technology 3 (VST3) is an updated version of the VST plugin standard developed by Steinberg. It introduces improvements over VST2, such as more efficient processing (only processing when audio is present), dynamic I/O configuration, dedicated event buses (e.g., for MIDI or OSC), and better handling of parameters.

Web Audio API

The Web Audio API is a JavaScript API built into modern web browsers for processing and synthesizing audio directly within web pages. It allows developers to load audio, apply effects, generate sounds, visualize audio, and create complex interactive audio applications that run in the browser without external plugins.