Core IO and DSP¶
Audio processing¶
load(path[, sr, mono, offset, duration, dtype]) | Load an audio file as a floating point time series. |
to_mono(y) | Force an audio signal down to mono. |
resample(y, orig_sr, target_sr[, res_type, ...]) | Resample a time series from orig_sr to target_sr |
get_duration([y, sr, S, n_fft, hop_length, ...]) | Compute the duration (in seconds) of an audio time series or STFT matrix. |
autocorrelate(y[, max_size, axis]) | Bounded auto-correlation |
zero_crossings(y[, threshold, ...]) | Find the zero-crossings of a signal y: indices i such that sign(y[i]) != sign(y[j]). |
clicks([times, frames, sr, hop_length, ...]) | Returns a signal with the signal click placed at each specified time |
Spectral representations¶
stft(y[, n_fft, hop_length, win_length, ...]) | Short-time Fourier transform (STFT) |
istft(stft_matrix[, hop_length, win_length, ...]) | Inverse short-time Fourier transform (ISTFT). |
ifgram(y[, sr, n_fft, hop_length, ...]) | Compute the instantaneous frequency (as a proportion of the sampling rate) obtained as the time-derivative of the phase of the complex spectrum as described by [R3]. |
cqt(y[, sr, hop_length, fmin, n_bins, ...]) | Compute the constant-Q transform of an audio signal. |
hybrid_cqt(y[, sr, hop_length, fmin, ...]) | Compute the hybrid constant-Q transform of an audio signal. |
pseudo_cqt(y[, sr, hop_length, fmin, ...]) | Compute the pseudo constant-Q transform of an audio signal. |
fmt(y[, t_min, n_fmt, kind, beta, ...]) | The fast Mellin transform (FMT) [R5] of a uniformly sampled signal y. |
phase_vocoder(D, rate[, hop_length]) | Phase vocoder. |
magphase(D) | Separate a complex-valued spectrogram D into its magnitude (S) and phase (P) components, so that D = S * P. |
logamplitude(S[, ref_power, amin, top_db]) | Log-scale the amplitude of a spectrogram. |
perceptual_weighting(S, frequencies, **kwargs) | Perceptual weighting of a power spectrogram: |
A_weighting(frequencies[, min_db]) | Compute the A-weighting of a set of frequencies. |
Time and frequency conversion¶
frames_to_samples(frames[, hop_length, n_fft]) | Converts frame indices to audio sample indices |
frames_to_time(frames[, sr, hop_length, n_fft]) | Converts frame counts to time (seconds) |
samples_to_frames(samples[, hop_length, n_fft]) | Converts sample indices into STFT frames. |
samples_to_time(samples[, sr]) | Convert sample indices to time (in seconds). |
time_to_frames(times[, sr, hop_length, n_fft]) | Converts time stamps into STFT frames. |
time_to_samples(times[, sr]) | Convert timestamps (in seconds) to sample indices. |
hz_to_note(frequencies, **kwargs) | Convert one or more frequencies (in Hz) to the nearest note names. |
hz_to_midi(frequencies) | Get the closest MIDI note number(s) for given frequencies |
midi_to_hz(notes) | Get the frequency (Hz) of MIDI note(s) |
midi_to_note(midi[, octave, cents]) | Convert one or more MIDI numbers to note strings. |
note_to_hz(note, **kwargs) | Convert one or more note names to frequency (Hz) |
note_to_midi(note[, round_midi]) | Convert one or more spelled notes to MIDI number(s). |
hz_to_mel(frequencies[, htk]) | Convert Hz to Mels |
hz_to_octs(frequencies[, A440]) | Convert frequencies (Hz) to (fractional) octave numbers. |
mel_to_hz(mels[, htk]) | Convert mel bin numbers to frequencies |
octs_to_hz(octs[, A440]) | Convert octaves numbers to frequencies. |
fft_frequencies([sr, n_fft]) | Alternative implementation of np.fft.fftfreqs |
cqt_frequencies(n_bins, fmin[, ...]) | Compute the center frequencies of Constant-Q bins. |
mel_frequencies([n_mels, fmin, fmax, htk]) | Compute the center frequencies of mel bands. |
Pitch and tuning¶
estimate_tuning([y, sr, S, n_fft, ...]) | Estimate the tuning of an audio time series or spectrogram input. |
pitch_tuning(frequencies[, resolution, ...]) | Given a collection of pitches, estimate its tuning offset (in fractions of a bin) relative to A440=440.0Hz. |
piptrack([y, sr, S, n_fft, hop_length, ...]) | Pitch tracking on thresholded parabolically-interpolated STFT |