Commit b2d736ad authored by ThomasBiberger's avatar ThomasBiberger
Browse files

add example, update readme and include the new combination of monaural and binaural quality scores

parent bc68ad82
% EXAMPLE for the usage of the combined audio quality model MoBi-Qadd (Biberger et a)
% as it was used in Biberger et al. (2021). MoBi-Qadd is a modified version of the
% MoBi-Q (Flener et al., 2019) with revised combination of monaural and
% binaural objective quality ratings.
% MoBi-Qadd and MoBi-Q are reference-based audio quality models and based
% on monaural (intensity and amplitude modulation) and binaural(interaural
% level and phase differences (ILDs, IPDs), interaural vector strength (IVS))
% cues to predict subjective audio quality ratings.
%
% 1 ... no difference (test signal = reference signal)
% 0 ... large difference
cur_dir=pwd;
addpath(genpath(cur_dir)); % include all folders from the current path
%% input signals
if verLessThan('matlab','8.0')
% reference signal (clean)
[RefSig, fsRef] = wavread('Stimuli/guitar_ref.wav');
% test signal (processed)
[TestSig, fsTest] = wavread('Stimuli/guitar_K_lin_5_non9985_ASWILD_3_ASWITD200.wav');
else
% reference signal (clean)
[RefSig, fsRef] = audioread('Stimuli/guitar_ref.wav');
% test signal (processed)
[TestSig, fsTest] = audioread('Stimuli/guitar_K_lin_5_non9985_ASWILD_3_ASWITD200.wav');
end
% compare sampling frequencies
if fsTest ~= fsRef,
error('signals have different sampling frequencies')
else
fs = fsTest;
end
%% monaural GPSMq
% The GPSMq is a reference-based audio quality model and based on
% power and envelope power SNRs to predict subjective audio quality ratings.
% A stimuli level of 0 dB full scale rms (dB FS rms, which means 1 in Matlab)
% is assumed to represent a sound pressure level of 100 dB SPL rms.
% Example: stimulus (sound) calibration to 65 dB SPL rms;
% sound_adapt=sound/rms(sound) * 10^((65-100)/20);
% Model output OPM:
% - high values indicate small differences between Test and Reference signal
% - low values indicate large differences between Test and Reference signal
% adjust presentation level, e.g. 65 dB SPL rms
RefSig(:,1)=RefSig(:,1).*10^((8)/20); % Ref. signal, left ch.
RefSig(:,2)=RefSig(:,2).*10^((8)/20); % Ref. signal, right ch.
TestSig(:,1)=TestSig(:,1).*10^((8)/20); % Test. signal, left ch.
TestSig(:,2)=TestSig(:,2).*10^((8)/20); % Test. signal, right ch.
stOut = GPSMqBin(RefSig, TestSig, fs); % monaural model output
%% binaural BAM-Q
% The BAM-Q is a reference-based audio quality model and based on
% binaural cues as interaural level and phase differences (ILDs, IPDs) and interaural
% vector strength (IVS) to predict subjective audio quality ratings.
% Model input:
% - Stereo signals are required as input [left_ch right ch]
% - 0 dB FS should correspond to 115 dB SPL.
% - Clean and distorted signals have to be the same length and temporal aligned.
% - The input signals are required to have a duration of at least 0.4
% seconds.
% Model output, binQ:
% 100 ... no difference
% 0 ... large difference
% -X ... even larger difference
% adjust presentation level, e.g. 65 dB SPL rms
RefSig(:,1)=RefSig(:,1).*10^((-15)/20); % Ref. signal, left ch.
RefSig(:,2)=RefSig(:,2).*10^((-15)/20); % Ref. signal, right ch.
TestSig(:,1)=TestSig(:,1).*10^((-15)/20); % Test. signal, left ch.
TestSig(:,2)=TestSig(:,2).*10^((-15)/20); % Test. signal, right ch.
[binQ, ILDdiff, ITDdiff, IVSdiff] = BAMQpc(RefSig, TestSig, fs);
%% Combine outputs of BAMq and GPSMq
% ***********Combination suggested by Biberger et al. (2021)***************
% Speech and music signals with distortions introduced by the hear-through mode
% of hearables (Schepker et al., 2020) were used for optimizing the combination
% of the monaural and the binaural model outputs.
% Model output, obj_meas_tih_2021
% 1 ... no difference
% 0 ... large difference
[obj_meas_tih2021]=combine_binQ_OPM_TiH2021(stOut.OPM_fix(:,1), binQ(:,1));
% *************************************************************************
disp('***********************')
disp('****monaural measures**')
disp('***********************')
disp(stOut)
% SNR_dc: 3.4546 % mainly represents linear, spectral distortions, see Biberger et al. (2018)
% SNR_ac: 2.8818 % mainly represents nonlinear distortions, see Biberger et al. (2018)
% SNR_dc_fix: 3.4546 % includes binaural extension to reduce the sensitivity for IPDs
% SNR_ac_fix: 2.2067 % includes binaural extension to reduce the sensitivity for IPDs
% OPM: 42.8757 % OPM, see Eq. (15) in Flener et al. (2019)
% OPM_fix: 43.4770 % OPM, see Eq. (16) in Flener et al. (2019); with binaural extension
% to reduce the sensitivity for IPDs;
% OPM_fix IS USED FOR THE OVERALL MEASURE!
disp('***********************')
disp('***binaural measures***')
disp('***********************')
disp(['binQ: ',num2str(binQ)])
disp(['ILDdiff: ', num2str(ILDdiff)])
disp(['ITDdiff: ', num2str(ITDdiff)])
disp(['IVSdiff: ', num2str(IVSdiff)])
% binQ: 47 % binaural overall measure based on ILDs, ITD/IPDs and the IVS
% binQ IS USED FOR THE OVERALL MEASURE!!!
% ILDdiff: 5.9312+03 % interaural level differences (ILD)
% ITDdiff: 1.5621-04 % interaural time/phase differences (ITD)
% IVSdiff: 0.0227 % interaural vector strength (IVS)
disp('***********************')
disp('****overall measure***')
disp('***********************')
disp(['overall_measure_Biberger2021: ',num2str(obj_meas_tih2021)])
% overall measure: 0.62108
\ No newline at end of file
### General remarks
The intrusive audio quality model MoBi-Qadd represents a modified version of the MoBi-Q (Fleßner et al., 2019, see https://gitlab.uni-oldenburg.de/kuxo2262/combinedaudioqualitymodel) to predict overall audio quality for monaurally, binaurally, and combined monaurally and binaurally distorted speech, music, and noise signals as it has been demonstrated in Biberger et al. (2021). It combines monaural audio quality predictions based on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to give overall audio quality predictions. The main differences...
The intrusive audio quality model MoBi-Qadd (Biberger et al., 2021) represents a modified version of the MoBi-Q (Fleßner et al., 2019, see https://gitlab.uni-oldenburg.de/kuxo2262/combinedaudioqualitymodel)
to predict overall audio quality for monaurally, binaurally, and combined monaurally and binaurally distorted speech, music, and noise signals. It combines monaural audio quality predictions based
on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to give overall audio quality predictions. Speech and music signals with
distortions introduced by the hear-through mode of hearables (Schepker et al., 2020) were used for optimizing the combination of the monaural and the binaural model outputs. The MoBi-Qadd quality scores
range from 0 (very strong differences between reference and test signals) to 1 (no perceptible differences).
'Example_MoBiQadd.m' gives a minimal example how to use the MoBi-Qadd in Matlab.
The GPSMq output provides four submeasures:
The GPSM<sup>q</sup> output provides four submeasures:
- **OPM**: ...objective perceptual measure; based on a combination of 'SNR_dc' and 'SNR_ac'
to which a logarithmic transformation with an lower and upper boundary is applied.
Distortions resulting in SNRs below the lower boundary are assumed to be
......@@ -28,12 +31,14 @@ The BAM-Q output provides four submeasures:
- **IVSdiff** ... intermediate IVS measure
A more detailed description of the MoBi-Qadd is given in:
T. Biberger, H. Schepker, F. Denk, and S. D. Ewert, "Instrumental quality predictions and analysis of auditory
cues for algorithms in modern headphone technology", Trends in Hearing, vol. xy, no.xy, PP.xy-xy. 2021.
J.-H. Fleßner, T. Biberger, and S. D. Ewert, "Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality",
J.-H. Fleßner, T. Biberger, and S. D. Ewert, "Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality",
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no.7, PP.1112-1125. 2019. https://doi.org/10.1109/TASLP.2019.2904850
......
......@@ -27,6 +27,8 @@ function [ obj_meas ] = combine_binQ_OPM(obj_meas_mon, obj_meas_bin)
% --------------------------------------------------------------------------------
obj_meas_mon(obj_meas_mon>100)=100; % limitation of upper boundary -> this is only relevant if reference vs. reference is tested
obj_meas_bin(obj_meas_bin>100)=100; % limitation of upper boundary -> this is only relevant if reference vs. reference is tested
mon_tmp=0.0528.*obj_meas_mon; % for values see Table II in Flener et al. (2019)
bin_tmp=0.0078*obj_meas_bin; % for values see Table II in Flener et al. (2019)
obj_meas=min(log10(mon_tmp),bin_tmp);
......
function [ obj_meas ] = combine_binQ_OPM_TiH2021(obj_meas_mon, obj_meas_bin)
% combine_binQ_OPM.m Function combines the monaural quality measure (OPM) and the binaural
% quality measure(binQ) to calculate an overall audio quality measure.
......@@ -8,14 +7,14 @@ function [ obj_meas ] = combine_binQ_OPM_TiH2021(obj_meas_mon, obj_meas_bin)
% obj_meas_bin: binaural output measure (binQ) of BAM-Q (Flener et al., 2017)
%
% OUTPUT:
% obj_meas_bin: struct for stimulus related parameters,e.g.,fs
% obj_meas: combined output measure (see Biberger et al., 2021)
%
% Usage: [ obj_meas ] = combine_binQ_OPM(obj_meas_mon, obj_meas_bin)
% Usage: [obj_meas] = combine_binQ_OPM_TiH2021(obj_meas_mon, obj_meas_bin)
% thomas.biberger@uni-oldenburg.de;
% date: 2019-09-09
% date: 2021-01-14
%
% --------------------------------------------------------------------------------
% Copyright (c) 2017-2019, Jan-Hendrik Flener, Thomas Biberger, Stephan D. Ewert,
% Copyright (c) 2017-2021, Jan-Hendrik Flener, Thomas Biberger, Stephan D. Ewert,
% University Oldenburg, Germany.
%
% This work is licensed under the
......@@ -27,21 +26,17 @@ function [ obj_meas ] = combine_binQ_OPM_TiH2021(obj_meas_mon, obj_meas_bin)
% 94041, USA.
% --------------------------------------------------------------------------------
obj_meas_mon(obj_meas_mon>100)=100; % limitation of upper boundary -> this is only relevant if reference vs. reference is tested
obj_meas_bin(obj_meas_bin>100)=100; % limitation of upper boundary -> this is only relevant if reference vs. reference is tested
vParams=[0.1188, 38.9817,0.0192,-20.8263]; % based on monaural and binaural outputs of MoBI-Q
mon_tmp=1./(1 + exp(-vParams(1).*(obj_meas_mon-vParams(2)))); % apply sigmoid to monaural scores
bin_tmp=1./(1 + exp(-vParams(3).*(obj_meas_bin-vParams(4)))); % apply sigmoid to monaural scores
obj_meas=(mon_tmp+bin_tmp); % combine monaural and binaural scores
obj_meas(obj_meas<0.6083)=0.6083; % ensures that the combined score does not fall below 0.6083,
% while such low values (<0.6083) were never
% observed for the distortioned signals in
% the databases used so far (even not for
% the anchor signals)
obj_meas=(obj_meas-0.6083)./(1.9098-0.6083); % bound quality scores between 0 and 1
end
end
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment