Commit 3274cc4b authored by dualberger's avatar dualberger

some comments how to use the model in the example file

parent 81d08290
% EXAMPLE for the usage of the combined audio quality model that is described
% in Flener et al. (2019). It is a reference-based audio quality model and based on
% monaural (intensity and amplitude modulation) and binaural (interaural level and
% phase differences (ILDs, IPDs), interaural vector strength (IVS)) cues to predict subjective audio
% quality ratings.
% Some notes on the Overall quality measure:
% - it should be avoided to use the case where Reference Signal = Test signal (which gives an overall quality of 0.78) as
% the fitting procedure did not not considered in the fitting procedure, and thus
% - any other (perceptual relevant) distortions in the Test signal result in an overall quality < 0.78
cur_dir=pwd;
addpath(genpath(cur_dir)); % include all folders from the current path
%% input signals
if verLessThan('matlab','8.0')
% reference signal (clean)
[RefSig, fsRef] = wavread('Stimuli/guitar_ref.wav');
% test signal (processed)
[TestSig, fsTest] = wavread('Stimuli/guitar_K_lin_5_non9985_ASWILD_3_ASWITD200.wav');
else
% reference signal (clean)
[RefSig, fsRef] = audioread('Stimuli/guitar_ref.wav');
% test signal (processed)
[TestSig, fsTest] = audioread('Stimuli/guitar_K_lin_5_non9985_ASWILD_3_ASWITD200.wav');
end
% compare sampling frequencies
if fsTest ~= fsRef,
error('signals have different sampling frequencies')
else
fs = fsTest;
end
%% monaural GPSMq
% The GPSMq is a reference-based audio quality model and based on
% power and envelope power SNRs to predict subjective audio quality ratings.
% A stimuli level of 0 dB full scale rms (dB FS rms, which means 1 in Matlab)
% is assumed to represent a sound pressure level of 100 dB SPL rms.
% Example: stimulus (sound) calibration to 65 dB SPL rms;
% sound_adapt=sound/rms(sound) * 10^((65-100)/20);
% Model output OPM:
% - high values indicate small differences between Test and Reference signal
% - low values indicate large differences between Test and Reference signal
% adjust presentation level, e.g. 65 dB SPL rms
RefSig(:,1)=RefSig(:,1)./rms(RefSig(:,1)).*10^((65-100)/20); % Ref. signal, left ch.
RefSig(:,2)=RefSig(:,2)/rms(RefSig(:,2)).*10^((65-100)/20); % Ref. signal, right ch.
TestSig(:,1)=TestSig(:,1)./rms(TestSig(:,1)).*10^((65-100)/20); % Test. signal, left ch.
TestSig(:,2)=TestSig(:,2)/rms(TestSig(:,2)).*10^((65-100)/20); % Test. signal, right ch.
stOut = GPSMqBin(RefSig, TestSig, fs); % monaural model output
%% binaural BAM-Q
% The BAM-Q is a reference-based audio quality model and based on
% binaural cues as interaural level and phase differences (ILDs, IPDs) and interaural
% vector strength (IVS) to predict subjective audio quality ratings.
% Model input:
% - Stereo signals are required as input [left_ch right ch]
% - 0 dB FS should correspond to 115 dB SPL.
% - Clean and distorted signals have to be the same length and temporal aligned.
% - The input signals are required to have a duration of at least 0.4
% seconds.
% Model output, binQ:
% 100 ... no difference
% 0 ... large difference
% -X ... even larger difference
% adjust presentation level, e.g. 65 dB SPL rms
RefSig(:,1)=RefSig(:,1)./rms(RefSig(:,1)).*10^((65-115)/20); % Ref. signal, left ch.
RefSig(:,2)=RefSig(:,2)/rms(RefSig(:,2)).*10^((65-115)/20); % Ref. signal, right ch.
TestSig(:,1)=TestSig(:,1)./rms(TestSig(:,1)).*10^((65-115)/20); % Test. signal, left ch.
TestSig(:,2)=TestSig(:,2)/rms(TestSig(:,2)).*10^((65-115)/20); % Test. signal, right ch.
[binQ, ILDdiff, ITDdiff, IVSdiff] = BAMQpc(RefSig, TestSig, fs);
%% Combine outputs of BAMq and GPSMq
[obj_meas]=combine_binQ_OPM(stOut.OPM_fix(:,1), binQ(:,1));
disp('***********************')
disp('****monaural measures**')
disp('***********************')
disp(stOut)
% SNR_dc: 3.7852 % mainly represents linear, spectral distortions, see Biberger et al. (2018)
% SNR_ac: 2.8843 % mainly represents nonlinear distortions, see Biberger et al. (2018)
% SNR_dc_fix: 3.7852 % includes binaural extension to reduce the sensitivity for IPDs
% SNR_ac_fix: 2.4026 % includes binaural extension to reduce the sensitivity for IPDs
% OPM: 41.9663 % OPM, see Eq. (15) in Flener et al. (2019)
% OPM_fix: 41.8715 % OPM, see Eq. (16) in Flener et al. (2019); with binaural extension
% to reduce the sensitivity for IPDs;
% OPM_fix IS USED FOR THE OVERALL MEASURE!
disp('***********************')
disp('***binaural measures***')
disp('***********************')
disp(['binQ: ',num2str(binQ)])
disp(['ILDdiff: ', num2str(ILDdiff)])
disp(['ITDdiff: ', num2str(ITDdiff)])
disp(['IVSdiff: ', num2str(IVSdiff)])
% binQ: 47 % binaural overall measure based on ILDs, ITD/IPDs and the IVS
% binQ IS USED FOR THE OVERALL MEASURE!!!
% ILDdiff: 5.7629e+03 % interaural level differences (ILD)
% ITDdiff: 1.5393e-04 % interaural time/phase differences (ITD)
% IVSdiff: 0.0248 % interaural vector strength (IVS)
disp('***********************')
disp('****overall measure***')
disp('***********************')
disp(['overall_measure: ',num2str(obj_meas)])
% overall measure: 0.34455
\ No newline at end of file
......@@ -4,7 +4,8 @@
% phase differences (ILDs, IPDs), interaural vector strength (IVS)) cues to predict subjective audio
% quality ratings.
% Some notes on the Overall quality measure:
% - if Test signal = Reference signal (no distortions) an overall quality of 0.78 is achieved
% - it should be avoided to use the case where Reference Signal = Test signal (which gives an overall quality of 0.78) as
% is it was not considered in the original fitting procedure, and thus not reflected by the back end parameters
% - any other (perceptual relevant) distortions in the Test signal result in an overall quality < 0.78
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment