Commit 888c20a9 authored by dualberger's avatar dualberger

add some comments in the README

parent 3e127092
### General remarks
The intrusive Combined (monaural and binaural) Audio Quality Model (Fleßner et al., 2019) combines monaural audio quality
predictions based on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to
give overall audio quality predictions.
It requires the clean and the distorted signal as input to estimate perceived binaural quality differences.
Stereo signals ([left channel right channel]) are required as model input. A stimuli level of
0 dB FS corresponds to a sound pressure level of 115 dB SPL.
Clean and distorted signals have to be the same length and temporal aligned. The input signals are required to
have a duration of at least 0.4 seconds.
'Example_combAudioQual.m' gives a minimal example how to use the Combined Audio Quality Model in Matlab.
The GPSMq output provides four submeasures:
- **OPM**: ...objective perceptual measure; based on a combination of 'SNR_dc' and 'SNR_ac'
to which a logarithmic transformation with an lower and upper boundary is applied.
Distortions resulting in SNRs below the lower boundary are assumed to be
imperceptible, while distortions causing large SNRs that exceed the upper limit are
assumed to lead to a fixed (poor) quality.
- **OPM_raw**: ...identical to 'OPM' but without lower and upper boundary
- **SNR_dc**: ....power-based SNR; based on temporal averaging and combination across auditory channels
- **SNR_ac**: ....envelope-power-based SNR; based on temporal averaging and combination across auditory and
modulation channels
The BAM-Q output provides four submeasures:
- **binQ**: ... binaural quality measure; based on a combination of of the submeasures that represent differences
between the reference and the test signal for interaural level differences (ILDdiff),
interaural time/phase differences (ITDdiff) and the interaural vector strength ('IVSdiff').
- 100 ... no difference
- 0 ... large difference
- -X ... even larger difference
- **ILDdiff** ... intermediate ILD measure
- **ITDdiff** ... intermediate ITD measure (can be 0 if ITDs are not evaluable)
- **IVSdiff** ... intermediate IVS measure
A more detailed description of the Combined Audio Quality Model is given in:
J.-H. Fleßner, T. Biberger, and S. D. Ewert, "S",
Journal of the Audio Engineering Society, vol. 65, no.11, PP.929-942. 2017. https://doi.org/10.17743/jaes.2017.0037
**Abstract:**
Recently, the binaural auditory-model-based quality
prediction (BAM-Q) was successfully applied to predict binaural
audio quality degradations, while the generalized power-spectrum
model for quality (GPSMq) has been demonstrated to account for a
large variety of monaural signal distortions.For many applications,
a combinedmonaural and binaural model would be advantageous,
however, the contribution of monaural and binaural quality aspects
to overall (spatial) quality is not conclusively clarified. Thus,
the current study systematically investigated overall audio quality
in a listening experiment for monaural and binaural distortions
on music, speech, and noise, applied either in isolation or in combination.
The resulting database was used for assessing different
methods for combining BAM-Q and GPSMq to joint overall audio
predictions for monaural and binaural signal distortions. It was
investigated, if monaural or binaural quality aspects contribute
stronger to overall audio quality. The results indicate that overall
audio quality depends on the lower quality aspect, eithermonaural
or binaural.
Authors of the Matlab implementation of the Combined Audio Quality Model:
- jan-hendrik.flessner@jade-hs.de
- thomas.biberger@uni-oldenburg.de
===============================================================================
### License and permissions
===============================================================================
Unless otherwise stated, the Combined Audio Quality Model distribution, including all files is licensed
under Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International
(CC BY-NC-ND 4.0).
In short, this means that you are free to use and share (copy, distribute and
transmit) the Combined Audio Quality Model distribution under the following conditions:
Attribution - You must attribute the Combined Audio Quality Model distribution by acknowledgement of
the author if it appears or if was used in any form in your work.
The attribution must not in any way that suggests that the author
endorse you or your use of the work.
Noncommercial - You may not use the Combined Audio Quality Model for commercial purposes.
No Derivative Works - You may not alter, transform, or build upon the Combined Audio Quality Model.
Exceptions are the following external Matlab functions (see their respective licence)
that were used within the Combined Audio Quality Model:
- Gammatone filterbank from V. Hohmann (https://zenodo.org/record/2643400#.XQsf5TnVLCM), for details see[1,2]:
[1] Hohmann, V. (2002). Frequency analysis and synthesis using a Gammatone filterbank.
Acta Acustica united with Acustica, 88(3), 433-442.
[2] Herzke, T., & Hohmann, V. (2007). Improved numerical methods for gammatone filterbank analysis and
synthesis. Acta acustica united with acustica, 93(3), 498-500.
- Code snipets from the Dietz Modell (Authors: Mathias Dietz, Martin-Klein Hennig), for details see:
M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers
from binaural signals. Speech Communication, 53(5):592-605, 2011.
- 'MFB2.m' from Stephan D. Ewert and T. Dau
- 'moving_average.m' from Christian Kothe (Code available at Matlab's File Exchange
https://www.mathworks.com/matlabcentral/fileexchange/34567-fast-moving-average)
\ No newline at end of file
......@@ -3,11 +3,6 @@
The intrusive Combined (monaural and binaural) Audio Quality Model (Fleßner et al., 2019) combines monaural audio quality
predictions based on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to
give overall audio quality predictions.
It requires the clean and the distorted signal as input to estimate perceived binaural quality differences.
Stereo signals ([left channel right channel]) are required as model input. A stimuli level of
0 dB FS corresponds to a sound pressure level of 115 dB SPL.
Clean and distorted signals have to be the same length and temporal aligned. The input signals are required to
have a duration of at least 0.4 seconds.
'Example_combAudioQual.m' gives a minimal example how to use the Combined Audio Quality Model in Matlab.
......@@ -19,7 +14,6 @@ The GPSMq output provides four submeasures:
Distortions resulting in SNRs below the lower boundary are assumed to be
imperceptible, while distortions causing large SNRs that exceed the upper limit are
assumed to lead to a fixed (poor) quality.
- **OPM_raw**: ...identical to 'OPM' but without lower and upper boundary
- **SNR_dc**: ....power-based SNR; based on temporal averaging and combination across auditory channels
- **SNR_ac**: ....envelope-power-based SNR; based on temporal averaging and combination across auditory and
modulation channels
......@@ -38,23 +32,28 @@ The BAM-Q output provides four submeasures:
A more detailed description of the Combined Audio Quality Model is given in:
J.-H. Fleßner, R. Huber, and S. D. Ewert, "Assessment and Prediction of Binaural Aspects of Audio Quality",
Journal of the Audio Engineering Society, vol. 65, no.11, PP.929-942. 2017. https://doi.org/10.17743/jaes.2017.0037
J.-H. Fleßner, T. Biberger, and S. D. Ewert, "Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality",
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no.7, PP.1112-1125. 2019. https://doi.org/10.1109/TASLP.2019.2904850
**Abstract:**
Binaural or spatial presentation of audio signals has become increasingly important in
consumer sound reproduction, but also for hearing assistive devices like hearing aids, where
signals in both ears might undergo heavy signal processing. Such processing might introduce
distortions to the interaural signal properties that affect perception. Here, an approach for
intrusive binaural auditory-model-based quality prediction (BAM-Q) is introduced. BAM-Q
uses a binaural auditory model as front-end to extract the three binaural features interaural
level difference, interaural time difference, and a measure of interaural coherence. The current
approach focuses on the general applicability (with respect to binaural signal differences) of
the binaural quality model to arbitrary binaural audio signals. Thus, two listening experiments
were conducted to subjectively measure the influence of these binaural features and their
combinations on binaural quality perception. The results were used to train BAM-Q. Two
different hearing aid algorithms were used to evaluate the performance of the model. The
correlations between subjective mean ratings and model predictions are higher than 0.9.
Recently, the binaural auditory-model-based quality
prediction (BAM-Q) was successfully applied to predict binaural
audio quality degradations, while the generalized power-spectrum
model for quality (GPSMq) has been demonstrated to account for a
large variety of monaural signal distortions.For many applications,
a combinedmonaural and binaural model would be advantageous,
however, the contribution of monaural and binaural quality aspects
to overall (spatial) quality is not conclusively clarified. Thus,
the current study systematically investigated overall audio quality
in a listening experiment for monaural and binaural distortions
on music, speech, and noise, applied either in isolation or in combination.
The resulting database was used for assessing different
methods for combining BAM-Q and GPSMq to joint overall audio
predictions for monaural and binaural signal distortions. It was
investigated, if monaural or binaural quality aspects contribute
stronger to overall audio quality. The results indicate that overall
audio quality depends on the lower quality aspect, eithermonaural
or binaural.
Authors of the Matlab implementation of the Combined Audio Quality Model:
......@@ -77,9 +76,9 @@ Attribution - You must attribute the Combined Audio Quality Model distribution b
The attribution must not in any way that suggests that the author
endorse you or your use of the work.
Noncommercial - You may not use Combined Audio Quality Model for commercial purposes.
Noncommercial - You may not use the Combined Audio Quality Model for commercial purposes.
No Derivative Works - You may not alter, transform, or build upon BAM-Q.
No Derivative Works - You may not alter, transform, or build upon the Combined Audio Quality Model.
Exceptions are the following external Matlab functions (see their respective licence)
that were used within the Combined Audio Quality Model:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment