diff --git a/README.asv b/README.asv new file mode 100644 index 0000000000000000000000000000000000000000..2d31a7d1df3d03b16ecb659b6e8498f86151a399 --- /dev/null +++ b/README.asv @@ -0,0 +1,102 @@ +### General remarks + +The intrusive Combined (monaural and binaural) Audio Quality Model (Fleßner et al., 2019) combines monaural audio quality +predictions based on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to +give overall audio quality predictions. +It requires the clean and the distorted signal as input to estimate perceived binaural quality differences. +Stereo signals ([left channel right channel]) are required as model input. A stimuli level of +0 dB FS corresponds to a sound pressure level of 115 dB SPL. +Clean and distorted signals have to be the same length and temporal aligned. The input signals are required to +have a duration of at least 0.4 seconds. + + +'Example_combAudioQual.m' gives a minimal example how to use the Combined Audio Quality Model in Matlab. + + +The GPSMq output provides four submeasures: +- **OPM**: ...objective perceptual measure; based on a combination of 'SNR_dc' and 'SNR_ac' + to which a logarithmic transformation with an lower and upper boundary is applied. + Distortions resulting in SNRs below the lower boundary are assumed to be + imperceptible, while distortions causing large SNRs that exceed the upper limit are + assumed to lead to a fixed (poor) quality. +- **OPM_raw**: ...identical to 'OPM' but without lower and upper boundary +- **SNR_dc**: ....power-based SNR; based on temporal averaging and combination across auditory channels +- **SNR_ac**: ....envelope-power-based SNR; based on temporal averaging and combination across auditory and + modulation channels + +The BAM-Q output provides four submeasures: +- **binQ**: ... binaural quality measure; based on a combination of of the submeasures that represent differences + between the reference and the test signal for interaural level differences (ILDdiff), + interaural time/phase differences (ITDdiff) and the interaural vector strength ('IVSdiff'). + - 100 ... no difference + - 0 ... large difference + - -X ... even larger difference +- **ILDdiff** ... intermediate ILD measure +- **ITDdiff** ... intermediate ITD measure (can be 0 if ITDs are not evaluable) +- **IVSdiff** ... intermediate IVS measure + + +A more detailed description of the Combined Audio Quality Model is given in: + +J.-H. Fleßner, T. Biberger, and S. D. Ewert, "S", +Journal of the Audio Engineering Society, vol. 65, no.11, PP.929-942. 2017. https://doi.org/10.17743/jaes.2017.0037 + +**Abstract:** +Recently, the binaural auditory-model-based quality +prediction (BAM-Q) was successfully applied to predict binaural +audio quality degradations, while the generalized power-spectrum +model for quality (GPSMq) has been demonstrated to account for a +large variety of monaural signal distortions.For many applications, +a combinedmonaural and binaural model would be advantageous, +however, the contribution of monaural and binaural quality aspects +to overall (spatial) quality is not conclusively clarified. Thus, +the current study systematically investigated overall audio quality +in a listening experiment for monaural and binaural distortions +on music, speech, and noise, applied either in isolation or in combination. +The resulting database was used for assessing different +methods for combining BAM-Q and GPSMq to joint overall audio +predictions for monaural and binaural signal distortions. It was +investigated, if monaural or binaural quality aspects contribute +stronger to overall audio quality. The results indicate that overall +audio quality depends on the lower quality aspect, eithermonaural +or binaural. + + +Authors of the Matlab implementation of the Combined Audio Quality Model: + +- jan-hendrik.flessner@jade-hs.de +- thomas.biberger@uni-oldenburg.de + +=============================================================================== +### License and permissions +=============================================================================== + +Unless otherwise stated, the Combined Audio Quality Model distribution, including all files is licensed +under Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International +(CC BY-NC-ND 4.0). +In short, this means that you are free to use and share (copy, distribute and +transmit) the Combined Audio Quality Model distribution under the following conditions: + +Attribution - You must attribute the Combined Audio Quality Model distribution by acknowledgement of + the author if it appears or if was used in any form in your work. + The attribution must not in any way that suggests that the author + endorse you or your use of the work. + +Noncommercial - You may not use the Combined Audio Quality Model for commercial purposes. + +No Derivative Works - You may not alter, transform, or build upon the Combined Audio Quality Model. + +Exceptions are the following external Matlab functions (see their respective licence) +that were used within the Combined Audio Quality Model: +- Gammatone filterbank from V. Hohmann (https://zenodo.org/record/2643400#.XQsf5TnVLCM), for details see[1,2]: + [1] Hohmann, V. (2002). Frequency analysis and synthesis using a Gammatone filterbank. + Acta Acustica united with Acustica, 88(3), 433-442. + [2] Herzke, T., & Hohmann, V. (2007). Improved numerical methods for gammatone filterbank analysis and + synthesis. Acta acustica united with acustica, 93(3), 498-500. +- Code snipets from the Dietz Modell (Authors: Mathias Dietz, Martin-Klein Hennig), for details see: + M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers + from binaural signals. Speech Communication, 53(5):592-605, 2011. +- 'MFB2.m' from Stephan D. Ewert and T. Dau +- 'moving_average.m' from Christian Kothe (Code available at Matlab's File Exchange + https://www.mathworks.com/matlabcentral/fileexchange/34567-fast-moving-average) + \ No newline at end of file diff --git a/README.md b/README.md index a102eb1059a2c8ab6db2d024a19e2244c399ab60..ebdad480a01df5ceccbb3364ab6ebf6a99ae2d5a 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,6 @@ The intrusive Combined (monaural and binaural) Audio Quality Model (Fleßner et al., 2019) combines monaural audio quality predictions based on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to give overall audio quality predictions. -It requires the clean and the distorted signal as input to estimate perceived binaural quality differences. -Stereo signals ([left channel right channel]) are required as model input. A stimuli level of -0 dB FS corresponds to a sound pressure level of 115 dB SPL. -Clean and distorted signals have to be the same length and temporal aligned. The input signals are required to -have a duration of at least 0.4 seconds. 'Example_combAudioQual.m' gives a minimal example how to use the Combined Audio Quality Model in Matlab. @@ -19,7 +14,6 @@ The GPSMq output provides four submeasures: Distortions resulting in SNRs below the lower boundary are assumed to be imperceptible, while distortions causing large SNRs that exceed the upper limit are assumed to lead to a fixed (poor) quality. -- **OPM_raw**: ...identical to 'OPM' but without lower and upper boundary - **SNR_dc**: ....power-based SNR; based on temporal averaging and combination across auditory channels - **SNR_ac**: ....envelope-power-based SNR; based on temporal averaging and combination across auditory and modulation channels @@ -38,23 +32,28 @@ The BAM-Q output provides four submeasures: A more detailed description of the Combined Audio Quality Model is given in: -J.-H. Fleßner, R. Huber, and S. D. Ewert, "Assessment and Prediction of Binaural Aspects of Audio Quality", -Journal of the Audio Engineering Society, vol. 65, no.11, PP.929-942. 2017. https://doi.org/10.17743/jaes.2017.0037 +J.-H. Fleßner, T. Biberger, and S. D. Ewert, "Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality", +IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no.7, PP.1112-1125. 2019. https://doi.org/10.1109/TASLP.2019.2904850 **Abstract:** -Binaural or spatial presentation of audio signals has become increasingly important in -consumer sound reproduction, but also for hearing assistive devices like hearing aids, where -signals in both ears might undergo heavy signal processing. Such processing might introduce -distortions to the interaural signal properties that affect perception. Here, an approach for -intrusive binaural auditory-model-based quality prediction (BAM-Q) is introduced. BAM-Q -uses a binaural auditory model as front-end to extract the three binaural features interaural -level difference, interaural time difference, and a measure of interaural coherence. The current -approach focuses on the general applicability (with respect to binaural signal differences) of -the binaural quality model to arbitrary binaural audio signals. Thus, two listening experiments -were conducted to subjectively measure the influence of these binaural features and their -combinations on binaural quality perception. The results were used to train BAM-Q. Two -different hearing aid algorithms were used to evaluate the performance of the model. The -correlations between subjective mean ratings and model predictions are higher than 0.9. +Recently, the binaural auditory-model-based quality +prediction (BAM-Q) was successfully applied to predict binaural +audio quality degradations, while the generalized power-spectrum +model for quality (GPSMq) has been demonstrated to account for a +large variety of monaural signal distortions.For many applications, +a combinedmonaural and binaural model would be advantageous, +however, the contribution of monaural and binaural quality aspects +to overall (spatial) quality is not conclusively clarified. Thus, +the current study systematically investigated overall audio quality +in a listening experiment for monaural and binaural distortions +on music, speech, and noise, applied either in isolation or in combination. +The resulting database was used for assessing different +methods for combining BAM-Q and GPSMq to joint overall audio +predictions for monaural and binaural signal distortions. It was +investigated, if monaural or binaural quality aspects contribute +stronger to overall audio quality. The results indicate that overall +audio quality depends on the lower quality aspect, eithermonaural +or binaural. Authors of the Matlab implementation of the Combined Audio Quality Model: @@ -77,9 +76,9 @@ Attribution - You must attribute the Combined Audio Quality Model distribution b The attribution must not in any way that suggests that the author endorse you or your use of the work. -Noncommercial - You may not use Combined Audio Quality Model for commercial purposes. +Noncommercial - You may not use the Combined Audio Quality Model for commercial purposes. -No Derivative Works - You may not alter, transform, or build upon BAM-Q. +No Derivative Works - You may not alter, transform, or build upon the Combined Audio Quality Model. Exceptions are the following external Matlab functions (see their respective licence) that were used within the Combined Audio Quality Model: