README.md 5.05 KB
Newer Older
Thomas Biberger's avatar
Thomas Biberger committed
1
### General remarks
Thomas Biberger's avatar
Thomas Biberger committed
2

3
4
5
6
7
The intrusive audio quality model MoBi-Qadd (Biberger et al., 2021) represents a modified version of the MoBi-Q (Fleßner et al., 2019, see https://gitlab.uni-oldenburg.de/kuxo2262/combinedaudioqualitymodel)
to predict overall audio quality for monaurally, binaurally, and combined monaurally and binaurally distorted speech, music, and noise signals. It combines monaural audio quality predictions based 
on GPSMq (Biberger et al., 2018) and binaural audio quality predictions based on BAM-Q (Fleßner et al., 2017) to give overall audio quality predictions. Speech and music signals with 
distortions introduced by the hear-through mode of hearables (Schepker et al., 2020) were used for optimizing the combination of the monaural and the binaural model outputs. The MoBi-Qadd quality scores
range from 0 (very strong differences between reference and test signals) to 1 (no perceptible differences). 
Thomas Biberger's avatar
Thomas Biberger committed
8
9
10
11

'Example_MoBiQadd.m' gives a minimal example how to use the MoBi-Qadd in Matlab. 


12
The GPSM<sup>q</sup> output provides four submeasures:
Thomas Biberger's avatar
Thomas Biberger committed
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
- **OPM**:       ...objective perceptual measure; based on a combination of 'SNR_dc' and 'SNR_ac' 
                    to which a logarithmic transformation with an lower and upper boundary is applied.
                    Distortions resulting in SNRs below the lower boundary are assumed to be
                    imperceptible, while distortions causing large SNRs that exceed the upper limit are
                    assumed to lead to a fixed (poor) quality.
- **SNR_dc**:       ....power-based SNR; based on temporal averaging and combination across auditory channels
- **SNR_ac**:       ....envelope-power-based SNR; based on temporal averaging and combination across auditory and
                        modulation channels

The BAM-Q output provides four submeasures:
- **binQ**:         ... binaural quality measure; based on a combination of of the submeasures that represent differences 
                    between the reference and the test signal for interaural level differences (ILDdiff), 
                    interaural time/phase differences (ITDdiff) and the interaural vector strength ('IVSdiff').
    - 100 ... no difference
    - 0   ... large difference
    - -X  ... even larger difference
-  **ILDdiff**      ... intermediate ILD measure
-  **ITDdiff**      ... intermediate ITD measure (can be 0 if ITDs are not evaluable)
-  **IVSdiff**      ... intermediate IVS measure


34
35


Thomas Biberger's avatar
Thomas Biberger committed
36
37
38
39
40
A more detailed description of the MoBi-Qadd is given in:

T. Biberger, H. Schepker, F. Denk, and S. D. Ewert, "Instrumental quality predictions and analysis of auditory
cues for algorithms in modern headphone technology", Trends in Hearing, vol. xy, no.xy, PP.xy-xy. 2021. 

41
J.-H. Fleßner, T. Biberger, and S. D. Ewert, "Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality", 
Thomas Biberger's avatar
Thomas Biberger committed
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no.7, PP.1112-1125. 2019. https://doi.org/10.1109/TASLP.2019.2904850


Authors of the Matlab implementation of the MoBi-Qadd: 

- jan-hendrik.flessner@jade-hs.de
- thomas.biberger@uni-oldenburg.de 

===============================================================================
### License and permissions
===============================================================================

Unless otherwise stated, the MoBi-Qadd distribution, including all files is licensed
under Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International
(CC BY-NC-ND 4.0).
In short, this means that you are free to use and share (copy, distribute and
transmit) the MoBi-Qadd distribution under the following conditions:

Attribution - You must attribute the MoBi-Qadd distribution by acknowledgement of
              the author if it appears or if was used in any form in your work.
              The attribution must not in any way that suggests that the author
              endorse you or your use of the work.

Noncommercial - You may not use the MoBi-Qadd for commercial purposes.
 
No Derivative Works - You may not alter, transform, or build upon the MoBi-Qadd.

Exceptions are the following external Matlab functions (see their respective licence)
that were used within the MoBi-Qadd:
- Gammatone filterbank from V. Hohmann (https://zenodo.org/record/2643400#.XQsf5TnVLCM), for details see[1,2]:
   [1] Hohmann, V. (2002). Frequency analysis and synthesis using a Gammatone filterbank. 
       Acta Acustica united with Acustica, 88(3), 433-442.
   [2] Herzke, T., & Hohmann, V. (2007). Improved numerical methods for gammatone filterbank analysis and
       synthesis. Acta acustica united with acustica, 93(3), 498-500. 
- Code snipets from the Dietz Modell (Authors: Mathias Dietz, Martin-Klein Hennig), for details see:
      M. Dietz, S. D. Ewert, and V. Hohmann. Auditory model based direction estimation of concurrent speakers
      from binaural signals. Speech Communication, 53(5):592-605, 2011.
- 'MFB2.m' from Stephan D. Ewert and T. Dau
- 'moving_average.m' from Christian Kothe (Code available at Matlab's File Exchange
   https://www.mathworks.com/matlabcentral/fileexchange/34567-fast-moving-average)