• Nenhum resultado encontrado

2.2 Digits as speech test material

2.2.1 The anatomy of a DTT

In 2015, the International Collegium of Rehabilitative Audiology (ICRA) published official recommendations for developing DTTs in different languages (Akeroyd et al., 2015). The aim of the guidelines was to ensure that the different language versions of DTTs are similar and enable cross- language comparison. According to the recommendations, the digits included in the test should have a balanced number of syllables, and each test list should contain each digit three times at each position within the triplet. A short announcement phrase before each digit is recommended to help focus attention to the triplet. The speaker should use natural

intonation, standard pronunciation, and constant vocal effort. Each digit is to be recorded in each position, the material cut, and new triplets formed by combining the digits into triplets using the position-specific recordings with equal pauses added between each digit. The ICRA also recommends using masking noise with the same long-term spectrum as the speech material, preferably by randomly superimposing the test material.

The intelligibility and homogeneity of the test material should be

optimized through level corrections to individual digits and, if necessary, by omitting non-optimal recordings. The optimized test material should be evaluated with separate NH participants to confirm that all test lists are equal in intelligibility. The evaluation measurements can be used to assess the mean SRT and slope and their standard deviations for NH listeners, as well as the test-retest reliability. Finally, the test should be validated with HI listeners in a multi-center study by comparing the results to pure tone average (PTA) and to other, validated measures of speech perception in noise. During the test, the digits should be clearly audible to most listeners at the initial SNR. DTTs use an adaptive test procedure where the next presentation level is based on the previous response: after a correct answer, the next triplet is presented at a more negative (i.e., more difficult) SNR, and after a wrong answer the SNR for the next triplet increases. A fixed step size of 2dB SNR and use of triplet scoring are recommended.

30

The recommendation is to calculate the SRT by averaging the SNRs from the 5th to the last trial (Akeroyd et al., 2015).

Most DTTs still adhere closely to the ICRA guidelines, but the test construction and protocols are continuously evolving to provide more accurate and reliable results (De Sousa, Swanepoel, et al., 2020). The first versions of DTTs typically only included monosyllabic digits to make the test material as uniform as possible, and to avoid the length of the digit aiding its recognition (Smits, Kapteyn and Houtgast, 2004; Wilson, Burks and Weakley, 2005). However, confusion matrix analysis on the Dutch digits-in-noise (DIN) test confirmed that bisyllabic digits were not

recognized more easily due to their length, nor confused more often with each other than with other digits (Smits, Theo Goverts and Festen, 2013). It appears that the application of level corrections to individual digits is sufficient to control for differences in ease of recognition related to the length of an individual digit, and the latest versions of the DTTs have included all digits (Han et al., 2020; Motlagh Zadeh, Silbert, Sternasty, et al., 2021).

The ICRA guidelines recommend optimizing individual digits and combining these into triplets (Akeroyd et al., 2015). Many DTTs use

recordings that aim to preserve prosody by using a different recording for each individual digit position in a triplet (Jansen et al., 2010; Zokoll et al., 2012; Vlaming et al., 2014; Han et al., 2020). However, Lyzenga and Smits (Lyzenga and Smits, 2011) showed that for triplets, prosody and

coarticulation have a negligible effect on speech recognition. A single recording of each digit (instead of separate recording for each position in a triplet) can, therefore, be used to construct the triplets, and such test versions report comparable test-retest reliability to DTT versions with preserved prosody (Smits, Theo Goverts and Festen, 2013; Potgieter et al., 2016). For some DTTs, the triplets were recorded as a complete set and optimized by applying level corrections to the whole triplet (Ozimek et al., 2009; Watson et al., 2012). In a new test version, the digits were created by artificial intelligence driven text-to-speech technology (Kropp et al., 2021).

31

Some variability exists with the background noise. For some DTTs, the speech-shaped broadband noise is generated by filtering white noise to match the long-term averaged spectrum of the speech material (Smits, Theo Goverts and Festen, 2013; Potgieter et al., 2016; Motlagh Zadeh, Silbert, Sternasty, et al., 2021), while others use noise that is generated by repeatedly superimposing the speech material (Jansen et al., 2010; Zokoll et al., 2012). Equally good test-retest reliabilities have been reported for both background noise types. However, Lyzenga and Smits observed that frozen noise-tokens, i.e., using the exact same piece of background noise recording, increased learning effects, but only after multiple test runs within the same test session (Lyzenga and Smits, 2011).

Babble noise is commonly used as background noise in sentence-level speech perception tests (Killion, Niquette and Gudmundsen, 2004; Spahr et al., 2012), but only a few DTTs have used it (Wilson and Weakley, 2004;

Moore, 2019). Fluctuating background noise is also rarely used as the standard background noise for DTTs (Van den Borre et al., 2021), even though fluctuating background noise has been shown to increase the spread of the SRTs for HI listeners (Wagener and Brand, 2005; Wagener, Brand and Kollmeier, 2006; Goossens et al., 2017).

Some DTT versions have used band pass filtered noise as the

background noise (Vlaming et al., 2014; Vercammen et al., 2018; Denys et al., 2019; Motlagh Zadeh et al., 2019; Motlagh Zadeh, Silbert, Swanepoel, et al., 2021). Studies with other speech perception tests in noise have shown low-pass (LP) filtered background noise to increase test sensitivity to high- frequency HL (Leensen et al., 2011; Jansen et al., 2014). The results for DTTs with filtered background noise have been mixed. Even though the use of LP filtered background noise increased the spread of SRTs for HI listeners, Vercammen et al. and Denys et al. (Vercammen et al., 2018; Denys et al., 2019) detected no significant improvements in test efficiency. However, Vlaming et al. (Vlaming et al., 2014) were able to increase sensitivity to high- frequency HL with the use of an LP filtered background noise. The

inconsistent findings might be due to differences in the study population

32

or in the spectral content of the digits. Motlagh Zadeh et al. (Motlagh Zadeh et al., 2019; Motlagh Zadeh, Silbert, Swanepoel, et al., 2021)

successfully used LP filtered background noise to improve DTT’s sensitivity to extended high frequency hearing loss, i.e., HL at 8–20 kHz.

The ICRA recommendations state that the digits should be clearly audible at the starting level. This is often accomplished by presenting practice triplets while instructing the test subject to adjust the volume setting to a comfortable level (Smits, Kramer and Houtgast, 2006; Potgieter et al., 2016; De Sousa et al., 2018). Wagener and Brand (Wagener and Brand, 2005) showed that for both NH and HI listeners, the absolute

presentation level has little effect on the SRT over a large range of clinically representative presentation levels as long as the stimuli are clearly audible.

Most of the current DTTs use an adaptive test procedure with a fixed step size of 2dB SNR and triplet scoring, as recommended by the ICRA (Jansen et al., 2010; Zokoll et al., 2012; Smits, Theo Goverts and Festen, 2013; Potgieter et al., 2016; Folmer et al., 2017). New test procedures have also been assessed to improve the efficiency of DTTs. Smits (Smits, 2017) modelled a test procedure that uses a fixed presentation level and

presents only the minimum number of triplets needed to reliably reach the pass/fail criterion. Based on the model calculations, the new test

procedure reduced the number of triplets needed for reliable screening results. The drawback was that the procedure only categorizes listeners as pass or fail instead of producing exact SRTs, which would be more

informative and more suited for monitoring hearing.

Another method that has been explored to improve the efficiency of DTTs is the use of variable step size and digit scoring. Denys et al. (Denys et al., 2019) showed that triplet scoring with a variable step size aiming at a 79% recognition score improved the efficiency of the Flemish DTT.

However, they noted that digit scoring could slow the convergence of the adaptive procedure to the actual SRT. Therefore, the most efficient test

33

procedure could consist of using triplet scoring for the first triplets to reach the correct SRT range, followed by a switch to digit scoring.

The NHT and its first adaptations to different languages were developed for landline telephones (Smits, Kapteyn and Houtgast, 2004; Jansen et al., 2010; Watson et al., 2012), but landline telephone versions were quickly replaced by internet and mobile device applications (Vlaming et al., 2014;

Potgieter et al., 2016; De Sousa et al., 2018; Han et al., 2020; Ceccato et al., 2021). This improved the reliability and precision of the tests, as landline telephone networks had limited the test material bandwidth to 300–3400 Hz. Internet and mobile device applications use a broadband signal which enables better detection of high frequency hearing loss (Jansen et al., 2013).

Despite minor developmental and procedural differences, and major linguistic differences between the different language versions of the DTT, studies have reported similar, good reliability measures. For most test versions, the slopes for the intelligibility function are steep, i.e., 15–25%/dB SNR ((Jansen et al., 2010; Potgieter et al., 2016; Giguère et al., 2020). A steep intelligibility function enables the detection of minor differences in speech in noise perception, and most DTT versions report measurement errors of less than 1.0 dB SNR for NH listeners. HI listeners have shallower

psychometric functions (Smits and Houtgast, 2007; Smits and Festen, 2011), and the measurement errors reported for HI listeners have been larger than for NH listeners, though mostly still within 1.2 dB SNR (Smits and Houtgast, 2007; Kaandorp et al., 2015; de Graaff et al., 2018).

2.2.2Comparability to other hearing measures