• Nenhum resultado encontrado

Participants watched videos of a person using the Flops (left panel of Figure 4.4), with the different sounds. The results showed that participants valued natural sounds with a lower spectral centroid more positively than synthetic sounds with a high spectral centroid (note that the result that higher frequencies make sounds less pleasant has been reported in many studies, see Chapter 2).

A selection of these sounds were used in a second experiment, in which users interacted with the Flops to perform several tasks, with the device being more of less usable, and we observed the contribution of the sounds and the usability to the feelings reported by the users. The results were that the sounds mainly influenced the valence of the emotions (although to a lesser extent than the usability of the interface, and to a lesser extent than in the passive listening experiment), whereas the usability of the device also influenced the arousal and the dominance dimensions of the feelings of the users. Overall, these results show that the aesthetics of the sounds contribute to the valence of the user experience of an interface, even when compared to other, larger manipulations. The results also suggested that pleasant sounds made the task feel slightly easier, and left the users feeling more in control. These results therefore argue that pleasant sounds in computer interfaces are desirable! Naturalness (defined here as whether the sounds were recordings of physical events or synthetic beeps) had little influence in comparison to the other factors. Another study conducted by Susini et al.

(2012) suggested that what is in fact important is the perceived congruence between the user’s gestures and the resulting sounds.

et al., 2010). The central tenet of this theory is that the motor plans specifying an action are associated in memory with the consequences of that action in a distributed andbidirectionalrelationship (consisting of any perceivable consequences of an action, proximal or distal, including proprioceptive, haptic, visual, or auditory consequences).

Activating any element of such a distributed representation may activate the whole representation: for example, activating the sensory consequences of an action may activate the motor plans producing that action and thus trigger or prime that action.

Whereas most of the literature has used visual stimuli, our study demonstrates that the ideomotor concepts also apply to sounds. In addition, it offers perspectives on the fate of the auditory-motor associations: how they are created, and how they persist and disappear over time. Overall, the results demonstrate that the gesture- sound associations mediating auditory-motor priming are extremely plastic: they can be formed very quickly in the context of the experiment and reconfigured as quickly when the contingent association is violated, even if they had been established during life-long experience (we found little evidence for any special status for ecological associations).

Altogether, these results are strong indications of a type of memory representation that is flexible, in which older items are quickly replaced by new ones, akin to the traditional concept of short-term working memory and consistent with the properties of the dorsal stream of sensory processing.

From auditory-motor associations to sonic interactions

Can auditory-motor associations explain the results of the Spinotron study? The ideo- motor theory is completely consistent with these results: manipulating the Spinotron created strong auditory-motor associations in the participants, which in turn helped them control the device. Thinking of the target sound (the regular tick tack of the ratchet spinning at the target pace) would have activated the corresponding motor plans, without the necessity to explicitly learn the dynamics of the model controlling the ratchet sound. For the group that received the audio feedback, that target was indicated auditorily during the training phase, and the audio feedback was also present during the test phase, and this probably created strong auditory-motor associations.

For the other group, the target was indicated visually during the training phase, but the visual feedback was absent during the test phase (only the haptic and proprioceptive feedbacks were thus present). This probably led to visuo-motor associations that were too weak to be usable. However, our results do not show that auditory feedback would be more effective than a visual feedback. Interestingly, the Spinotron study showed the same rate of learning when the sound feedback was generated by an ecological model, or by a somewhat more arbitrary model. This is therefore consistent with the idea that auditory-motor associations do not rely on long-term memory but are created on-line and are extremely plastic.

From an applied perspective, these results indicate that sonic interactions can ben- 32

efit from auditory-motor associations, but that these associations must be created and reinforced on-line, and cannot rely only on past experience (e.g. ecological associa- tions). This gives freedom to the designers!

On the influence of naturalness

The Flops used sounds with different degrees of “naturalness” (using sounds that range from the natural consequence of the actions to arbitrary abstract beeps). Overall, listeners found natural sounds more pleasant, but this result has to be qualified. First, the usability of the interface influences the perception of naturalness: sounds are perceived as less natural when the interface is malfunctioning. Second, “natural” sounds do not make the manipulation more natural. It is the perceived congruency between a user’s gesture and the resulting sound that makes the whole experience feel natural.

More generally, the pleasantness of the sounds appears to be a more important factor. Relationships between auditory features and pleasantness (see Chapter 2) still hold in interactive contexts. In particular, sharpness, measured by the spectral centroid has a great influence on the emotions felt by the users. This feature therefore appears to be strong predictors of the users’ reaction to sounds, independently of whether they are solely passively listening to the sounds, or generating the sounds when interacting with an object: high spectral centroids (and/or synthetic sounds) induce negative emotions, and low spectral centroids (and/or natural sounds) induce positive emotions. Such result has been reported in many other studies (see for example Juslin and Timmers, 2010; Kumar et al., 2008; Takada et al., 2010; Västfjäll et al., 2003; Zwicker and Fastl, 1990).

33

34

Chapter 5

Vocal imitations of everyday sounds

Most of the studies reported so far have collected the listeners’ descriptions of sounds (for interpreting perceptual dimensions, categories, or documenting interactions with prototypes). One observation was striking: very often, participants ran out of words and started imitatingthe sounds, vocally, but also using a lot of gestures, facial expressions, and pantomimes. In fact, French speakers usually have a very limited vocabulary specific to sounds, and thus rely on other linguistic devices to communicate about sounds (Faure, 2000). I conducted several studies over the years to investigate whether vocal imitations can communicate what they imitate, and how they do so.

I supervised the master’s theses of Karine Aura-Rey and Arnaud Dessein, who conducted two initial studies of vocal imitations of everyday sounds. I then spent one year at the University Iuav of Venice with Davide Rocchesso to gather the preliminary results confirming the initial intuitions. The results clearly showed that imitations were the most natural way for most people to communicate about their auditory experience, and thus had a great potential for sound designers: vocal imitations could be the equivalent of the architect’s sketchpad and pencil for sound designers. We were then lucky enough to be awarded a grant from the European Commission for the SkAT-VG project. During this project, I lead the research group focusing on perception and cognition. We studied what listeners imitate and how they imitate it. In particular, I supervised the master’s theses of Ali Jabbari and Hugo Scurto. We also explored how to use vocalizations and gestures to create intuitive tools for sound designers.

5.1 Imitating sounds to communicate them

Imitations are extremely important during infants’ development and learning of skills, customs, and behaviors (Chartrand and Bargh, 1999; Heyes, 2001; Meltzoff and Prinz,

35

2002), and occur in all sorts of situations in adults (matching postures, mannerisms, facial expressions, phonetic features, etc.). But can human vocalizations convincingly reproduce sounds made bynon-human sources?

The short answer is yes. An important piece of evidence comes from linguistics. Al- though the most commonly agreed-upon view is that the relationships between signifier and signified (i.e. words and their meaning) is arbitrary (de Saussure, 1916), spoken languages also contain numerous instances of sour symbolism wherein the sound of a word is perceptually evocative of its meaning. Onomatopoeias (standardized words that mimic the sound of the object they refer to), ideophones (words that evoke sounds, movements, colors, or shapes by means of a similarity between the sound of the word and the idea it refers to), and phonesthemes (sublexical units referring to higher level attributes of meaning, e.g. “gl”, as in “glitter”, “glow”, “gleam” etc. relates to “vision”

and “light”) are classical examples of words evoking some aspects of non-human sounds and other sensory impressions (Sobkowiak, 1990; Assaneo et al., 2011; Schmidtke et al., 2014; Blasi et al., 2016).

Onomatopoeias, ideophones, and phonesthemes arewords (or parts of words), and as such are constrained by what they refer to, but also by the linguistic system they belong to. Borrowing the words of Rhodes (1994), they are “tame” imitations, as opposed to “wild” imitations, created by speaker on the spot. In comparison, wild vocal imitations have been only rarely (see for example Perlman et al., 2015)

In a first case study of wild vocal imitations of sounds conducted during Karine Aura’s master thesis (whose data were published later in Lemaitre et al., 2014), we observed how French speakers discuss about sounds. We observed couples of par- ticipants with limited musical or audio expertise. During the study, one participant listened to different series of sounds (in isolation), and then had to communicate one target sound in the series to the other participant. The task of the second participant was to recover the target sound. We did not specify anything about what to say, do, or not to do, and the participants were free to discuss as long as they liked. The goal was to observe what they would spontaneously use. We used sounds that were very easy or very difficult to identify. Conversations were manually annotated for the presence of vocal imitations. Here is an example of such conversation:

- “It sounds like as if you would take a piece of corrugated cardboard. First you scrape it, then you tear it offand it sounds like Rrrrr offthe cardboard.

You see? First, Fffff, and then Rrrrr. - Oh I see: Fffff then Rrrr.”

Vocal imitations were commonly used: they were present in 59.3% of the conversations.

Conversations (79.6%) included even more often a number of imitative gestures.

36

Documentos relacionados