Leanpub: Publish Early, Publish Often

Week 3

This week you will learn more about the importance of multimodality in human perception in the theory track.

The terminology track will focus on the motion of performers that do not produce sound: sound-modifying and sound-accompanying actions, as well as communicative movements and gestures.

Finally, in the methods track we will take a closer look at “motion capture” systems. Remember to use the dictionary

3.3 Introduction to Multimodality

Multimodality is a term used to emphasise that our senses work together.

A modality refers to one of the channels we use to get information, such as audition, vision, taste, balance and proprioception. In most cases, the modalities confirm each other: they contribute to strengthen our perception of a phenomenon. However, in particular cases our mind perceives something that is not present in any of the modalities. An example of this is the McGurk effect, named after the psychologist that first described this phenomenon. You will see a demonstration of this effect in the next video.

Our cognition’s multimodal nature may explain why we easily project features of one modality onto another. When hearing a sound, we assume something about the sound production, even if we have never heard the sound before. What type of material may have caused this sound? What type of action was involved?

Furthermore, musical elements such as phrases, rhythmic patterns, melodies or harmonic progressions may give associations to shapes and movement. The same elements may even induce strong emotional reactions in the perceiver. But why is this so? Move on to learn more.

References

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 746�748.

3.6 Can Body Movement Teach us Something about Music Perception

As we have learned previously, the theory of embodied cognition tells us that our involvement with the world is based on the fact that we are humans with physical, moving bodies.

The same holds true for our involvement with music. The paradigm of embodied music cognition suggests that our bodies are integral to how we experience music. As such, we can learn a lot about how we perceive music by studying how we move to music.

Studying music cognition

How would you design an experiment to study people’s perception of music? There are many possible answers to this question. Let us consider two alternative methods.

Using qualitative interviews is one method to study how people experience. This makes people reflect on their own behaviour, and will therefore give valuable subjective data. By asking people about how they move to music, they are forced to articulate and express how they perceive the music, but only indirectly. Our experience is that people answer quite differently whether they are asked: “How did you move to this song?” rather than “What do you think about this song?”. Asking both questions may also be a way to reveal more about their musical experience than they would have otherwise shared.

However, talking about one’s own behaviour may be challenging. Few people have the vocabulary to express the nuances of their own bodily behaviour and perception of music. Furthermore, if the interview occurs after a concert, the person’s long-term memory becomes an important factor in shaping their answer.

A more quantitative approach is to ask people to use a device with a button or slider to indicate their excitement with the music. This removes the language barrier and can be carried out in real time. However, such an apparatus may not be so easy to use in most natural music environments, such as a concert hall. Furthermore, by asking people to indicate their excitement, some preconditions are put into the experiment by the researcher and many nuances of how people perceive the music are not considered at all.

Yet another method, and the one that we will focus most on in Music Moves, is that of “motion capture”. Observing people’s bodily response to music is “easy” for most people, since they do not have to put their experience into words nor do any conscious activity other than just moving regularly. However, the close connection between perceiving and moving (as discussed in the video on perception: affordances, mirror neurons, etc.) suggests that movement to music may be a bodily expression of how you perceive the music. Marc Leman calls this corporeal articulations, which he discusses thoroughly in the book Embodied music cognition and mediation technology.

3.10 Introduction to Motion, Action and Gesture

This week’s terminology track focuses on actions that do not produce sound directly.

One example are sound-modifying actions, for example the feet of a pianist as she pushes the pedals on the piano. Another is the sound-accompanying movements one may find when moving to sound.

We may also talk about various types of communicative movements, such as between musicians, between a conductor and the musicians, or between musicians and audience members.

Finally, we may talk about gestures, movements or actions that are used to express some kind of meaning. More on all of this in the next video.

3.13 Introduction to Quantitative Movement Analysis

In this week’s methods track you will learn about quantitative movement analysis.

This typically includes the use of one or more motion capture systems: cameras and sensor-based. We will go through all of these systematically, and will also take a sneak peek into the motion capture lab at the University of Oslo.

3.15 Infrared Marker-based Motion Capture

Infrared marker-based motion capture is one subcategory of optical motion capture systems.

As discussed in the previous video these systems may be very precise, with high resolution and high recording speeds. This type of technology is widely used for animation purposes in the film and gaming industries, and for medical purposes and rehabilitation. An increasing number of music researchers are now also making use of such systems in their studies of music and movement.

Cameras and markers

Infrared marker-based motion capture systems use reflective markers on the body or on an instrument. Such markers vary in size, and the best size to choose depends on the type of movement to be recorded. For instance, quite small markers are used to capture facial expressions, and larger markers can be used for full-body movement.

In order to “see” the markers, each of the mocap cameras emits infrared light. The light is reflected off the markers, and sent back as a two-dimensional image to each camera. The computer can then determine the exact location of the marker in space by combining the images from each camera such as sketched below.

Cameras finding the location of a marker

Three-dimensional motion capture data

Infrared mocap systems provide three-dimensional position data for each marker. The three dimensions are measured along the axes X, Y, and Z, and the orientation of these axes are determined when the system is calibrated. In a rectangular room, it often makes sense to let the axes run between opposing walls, and from floor to ceiling.

Recording motion data at a rate of 100 Hz means that 100 measurements are made per second, each with 3 data points (X, Y, Z) per marker. Considering that a full-body motion capture may require up to 30 (or even more) markers, we end up with a large amount of data. Software for simple processing and visualisation of the data is usually available from the mocap system provider. However, for music-related research it is often necessary to use analysis software that is tailored for our needs. One such example is the MoCap Toolbox from the University of Jyv�skyl� in Finland.

Data processing

Various processing of the recorded data is often needed, this may be small adjustments due to minor errors in the recording, or transformations of the data to calculate for instance the velocity or acceleration of the movement.

Occlusion: Gap-filling

One normal problem with motion capture data, is that a marker is “lost” in the recording. This happens when a marker is occluded or if it is moved out of the field of view of the cameras. Small gaps in the marker data can be easily repaired with so-called “gap filling”, based on interpolating between the closest data points to estimate the marker position in the gap, as shown below.

For longer gaps in the data it may be impossible to accurately estimate the marker position. That is why it is important to create as good recordings as possible in the first place.

Smoothing

Sometimes the recorded mocap data may be noisy, for example with small random errors in the data set. This may be caused by poor lighting conditions or a bad calibration of the system. It is still possible to reduce the noise level, by applying a smoothing filter to the data, as shown below.

Transformations

Finally, after gap-filling and smoothing the data, it may be necessary to transform it in different ways. Here the research question is the most important when it comes to deciding which types of data processing and transformation is needed. Some popular transformations include:

Position data is intuitive, and is often useful directly.
Sometimes we need to look at how fast the position changes. The rate of change of position (the derivative of position) is what we call velocity: how fast does the position change? Velocity is closely related to kinetic energy, and may therefore be useful in certain types of analysis.
Similarly, we may need to look at how fast the velocity changes. This is called acceleration, and is calculated as the derivative of velocity. Acceleration is closely related to force.
Higher derivative levels may also be useful. It has been suggested that jerk, which is the derivative of acceleration may convey certain motion properties related to affect and emotion.
We may also use the position data from markers to calculate joint angles, orientation/rotation, periodicities, and so forth.

There are also numerous more advanced processing and transformation techniques in use, and we suggest to check out the MoCap Toolbox to explore these further.

Reference

Burger, B., & Toiviainen, P. (2013). MoCap Toolbox - A Matlab toolbox for computational analysis of movement data. In Proceedings of the Sound and Music Computing Conference

Up next

Week 4