This interview was prepared and conducted by Marcelo M. Wanderley and Christian Frisson. The original video interview has been edited for clarity and brevity.
Now is the opportunity to put all this in context. It was the same idea when I invited you to give a keynote speech at the NIME 2003 conference 20 years ago. At that time, you presented your research, spanning from your Ph.D. theses (Claude Cadoz, Annie Luciani, and Jean-Loup Florens) until 2003.
1. What Drew You to the Field of Force Feedback and Music?
Claude: Well, it’s a question I always ask myself, but there was definitely a beginning. The “you” can actually be divided into two periods: the initial “infernal trio” (me, Jean-Loup Florens, and Annie Luciani) and then the second one, today’s “infernal trio” (me, Annie, and Nicolas Castagné). That’s because if there is one thing that is permanent in all the work at ACROE, it is that there are three of us with the same mindset and the same type of anger when needed.
1.1. The First Encounters
In the beginning, there is a little story about why we met. Annie, Jean-Loup, and I were in the same program of the ENSERG school at the time, now called Phelma. It was in 1969. We obtained our entire engineering education there, and there was a common sensitivity to everything related to sound, music and image synthesis.
We will say that it was in 1973 that I really started with Annie and Jean-Loup. At that time, there had already been almost 20 years of computer music with the work of Jean-Claude Risset and Max Mathews, and then the automatic composition side, which interested us less because we were much closer to signal processing, data processing and reconstruction by synthesis. I was still in school, discovering the sounds he had made and the whole perspective on sound synthesis.
1.2. Composing the Sound: Sound Synthesis
The idea of sound synthesis, whether additive or by frequency modulation, etc., was captivating; that is to say, to manufacture sound in all its dimensions with the computer. Jean-Claude said: the novelty, now, is that we no longer just compose the sounds together but we compose the sound itself, what he describes in his book as “Composer le son” (cf.
Risset 2014). It was fascinating because we saw an excellent perspective of musical creation and composition. This, for us, was the starting point. In these paradigms, there is this formidable opening that was made available by the digital computer.
I immersed myself in Music V. You have to see the context of the time: I could not get the code in Fortran as Music V was not yet distributed, so I reimplemented code by myself on the machines of the time. So, we were far from real time.
At the Spoken Communication Laboratory at INP Grenoble where we were, there was someone who was an electronics engineer and had made a digital-to-analog converter. It couldn’t be found; it had to be made from scratch. Then, with Jean-Loup, we started working with Assembler. So, we wrote our synthesis programs in Assembler, but at the beginning, it was Music V-type synthesis.
It was this stage, which was the very first, which led us to say: well, that’s great, we’re going to be able to do lots of things. But all the same, it’s not thrilling to make music by typing lines of code. Something is missing, and this is where a whole reflection began, which is at the origin of the program we are running today. That is to say, we told ourselves that what was missing was that we were not in front of the computer in the same way as we would be in front of a musical instrument to make music.
What immediately impressed us, and I must say that we were among the only ones at that time to touch on it, is something that worries everyone without people being able to formulate it: it is the fact that music, when it is composed, must first be written. That’s what people said: you have to write it first before you can play it.
1.3. A Question of Control
We said—in the words of the time, it was still a little naïve—there are at least two opposite poles for musical creation, which are the instrumental pole (we take an instrument, we pull the strings, and we see that it does something, and, after a while, we manage to improvise without necessarily having learned the notes that are made) and then on the other hand, there is composition, where we assume that everything is settled from the point of view of how we materialize the sound.
Few people thought of questioning musical notation. Well, it was clear that when we got there, we found enormous contradictions between the potentialities, which opened up the very principle of synthesis and the need to go through lines of code, which are very abstract and far from a sensory experience.
Here, there is a missing link in this affair, which is likely related to the fact that computer music had two opposite and quite different original directions: sound synthesis and computer-assisted composition. We found that there was a middle ground that was missing. Before becoming a composer who conceives musical works, one is first of all an instrumentalist, either full-fledged or by proxy, because the instrumental experience is perceived fairly directly when we are musicians.
For example, we do not necessarily need to go far in mastering the instrument. If musical thoughts develop, we can hook them into an instrumental reality. That was a fairly important argument for us and indeed, for sound synthesis. The first thing missing was an instrumental idea behind it, so we adopted the term “physical model.”
It’s the displacement of the effect towards the cause—even in just this sentence, there are plenty of consequences in many respects. That is to say that we face these problems even today. Indeed, we realize that when we compose and play, we rely on mental representations. Are they abstract representations or concrete physical representations of what to do?
There’s a whole set of questions that are always present which made us approach cognitive sciences and perception issues at a certain time.
1.4. Gestures
Let’s take musical sound for simplicity, i.e., instrumental sound, and leave aside the sounds of nature, which can be treated in the same stride but which we don’t need for the moment. Musical sound results from two completely different causes: one is linked to the existence of a permanent object, which has reasonably stable properties over time. By definition, it is the musical instrument, and we are going to think of musical instruments according to this characteristic. And then, the other, which is also a cause, is what we do with this instrument. So that’s where I introduced the term “gesture.”
So, fortunately or unfortunately, we have often talked about gesture terminology: what is a gesture? What is meant by gesture? No term is more polysemous than this one; a catch-all term.
I use the word gesture according to a precise meaning. This is why I have written about the typology of musical gestures and introduced several essential nuances at this level. (cf.
Cadoz 1988). There is the empty-handed gesture, the gesture one makes in front of people, in front of a camera, where there is no contact with an object. And then there is the instrumental gesture, which is defined above all by the fact that it is the gesture that is applied to an instrument. Throughout the duration of the instrumental gesture, there is an inseparable pair: instrumentalist–instrument. Therefore, this opens up very precise lines for analyzing the instrumental gesture.
1.5. The Energy Continuum
You know my favorite terms: the “
ergotic,” “
semiotic,” and “
epistemic” functions of the gesture channel. The
ergotic function of the instrumental gesture is responsible for the energy that, in a real (physical, natural) instrumental chain with acoustic instruments, is indeed what reaches the ears of the listeners This is what I have called “the energy continuum” (cf.
Cadoz and Wanderley 2000).
It is a sound phenomenon that is characterized, in the first place, quantitatively, by its vibratory energy. Well, this vibrational energy that eventually crosses the room to address listeners or just the path between the instrument and the ear; well, it is the means that is necessary for the sound, the music, to be able to exist, to be transmitted, to be communicated, etc.
This is an essential condition. Because there is a rupture when we use electronic systems, that is, systems with an energy source external to the human and where the gesture serves only to modulate the energy flow from the electrical or another external source to the ear, whereas in the instrumental situation, the energy originates in the human body, passes by the instrument’s physical properties, and radiates through the sound. And this energy, the energy that is found in the sound, is provided by the instrumentalist.
I am talking about multi-sensorimotor interaction. This is my diagram with the human, the object, i.e., the instrument, and then the three loops: that of the gestural action to the auditory perception, that of the gestural action to the visual perception, and that of the gestural action to haptic perception (cf.
Cadoz et al. 1982, p. 76).
I used a term that scares many people but, in my opinion, is still the only one that holds: “tactilo-proprio-kinesthetic” (cf.
Cadoz and Wanderley 2000).
1.6. The Beginnings of Haptics at ACROE
This brings up the three transducers that we need: the electro-acoustic transducer (the loudspeaker), the screen, and then the third one, the “force-feedback gesture transducer” (TGR—“transducteur gestuel rétroactif”).
When we think of the use of these haptic devices, we immediately think of the (piano) keyboard. Then, finally, influenced by the different types of manipulation interfaces with the computer, we imagine levers, sticks, etc.
The first “force-feedback key” developed at ACROE in 1980 was a single (piano-like) key. The motor was very good; the position sensor was an induction sensor. These were Jean-Loup Florens’ techniques; he was unmatched in that.
And so, with performance on a device the size of a music-keyboard key, we had respected the basic dimensions but with a much larger vertical movement, perhaps 4 cm at the end of the key.
1.7. The Modular Feedback Keyboard (“Clavier Modulaire Rétroactif”)
So there, we had a key morphology, but it was only a key, whereas on a piano keyboard, for example, we can have 88 keys. This led us to invent a motor from scratch that can be housed in a box the size of an instrument (cf.
Cadoz et al. 1990). We made a standardized choice, in a way, of the thickness of the key width—exactly 13.75 mm—because that is inferred from the standard pianos of today.
1.8. High-Quality Response and High-Frequency Simulation
A force feedback gesture system must be both powerful and precise; hence the need to make appropriate motors because, for instance, small electric fish-angling motors are inadequate. The power is very important. It’s like when one wants loudspeakers that are very linear and faithful over a wide bandwidth; well, it has to be powerful. It’s the same for force feedback systems; indeed, it was necessary to invent a motor capable of providing fairly high power. And what’s more, with very low time constants.
Currently, our systems can operate at audio frequency. Running them at 20 kHz heats them quickly. And it may very well be that at the end of a manipulation, there is a key that goes “through the ceiling,”—a coil that burns—but in any case, the bandwidth is very high, which is not luxury.
We made a certain number of accurate measurements in a playing situation which showed that, for certain phenomena—such as, for example, in the case of the simulation of a bow—this manipulation may be perceived as truthful as when a violinist’s bow is manipulated. These characteristics make it possible to obtain a continuous sound with the back-and-forth movement of the bow (cf.
Luciani et al. 2009).
At one point, we focused a lot on that. We didn’t understand the phenomenon immediately, but we found it fabulous. This effectively justifies the need for a high bandwidth. It is, in fact, how the instrumentalist makes it so that the sound is entirely fluid. Everything happens as if the musician had a perception in his or her fingers, a tactile perception that allows them to change the movement of the bow at precisely the moment when the string itself changes direction. That is, when the string has zero speed.
Maybe these are things that are very important for musicality. I say maybe, but I know very well that this is the case when one handles something like a bow which has its own physical behavior, its own properties, and in the gestures of the violinist, physical properties that we perceive through the tactilo-proprio-kinesthetic sense and that we perceive because we are in an action–perception loop.
1.9. Modularity
A second property of the modular feedback keyboard is modularity itself. Modularity is essential to overcome a peculiarity of the gestural channel, which is the difficulty of having something universal. The force feedback system cannot be universal; one can only imagine specific morphologies relating to particular applications.
This modularity of the modular feedback keyboard means, on the one hand, being able to add as many slices as you want, but on the other hand, being able to escape from the strict morphology of the piano keyboard.
This has led us to invent systems that can be “transformed.” The end effector can be treated separately from the actuator, which means that the modular feedback keyboard has two primary components: the actuators, which are slices 13.75 cm wide, and the intermediary, which may be pulleys or various transmission systems. Indeed, this remains completely vital to be able to have the possibility of carrying out gestures of different kinds with acceptable performances.
This technology nevertheless leads to systems that cost more than a simple computer mouse. This is because it uses highly specialized components such as motors and sensors, but also because it must also be very efficient. At the time, it was difficult to get people to admit that such performance requirements were needed for dealing with instrumental gestures.
2. Could You Summarize Your Contributions in Combining Force Feedback and Music?
Claude: Yes, absolutely. It’s the second part of the discussion because everything I have said up to now corresponds to all the analysis, identification, and quantification of the performance of a device that can be part of a multisensory sensorimotor platform.
The next step, which developed in parallel, was to think about the composers, the musicians, in other words, the musical output, because that was and remained the objective of making music, to create with all that.
2.1. The Cordis-Anima Language
The essential point was the development of user environments. You’ve heard of “Genesis” (cf.
Castagné and Cadoz 2002) and “Mimesis” (cf.
Cadoz et al. 2003). Well, these are two significant chapters of ACROE.
“Cordis-Anima” is the language that allows programming between inputs and outputs, which correspond to a vision of the physical cause of the sound (cf.
Cadoz et al. 1993).
Cordis-Anima also started from a desire to do the same thing as Risset did; that is to say, to offer users and musicians a modular computer system with functional blocks, as in Music V.
This property—the modularity of the functional blocks—was fundamental, and we are still entirely reliant upon it today. Obviously, in all the systems, it was a question of transposing this concept onto the objects that would take care of the relation between the gestural input and the tactile, acoustic and visual outputs.
We were looking for an elementary building block, an algorithm that would replace the wavetable oscillator of Music V. That is, a straightforward algorithmic formula that is modular and combinable. The object, even in its simplest form, must be able to respond to the constraint of receiving something of a gestural order and of producing something which will be able to control the sound, the image, etc.
2.2. Mass-Interaction Systems
This led us to the idea that only the notion of “inertial particles” made it possible to address this question at the right level, and therefore, this is how the first algorithm was born (cf.
Cadoz et al. 1982).
So, in a recurrent sequence of order 2, it is the constraint of algorithmic optimality which led us quite independently to say how, with the least possible memory usage, we can manage to manufacture an oscillating signal which is spread over time when we only want to give it a few pieces of information, obviously not 44,100 pieces of information every second.
Then, we realized that it was necessary to bring in elements of a different dual nature: inertial and interactional elements; in other words, the material and the connecting elements. From there, the algorithm develops automatically. There is no more choice to be made, and we arrive at the synthesis system by mass-interaction physical modules.
If we remain in the linear case, this formalism allows us to approach an immensity of things. Here and there, some things have become more related to applications, for example, when one wants to do something corresponding to bow friction. One enters into properties of physics that are a little bit more special, which are non-linear, and from there, a whole development takes place.
Cordis-Anima is a very stable language. It has proven itself for several decades. So that’s a whole body of knowledge, of know-how. The critical element that was missing was Genesis.
2.3. Genesis
Genesis, developed by Nicolas Castagné and myself, is the interface that allows you to practice Cordis-Anima. Genesis is still in good shape today; this is the tool I wanted to have.
There is excellent know-how today, very refined, and which indeed is something that can be used in pedagogy and can be disseminated. And besides, we are still working on it and regularly organize workshops around it.
I will now come back to your question. The development of gestural systems and modeling were born simultaneously and proceeded from the same objectives. Nevertheless, for technological reasons, they still struggle to come together. Creating highly sophisticated things in offline modeling was much easier than developing systems with real-time force feedback, which are operational today.
2.4. Musical Compositions with Real-Time Force Feedback
In 2015, we made the first piece that was created in concert with a force feedback system on stage in real time. Today, we also use the force-feedback system with Genesis and with our 24-speaker dome.
This is part of what we call the “Hélicante platform”. It contains the Genesis modeler, a workstation that allows the real-time simulation of objects of up to approximately 10,000 components, and the ability to broadcast the results in multichannel. Here, we have 24 speakers, but if we go to the ZKM (Zentrum für Kunst und Medien Karlsruhe), they have 43 speakers.
This technology is exciting for producing things that happen in space without going for complex processing since it is part of the Hélicante station.
The platform has already been used in concert three or four times with creations that are generally quite interesting.
2.5. Gesture Emulation
But you can’t have gestural systems with you all the time. Never mind. We can still make models that will run in real time with force feedback systems. You can make them offline; you don’t have to have the system with you all the time to be able to develop the models, far from it. I call it “gestural emulation.”
So, this introduces a new axis called “TGR metrology and gesture emulation.” Therefore, we can imagine making an offline model that could be tested in all its dimensions before going online with a real TGR (“transducteur gestuel rétroactif”). It’s all modeling based on the Cordis-Anima formalism, achievable in Genesis, which allows us to deal with the representation of gestures and force feedback systems when they are unavailable.
We can approach it simply: what is a TGR key? It’s a mass. Yes, it’s a mass, and we can give it an actual value. It is known that a mass of TGR, given the fact that there is a moving coil and aluminum, etc., its inertia reduced to the point of manipulation is something like 300 g.
But as we use electricity, we can make negative masses—the negative masses are fascinating when they don’t blow up in your face! So, regardless of the degree of modeling that can be achieved, this modeling will allow us to do a certain number of things.
Comment and Question (Marcelo): If I remember the history of ACROE correctly, it had very striking moments of development: the first key in 1981 (cf.
Cadoz et al. 1982), the first TGRs with several keys together towards the end of the 90s, with Leszek Lisowski and Jean-Loup Florens (cf.
Cadoz et al. 1990), around the beginning of the 2000s, when you managed to build ten units of eight keys each during the ENACTIVE Network European project (cf.
Luciani and Cadoz 2007), and finally, around 2012–2015, in real time (cf.
Leonard et al. 2012). It was all an incredible, insane effort by a team of three permanent staff plus students. It was an industrial effort.
3. How Can This Type of Development Be Maintained in These Research Structures, Which Are Not Industries? What Are the Possibilities for the Future?
I was talking about the heritage of know-how. An indicator, if you will: I have something like 70,000 models that meet a given need. I spent much time developing and documenting them somewhat systematically to make something that could be passed on. There is much work there.
3.1. Pedagogy and Transmission
There is something that I call “teaching sheets” or “modeling notebooks.” They combine a description of a model that is easy to implement on an interactive multimedia reading and editing system.
There are no technological obstacles to this. On the other hand, there is quite a lot of work that must be done correctly. Otherwise, it doesn’t work; it’s useless. Pedagogical work…it can be a theme; we imagine a small set of sheets in this modeling notebook that focuses, for instance, on the theme of gesture emulation.
It’s straightforward to do from the point of view of the construction of the content. First, you have to introduce the context, then identify the problems to solve. Then show how they were solved, associate to it purely didactic models, and finally to models from pieces where they have been used.
Because I find that musical development, the creation of a piece, and the development of models are things that are done in a fusional way, I may have a musical idea without considering how I will implement it. If I start implementing it, then I come across questions. Some are solved because the techniques needed have been known for quite a while, while others still need to be solved. There, I change my hat; I become a “modeler” and figure out how to solve the remaining issues.
My permanent concern is pedagogy, to write everything. I write in such a way that the next day I can find what I did, and I discipline myself absolutely for that, no matter what.
3.2. Genesis Workshops
We held our first international convention for Genesis users last June (2022) and will do it again this June (2023). In these conventions, we invite people who want to discover and start working with Genesis.
We did a workshop where people learned how to make small models with force feedback. How could it be understood, disseminated, used and developed? As a solution, we proposed the “Hélicante studios,” which consist of a four-key gesture force feedback system, Genesis and a third component, “gesture emulation,” mentioned before.
3.3. Marketing vs. Companionship
Going back to your question: for the moment, I have stopped thinking that one day we would have a product that would be marketable. I stopped believing in the idea that one day we were going to establish the company that was going to revolutionize everything, revolutionize the world. That’s not how it’s going to work.
On the other hand, there is a solid group of people around these concepts. This group was formed and has continued to work since our first workshop at the ZKM in 1996. They have the same passion today as at the very beginning. It’s nice when you have such a group that lasts; it’s stimulating. Many things have been done as a result of discussions that took place during the workshops or during situations where we all rolled up our sleeves.
I am currently calling the collective “the companions of the Hélicante.” We will ensure that there is a force feedback system and all the paraphernalia to do real-time modeling and then multi-channel broadcasting because now it is very accessible. So, a Hélicante platform that will be built gradually and then an action program with educational, creative and dissemination dimensions, all the necessary axes in something that is part of a whole.
And then there are the concerts, where we talk to people. We speak to the public; we talk to the students. We offer them the possibility of experimenting; we are there to help them. These are moments of absolute, deep joy. It’s very creative.
I sincerely believe that the possibility of success in this endeavor depends on rolling up one’s sleeves and coming together collectively—by effectively creating, structuring, cleaning up, clarifying concepts and preparing to be active in pedagogy and creation.
Voilà. Thanks a lot!
Claude Cadoz is a graduate of the Grenoble Institute of Technology, Electronic and Radio Telecommunications Engineering School, 1972.
Claude Cadoz founded ACROE in 1976 with Annie Luciani and Jean-Loup Florenz. ACROE is an ICA research group devoted to interdisciplinary research in computer arts, computer science and computer technology. He began pioneering research in force feedback devices, physical modeling and real-time simulations for computer music, computer graphics and computer animation. He developed the basic concepts of modular physical modeling for computer music, first, for sound production and then for music creation, including both the synthesis and composition of sounds.
He has published approximately 35 international papers and 10 invited international talks on physical modeling, human–machine interactions, computer music and gestural interaction. He wrote a book for the general public on virtual reality (1994). He has trained more than 400 graduate-level students and directed 35 PhD theses.
He is the author of two international patents on actuator–sensor technology for haptic devices and an electro-dynamic piano. He created several musical artworks with ACROE’s technologies: ESQUISSE (music and video, created with A. Luciani and J.-L. Florens in 1993), pico..TERA (physical model synthesis, in 2001) (
Cadoz 2002), Gaea (physical model synthesis, in 2007), Helios (physical model synthesis and haptic interaction, in 2015), and Quetzalcoatl (physical model synthesis and collaborative haptic interaction, in 2018).