Artificial intelligence (AI) is a well-established discipline of computer science focused on making computers perform tasks that would normally require human intelligence (Russell and Norvig 1995
). Due to the convergence of massive data availability, computational resources and novel deep-learning-based architectures, the machine learning (ML) sub-field of AI has experienced major breakthroughs over the past decade (Goodfellow et al. 2018
). Such great progress has been made that AI technology is now a major driver of global business and investment.1
AI is being deployed in varied practical scenarios, from machine translation to medical diagnosis, detecting fraudulent credit card use, transcribing speech, summarizing sporting events and financial reports, targeted advertising on social media, and autonomous vehicles. The combination of AI and robotics is also widely transforming manufacturing (Economics 2019
). Real-world applications of AI have generated controversy as well, such as tracking people using biometric indicators (e.g., face, speech, gait), prediction of criminal recidivism, judicial sentencing recommendations, drone warfare, predicting protected characteristics (e.g., sexuality, pregnancy), and creating and distributing propaganda (e.g., deep fakes).
AI has also impacted music, from its creation to its distribution. Significant among these are music streaming services using AI for music recommendation and information retrieval, e.g., Spotify2
and in product suggestions for online retailers, such as Amazon
The application of AI to music creation has appeared in academic halls during the past 60 years, e.g., Hiller and Isaacson
); Dannenberg et al.
); Fernández and Vico
); Sánchez-Quintana et al.
); Sturm et al.
), but AI is now being used visibly in popular forms of music. Recent examples include: Taryn Southern’s 2017 album, “I AM AI”, which features music generated by a commercially developed music AI system; the 2018 album “Hello World”, billed as “the first music album composed by AI + artists”;5
Holly Herndon’s 2019 album, “Proto”;6
and dozens of albums created by the “first-ever algorithm to sign [a] major label deal”.7
Several companies have also been founded recently to capitalize on advancements of AI applied to music content creation, particularly for production music, i.e., music to accompany film, radio and other media. Examples of these companies include Aiva8
for creating soundtracks for advertisements; and Melodrive10
for automatically creating music in video games. Some companies are also devoting resources to creating software for artists exploiting AI technology. Examples include LANDR for mastering music,11
for music composition, and Google’s Project Magenta14
for sound and music synthesis.
The combination of large quantities of recorded music and accompanying data, effective algorithms, and high-power computational hardware is now producing progress audibly moving so fast that alarms are sounding.15
What does AI mean for music? As in so many areas of labour where AI and technology leads to more efficient production lines and increased profit but human redundancy and deskilling, can the same happen in music? Some say that AI will free people of mundane tasks, but this can be “tantamount to liberation from the ability to make a living” (Drott 2019
). How will AI help and harm the various participants contributing to and benefiting from music, e.g., composers, musicians, educators, listeners, and organisations? Should consumers be informed about the involvement of AI in the music they listen to, much the same way ingredients of food products are communicated? And how should this information be presented, and to what level of detail?
A tangible example of many challenges and open questions of music AI is given by the folkrnn
project (Sturm et al. 2016
This project has built and trained several music AI models on data produced from tens of thousands of transcriptions of folk music available online.17
The resulting models are able to generate novel transcriptions that exhibit many characteristics similar to the original data. An online implementation18
offers users the functionality of generating transcriptions with the click of a button. Figure 1
shows the parameters available to the user, and a notated transcription generated by the AI. Users can submit the transcription to a growing collection of such tunes at another website.19
Material generated by folkrnn models has been used in variety of ways,20
including an album situated within the style of Irish traditional music (Sturm and Ben-Tal 2018
The experience of developing and using folkrnn models brings focus to the numerous questions identified above, and also motivates other questions. Who owns the rights to the material generated by folkrnn models, and to any music derived from it? What implications are there that folkrnn models are trained on a collective source of music transcriptions, some of which may be derived from copyright-protected work? How might folkrnn—and tools like it— impact music creators and music communities, traditional or otherwise? What kinds of misuses of folkrnn are possible, and how can its developers guard against these? How is the value of music affected by AI system that can generate and synthesize millions of “folk music” recordings without the direct input of human musicians? Should a person who uses material generated by folkrnn models in the composition process specify where it came from, i.e., that it was not entirely their own creation? Should music created in such ways, with the “input of AI”, be considered inferior to music created without?
These questions motivate taking a closer look at what is actually occurring in the application of AI to music from two different and complementary perspectives: copyright law and engineering praxis. In Section 2
, we look specifically at copyright law, and how it might address some of the questions surrounding material generated by music AI. The legal perspective is important because the use of AI in music creation is posing novel challenges to intellectual property law, moral rights, and the protection of creative human production. In Section 3
, we look at engineering praxis in AI and ML, and some of the questions arising from their research and development methodologies. This perspective shifts the focus in engineering from developing music AI, and the incentives of doing so, to surveying the impact of such technology and its development, intended or not, and the implicit assumptions made in performing such work. Section 4
returns to folkrnn to discuss aspects of these two perspectives in a more grounded way. The conclusion looks to the future of AI in music, and speculates what it may mean for copyright law and engineering praxis. Overall, this article contributes new perspectives to the interdisciplinary domain of music AI development, which is currently heavily focused on how to create such technology. There are no easy answers to many of the questions raised in this article, but they must continue to be identified and formalized to assist their future discussion.
2. Copyright Law Perspective
As an artistic work, music may be protected by copyright and related or neighbouring rights. While copyright
protects original works, i.e., the musical composition or the lyrics, in the case of music, neighbouring rights
relate to the performance or interpretation, or to the fixation made for the sound recording. Thus in addition to the composer, the writer, musicians, singers and phonogram22
producers also benefit from certain protection.
Focusing on copyright, some countries such as the UK, South Africa, Hong Kong, India, Ireland, and New Zealand have envisaged protection for computed-generated works granted to the person by whom the arrangements necessary for the creation of the work have been undertaken. Thus, in the UK, computer-generated works are defined as works “generated by computer in circumstances such that there is no human author of the work”.23
Note these provisions leave room for ownership being allocated either to the programmer or to the user of the computer program.
In relation to other countries where no specific regime exists (the majority in Europe), it has been questioned whether AI-generated works attract copyright protection. Continental copyright legislation is very much dependent on human-centred concepts, both with regards to the beneficiary of protection (i.e., the author), the conditions for protection (e.g., originality), and the rights granted (economic, but also moral rights). According to the Court of Justice of the European Union (CJEU), a work is considered original when it is the expression of the author’s own intellectual creation and his/her free creative choices, the author’s personality, or the author’s personal touch.24
In light of this, a number of scholars conclude that under present law, autonomously AI-generated works might not be eligible for copyright protection, e.g., Buning
); Deltorn and Macrez
); Lauber-Rönsberg and Hetmank
Regardless, humans can still have an important involvement in creating music, even if assisted by an AI system. Originality, as a precondition for copyright-ability, may be concluded whenever there is a significant creative human contribution to the resulting output. However, it cannot be disregarded that in the future, AI might be able to autonomously compose songs with minimal or insubstantial human intervention.26
Authorship recognition may require an analysis of the operation of the systems and the role of the different actors involved in the process (e.g., the developer, the trainer or the user).27
These cases are the ones that challenge EU copyright law.
Is the lack of copyright protection of AI-generated results adequate from a policy point of view? The response to this question would require further legal and socio-economic analysis.28
In case a certain level of protection is considered as necessary, adjustments may be needed to the existing framework to either amend the existing copyright laws or to pass new sui generis
rights targeting AI-generated products. One of the obstacles to put in place a different system of protection for AI-generated products would be to elucidate when a work is or is not AI-generated and to what extent.
Another aspect of relevance is the use of copyright works in datasets used for training AI systems. Training state-of-the-art ML models requires large amounts of data, i.e., pre-existing music, such as scores, lyrics and/or audio recordings. When the music used to train a system is protected by copyright,29
permission from the rightholders is required unless an exception applies. In the EU a limited number of exceptions exist. Copyright exceptions traditionally concern a qualified purpose, e.g., quotation, parody, teaching, research, and news reporting. The new European directive on Copyright in the Digital Single Market30
has introduced two exceptions on text and data mining31
that could greatly benefit AI developers. Subject to certain conditions, the first exception32
would allow text and data mining for purposes of scientific research provided the researcher has lawful access to the work. This exception is mandatory and cannot be derogated by contract. The second exception33
would authorize text and data mining of lawfully accessible works and other subject matter beyond scientific purposes, except if rightholders have expressly restricted the use.34
The new provisions could then allow certain reproductions required for automated computational analysis of music hosted on digital repositories if all the required conditions are met. All in all, it must be retained that copyright is about use and not about access; the exceptions can only be invoked in case users have legal access to the work, or in the above mentioned example to the digital repository. Furthermore, the European member states still need to implement the directive into national law.
Other questions arise from behaviors of music AI systems. The accidental reproduction of copyright-protected work in music generated by an AI system may also raise liability questions. In case a piece of music reproduces a pre-existing work or a phonogram, or part of those, the authorization of the relevant rightholders is normally required. In a recent case dealing with the sampling of 2 seconds in a music composition, the CJEU concluded that taking a sound sample of a phonogram for the purposes of including that sample in another phonogram, even if very short, requires the rightholders’ authorization, unless that sample is included in a modified form unrecognizable to the ear, or the use is made under a copyright exception.35
Last but not least, another question concerns the allocation of responsibility for such infringement, which could relate to both the engineer and the user. How should a music AI engineer safeguard against such behaviors?
3. Engineering Praxis Perspective
Since engineers bring new technologies into the world, it can be argued that they share the responsibility for the resulting outcomes, positive and negative, intended and unintended. The integration of AI into societal and political spheres can appear as fostering objectivity, but the data-centric nature of ML can unintentionally perpetuate discrimination by encoding existing social biases learned from real data used by an engineer for training (Barocas and Selbst 2016
). Even if an engineer has good intentions, their uncritical treatment of data and their resulting systems might become instrumental in producing great social and economic harm. No matter how complex a system might be, engineers must study, evaluate and document its working principles and limitations and make users and others impacted by such a system aware of them. Technology is not ethically neutral (Dusek 2006
), which motivates design procedures that are guided by clear ethical principles (Hand 2018
; The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems 2017
), as well as employing continuous evaluation to mitigate harmful impact. Bryson and Winfield
) provide a brief description of and motivations for standardizing the ethical design of AI such that its development benefits humanity. It introduces the standardization efforts of The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems
), and discusses in particular standardizing transparency in autonomous systems. The development and implementation of these practices are still ongoing.
When evaluating the impact of AI, engineers typically work from the narrow perspective of AI performance in limited and controlled contexts. A critical but often overlooked question is what defines a particular system as “good” or “bad” with respect to a use or user, e.g., a music listener, performer, composer or business. Defining meaningful evaluation strategies is difficult for many applications in the music domain, due to the subjectivity of music judgment and preference. This makes evaluation one of the main research challenges in the discipline of music information retrieval (Schedl et al. 2014
; Sturm 2016
). Engineers by and large focus on performance metrics like accuracy, precision, and reliability when developing data-driven AI techniques, and then infer from performance how useful or harmless a system is. Evaluation can instead be framed in the context of specific applications and their users, but this is rarely done (Schedl et al. 2013
Broader perspectives are being advocated for by the developing research fields “fair, accountable and transparent machine learning” (FAT/ML),36
and “explainable AI” or “interpretable AI”. These fields argue for richer notions of evaluating AI performance and use, ones that actually reflect usefulness and real impact. Definitions of these concepts are numerous, and represents a track of research in itself. In one sense, an AI system is considered to be fair
if it does not discriminate against certain individuals based on a set of protected attributes like gender, race, and age.37 Group fairness
has been proposed as a category of algorithmic fairness (Kleinberg et al. 2017
), which consists of more than 21 metrics to measure discrimination, some of which are contradictory. Transparency
, or a clear explanation of where and how AI has been used to make some decision or product, has been also argued to be a relevant evaluation criteria for AI development and application. A related concept is interpretability
(Doshi-Velez and Kim 2017
) or explainability
(Mittelstadt et al. 2019
): the ability to explain, in an understandable way, how an AI behaves a particular way, or has arrived at a given result. Metcalf
) underlines the importance of distinguishing between bias
. The former is a technical feature of statistical models, and the latter is related to human values that comprise social biases and prejudices about a group of people. Bias and unfairness may occur within an AI system without necessarily impacting its performance in some aspects, but can impact metrics reflecting the values of other participants.
These concepts not only promote the measurement of AI systems in ways that are more relevant to their application, but also motivate new methodologies for engineering, and for addressing the general lack of auditing tools to evaluate the social impact of AI.38
For instance, engineers can use interpretability to find and correct causes of discrimination in an AI system, to better explain its errors and improve operation, and to perform causal inference (Zeng et al. 2017
). One can also see potential uses of such approaches to addressing some questions of copyright law in music AI raised in Section 2
. For instance, explainability can help address questions about how specific music material was generated by a system. Explainability can also improve the deployment of AI systems in the real world, where sample bias and concept drift are common.39
The engineering of AI systems can benefit by working in transparent ways as well, e.g., clarifying the reasons why specific metrics are used and the societal values which underlie them (Glymour and Herington 2019
; Kilbertus et al. 2017
); describing the processes of data collection and use, which may result in discrimination (Hand 2018
); even improving the composition of a research team, where a lack of team diversity can compound the impact of the problems mentioned above and reinforce blind spots (Ruiz et al. 2002
). Working in these ways can help address issues related to the socio-economic impact of AI systems, define new evaluation principles centred around these issues, and prevent and mitigate negative effects, e.g., see Benthall and Haynes
Discussions about bias, fairness and transparency in the development and application of AI to music are just beginning, e.g., Holzapfel et al.
Music creation is a part of a landscape in which various AI-driven tools are deployed and used for different purposes, e.g., composing, mixing, and streaming. For music streaming services, AI can make considerable positive and negative impact on music recommendation. AI can be a core technology of music recommendation systems designed to provide users with relevant recommendations, and thus, can influence music consumption and listening behaviour at large scales. However, such approaches can also incentivise unscrupulous behaviours that take advantage of artists and listeners (Eriksson et al. 2018
For instance, using AI music generation can greatly increase the revenue of a music streaming business, whereby in-house synthetic “artists” are promoted on playlists thus reducing the revenue distributed to human artists whose music is also streamed. In such cases, artists and listeners might be informed about the way AI systems work for music recommendation (Aguiar and Waldfogel 2018
). It could also be mandated that users be informed about the ways in which AI was used to create the music they are listening to, or even opt-out of hearing AI-generated music altogether. Such transparency can serve to empower artists and listeners to challenge AI systems; however, these require defining or clarifying the degree to which AI is involved in the process. It is tempting to try and enumerate the involvement of AI in various stages of music creation, e.g., composition, performance, production. For each stage, one might indicate the degree of AI involvement with a numerical score. A more fine grained analysis could describe exactly how AI was involved, e.g., the design of a microphone, the creation of an impulse response, the synthesis of a particular voice, or the use of score following during performance. Consumers could then use such information to make more informed choices, in the same way that labeling food ingredients is used to identify nutritional value or potentially deadly reactions due to allergies.
This approach to transparency, however, faces problems immediately. First, decomposing music creation into stages is not so clear in many instances, and also introduces its own biases, since these stages are not always separable and do not even exist in all music. Second, designating which aspects are more and which are less central to the value consumers attach to music can be debatable. Consider, for instance, an AI application that automatically advances a music score for performing players: certainly useful AI technology, but is it material to the musical value of the recording? Do consumers themselves have enough knowledge about music and its production to evaluate the relevance of such information? Third, an accurate tally of the involvement of AI will also require extensive and sometimes intrusive documentation of all the software and hardware used in the process and its relevance to the finished product—which has the potential to become unwieldy and a hindrance to the creative process, and could even harm artists who do not have institutional backing. Requiring a young, aspiring artist to provide full and accurate documentation of the AI they have used as a condition for making their music available to audiences could be an insurmountable barrier. Investigating the provenance of each product (hardware and software) for traces of AI is perhaps not something many artists would be able or willing to do.
This is not to say that it is too difficult to inform consumers about the way AI has been involved in the creation of a particular music experience, but certainly, research must be conducted to establish exactly the harms that can result if such information is not provided to consumers and in what contexts. In the case of a person with a life-threatening allergy to a particular food ingredient, one benefit of labeling food with ingredients is clear: the avoidance of death. Such a benefit is not yet clear in the case of identifying the involvement of AI in music creation. Furthermore, inundating consumers with information but expecting them to ascertain its relevance to them may not in the end be conducive to informed choice.
Technology is a double-edged sword, with benefits and detriments that deserve to be critically analysed as it is developed, applied, improved and retired. General-purpose technologies such as AI that have the potential to affect economies and societies require study from broad and diverse perspectives, with considerations of questions that touch on law, economy, ecology, and philosophy (Bostrom and Yudkowsky 2014
; Craglia 2018
; Dusek 2006
; Holzapfel et al. 2018
). Relevant initiatives in ML research are centred on developing AI systems that are not just accurate, but also fair, accountable, transparent and interpretable. Such work moves beyond describing just the engineering of technologies to examining their use and misuse, considering their impact, and how such dimensions should be reflected in engineering praxis. This article aims to contribute to this discussion specifically for AI developed and applied to music.
We have looked at two broad perspectives, both timely and motivated by a particular music AI application. The first is from the perspective of copyright law, which arises from questions about the ownership of material generated by AI systems that may have themselves been trained on copyright protected material. Many questions have been identified, most of which cannot be answered at this time due to a lack of precedent; but what is certainly clear is the disruption AI music generation will cause to legal and societal norms as it improves. The second perspective is engineering praxis, which arises from concerns as to the impact of AI technology on music ecosystems and the ways in which such engineers work. Many questions are raised here as well that cannot be answered, but what is clear is that music AI should be developed and used in consultation with music practitioners (Holzapfel et al. 2018
; Sturm et al. 2018
The impact of music AI on career trajectories and opportunities present risks not just to job seekers, but are likely to have adverse effects on innovation in the music-ecosystem. The entry-level jobs in recording studios and small media companies that are likely to be replaced by music AI allow aspiring musicians to acquire skills and networking opportunities for embarking on their career. Adapting education curricula to include more sophisticated skills and ways of working alongside and with the support of music AI tools is only part of the answer, and temporary at that, with further technological development. Developers of the technology and the managers who decide on deploying it should consider the longer term effects of their decisions, and ways of mitigating adverse results. At the same time, the current legal framework may need adjustment to best facilitate creative activity and artistic innovation in such an evolving ecosystem.
It might be illustrative to consider a possible endpoint of music AI development: a “complete” music AI, a kind of “musical holodeck” able to generate any possible music, archiving all recorded music (scores, audio and video recordings, images, and other data about the music and its performers, etc.). It knows and understands what music is and how it works, and can not only retrieve all music fulfilling the requests of any listener, but also create any possible music in an instant. A subscriber of this system can ask to hear Led Zeppelin performing a particular ABBA song. Composer Ben-Tal can hear his piano music performed by a long-dead pianist or the Berlin Philharmonic. The system provides any subscriber with limitless access to individualised musical experiences. Will there still be a need for recording engineers, producers, and performers? Any subscriber can now be a composer of their own soundtrack. There is no need to distribute music because all music is (re)created locally. Would such a system divorce music from musical “works”? A system that responds dynamically to the wishes of a subscriber will not need to be centred on structured stretches of organised sound. Music could become an amorphous soundtrack to one’s life – delivering on Erik Satie’s vision of “Furniture Music”: music that is heard but not listened to; music that forms part of everyday life like the objects people have in their home (Templier 1969
Such a future clearly requires a careful assessment of copyright rules. If a subscriber hears something they like, they can ask to hear something “like it” and the AI should be able to produce a new instance that meets their demands without outright copying. What should then be the approach to the ownership of AI-generated music? We would have to consider the purpose of awarding copyright. Generally speaking, copyright (and related rights) aims to provide creative people (and investors) with a reward in the form of an exclusive, quasi monopolistic, right. In droit d’auteur countries, copyright also reveals an intimate link with the personality of the author. Returning to our music AI, its musical potential is immense and it is likely that most subscribers will only ever discover a tiny sliver of that potential. It is also likely that some subscribers will have the inclination and the aptitude to be much more inventive in their interaction with the system and produce truly unique results through it. How can we enable such subscribers to dedicate their time and effort towards this and have means of sharing the fruits of their intellectual labour with others? Is there any right the designer of such an AI should retain? What would be the impact of maintaining the status quo, or of passing new rights in the market of human created and/or AI generated works? As our survey of copyright law indicates, technological developments challenge established norms. The distant future sketched above suggests that there is a need for fundamental re-thinking in this area.
Of course, the inevitability of such a music AI should be taken with a grain of salt. Human creativity can surprise in its ability to incorporate new technologies. For instance, the turntable was designed to facilitate playback at home but has become a performance tool in ways that were not intended by its developers or manufacturer. It is likely that any music AI tools will open opportunities for creative but unintended uses.