Machine learning and creativity

Authors

Louise Popple

Senior Counsel – Knowledge

London

How does machine learning work?

Machine learning can work on the basis that the software 'learns' how to undertake a particular task by considering examples. For example, it might learn how to recognise pictures of cars by being exposed to examples of pictures that have been labelled as containing a car or not containing a car. Crucially, it would not have been programmed with any prior knowledge of cars such as the presence of four wheels, a bonnet, doors, a boot and the like. In a more complex scenario, the examples might be creative works such as books, poetry, pieces of music or paintings.

In most cases, the example works are fed, in digital form, into the AI. Normally, there must be a human who undertakes this task, or (less frequently) a human who instructs the AI on which works to collect. The example works – or parts of them at least - are therefore copied during the machine learning process, albeit in digital form. However, it is also possible for the AI to 'crawl over' example works (if they are freely available online) and analyse them without copying them.

Once the example works have been collected or identified, the AI analyses them, 'learning' how they are produced and constructed. Increasingly, this learning process uses complex artificial neural networks (ANNs). The AI then uses the results of this learning process (whether that be complex statistical models or something else) to produce new - and hopefully original - works.

Can machine learning infringe third party copyright?

There are at least three possible stages in the machine learning process at which an infringing copy of an original work may be made.

The first is at the outset, when the work is ingested. A permanent or semi-permanent copy may result from the ingestion. It should not matter that the example works are not normally stored in their original forms (e.g. as recognisable paintings) but in digital form (i.e. some form of electronic code) - copying (or reproduction) includes storing the work in any medium by electronic means. There may, however, be arguments that what has been stored does not actually represent the protected expression of the work such that there is no substantial reproduction. It is also not necessary for the whole of the example work to have been copied. Where a substantial part (measured qualitatively not quantitatively) is copied, there is also infringement.

In view of this, it is easy to envisage how there might be a number of disputes in the coming years about whether copyright subsists in third party works, whether a 'substantial' part has been taken and whether there has been reproduction in the sense required. This is particularly so where data (such as financial indices) or short works (such as the titles of books) as opposed to more traditional, full-length, creative works are sampled, where there might be more scope to argue that copyright does not subsist or that a substantial part has not been taken. Likewise, we can expect much discussion and potentially also disputes about whether there is and should be infringement where the AI 'crawls over' the example works, without actually copying them. Under the existing law, this would not infringe the reproduction right. However, whether the law will change in time to capture such acts remains to be seen.

Following copying for the purpose of storage, the next stage could be when further copies of the example works are made during the machine learning process itself. Where that is the case, it would have to be established whether the temporary copies defence applied, which will, in part, depend on whether the original works themselves have been lawfully introduced into the AI. Commercial users are unlikely to benefit from the UK's existing data analysis for non-commercial research exception or the EU's proposed text and data mining exception, which forms part of the digital single market agenda's copyright directive.

The final stage is when the results of the machine learning are created. It seems unlikely that the results would be infringing where they did not contain any of the original works themselves but, instead, constituted only statistical models.

If the results on which the AI is then used to create a new work do not contain a substantial part of the source works, it seems improbable, although not impossible, that the new works would infringe copyright. Where they are created by reference to the statistical model only (and there is nothing in the model which is a substantial part of any work) or are created using ANNs, it seems unlikely that there could have been any copying.

Another way in which copyright might be infringed during the machine learning process is where there are technologies restricting the use of the example works in question, such as Digital Rights Management. Removing such technologies (e.g. to facilitate access, copying or storage) would also constitute an infringement (being an unauthorised circumvention of technological measures).

In view of the above, it may be necessary to obtain permission from, or collaborate with, the owner of the example works. This assumes that public domain works for which copyright has expired cannot be used. Obtaining permission to use such works in this way could be a difficult process. Those supplying the works may require significant remuneration and/or part ownership of (or at least control over the use of) any resulting works. As a result, we are likely to see a number of interesting collaborations – as well as disputes - in the coming years.

Does copyright subsist in AI generated works?

The amount of money invested in AI software capable of performing creative tasks will partly depend on the extent to which the output is protected by IP rights, such as copyright. Without any IP or technological protection measures (such as pay-walls), potential competitors and others could copy and reproduce the results almost instantly without having made any investment.

For a literary, music or artistic work to be protectable by copyright under the UK Copyright Designs and Patents Act 1988 (CDPA), it must be original. The old position under UK law (before the influence of EU case law) was that a work was "original" if it "originated" from the creator and the creator had expended more than negligible or trivial effort and skill, labour or judgment in creating the work. Under this test it is more arguable that AI-produced creative works resulting from human skill or effort (e.g. in setting up the AI and other dimensions of the production process) would be original. Moreover, UK copyright law recognises 'computer-generated works', defined as works "generated by computer in circumstances such that there is no human author of the work" (s. 178 CDPA).

However, the old UK test for originality no longer applies, at least for works within the scope of the Copyright Directive. Under EU case law, a work is only original if it is the author's "own intellectual creation", including reflecting the author's personality, free and creative choices, and personal touch. While AI can be programmed to make individual creative choices, it is not clear whether this is enough or whether human input is also required. To date, the CJEU has not had to answer whether human creative input is required to satisfy the "own intellectual creation" test. Likewise, neither it nor the UK courts have had to assess the relationship between the EU test for originality and the UK copyright law references to computer-generated works. While the position is not clear-cut, most commentators are of the view that some form of human input is required to satisfy the EU originality test.

This is not just an EU issue, however. The US Copyright Office stated in 2014, that works created by non-humans are not copyrightable, following a dispute about the status of 'selfie' photographs taken by macaque monkeys using equipment belonging to the British nature photographer, David Slater.

Certainly, it will be difficult to argue that AI-produced creative works are original under the EU test in circumstances where there is little or no human involvement in the creation process. Indeed, even where a human instructs the AI to produce the work, perhaps even setting certain parameters such as artistic styles, it seems unlikely that the threshold for creativity would be met under existing EU law.

However, where a human is involved after the AI has produced the work (for example, in refining the work) or there is a genuine collaboration, then copyright should subsist in those parts provided that the human exercised their own intellectual creativity. Likewise, copyright might subsist in other elements such as the sound recording (for musical works) which will offer some form of protection.

While it seems that human creativity is required to satisfy the EU originality test, it is possible to argue that computer-generated works are outside the scope of the Copyright Directive and therefore have their own "sui generis" protection in the UK under the CDPA. It could be necessary for there to be a CJEU reference to decide this point before Brexit. As and when the UK leaves the EU, it is possible that the UK courts will interpret existing UK law in a way that establishes a new sui generis right or even brings back the old UK test for originality (although the latter seems unlikely). Alternatively, it may be necessary to legislate for a new sui generis right to be created that expressly protects works created by AI. Whether any of this happens, remains to be seen. In the meantime, there is likely to be considerable debate about whether such works should attract copyright protection and the potential impact this would have on human - or traditional - creativity.

Who owns copyright in AI generated works?

Where copyright is, in fact, found to subsist in AI-produced work, who owns the copyright? For computer-generated works the author (and therefore first owner) is expressly defined under English law to be "the person by whom the arrangements necessary for the creation of the work are undertaken": (s. 9(3) CDPA). In the context of the sort of creative processes discussed above, this could be the person who created the machine learning software, who selects and/or feeds the example works into the AI, who provides the instructions to the AI or who co-ordinates the whole process. This is an equivalent test to that used in relation to the authorship of sound recordings.

It could be argued that the arrangements are made by the programmer of the software since they will have invested considerable skill and time in creating a programme capable of learning. However, it does not seem right that such person would own everything resulting from the use of the AI particularly when the AI is consistently learning and refining itself and separate creative input may have been necessary to create a new work. However, it also does not seem right that those who operate or implement the AI should own the copyright since their contribution is likely to be relatively insignificant compared to the programmer or the person who invested in bringing about the programmer's work. The position is likely to be fact sensitive and there could even be scope for a finding of joint authorship. It is more likely that the person(s) higher up the chain of decision making and investment in the creation of the necessary arrangements will be found to be the author.

In view of the uncertainty, it is sensible for those involved to clarify contractually who owns the copyright (if any) in any works produced by AI software.

Future trends

Going forward, we are only going to see increased use of AI for creative tasks. This is particularly likely in areas such as news reporting, especially the type of news that involves analysing vast amounts of data, such as financial, weather and sports data. Here, the value of the resulting article is short-lived and can potentially be protected by pay-walls and provision of a service so any loopholes in the law might not be critical. However, it seems that there is still a little way to go on a technological level before AI is producing creative content of real merit such as opinion pieces, poetry and literature.

If you have any questions on this article please contact us.

Prev. Next

Interface 1 July 2019

AI and big data in the healthcare sector

Artificial intelligence algorithms and big data are already being used in hospitals and other areas of healthcare and diagnostics. What exactly does this mean for patients and for those seeking to overcome the legal challenges?

1 of 6 Insights

Click here to find out more

Interface 1 December 2017

AI and business process outsourcing: rise of the machines

The term "artificial intelligence" (AI) is increasingly attached to emerging technology. While newspaper headlines focus on developments like driverless cars and robot assistants, the real revolution for businesses is currently happening in the context of process automation.

2 of 6 Insights

Click here to find out more

Interface 1 July 2018

AI is disrupting the delivery of legal services – but to what extent and how?

3 of 6 Insights

Click here to find out more

Interface 1 July 2018

Facial recognition technology in the EU: does GDPR spell the end

Facial recognition technology (FRT) tends to dominate the headlines when it is used for law enforcement purposes, but its commercial use is one of the more widely visible forms of AI.

4 of 6 Insights

Click here to find out more

Interface 1 July 2018

UK and EU developments in AI in 2018

Countries including the US and China have traditionally led the way in developing strategies to foster and develop AI technologies, but 2018 has seen the UK and EU taking meaningful steps to position themselves at the forefront of the AI market.

6 of 6 Insights

Click here to find out more

Sectors Technology, media & communications Artificial intelligence & machine learning

Return to

home

Go to Interface main hub

Machine learning and creativity

How does machine learning work?

Can machine learning infringe third party copyright?

Does copyright subsist in AI generated works?

Who owns copyright in AI generated works?

Future trends

More from this series

AI and big data in the healthcare sector

AI and business process outsourcing: rise of the machines

AI is disrupting the delivery of legal services – but to what extent and how?

Facial recognition technology in the EU: does GDPR spell the end

UK and EU developments in AI in 2018

home