Podcast: Download (Duration: 15:21 — 13.1MB)
Subscribe: Spotify | TuneIn | RSS | More
Should copyright be attributed to original literary and artistic works autonomously generated by AI? How will creators of original material be compensated when their works are used to train natural language generation models?
Intellectual property reform in the age of AI is inevitable, and we need our voices to be heard.
In this solo show:
- How copyright in the age of AI is being discussed amongst governments and global organizations
- Why copyright needs to be international
- How Blockchain technology could be used to facilitate intellectual property management
- Why we must allow the licensing of works in copyright for machine learning in order to combat bias and prejudice
- How licensing could work in a new model
This is the second in my AI episodes based on my new book, Artificial Intelligence, Blockchain, and Virtual Worlds: The Impact of Converging Technologies On Authors and the Publishing Industry.
Check out previous episodes and resources on AI at www.TheCreativePenn.com/future
***
In January 2020, a Chinese court ruled that an AI-written article would be protected by copyright. “The article’s articulation and expression had a ‘certain originality’ and met the legal requirements to be classed as a written work — thus it qualified for copyright protection.” [Venture Beat]
At this stage, there are more questions than answers in the realm of AI and copyright law. During 2020, I’ve worked with Orna Ross, founder of the Alliance of Independent Authors, to prepare submissions for the World Intellectual Property Organization (WIPO) and for the UK Government on artificial intelligence and copyright.
Some questions under discussion include:
• Should copyright be attributed to original literary and artistic works that are autonomously generated by AI or should a human creator be required?
• Should the use of the data subsisting in copyright works without authorization for machine learning constitute an infringement of copyright?
• If the use of the data subsisting in copyright works without authorization for machine learning is considered to constitute an infringement of copyright, what would be the impact on the development of AI and on the free flow of data to improve innovation in AI?
As you can imagine, we’ve had a challenge wrapping our creative brains around this area, but it is critically important for authors and the publishing industry. Intellectual property reform in the age of AI is inevitable, and we need our voices to be heard.
Most author organizations and publishers have shown little interest in submissions like this, but assuming it will all work out for the best is not enough.
If creative voices are not in the room when these issues are discussed, then stronger voices will dominate — and these may not benefit creators and rights-holders. Wherever you are in the world, consider investigating what your government is doing around AI and intellectual property, and get involved.
This whole domain is in flux and I am not a copyright lawyer, so the following are merely ideas for consideration and further discussion.
Copyright needs to be international because the digital world is global
Different jurisdictions make no sense for copyright law since technology spans borders and we publish globally with a few clicks.
This is easy to write but incredibly difficult to manage, which is why the overhaul of copyright for the age of AI is likely to be the time it happens because it’s the first truly pervasive global shift in technology.
Blockchain technology should be used for intellectual property management, from creation to licensing, through distribution, payment, and estate management
Many of these technical subjects take a while to get used to, but when the penny drops, the clouds part and you can truly see the potential. I now believe that Blockchain is possibly the most transformational aspect of converging technologies for authors.
Orna Ross, author, and founder of the Alliance of Independent Authors, published an article and a positioning paper outlining the potential of Blockchain for Authors back in 2017.
She noted that, “Blockchain not only seamlessly allows direct payments from reader to author but allows income from sales to be effortlessly split at the point of transaction between the author and anyone else involved in the making of the book, including services and booksellers.
Blockchain could allow, therefore, for the evolution of an author-centric business model where creative and commercial value is automatically recognised, registered and compensated and where the author’s smart wallet becomes the first point of payment for everyone, and the financial and informational node for their work. Author smart wallets would become the economic expression of creator’s copyright.”
At Frankfurt Book Fair in 2019, the co-founders of Content Blockchain, presented “a decentralized, global, digital infrastructure for the creative community to discover, register, navigate, offer, sell and license digital media content and otherwise exchange value over the network.” Their International Standard Content Code (ISCC) is comparable to traditional identifiers like the ISBN.
The same session at Frankfurt featured Bookchain.ca, “an online platform built on blockchain, that allows authors and publishers to configure the security, traceability, attribution, and distribution settings (including lending and reselling) of their ebooks and sell them through our catalogue, while protecting and securing the ebooks against theft and piracy.”
Blockchain technology could be used in a truly global, digital, scalable way and transform how intellectual property works. It is perhaps the only way of tracking IP and compensating creators in the fast-moving age of AI.
Creative works in copyright must be used in machine learning models to prevent bias and increase diversity in Natural Language Generation
In the wake of the George Floyd murder and the race activism of mid-2020, many in publishing emphasized the need for diversity in the industry, both in the type of books published and in the people employed who act as gatekeepers and curators.
Changes are ongoing, but with the advance of technology, this diversity also needs to be reflected in AI machine learning models. Currently, only works out of copyright are used to train algorithms (at least officially) and these tend to be predominantly written by dead, white, male, wealthy, Christian, western, and English-language authors. There is nothing wrong with this literature, or this group of people, but they do not represent the diversity of the world and they are out of date in terms of writing style. Models trained on outdated, biased data will generate outdated, biased writing that will perpetuate existing prejudice.
By withholding works in copyright, the industry could prevent the “development of AI and the free flow of data needed to improve innovation,” the wording of the UK Government’s request for submission on the topic. It is clearly the priority of governments and the tech industry to advance innovation, and we need our work to be part of AI development so we can shape this possible future. If we don’t actively participate, it could be forced upon us by changes in copyright law.
But if we allow such models to use our work, how will authors and rights-holders be compensated?
Creative works in copyright should be licensed for machine learning, and the original creators compensated with either a one-off license fee or a micro-payment facilitated by Blockchain smart contracts
“It’s rarely obvious what our data can do, or, when fed into a clever algorithm, just how valuable it can be. Nor, in turn, how cheaply we were bought.” Hannah Fry, Hello World: How Algorithms Will Define Our Future and Why We Should Learn To Live With It
There seems to be no existing license for use of books within copyright for training Natural Language Generation algorithms, but this could be a lucrative way to monetize IP for creators and rights-holders. We certainly need to value our data more than we do currently.
For example, I might include all my J.F. Penn novels and short stories into a dataset, JFPennData2020.
A company with an AI Natural Language Generation model could license my dataset for machine learning for a one-off payment, or, preferably, an agreed micro-payment from downstream products. This could be based on the number of words used in the dataset. If my words contributed 1% of the model, I would receive 1% micro-payment of the books or products generated downstream, tracked by Blockchain technology and automatically distributed.
Or, because I can’t generate enough words by myself, I could join together with a group of authors who own and control their IP to create a bigger dataset, for example, IndieThrillerWriters2020.
We might even use that to create synthetic data which increases our joint licensing potential with more datasets. We develop a smart contract between us that controls the percentage paid to each author based on words contributed, and we license that work to machine learning models. The micro-payments would be split automatically, and the process is transparent to all. Traditional publishers and other rights-holders could use a similar model, increasing the ongoing value of intellectual property assets.
This kind of process would ensure that copyright does not prevent the development of artificial intelligence, that machine learning models are trained on diverse works, and original creators can continue to be rewarded throughout the publishing supply chain from original products to all kinds of possible derivative works. It could significantly expand the earning potential for creators and rights-holders.
The use of smart contracts in licensing could also transform estate management. If you include heirs and successors into a smart contract, those micro-payments would be distributed on the death of the author, cutting out manual administration. They could also be used while the author is alive to distribute to family members and/or charities, and to reward the ecosystem of editors, cover designers and other professionals involved in creating a finished asset, encouraging further collaboration within the industry.
* * *
These possibilities represent an entirely new business model for authors and the publishing industry, and of course, there is a long way to go before Blockchain technologies emerge into the mainstream.
But as computer scientist, Alan Kay said, “The best way to predict the future is to invent it.” Creating and licensing copyright is the basis of our industry. We need to take an active role to invent its future.
***
I hope you’re inspired by the possibilities of the next decade. Please do leave your comments or questions here on the episode. If you work in these areas and would like to connect, or if you have new ideas that might empower authors and propel our industry forward, please reach out.
I’ll also be speaking professionally and consulting in this area, so if you have business opportunities, I’d love to talk. Contact me here.
You can now get the ebook on all the usual stores and that covers a lot more ground. Coming soon in print and audiobook. I’ll be back with another solo show covering copyright law, blockchain for smart contracts, and micro-payments. Happy writing, and I’ll see you next time.
Pam Harvey says
I’m finding this series of mini-podcasts to be much more interesting than I thought they would. I now feel…enthralled. I’ll keep my eye/ear out for more from you on this topic – thanks for getting me in!
Joanna Penn says
Thanks, Pam! I’m glad they have piqued your curiosity!
Atulya K Bingham says
Ah this subject is really important. Feeling very grateful you’ve spent so much time researching and envisioning for the future of human creative potential. I also like the short podcast format. Very digestible.
Vivienne says
Gosh! All these new things to think about with AI. Thanks for enlightening us all.
Lisa Machin says
I know I’m late to the party but it’s absolutely fascinating. I guess things have moved on a lot since you wrote this.
Joanna Penn says
Interestingly, no. There are lots of court cases, but nothing on whether training is fair use or not. Blockchain is still being touted as a potential way to do on-chain registration of copyrighted material, so it’s still in flux!