The science fiction magazine Clarkesworld was one of the first to comment on how bad the sudden impact of artificial intelligence has been on their submissions. After some months to ponder the question, the editor in chief and founder of Clarkesworld, Neil Neil Clarke has issued a statement that he hopes may guide the positions of publishers, editors and creators alike moving forward.
Here is that statement, unedited:
Neil Clarke | 05/28/2023
I’ve complained that various publishing industry groups have been slow to respond to recent developments in AI, like LLMs. Over the last week, I’ve been tinkering with a series of “belief” statements that other industry folks could sign onto. I’m finally ready to start sharing a draft. Feedback welcome, but if you are here to discourage, move along. Not interested in a fight.
Where We Stand on AI in Publishing
- We believe that AI technologies will likely create significant breakthroughs in a wide range of fields, but that those gains should be earned through the ethical use and acquisition of data.
- We believe that “fair use” exceptions with regards to authors’, artists’, translators’, and narrators’ creative output should not apply to the training of AI technologies, such as LLMs, and that explicit consent to use those works should be required.
- We believe that the increased speed of progress achieved by acquiring AI training data without consent is not an adequate or legitimate excuse to continue employing those practices.
- We believe that AI technologies also have the potential to create significant harm and that to help mitigate some of that damage, the companies producing these tools should be required to provide easily-available, inexpensive (or subsidized), and reliable detection tools.
- We believe that detection and detection-avoidance will be locked in a never-ending struggle similar to that seen in computer virus and anti-virus development, but that it is critically important that detection not continue to be downplayed or treated as a lesser priority than the development of new or improved LLMs.
- We believe that publishers and submission software developers do not have the right to use contracted works in the training of AI technologies without contracts that have clauses explicitly granting those rights.
- We believe that submitting a work for consideration does not entitle a publisher or agent to use it in the training of AI technologies.
- We believe that publishers or agents that utilize AI in the evaluation of submitted works should indicate that in their submission guidelines.
- We believe that publishers or agents should clearly state their position on AI or AI-assisted works in their submission guidelines.
- We believe that publishers should make reasonable efforts to prevent third-party use of contracted works as training data for AI technologies.
- We believe that authors should acknowledge that there are limits to what a publisher can do to prevent the use of their work as training data by a third party that does not respect their right to say “no.”
- We believe that the companies and individuals sourcing data for the training of AI technologies should be transparent about their methods of data acquisition, clearly identify the user-agents of all data scraping tools, and honor robots.txt and other standard bot-blocking technologies.
- We believe that copyright holders should be able to request the removal of their works from databases that have been created from digital or analog sources without their consent.
- We believe that the community should not be disrespectful of people who choose to use AI tools for their own entertainment or productive purposes.
- We believe that individuals using AI tools should be transparent about its involvement in their process when those works are shared, distributed, or made otherwise available to a third party.
- We believe that publishing contracts should include statements regarding all parties’ use, intended use, or decision not to use AI in the development of the work.
- We believe that authors should have the right to decline the use of AI-generated or -assisted cover art or audiobooks on previously contracted works.
- We believe that individuals using AI tools should be respectful of publishers, booksellers, editors, and readers that do not wish to engage with AI-generated or -assisted works.
- We believe that publishers and booksellers should clearly label works generated by or with the assistance of AI tools as a form of transparency to their customers.
- We believe that copyright should not be extended to generated works.
- We believe that governments should craft meaningful legislation that both protects the rights of individuals, promotes the promise of this technology, and specifies consequences for those who seek to abuse it.
- We believe that governments should be seeking advice on this legislation from a considerably wider range of people than just those who profit from this technology.
- We believe that publishers and agents need to recognize that the present state of detection tools for generated art and text is capable of false positives and false negatives and should not be relied upon as the sole source of determination.
Some notes in response to comments:
AI-generated and -assisted are not specifically defined due to the fact that the technology is evolving and any definition would likely create loopholes. Broad terms are meant to encourage transparency and allow individual parties to determine whether or not such uses cross their individual lines. For example, one’s attitude towards AI-assisted research may be different for non-fiction vs. fiction.
In suggesting that AI developers should be required to provide reliable detection tools and that by emphasizing that detection should be a priority, we are stating that the people best equipped to understand and tackle the problem of detection–which can be tackled in several different ways–are the ones developing the AI tools that generate the output. (Some have even suggested that all output should be saved for detection purposes, but we’re not qualified to tell them how to solve the problem.)
It’s a problem of the same scale (or even more complex) as generation. Instead, industry professionals have been quoted saying things like “get used to it” and have placed insufficient effort into addressing the problems they are creating. The pricing concern is to make sure that anyone can access it. It’s going to impact more than just the publishing world.
We at SCIFI.radio concur with Mr. Clarke’s carefully considered words, with two observations of our own:
- Tools for the detection of AI-written content are likely going to be absent for the foreseeable future. Even if such detection could be made reliable, the state of the art is evolving so fast that no tool, human maintained or otherwise, is likely to be ever reliable enough to use for very long.
- While it seems instinctively wrong that AI’s can be allowed to view and analyze massive amounts of images or information1 without regard for permission by the original creator(s), there is currently no practical way of licensing or restricting this without also interfering with the rights of human beings to do the same thing. There are currently no laws against such data mining anywhere in the world. Any such newly invented restrictions must be implemented such that we can clearly and easily delineate AI generated work from human work, so that we do not accidentally outlaw humans from being able to observe, think about, and create new expressions of art, music or science.
Here is a link to the original statement by Mr. Clarke: http://neil-clarke.com/ai-statement/
1 It should be noted that AI databases do not store images, only abstract formulae for creating images with user-defined descriptive weighting. They do not copy and paste. Rather, they store predictive analyses on whether successive reductions in a random Gaussian noise field will result in an image agreeing with a user supplied text description and produce iterative attempts to resolve the noise into an image.
SCIFI.radio is listener supported sci-fi geek culture radio, and operates almost exclusively via the generous contributions of our fans via our Patreon campaign. If you like, you can also use our tip jar and send us a little something to help support the many fine creatives that make this station possible.