We Need a ChatGPT Tax to Compensate Society for AI’s Blunders

It seems the owners of generative artificial intelligence models have been looking to improve them with an eye toward asking for forgiveness rather than permission. Late last month, two lawsuits were filed against OpenAI: one over scraped private data and another over copyrighted books.

If creators of books and other media can’t be readily identified in such lawsuits, they won’t be able to receive compensation. Only a tax on AI can recapture some of the value generated by unidentifiable creators and redistributed toward projects furthering social good.

AI Generates Externalities

Ameliorating externalities is a major goal of tax policy; we’ve spoken about the concept before in the context of a proposed gambling tax. Simply put, they’re adjunct transactional costs but not considered in the exchange between the transaction’s parties.

For example, let’s say I negotiate with you to build a shed on your property. You and I discuss the total cost to clear the site and build the new structure, and we settle on a price. You pay me in advance, I build the shed, and we’re both happy.

An externality would be neglecting to consider that I sourced the lumber from the local park, where I hacked down trees and operated my sawmill to make the lumber for the project. No one in town even remembers who planted those trees.

If everything remains as is, I’ve externalized the cost of sourcing lumber to the town, and a portion of the profit I made on the project isn’t mine to keep. My contributed value is in the craftsmanship in the shed—the labor—but I’ve pocketed the value derived from the materials as well.

In the case of a large language model such as GPT-4, the algorithm is the labor, but the data it has ingested to “learn” about language is the lumber. It’s critical to the project and created and owned by unnamed and uncredited masses.

OpenAI banners fly near an installation depicting the legendary “Trojan horse” built entirely out of microelectronic circuit boards and other computer components, at the campus of Tel Aviv University in Tel Aviv on June 5, 2023.

Photographer: Jack Guez/AFP via Getty Images)

Bloggers and Bigwigs as Equals

Using some hypothetical AI that’s been exposed to a swath of copyrighted novels, the remuneration owed to creators shouldn’t be keyed to the market value of those works—the model’s use of them isn’t, per se, reducing their salability in the marketplace. The works are intangible property and aren’t being reproduced in their entirety by the model.

Remunerating society for the use of these works is the wheelhouse of tax policy, not individual infringement suits or the copyright regime writ large. For the novels with alleged copyright infringements, the case isn’t being made that the harm is caused by ChatGPT reproducing the works without their authors being compensated.

It’s something new: The creative works are being stripped for parts and used to inform the AI’s model. This amounts to a use we haven’t previously had to contemplate, which may remove the issue from the realm of intellectual property law entirely.

Even if that isn’t the case, the pursuit of ad-hoc lawsuits, or even a class action lawsuit, is to allow OpenAI or owners of similar models the ability to pay one time and gain access to all the world’s knowledge. It incentivizes them to move quickly and infringe things to maximize the value they can gain in exchange for whatever the settlement would be. For individual creators, the compensation would be de minimis, and the administrative costs would be high.

A Nimble ‘Data Crawl’ Tax Policy

Taxing AI parameters is a starting point, but building an inflexible tax paradigm around current AI technology is folly. The mature technology almost certainly will bear little resemblance to the proliferation of chatbots we see before us. Tax policy will need to remain nimble and prepared to pursue just compensation for the public wherever the model data crawlers go next.

As with a more general data tax, determining the percentage of the value to be taxed for crawled public data will be the hardest part. One possibility is to tax models according to their parameters—which can be thought of as bits of knowledge derived by the model from data it has been shown.

Put differently, it’s a measurement of what an AI has “learned” from what it has read so far, or how big its “brain” is. OpenAI’s GPT-4 is thought to have somewhere in the vicinity of 170 trillion parameters. In considering a parameter tax, we’d be talking about millionths or billionths of a fraction of a penny per parameter if we aren’t interested in running OpenAI and competing model creators out of business.

A counterargument sure to be leveled is that the best way to tax creators of AI models is through rigorous application of existing tax frameworks, such as the corporate income tax. But that doesn’t quite sit right—not every dollar of corporate profit in, for example, a food canning business is as clearly derived from the work of the collective public as those profits derived from AI. New frameworks must be imagined, and new methods of compensating the public invented that keep pace with this new technology frontier.

Look for Leahey’s column on Bloomberg Tax, and follow him on Mastodon at @andrew@esq.social

Learn more about Bloomberg Tax or Log In to keep reading:

See Breaking News in Context

From research to software to news, find what you need to stay ahead.