Artificial Intelligence or fair use? Is any real 'intelligence' behind AI?

February 14, 2024

Artificial Intelligence or fair use? Is any real ‘intelligence’ behind AI?

Rob Tornoe, digital editor and writer, The Philadelphia Inquirer

It’s easy to get depressed about Washington’s current state of politics. At times, the political divide in our country has never felt starker, with all signs democracy itself could be on the ballot in this year’s presidential election.

But something odd is happening on Capitol Hill that is both surprising and reassuring. Sen. Josh Hawley, a Republican from Missouri who famously raised his fist in solidarity with Trump supporters who stormed the Capitol, has been partnering with Connecticut Democratic Sen. Richard Blumenthal on legislation that would help news organizations grapple with quick-moving changes in the tech world.

What brought them together? Artificial intelligence.

Before we get too deep into this column, let’s be clear: there is no real “intelligence” behind AI, at least as far as chatbots and large language models are concerned. Emily Bender, a computational linguist at the University of Washington, has described them as “stochastic parrots,” mimicking words and sentences they’ve been fed without any fundamental understanding of what they mean.

“When we are interacting with these text synthesis machines, if it seems to be making sense, it’s because we’re making sense of it. All the meaning-making is on our side,” Bender told me last year.

Chatbots, such as ChatGPT and Google Bard, have been able to spit out increasingly plausible texts thanks to the massive amount of information they’ve been trained on, which makes them feel coherent and conversational. That’s become a problem for news organizations because these large language models have been feeding in part on content produced by newsrooms for free, without even leaving a tip.

So are AI-powered chatbots covered under fair use laws, or are they stealing copyrighted content at the expense of the newsrooms that paid to produce it?

In December, The New York Times sued OpenAI and Microsoft for copyright infringement, claiming millions of articles were used to train chatbots that are now directly competing as a source of information (and netting billions of dollars in funding).

In an attempt to clarify things, Hawley and Blumenthal have introduced bipartisan legislation to prevent content produced by generative AI from enjoying protection under Section 230 of the Communications Decency Act, which provides immunity to online platforms over third-party content and conversation on their websites. In effect, it’s credited with protecting the internet.

Their legislation would strip “immunity from AI companies in civil claims or criminal prosecutions involving the use or provision of generative AI,” according to a statement from Hawley’s office. Basically, it would hold tech companies accountable for copyright theft and not extend to them the same fair use protections afforded to news organizations for reporting, commentary and criticism.

During last month’s hearing, Blumenthal pointed to the irony that news organizations can be sued for delivering false or defamatory information. However, tech companies and social media platforms cannot be sued for providing that same content.

After watching the hearing, I found myself more aligned with the arguments made by News/Media Alliance CEO Danielle Coffey and National Association of Broadcasters President Curtis LeGeyt, who both testified that tech companies should have to license content from news organizations to feed their large language models.

According to LeGeyt, one chatbot was prompted to provide information about the latest news in Parkersburg, West Virginia. A blogger or another newsroom would’ve cited WTAP in their report if they couldn’t confirm the fact independently, but the chatbot generated text that simply copied WTAP’s website nearly word for word. Not only did the station not grant permission to use their content, they weren’t even made aware of it.

“I think quite simply, if Congress could clarify that the use of our content and other published content for training and output of AI models is not fair use, then the free market will take care of the rest,” Roger Lynch, the CEO of Condé Nast, told the committee.

The only real contrarian to deliver testimony was Jeff Jarvis, an author, blogger and the director of the Tow-Knight Center for Entrepreneurial Journalism. Jarvis, a longtime defender of an open and free internet, made a compelling point comparing these AI systems to journalists themselves, who read, learn and repurpose information from their competitors to the benefit of their readers.

From Jarvis’ point of view, legislation that walls off tech companies from these rights could become a slippery slope. Take the case of two New York lawyers sanctioned for submitting a legal brief that included false citations generated by ChatGPT. As Jarvis pointed out, was it ChatGPT’s fault for creating false information, Microsoft’s fault for how it pitched the product or was it the lawyers’ fault for presenting false information to the court?

I think Jarvis has a legitimate point. But from my vantage point, there’s a healthy difference between a reporter aggregating something produced by another newsroom or using a competitor’s work to push forward a story and an AI-powered chatbot ingesting millions and millions of words produced by a newsroom and repositioning it as something new at a scale human beings can’t replicate.

“The printing press never created anything,” Blumenthal noted. “Ben Franklin ran a printing shop and never told us, ‘It’s not my fault. It’s the printing press that did it.’”

Let’s also not forget about the tendency of chatbots to get things wrong — a “hallucination,” as AI researchers describe it — yet still present their plausibly predicted text as the absolute fact, with no editor forcing accountability or corrections.

Back in October, both Google and Microsoft Copilot chatbots falsely claimed there was a ceasefire in Israel. Copilot also confidently gave many incorrect answers to basic questions about elections in Germany and Switzerland, according to a study by the European nonprofits AI Forensics and AlgorithmWatch shared with The Washington Post.

According to the Times, even during Google’s introduction of Bard last year, it spouted out the untrue statement that the James Webb Space Telescope had captured the first photo of a planet outside our solar system.

The good news for both sides is that it seems relatively easy for tech companies to pay and license the work of news organizations, both new and old. Similar arrangements have been made in other countries. As Coffey pointed out during her testimony, news organizations have spent a lot of money to digitize archives that date back more than a hundred years.

“I would encourage these licensing agreements and arrangements between AI developers and news publishers so that we can avoid protracted litigation,” Coffey said. “That’s not good for either industry.”

“These technologies should be licensing our content. If they are not, Congress should act,” LeGeyt said. “But under current law, they should be doing it.”

None of this means newsrooms shouldn’t embrace AI. News companies worldwide have already found massive benefits from these tools — from transcribing interviews to scouring court records and summarizing lengthy meetings.

LeGeyt cited some examples during his testimony to Congress. One broadcaster is piloting a tool that uses AI to cull through news tips emailed to the station and sent on social media to produce recommendations reporters can verify and turn into stories. Several others use AI tools to transcribe their work into other languages, allowing their journalism to reach a more diverse audience.

“America’s broadcasters are extremely proud of our role in serving your constituents, and we are eager to embrace AI when it can be harnessed to enhance that critical role,” LeGeyt testified. “However, as we have seen in the cautionary tale of big tech, exploitation of new technologies can undermine local news.”

Rob Tornoe is a cartoonist and columnist for Editor and Publisher, where he writes about trends in digital media. He is also a digital editor and writer for The Philadelphia Inquirer. Reach him at robtornoe@gmail.com.

Courtesy of: https://www.editorandpublisher.com/stories/artificial-intelligence-or-fair-use-is-any-real-intelligence-behind-ai,247853