A model for journalistic copypasta

Updated May 2023

Comments at substack, hacker news.

Here’s a thing that happens sometimes:

Somebody thinks that A causes B.
They gather data and find that A and B are correlated.
They write a paper. In it, they’re careful to only say that A is “associated” with B, and that the data is imperfect, and that they only controlled for certain confounders, and further research is needed, etc.
When the paper is published, the journal puts out a press release. It omits most of those caveats and has a quote from the researcher (now unencumbered by peer review) saying, “B is bad! So this shows the dangers of A!”
The press release goes to a fancy media outlet where an underpaid and harried writer has no time or training to agonize over technicalities. They reassemble the press release into a simulacrum of journalism.
The piece goes to a copy editor who chooses a headline of PEER REVIEWED RESEARCH SHOWS THAT A CAUSES B.
The piece is published. Thousands of people read it. Eventually, the fact that A causes B appears on Wikipedia, using the piece as a citation.
The researcher’s colleagues roll their eyes, but don’t want to make enemies in their tiny gossipy community. Independent writers complain, but they have small audiences, and aren’t considered a reliable source on Wikipedia.

Erik Hoel recently bemoaned that high prestige outlets like the Guardian somehow get away with this kind of “copypasta”, as well as dark patterns like deliberately making it hard to find the original scientific paper. He also suggested that if independent authors tried any of this nonsense, they’d be crucified, so maybe everyone should follow more independent authors.

I mostly agree, but I’m pessimistic. We should think more about why this happens. Because, unfortunately, I’d bet against any huge changes coming soon.

Why? Trust. Despite their mediocrity, people trust places like the Guardian, and—I claim—this is not a mistake. The world is full of outright lies, it’s hard to know who you can trust. The Guardian provides a consistent bare-minimum level of quality across a wide range of topics, which is not easy. So I suspect the market is reasonably efficient: The fancy places provide real value, even when serving up journalistic mush, and readers are making reasonable decisions when they pay attention to them.

Unless the distribution of writing changes somehow to upset that dynamic, I suspect the copypasta will continue.

1.

I’d even go a little further: For complicated subjects, prestigious places are rarely the best sources. To pick an example close to my heart, take air purifiers. There are lots on the market, but to know if they truly work, we need to test them.

The most well-known name in the air purifier game is the Wirecutter. Here is their testing regime:

Generate some smoke in a room.
Measure particles.
Run a purifier for 30 minutes.
Measure particles again.

This isn’t terrible, but it’s not very good. You’re only taking two measurements, so any noise will screw up the final ratio. This leads to them doing things like publishing two different tests of the same air purifier that imply clean air delivery rates that vary by a factor of 2.4.

Meanwhile, Internet People like Smart Air or Jeff Kaufman do tests that show the entire trace of particles over time. Here’s one from me:

box fan test

This averages many measurements and so is much more reliable.

Now, there’s no special brilliance needed to come up with this kind of test. Frankly, it’s pretty obvious and I’m sure the Wirecutter knows they could do it. They just don’t.

2.

But why? Why are high-prestige places often so mediocre?

In one sense, it’s explained the same way as everything else: $$$

This seems obvious, right? They don’t link to journal articles because if they did you might leave their site and they couldn’t stick as many ads in front of your eyeballs. They use quotes from press releases in a way that looks like they spoke to the person because that allows them to churn out pieces more quickly. They don’t do more reliable tests of air purifiers because that would take more effort and they don’t think anyone cares.

But hold on. Sure, cutting corners saves money, but it also makes their product worse. They could save even more by having their articles written by a team of pet monkeys. They could stick a video ad after each paragraph. So why does the market produce this particular level of quality and user hostility? If people can get better information elsewhere now, why don’t they?

One theory is that it’s because the Guardian does “real” reporting, where they travel to places and file freedom of information act requests and probe their contacts in government for off-the-record information. This is hugely valuable, and perhaps that reputation spills over into their quik-n-ez journalism products.

Probably. I think most people don’t realize how much of what they read is regurgitated press kits. But I don’t think it’s the full explanation.

Because—I’m pretty cynical. I’m in the low tail of the population in terms of how much credibility I give the traditional media versus well-informed randos or original scientific papers. Yet, I still read traditional media, sometimes including obvious copypasta. I even sometimes—please don’t tell anyone—read the Wirecutter.

As far as I can tell, so does everyone else. Why?

3.

You’re driving down a lonely highway and you pull into a small town, hungry and exhausted. You see three restaurants:

3 restaurants

Subway isn’t good. At least, I don’t know anyone who claims it is. (The Dynomight Biologist is aggrieved at having to inhabit a universe where it even exists.)

If you lived nearby, you’d probably figure out which of the local spots is best. But if you’re just passing through and want to eat something quickly with a minimum risk of food poisoning, you might decide Subway is your best bet.

Why? Well say one person anywhere gets takeout from Subway and when they open the bag, a bat flies out and attacks their cat. That person could post their story online and cost the company tens of millions in sales. A bat attack would be bad for Jimmy’s, too, but not as bad (Jimmy doesn’t even have tens of millions in sales). A single bad event simply hurts Subway more than it hurts Jimmy’s. They both know this, and we know that they know it.

So some independent restaurants will specialize in making great food that will bring the locals back. Others will specialize in cutting costs and fleecing tourists. But if you are a tourist, it’s hard to tell the difference. So you might just rely on the fact that Subway has more skin in the “not poisoning you” game.

4.

Say you’re standing in a field with many holes of different depths and breadths. You want to find the deepest. You’ll soon notice that you can’t see the depth of narrow holes unless you’re right above them. If you’re in a hurry, you’ll probably pick a broad hole, even though you don’t care about breadth.

With the restaurants, maybe Eat Pizza is terrible, Jimmy’s is good, and Subway is tolerable. You might picture it like this:

broad and narrow restaurants

Jimmy’s might be amazing, but Creepy Floating Guy can’t see that.

Similarly, if you want information about air purifiers, I immodestly claim that dynomight.net is good, while air-purifier-ratings.org is godawful. The Wirecutter isn’t great, but at least it’s not totally made up.

broad and narrow websites

5.

You’ve probably heard of “Gell-Mann amnesia”. This is the idea that when you read a newspaper article for something you’re an expert on, you always see that it’s riddled with errors. But then you go to the next article and assume it’s fine.

This is usually presented as a kind of cognitive bias: We’re given with evidence that newspapers aren’t very good, but for some reason, we refuse to update on that evidence.

I agree we trust the prestige media too much. But in another sense, Gell-Mann amnesia isn’t wrong. For a random topic that you know nothing about, the Guardian may not be great, but they’re unlikely to straight-up lie to you. Some independent writers are probably better, but there are lots and lots that are worse, and it’s hard to tell them apart when you don’t know anything about the topic.

One thing people bring up is that independent writers usually have comments and that if they make a mistake, commenters will tear them apart. But… even if you don’t make any mistakes, commenters will often do that anyway. And how do you know that negative comments aren’t being removed? Unless you’re familiar with the community, a comments section doesn’t have much signal.

When can you trust independent writers? For me there are three cases:

I know enough about the topic to verify that the author knows what they’re talking about.
I don’t know the topic, but I’ve followed the author for a while and they’ve had many articles in category #1 so I trust them.
There are extensive comments from a community that I know and trust.

See the problem? (1) is the only one that works for one-off articles on a random topic, and it only works when I’m already well-informed! (2) and (3) require me to spend lots of time forming an impression of an author or community. I guess most people don’t follow any independent authors or communities closely enough for these to work. So they fall back to McJournalism. When it comes to writing, most people are “tourists”.

6.

So, finally here’s the model. Let’s assume four things:

There are four types of writing:
1. good independent writing
2. bad independent writing
3. SEO sludge
4. copypasta from the Guardian
It’s hard to distinguish good independent writing from bad independent writing or SEO sludge, particularly if you aren’t an expert on the subject or don’t spend all your time on the internet.
But everyone recognizes the Guardian.
The same goes for algorithms.

What happens in this model?

Well, some people are very motivated and will curate their information sources and follow whoever provides the most value, probably including some independent writers.

But most people aren’t like that. They just want to get information quickly and go live their lives. So they get their information in three ways:

Social media. What trends on social media isn’t very reliable. (Most people aren’t experts on most topics.) Some smaller places combat this with manual moderation, others just let it run wild. Lots of other places try to reign it in with algorithms. But since their algorithms can’t ascertain truth, they mostly just promote the Guardian and un-promote everyone else.
Search engines. Again, algorithms can’t tell good independent writers from bad, so they mostly promote places like the Guardian. (This is why independent writers rarely get search traffic for health or finance topics.) Although, the SEO people are working day and night to stay ahead, so they manage to get a lot of their sludge through, too.
Directly going to well-known big media outlets. In this case, the Guardian.

Few people are willing to make the effort to follow with independent writers without mediation from social signals or algorithms. But social signals don’t convey reliability and all the algorithms can do is promote the Guardian. So mostly people read the Guardian.

7.

I love independent writing. The idea of a population-wide cauldron of ideas fills me with hope. But I think this model shows why prestigious outlets will are so hard to displace: They provide somewhat-reliable information on a wide range of topics.

This is hard—they do it for years and put their “real journalism” reputation on the line even for copypasta. It’s also valuable because trust matters as much as quality. For everyone to switch to independent writers would require a level of effort than empirically only a minority of people are willing to make.

On the positive side, that minority of people probably includes lots of “elites” who influence everyone else through second-order effects. So that’s something. But unless we have a different way to distribute independent writing, it will remain a niche interest.

In my view, following independent writers probably should not be a thing. (Though, uh, please don’t let that stop you from following me.) It’s a tragedy that, today, someone can’t write a one-off article and have any confidence that anyone will read it. There are lots of people who have something important to say about one topic, but don’t want to be “writers”. Where are the people who worked on Second Avenue Subway explaining why it was so damn expensive? I’m sure there’s lots of great stuff that goes unappreciated, but I’m also worried about the “invisible graveyard” of articles that never got written at all.

What we need is a way to deliver independent writing that

suggests stuff people will like, and
requires minimal effort from readers, and
gives reliable signals of trustworthiness.

I don’t know how to do that, or if it’s even possible. But until it happens, I don’t see independent writing displacing the McMedia.

Comments at substack, hacker news.

Creative nonfiction training exercises

Things I suggest you write about

After a mutant spider bite, you take Fluffer to a dog park and realize he is bark-gossiping about you with the other dogs. What is he saying?

How I learned to stop worrying and structure all writing as a list

We like lists because they are an objectively good way to organize information. They allow readers to quickly and easily get what they want.

Say you want to learn about sleep. You see two articles: 1. "Theory and practice of effective sleep" 2. "Seven insights about sleep" Are you drawn to the second? I am. Of course, I hate myself for this because I've...

So you're thinking about writing on the internet

Observations on the dynamics of writing and commenting.

I'm not famous or successful, so why should you care what I think? Well, I have some observations about the dynamics of writing on the internet that I think my (even more non-famous and non-successful) self would have benefited from...

What I learned from reading about writing

Have a consistent ruleset. Titles are a black art. Have empathy for the brain’s parser. It’s impossible to follow all the conventions. Write a thesis statement even if you hate it