A model for journalistic copypasta

Aug 2022

Here’s a thing that happens sometimes:

  1. Somebody thinks that A causes B.
  2. They gather data and find that A and B are correlated.
  3. They write a paper. In it, they’re careful to only say that A is “associated” with B, and that the data is imperfect, and that they only controlled for certain confounders, and further research is needed, etc.
  4. When the paper is published, the journal puts out a press release. It omits most of those caveats and has a quote from the researcher (now unencumbered by peer review) saying, “B is bad! So this shows the dangers of A!”
  5. The press release goes to a fancy media outlet where an underpaid and harried writer has no time or training to agonize over technicalities. They reassemble the press release into a simulacrum of journalism.
  6. The piece goes to a copy editor who chooses a headline of PEER REVIEWED RESEARCH SHOWS THAT A CAUSES B.
  7. The piece is published. Thousands of people read it. Eventually, the fact that A causes B appears on Wikipedia, using the piece as a citation.
  8. The researcher’s colleagues roll their eyes, but don’t want to make enemies in their tiny gossipy community. Independent writers complain, but they have small audiences, and aren’t considered a reliable source on Wikipedia.

Erik Hoel recently bemoaned that high prestige outlets like The Guardian somehow get away with this kind of “copypasta”, as well as dark patterns like deliberately making it hard to find the original scientific paper. He also suggested that if independent authors tried any of this nonsense, they’d be crucified, so maybe everyone should follow more independent authors.

I mostly agree, but I’m pessimistic. We should think more about why this happens. Because, unfortunately, I’d bet against any huge changes coming soon.

Why? Trust. Despite their mediocrity, people trust places like The Guardian, and—I claim—this is not a mistake. The world is full of outright lies, it’s hard to know who you can trust. The Guardian provides a consistent bare-minimum level of quality across a wide range of topics, which is not easy. So I suspect the market is reasonably efficient: The fancy places provide real value, even when serving up journalistic mush, and readers are making reasonable decisions when they pay attention to them.

Unless the distribution of writing changes somehow to upset that dynamic, I suspect the copypasta will continue.

1.

I’d even go a little further: for complicated subjects, prestigious places are rarely the best sources. To pick an example close to my heart, take air purifiers. There are lots on the market, but to know if they truly work, we need to test them.

The most well-known name in the air purifier game is the Wirecutter. Here is their testing regime:

  • Generate some smoke in a room.
  • Measure particles.
  • Run a purifier for 30 minutes.
  • Measure particles again.

Now, this isn’t terrible, but it’s not very good. You’re only taking two measurements, so if there’s noise in either, the final ratio will be off. This leads to them doing things like publishing two different tests of the same air purifier that imply clean air delivery rates that vary by a factor of 2.4.

Meanwhile, Internet People like Smart Air or Jeff Kaufman do tests that show the entire trace of particles over time. Here’s one from me:

box fan test

This averages many measurements and so is much more reliable.

Now, there’s no special brilliance needed to come up with this kind of test. Frankly, it’s pretty obvious and I’m sure the Wirecutter is aware they could do it. They just don’t.

2.

But why? Why are high-prestige places often so mediocre?

In one sense, it’s explained the same way as everything else: 💵 💵 💵

This seems obvious, right? They don’t link to journal articles because if they did you might leave their site and they couldn’t stick as many ads in front of your eyeballs. They use quotes from press releases in a way that looks like they spoke to the person because that allows them to churn out pieces more quickly. They don’t do more reliable tests of air purifiers because that would take more effort, and they don’t think anyone cares.

But hold on. Sure, cutting corners saves money, but it also makes their product worse. They could save even more by having their articles written by a team of pet monkeys. They could stick an unskippable video ad after each paragraph. So why does the market produce this particular level of quality and user hostility? If people can get better information elsewhere now, why isn’t everyone doing that?

One theory is that it’s because The Guardian does “real” reporting, where they travel to places and file freedom of information act requests and probe their contacts in government for off-the-record information. This is hugely valuable, and perhaps some of that reputation spills over into their quik-n-ez journalism products.

Probably. I think most people don’t realize how much of what they read is regurgitated press kits. But I don’t think it’s the full explanation.

Because—I’m pretty cynical. I’m somewhere in the low tail of the population in terms of how much credibility I give the traditional media versus well-informed randos or original scientific papers. Yet, I still read traditional media, sometimes including obvious copypasta. I even sometimes—please don’t tell anyone—read the Wirecutter.

As far as I can tell, so does everyone else. Why?

3.

Say you’re driving down a lonely highway and you pull into a small town, hungry and exhausted. You see three restaurants:

3 restaurants

Subway isn’t good. At least, I don’t know anyone who claims it is. (The Dynomight Biologist is aggrieved at having to inhabit a universe where it even exists.)

If you lived nearby, you’d probably figure out which of the local spots is best. But if you’re just passing through and want to eat something quickly with a minimum risk of food poisoning, you might decide Subway is your best bet.

Say one person anywhere gets takeout from Subway but when they open the bag, a bat flies out and attacks their cat. That person could post their story online and cost the company tens of millions of sales. A single bad outcome can do more damage to Subway than to Jimmy’s. They both know this, and we know that they know it.

So some independent restaurants will specialize in making great food that will bring the locals back. Others will specialize in cutting costs and fleecing tourists. But if you are a tourist, it’s hard to tell the difference.

4.

Say you’re standing in a field with many holes of different depths and breadths. You want to find the deepest.

You’ll soon notice that the depth of narrow holes is impossible to judge unless you’re right above them. If you have limited time, you’ll probably pick a broad hole, even though you don’t care about breadth.

With the restaurants, maybe Jimmy’s is good, Eat Pizza is terrible, and Subway is tolerable. You might picture it like this:

broad and narrow restaurants

Jimmy’s might be amazing, but the creepy floating person can’t see the difference from Eat Pizza.

Similarly, if you want information about air purifiers, I immodestly claim that dynomight.net is good, while air-purifier-ratings.org is godawful. The Wirecutter is not great but at least not totally made up.

broad and narrow websites

5.

You’ve probably heard of “Gell-Mann amnesia”. This is the idea that when you read a newspaper article for something you’re an experts on, it’s inevitably riddled with errors. But then you go read the next article and assume it’s fine.

This is usually presented as a kind of cognitive bias. We’re given with evidence that newspapers aren’t very good, but for some reason, we refuse to update on that evidence.

I agree we trust the prestige media too much. But in another sense, Gell-Mann amnesia isn’t wrong. For a random topic that you know nothing about, The Guardian may not be great, but they’re unlikely to be straight-up lie to you. Some independent writers are probably better, but there are lots and lots that are worse, and it’s hard to tell them apart.

One thing people bring up is that independent writers usually have comments and that if they make a mistake, commenters will tear them apart. But often commenters will do that anyway. And how do you know that negative comments aren’t being removed? Unless you’re familiar with the community, a comments section doesn’t have much signal.

When can you trust independent writers? For me there are three cases:

  1. I know enough about the topic to verify that the author know’s what they’re talking about.
  2. I don’t know the topic, but I’ve followed the author for a while and they’ve had many articles in category #1 so I trust them anyway.
  3. There are extensive comments from a community that I know and trust.

See the problem? #1 is the only one that works for one-off articles on a random topic, and it only works when I’m already well-informed. #2 and #3 require me to spend lots of time forming an impression of an author or community. I guess most people don’t follow any independent authors or communities closely enough for these to work. So they fall back to McJournalism. When it comes to writing, most people are “tourists”.

6.

So here’s the model. Let’s assume four things:

  1. There are four types of writing: Good independent writing, bad independent writing, SEO sludge, and copypasta from The Guardian.
  2. It’s hard to distinguish good independent writing from bad independent writing or SEO sludge, particularly if you aren’t an expert on the subject or don’t spend all your time on the internet.
  3. But everyone recognizes The Guardian.
  4. The same goes for algorithms.

If all these are true, what happens?

Some people are highly motivated. They will curate their information sources and follow whoever provides the most value. That will likely include some independent writers.

But most people aren’t like that. They just want to get information quickly and go live their lives. So they get their information in three ways:

  1. Social media. What trends on social media isn’t very reliable. (Most people aren’t experts on most topics.) Some smaller places combat this with manual moderation. Others just let it run wild. Still others will try to reign it in with algorithms. But since their algorithms can’t ascertain truth, they mostly just promote The Guardian and un-promote everyone else.
  2. Search engines. Again, algorithms can’t tell good independent writers from bad, so they mostly promote places like The Guardian. (This is why independent writers rarely get search traffic for health or finance topics.) Although, the SEO people are working day and night to stay ahead, so they manage to get a lot of their sludge through, too.
  3. Big media outlets. In this case, The Guardian.

Overall, only a small number of people are willing to make the effort that’s needed to follow with independent writers without mediation from social signals or algorithms. But social signals don’t convey reliability and all the algorithms can do is promote The Guardian.

7.

Now, I believe in independent writing. The idea of a population-wide cauldron of ideas fills me with hope. But I think the above shows why prestigious outlets will be so hard to displace: They provide somewhat-reliable information on a wide range of topics. This is hard—they do it for years and put their “real journalism” reputation on the line even for copypasta. It’s also valuable because trust matters as much as quality. For everyone to switch to reading independent authors would require a level of investment that only a minority of people will make.

On the positive side, that minority of people probably includes lots of “elites” who influence everyone else through second-order effects. So that’s something. But unless we have a different way to distribute independent writing, it will remain a niche interest.

Frankly, following independent writers should not be a thing. (But, uh, please don’t let that stop you from following me.) It’s a tragedy that, today, someone can’t write a one-off article and have any confidence that anyone will read it. Where are the people who worked on Second Avenue Subway explaining why it was so damn expensive? I’m sure there’s lots of great stuff that goes unappreciated, but I’m also worried about the “invisible graveyard” of articles that never got written at all.

What we need is a way to deliver independent writing that

  • suggests stuff people like, and
  • requires minimal effort from readers, and
  • gives reliable signals of trustworthiness.

I don’t know how to do that or if it’s even possible. But until that happens, I suspect that trying to make independent writing to overtake the McMedia is tilting at windmills.

Subscribe via RSS or substack or here:
Also about writing