Reasons and Persons: Watch theories eat themselves

Reasons and Persons: Watch theories eat themselves

Dec 2021

You live with a group of utterly rational and self-interested people on an island, gathering coconuts to survive.

Tired of working so hard, Alice builds a machine and implants it in her brain. This machine leaves her rational except when it comes to fulfilling threats, which she always does regardless of the damage to herself. She announces to the group, “I will gather no more coconuts. Either you do it for me, or I burn the coconut trees and we all starve.”

You regretfully conclude that your best choice is to capitulate. But every night as you gather coconuts for Alice, you wonder, where did you go wrong?

Eventually, you realize: Upon arrival on the island, your first task should have been to implant a machine in your brain and promise, “If anyone else installs any machines in their brains, then I will burn the coconut trees.”

I’ve long been fascinated by Derek Parfit’s Reasons and Persons, often called one of the most influential philosophy books of the 20th century. How can you not love something that tries to answer these questions:

We are particular people. I have my life to live, you have yours. What do these facts involve? What makes me the same person throughout my life, and a different person from you? And what is the importance of these facts?

How? I’ll tell you how! Turn to page 107.

Those who hold a Constructivist view may question my division of a moral theory. (R1) revises what I call our Ideal Act Theory. Constructivists may see no need for this part of a moral theory. But they cannot object to my proposal that we should ask what we should all do, simply on the assumptions that we shall all try, and all succeed. Answering this question is at worst unnecessary. If a Constructivist asks what we should all ideally do, his answer cannot be some version of Common-Sense Morality. If he accepts some version of this morality, he must move to the corresponding version of (R1), the revised version of his morality that would not be directly collectively self-defeating. And, since he should accept (R1), he should also accept (R2) and (R3). He should revise his Practical Act Theory, the part that used to be his whole theory.

There are two issues here.

For one, this is hard to read. Now, Parfit isn’t trying to be obscure, he’s just happy to double the difficulty for 10% more precision. I’m sure that’s right for academic philosophers, but I’d like a different tradeoff.

Second, I sometimes… Well, I sometimes don’t care about the questions Parfit is trying to answer. The above paragraph asks if people who think morality is created by society should accept a certain way of splitting up moral theories. To me, that feels like agonizing about definitions.

You might think all this leaves the book uninteresting, but that’s not true at all. Parfit’s arguments are anchored by a series of evocative thought experiments. These are accessible and mind-expanding independently of the hulking logical arguments on top of them.

So this is the summary of Reasons and Persons I would have liked to read: The goal is to provide a tour of the thought experiments and let you (mostly) decide for yourself what you think about them.

Warning: I often changed the scenarios quite a lot. (Parfit has no alien viruses.) I also changed the order of things, and greatly shortened or dropped most of the detailed arguments. I think this still gives a lot of the value, but it’s definitely not a full representation of Parfit’s ideas.

How self-interest gets into trouble

The first part of the book asks, do ethical theories eat themselves? We’ll start with the idea that it’s rational to do what’s best for you.

The desert hitchhiker

You’re self-interested but unable to lie. When driving through the desert, your car breaks down. A stranger stops and offers you a ride into town for $20. You’d be thrilled to pay $20, but you have no money on you. The stranger asks, “Well if I give you a ride, will you pay me $20 later?”

You think about it. Once you’re in town, the stranger can’t force you to pay and so, since you’re self-interested, you won’t. Since you can’t lie, you admit this to the stranger, and you’re left on the side of the road.

The firefighting pact

You are self-interested. One day, your neighbors have a meeting and point out that the weather is hot and dry, and fires could happen at any time. They propose that everyone should swear that if a fire breaks out at anyone’s house, everyone will help out. However, they have a lie detector, and under examination discover that you wouldn’t show up when needed. Thus, you’re excluded from the pact. When your house catches on fire, it burns to the ground.

Kate the writer

Kate is an altruistic writer. She believes her work is important for the world, so she works like mad until she collapses in exhaustion and depression. She doesn’t like being exhausted and depressed, but she thinks that the good she does for others outweighs her pain.

Fortunately, you have a science-fiction neuroscience gun. You zap Kate’s brain and make her completely selfish. She immediately quits writing, since she did that for the benefit of others. But now she finds her life is less meaningful and is less happy than when she was altruistic.

Takeaway: There are scenarios where being self-interested seems to make you worse off, not better.

Are you disturbed by these examples? Many people say, “So what? Just do what’s best for you overall.” OK, but how does that work? Once the stranger has given you a ride into town, it is in your interest to stiff them, no matter what you said in the past.

Adding an exception like, “Follow your self interest unless you made a promise” doesn’t work. If you realize the stranger would use your $20 to buy a knife and stab you, you should renege. So you can’t “just” do what’s better for you overall. That’s very hard, or maybe even impossible.

On rational irrationality

As further evidence that self-interest isn’t necessarily good for you, there are situations where you might rationally choose to make yourself crazy. One is the story of Alice’s brain modification at the beginning of this post. Here’s a similar one.

Schelling’s answer to armed robbery

A burglar breaks into your house. He threatens that you must open your safe or he’ll kill your children.

Fortunately, you are an expert on conflict strategy and have a potion that makes you temporarily indifferent to the lives of your kids. You quickly drink it and then laugh at the man’s threats. “You want to kill my daughter? Go ahead!”

The burglar realizes that there’s nothing to be gained by hurting anyone, so he leaves before the police show up.

On how deontology gets into trouble

Next up is deontology, the idea that morality can be defined by obeying a set of universal rules, such as “don’t lie”, “don’t steal”, etc.

The obscene film

A man breaks into your house. He says, “If you don’t allow me to film you doing [obscene act], I will kill your children. If you do allow me, I will later use that film to blackmail you into doing minor crimes.” If you make the film, your kids are safe forever. But you know that, given your personality, you would give in to the blackmail and do the crimes.

Should you allow him to make the film?

Murder and accidental death

Ben is about to die. But right before he does, he plans to murder Cathy. Meanwhile, a fire is closing in on Deb, who will die unless she is rescued. The lives of Cathy and Deb are equally valuable.

You have enough time to either convince Ben not to murder Cathy or to rescue Deb, but not both. You have a 50% chance of convincing Ben not to do the murder and a 51% chance of rescuing Deb.

What should you do?

The question here is if only outcomes matter, or if it matters what actions people make. In the first scenario, you choose between (A) doing some minor crimes, or (B) someone else doing a horrible murder. If morality is about avoiding doing bad things, then you should choose (B) since then you do nothing wrong. Similarly, in the second scenario, the question is if you’re just trying to avoid death, or trying to avoid murder.

How consequentialism get into trouble

Finally, let’s look at consequentialism, the idea that only consequences matter.

Clare’s child

Clare loves her child. She can spend $50 to buy her kid a wonderful dinner, or to cure a stranger of a horrible disease. She chooses dinner.

You suggest to Clare that this was wrong. She says, “Given how much I love my kid, it was impossible for me not to spend the money on her. And it would be wrong for me to make myself love my kid less. So I can’t really be blamed here.”

If only consequences matter, is she right?

The alien virus

People get much of their joy in life from “selfish” desires like eating pizza or playing with their kids. One day, aliens happen by the Earth. Noticing that we are such primitives, they decide to help us out: They drop a virus that transforms everyone into pure consequentialist do-gooders.

All people now focus only on increasing the average good in the world. Desires like “play with my kids” are dropped since they are “agent-dependent” and so not consequentialist.

In some ways, the world immediately becomes some kind of zero-externality socialist paradise: Litter disappears! Fisherman stop fishing when stocks start to deplete! There’s no need for taxes. People work as hard as they can and donate everything to the needy. There’s no need for fences or police.

But people report that their lives feel meaningless. Evolution put the rewards for enjoying dinner and taking care of our kids deep within us. When we neglect these “agent-dependent” goals, we lose a lot. Average happiness falls.

Parfit takes these scenarios very seriously. He thinks consequentialism is seriously broken because humans couldn’t have rich lives without agent-relative (and thus non-consequentialist) desires.

How consequentialism might save itself

Esoteric theories

Let’s continue the alien virus scenario. When we stopped, people lived in a clean socialist paradise but they were depressed because their lives lost meaning without being able to focus on themselves and those closest to them.

Now, the top minds of humanity get together. They calculate that if they transform everyone back into non-consequentialists, the average happiness in the world will go up. Since they are (currently) consequentialists, they consider this to be a good thing. So they create a vaccine that neutralizes the virus and makes people as selfish as before. They vaccinate everyone and destroy any evidence that this era ever happened.

Time passes. By the year 2907, people live in small orbital space clusters. In this new situation, people would be happier if their circle of concern happened to be everyone in their cluster. But they’re stuck with their old outlook.

Fortunately, a small group had refused the vaccine and endured secret consequentialist misery over the centuries. They decide their time has come, and so they re-release the alien virus. After the population is converted back to consequentialism, they calculate the moral outlook suited for the current age and engineer a virus that gives people that morality. Since everyone is (currently, again) consequentialist, they agree that this is a good thing, so everyone takes the virus. They again destroy all the evidence, except leaving a secret cabal to keep the flame of consequentialism alive, waiting for the day to again inflict a brief flash of enlightenment on everyone else.

The argument here is that, sure, consequentialism might lead people to choose some other (non-consequentialist) morality. But even if that’s true, people should keep some trace of consequentialism around in case circumstances change and that secondary morality needs to be adjusted.


Satan rules the universe

Maybe Satan rules the universe. If so, he could make it so that anyone who is self-interested has a horrible life. Or he could make it so that if everyone is utilitarian, then average utility is low. Or he could make it so that people who believe it’s wrong to lie end up lying more—i.e. believing in deontology leads to worse deontological outcomes.

Now, maybe Satan doesn’t rule the universe. But what if he did? It’s worth worrying about since in our universe these things probably are at least a little true in some situations.

Coordination problems

In the real world, what happens doesn’t just depend on you, it depends on other people. How should we think about this?

The prisoner’s dilemma

You’re probably familiar with this, but just in case: You and I are in separate rooms. We each have a red button and a blue button. Depending on what we press, we’ll get different amounts of money. Here’s what you’ll get:

You \ me Red Blue
Red $1 $3
Blue $0 $2

And here’s what I get:

You \ me Red Blue
Red $1 $0
Blue $3 $2

No matter what I do, you always get $1 more by choosing red. The same is true for me. Since we’re both selfish brutes, we both choose red, and so we only get $1.

Parfit argues that, in practice, 2-person prisoner’s dilemmas are rare. One reason is that most prisoner’s dilemmas are repeated. Even if I’m a sociopath, I probably don’t want to take all my roommate’s stuff and trade it for drugs because I don’t want my roommate to do that to me later.

More importantly, people conspire! You need an outside force to stop them. That’s why the cops in movies keep the prisoners in different rooms and lie to them about what the other one is doing.

What is common is multi-party prisoner’s dilemmas.


It’s convenient to litter, and it’s mostly other people that have to look at it and clean it up. Since I’m a low-quality person, I litter.


You’re a fisherman in an area where the fish population is collapsing. You know everyone needs to slow down and let the fish recover, but you also know that if you don’t fish, someone else will and the fish will collapse anyway. So everyone fishes and pretty soon the fish are all gone.

Public transit

Our roads are clogged. If we all took the bus, the roads would be clear, and we’d all get to work faster. But, if everyone else took the bus, you could drive and get to work even faster. So everyone drives and the roads stay clogged.

Prisoner’s dilemmas with kids

We all want our kids to have relaxed childhoods. However, if I push my kid slightly harder than you do, then my kid will get into a fancier college. Slowly, everyone comes to the same realization and pushes their kids harder and harder until every childhood is an endless misery of studying.

Weird outcome matrices

You and I are put into separate rooms. We each have three buttons. Depending on what the two of us press, we’ll each get some BAD, OK, GOOD, or GREAT outcome. Here is the outcome matrix.

You \ me 1 2 3

If we both press button 1, have we done the right thing, according to consequentialism?

You can create many other weird possibilities with these outcome matrices. These all lead you to ask: Does the moral thing to do depend on what you expect others to do? If yes, then you get stuck in the OK outcome above. If not, then you might get terrible results if others misbehave. Is it right to assume other people will behave well, or should you try to guess, or what?

Parfit argues that commonsense morality evolved in small communities where what you did could only affect a few others. But modern life creates many more situations where we can benefit ourselves by creating a larger cost to others. Thus, he suggests we need a new morality to adjust to these new times.

Assigning blame

Sometimes it’s hard to say who did the right thing.

1st rescue mission

There are 100 people trapped in a cave, and four people are needed to rescue them. You could go on the mission, or go rescue 10 different people. If you don’t go on the mission, someone else will take your place. What should you do?

2nd rescue mission

There are 100 people trapped in a cave, and four people are needed to rescue them. Only four people are available, one of which is you. Alternatively, you could go rescue 50 people on your own. Is that better (since 50 is more than 1/4 of 100)?

These scenarios suggest that marginal changes are what matter, that the good of your actions is what difference they make, holding everyone else’s actions constant. Cool, except…

1st execution

Alfred and Bert hate Carlos so they get together and shoot him at the same time. Either bullet would have killed Carlos. Did either of them do anything wrong?

2nd execution

You hear that a million people who hate Carlos are going to show up and shoot him at midnight. The mob cannot be stopped. Since you hate Carlos too, you also shoot him at the same time.

3rd execution

Carlos needs $1000 to buy medicine or he’ll die. He asks a million people for a donation, and they all refuse. You also refuse.

4th execution

I could spend $1000 to save the life of a child somewhere, but I don’t.

1st poisoning

Alfred gave Carlos a poison that will painfully kill him in a few minutes. Before that happens, Bert shows up and shoots Carlos, killing him instantly and painlessly.

2nd poisoning

Alfred gave Carlos a poison that will painfully kill him in a few minutes. Meanwhile, you are about to be hit by a truck. Bert throws Carlos in front of the truck, instantly killing him and saving you.

Drops of water

A thousand thirsty people are in the desert. A truck is going out to them with a big barrel in it. You have a single liter of water, that you could pour into the barrel to be distributed equally to everyone. No one can sense the difference of 1mL of water. Does that mean that you don’t need to add your water?

There are a ton of examples like this, with drops of water or torturers with buttons that cause an imperceptible increase in pain. These are meant to dispute the idea that imperceptible effects can’t be bad. I find these boring, probably because I’m already convinced that imperceptible effects can be bad—isn’t that just a comment on the quality of someone’s sensory system?

Morality decompositions

Two key distinctions

Moral theories vary in two key ways:

  1. They can be agent-neutral or agent-relative. In an agent-neutral theory, it should be possible to take any world state and say how good it is. In an agent-relative theory, you can only say how good it is from one person’s perspective.
  2. They might care about what happens, or also what we do. The question here is if actions themselves matter, or just the result.

This matrix illustrates the different attitudes you might take to kids being fed, depending on which kind of moral theory you are operating with.

  What we do matters What happens matters
agent-relative I should feed my kids My kids should be fed
agent-neutral Parents should feed their kids Kids should be fed

Traditional deontology would be in the upper-left corner. Traditional consequentialism would be in the lower-right corner.

Five parts of a moral theory

As well as having different moral theories, you can take a theory and break it up into pieces. For example, maybe you want to be pragmatic. You might say that in an ideal world, we would have moral theory A, but given that many people are jerks, we should instead have moral theory B.

Somewhat separate from morals are motives, the desires that animate us. Maybe the desire itself to take care of our kids makes us happy, independent of the effects of that desire. You might say that in an ideal world, we would have motives C, but that given that many people are jerks, we should have motives D instead.

Finally, you can also think about E, how we should react to bad acts. (Here no theory is needed in a perfect world!) You can picture these different parts of a moral theory like this:

  Ideal Practical
Successful acts Ideal Act Theory Practical Act Theory
Motives Ideal Motive Theory Practical Motive Theory
Blame and remorse n/a Reaction Theory
You can spell out these five theories in more detail.
  • Ideal Act Theory is what we should do if everything was predictable and everyone did the right thing.
  • Practical Act Theory is what we should do, given that the world is uncertain and some people are jerks.
  • Ideal Motive Theory is what motives we should have, given that motives had effects other than our acts.
  • Practical Motive Theory is what motives we should have, given that the world is uncertain and some people are jerks.
  • Reaction theory is what acts we should blame people for, given that the world are uncertain, some people are jerks, and blame has complex effects on the future.

This is useful: If you’re having a moral debate with someone, make sure that you’re talking about the same cell in the above matrix!


I think Parfit makes three main arguments.

  1. Self-interest theory is collectively self-defeating due to prisoners dilemmas. (And gets complex at the individual level due to conflict strategy.)
  2. Consequentialism is indirectly self-defeating because it is agent-neutral—if we only look at outcomes without considering the agent, our lives would be empty, since we’ve evolved to care about our relationships.
  3. Commonsense morality is self-defeating because it is agent-relative. If different people are supposed to prioritize loved ones, then everyone will screw over the world for their ingroup.

Now, it’s bad that self-interest theory is self-defeating, but that doesn’t mean it is “wrong”—it never claimed to give globally optimal outcomes. It only promises to give you the best outcome given what everyone else is doing. It’s perhaps more concerning that being self-interested might leave you stranded in the desert or excluded from firefighting pacts, but maybe you can overcome this with clever self-interested meta-reasoning.

It’s more of a problem that consequentialism and commonsense morality are self-defeating. After all, these are moral theories—you’d expect them to lead to optimal outcomes.

Parfit suggests that a solution might come from some kind of merger of consequentialism and commonsense morality. He doesn’t say exactly how this would work (except that it’s complicated) but suggests a kind of agent-neutral version of commonsense morality as a starting point: Maybe we should all collectively work to make “kids are fed by their parents” happen, as opposed to robotically trying to feed all kids (consequentialism) or just trying to feed our kids (commonsense morality).


I had two major thoughts after reading this. First off, it’s striking that Parfit doesn’t engage with economic ideas. For example, he is concerned that self-interest theory might lead to poor self-interested outcomes, and that consequentialism might lead to poor consequentialist outcomes. That’s fine, but surely the major problem in our actual world is that consequentialist behavior leads to poor self-interested outcomes. The implicit view in economics is that it’s society’s job to align selfish interests with public interests. I’d have liked to know what Parfit thought about that.

Similarly, Parfit is concerned about the bad effects of multi-party prisoners dilemmas. I agree these are a serious problem, but let’s remember that these also have positive effects. After all, they are the entire basis of capitalism! Companies could make massive profits if they all agreed to keep their prices high. But usually—we hope—at least one greedy company will try to screw the others over by cutting prices, and so the cartel collapses.

A second thought is just how strong of an influence Schelling’s The Strategy of Conflict is. While Schelling is concerned with war, conflict, negotiation, etc., it is remarkably similar in that it starts with simple principles but as you start to consider reactions and counter-reactions, you seem to get a spiral of ever-increasing levels of complexity. (If you haven’t read this book, it’s great.) Parfit is aware of this, but he never seems to push things quite as far as he could. Here’s an example:

How much should the rich give to the poor?

After making billions in Silicon Valley, you read The Most Good You Can Do and decide you’re a consequentialist. You ask yourself, how much of your money should you give away? Under mild assumptions, the answer seems to be almost everything.

But then you have Thomas Schelling over for dinner. He says you should be careful: When you donate, you aren’t just giving away your money, you are also helping to set a social norm for how much the rich should give away. Perhaps giving away almost everything will make people think that consequentialists are crazy, and so many people will give nothing. Maybe it would be better to try to establish a gentle norm, like giving away 10% of your wealth.

This is (roughly) where Parfit stops. But you can keep going. Maybe what you should do is secretly donate almost all your money, but pretend in public like you only gave away 10% so people don’t think consequentialism is a bummer. And maybe at the same time you should secretly find other like-minded billionaires and tell them that you did actually give most of your money away? Or maybe you should build an AI to predict each person’s appetite for donation, and coordinate a society-wide deception so that everyone thinks there is a different norm?

Or maybe all this is outweighed by the fact that your lies could be discovered or that dishonesty has other bad effects? And if you think lying is bad, does that mean that even the original gambit of trying to create a norm of 10% is bad?

I could keep going forever here, but you get the point—things spiral and it never ends.

Sadly, that’s my main conclusion from all of this. Nothing works. Whatever system you commit to, it’s always possible to jump up one level of abstraction and break things. There’s no answer for this, or at least Parfit doesn’t provide one. (Though probably in practice things don’t get broken too badly, and we can solve most of the problems while only climbing the abstraction ladder a level or two.)

So that’s part one of the book: Parfit breaks morality. In part two he will wield time for even more breakage, with gambits like “if time is an illusion, then…” Part three is the good stuff—breaking the idea of personal identity.

This review continues here: Reasons and Persons: The case against the self

new dynomight every thursday
except when not

(or try substack or rss)