Conference Report: SC15

My week in Austin started out cold and rainy.
My week in Austin started out cold and rainy.

This last week, I had the privilege to attend the biggest annual supercomputing conference in north America, SC. I was one of about ten students studying high performance computing (and related fields) who were funded to go by a travel grant from the HPC topical group of the Association for Computing Machinery. It was a blast, and I learned a ton.

I haven’t had much time to write up any science results, so I figured I’d give a few brief highlights of the conference, if I could.

Vast Scale

SC15 was by far the biggest conference I’ve ever attended. There were more than 10,000 people registered… and the scale showed. The plenary talks, attended by most of the conference, were like rock concerts, complete with stage lighting and huge crowds.

The plenary sessions for SC15 were like rock concerts.
The plenary sessions for SC15 were like rock concerts.

And in addition to the technical program, there was a massive exhibition, with booths manned by scientists, government organizations, and corporate vendors—anyone with an interest in supercomputing. I spent a long time at the NASA booth chatting with the scientists about their research.

The SC15 exhibition is quite impressive
The SC15 exhibition is quite impressive

A Focus on the Future of Supercomputing

The high-performance computing community is currently working hard to prepare for the next generation of supercomputers, the so-called exascale machines, which will turn on in five years or so. These machines will be orders magnitude faster and more parallel than current systems. And although this brings opportunity, it also brings huge challenges.

How do you run a program on a supercomputer when it’s so large that a component fails every day? How do you write programs that can take advantage of all that computing power? To do so, you essentially need to write many many programs, each of which is running on a different piece of the supercomputer. (We do this already, but it will be much harder on exascale machines.)

About half of the talks and panels I attended were discussing these problems. Lots of people have different approaches. For example, I attended a tutorial on a programming library called HPX, which uses the concept of a future—a promise to return some data after calculating it—to express how to write parallel programs. I also attended a session on Charm++, which tries to treat each part of a parallel program as an independent creature which can talk to and interact with different parts of the program. Both of these ideas are designed to help people deal with ultra-parallel programs.

Highlight: Alan Alda

The plenary speaker on opening night was the Alan Alda, the actor. Alda is a major science advocate. In his talk, he not only argued strongly for the need for science communication, but he also argued for his vision of how that should be done. Alda felt that scientists need to be trained as communicators who can read their audience and bring the subject matter to them. To this end, Alda has started an organization that trains scientists to be better communicators: The Alan Alda Center for Communicating Science.

It was a very good talk. I didn’t know about the center, but now I want to take one of those classes!

I took this picture from the Alan Alda Center's website. Presumably it is a scientist learning to communicate.
I took this picture from the Alan Alda Center’s website. Presumably it is a bunch scientists learning to communicate.

Highlight: Reduced Order Modelling

One of the most interesting talks I saw by far was the talk on “reduced order modelling.” The idea is this. Suppose you’re an engineer and you want to use computer simulations to help you design whatever it is you’re designing, like an airplane. Unfortunately, the simulation of air flow over the body of the craft takes a long time… hours or days on a supercomputer. So, change one thing and wait hours to see what happens. Not very useful for design. How do you handle that?

Well, a new class of techniques try to answer this. Basically, the entire set of possibilities can be represented by splicing together the results of just a few simulations… enough to get a representative idea of what’s going on. The techniques that do this are called “reduced order modelling” and this is exactly how gravitational scientists are using numerical models of gravitational waves to make predictions about what gravitational wave detectors like LIGO will see.

Stanford professor Charbel Farhat gave a very nice overview talk of the methods and their industrial applications.

reduced order modelling
Reduced order modelling means that an engineer designing this plane could get near instant feedback about how it behaves. Credit: David Ansallem


By necessity, I am leaving many amazing talks, workshops, and panels out of this article. But hopefully it gave you a taste for what SC15 was like. I may have more to sayin the future. But I think that’s all for now.

Posted in Science And Math | Tagged , , | Leave a comment

Bruno Maddox and the Magnet: A Story of Misconceptions

Insane Clown Posse certainly wonders how magnets work.
Insane Clown Posse certainly wonders how magnets work.

This week the ever-inquisitive Gary Matthews pointed me to a 2008 article for Discover Magazine by Bruno Maddox, claiming that physicists cannot explain how magnetism works, and that they are in denial about it. I encourage you to read the article. Maddox is wrong—dead wrong—but his argument displays a number of common misconceptions about science. And I’d like to address some of them. The most important misconceptions Maddox displays are that of first cause, of classical intuition, and of distrust of the abstract. Let’s get started.

(DISCLAIMER: The opinions in this article are my own. I will be describing very little real science here… just philosophy.)

The Misconception of First Cause

Early in his article, claims that nobody can explain how a magnet works and that nobody seems to be particularly bothered by this.

For one thing, as far as I can tell, nobody knows how a magnet can move a piece of metal without touching it.

Maddox writes

And for another—more astonishing still, perhaps—nobody seems to care.

I want to talk about the notion of “touch” later. But for now let’s focus on the other part of that quote—that nobody seems to care. What Maddox is getting at, I think, is that science can never answer why something happens… at a fundamental level, it can only offer descriptions and make predictions. It can only tell you how something happens.

A Hypothetical Conversation

Let’s imagine, for a moment, a hypothetical conversation between Maddox and a physicist. If he asks about magnets… the physicist will say something like “oh the electromagnetic force is caused by the magnetic field.”

“Okay, so what causes the magnetic field?” Maddox might ask. And to this a physicist might say “Well, the magnetic field is really a relativistic echo of this more fundamental thing, the electromagnetic field tensor. A magnetic field is created by moving charge… but that motion depends on your point of view. The field tensor is invariant.”

Maddox might push further. “What causes that?” And a physicist might tell him that it’s a low-energy limit of the electroweak force.

Maddox, getting really aggravated now, might push again. “But what causes that?” And the physicist, depending on her leanings on quantum gravity, would give him an unworried shrug. “We don’t know. It just is.”

What’s Wrong With Maddox’s Question

Do you see the problem? It’s the same problem as in theology. If you ascribe cause to something, then you must ask what causes the cause. One (very theological) answer is that God is infinite and can get around these petty problems like cause and effect.

But science has a better answer: we don’t know! And moreover, we cannot know! At a fundamental level, science is based on observations of the world around us. We are limited by what those observations can tell us. These observations can tell us a lot. They can tell us what happens—to bars of iron can be made to pull at each other. They can tell us how it happens—the bars attract if they are oriented in a particular way, otherwise they repel. And, with a bit of cleverness, they can give us the tools to make predictions—an electric current will attract an iron bar.

But observations, at some level, will fail to explain something. And that’s perfectly okay. In fact, it’s better than okay. It’s a good thing to know your limits! And this is a fundamental limit. The success of science is built on knowing that whatever Nature does must be the truth, no matter how counter-intuitive.

I believe Maddox knows this. He certainly lampshades it when he comments that

But as far as I can tell—and isn’t the point of science that all its bigger propositions come accompanied by this noble caveat?—[Steven Weinberg] really can’t [explain how magnets work].

But Maddox sees this as a reason to distrust science and it is not. It is science’s greatest strength.

(I don’t mean to imply that science has no explanatory power. It tells us that magnetism in a bar magnet is caused by either atomic spin or electron spin, for example… which is very powerful. But at some point, the chain of causes stops and you can go no further.)

The Misconception of Classical Intuition

Let’s reflect on that for a moment. Whatever Nature does must be truth, no matter how counter-intuitive. This is the second misconception Maddox displays. Maddox finds it unsettling that we cannot explain “how a magnet can move a piece of metal without touching it.”

But… what does it mean to touch? Let’s think about the subatomic realm, the world of quantum mechanics. In the world of atoms and electrons, “touch” is a fuzzy concept. For one thing, there is no such thing as a “particle.” Protons, electrons, neutrons, and even atoms and molecules, are not localized balls, like we’re used to in our world. They’re waves of probability, distributed throughout space. What this means to us in the world of trains and aeroplanes is not totally clear. But it is the nature of Nature. So particles them, aren’t really particles.

For another thing, when we “touch” a table, there’s a lot of empty space between the atoms in our hands and the atoms in the table! What’s really happening is that the atoms in our hands are repelling the atoms in the table… for a variety of reasons, including the electromagnetic force and the Pauli exclusion principle. There’s none of the “touching” Maddox seeks at all! Maddox is disturbed by the idea that we appeal to “spooky action at a distance,” but a more interesting question is are there any forces that aren’t, fundamentally, this sort of spooky action at a distance.

(As a historical note, Einstein described quantum mechanics as “spooky action at a distance” because he was disturbed by the fact that quantum entanglement seemed to violate causality. We know now that it does not violate causality and Einstein was worried for nothing. But the electromagnetic force never bothered Einstein.)

Maddox is falling prey to the fallacy of classical intuition. He believes that because he experiences the world in a particular way, the world must be that particular way. But Nature is not so gentle! We evolved to perceive the world in a way that benefits us evolutionarily… not in the way it really is! Again, the great strength of science as a methodology is that it overcomes this classical intuition and allows us to glimpse the world as it really is. (Or at least, closer to how it really is.)

A Fallacious Distrust of the Abstract

Finally, Maddox says that

When you get right down to it, the mystery of magnets interacting with each other at a distance has been explained in terms of virtual photons, incredibly small and unapologetically imaginary particles interacting with each other at a distance. As far as I can tell, these virtual particles are composed entirely of math and exist solely to fill otherwise embarrassing gaps in physics, such as the attraction and repulsion between magnets.

Well, Maddox is right about one thing. Virtual particles are unapologetically imaginary. This is a complaint that I, and many other scientists, share with Maddox. But this isn’t a problem with the science. It’s a problem with lazy science communication.

As I described above, the notion of a particle is deeply misguiding. A particle is a “human-scale” approximation of the true nature of reality, which is made up fields and waves. Really, force isn’t carried by virtual particles. It’s carried by fields, which interact with each other via waves that travel at speeds no greater than the speed of light. And it just so happens that these waves look like particles to us if we squint. But this doesn’t work all the time. Sometimes the notion of a single particle simply doesn’t make sense.

But, even in the realm of subatomic physics, the idea of a particle is very powerful. It provides intuition and a surprisingly robust computational tool. This is why, historically, high-energy physics has been misleadingly called “particle physics.” (And for those in the know, how the terrible name “second quantization” came to be.) And the notion of a virtual particle, an imaginary particle associated with the excitation of a quantum field, even more powerful.

So… if it makes good predictions…. is a virtual particle really imaginary? Or is it a valid way of interpreting the fundamental nature of reality?

The answer is that, despite my distaste for virtual particles… they’re often exactly as good of a description as waves—better, because they’re easy to work with. It’s true that the description fails sometimes, but so what?

(For experts, I’m discussing the occupation-number formalism of quantum field theory, vs. other formalisms. In particular, the occupation number formalism fails when a vacuum cannot be uniquely defined… a la Unruh effect or curved spacetime.)

This is why Maddox is wrong to distrust virtual particles. Maddox’s distrust seems to stem from the fact that virtual particles are purely mathematical and that there is a more general way to describe quantum fields. But he should not distrust this mathematical abstraction. It is the tool we use to make predictions.

Moreover, it’s the only tool we have. Scientists are not explaining why phenomena occur. Really what scientists do is build Lego models of the universe, simulacra that behave like the universe and allow us to make predictions. Equations and mathematical abstraction are the Lego blocks of our models. And the particle picture of quantum field theory is a very good model indeed.

Other Rebuttals

Maddox’s post is quite old… seven years old by now. I am not the first scientist to refute him. In particular, I’d like to recommend this blog post by Sabine Hossenfelder, which is, as usual, excellent.

Posted in Physics, Quantum Mechanics, Science And Math | Tagged , , , | 8 Comments

The CMB Axis of Evil and the Nature of Randomness

axis of evil planck
Figure 1. Some fluctuations in the cosmic microwave background align around an axis in the sky, called the “axis of evil” and shown in white here. Image due to the Planck collaboration.

This Halloween, Nature News released an article titled Zombie Physics: 6 Baffling Results that Just Won’t Die. It’s a fun article describing several mysteries in physics whose solution sits in a sort of limbo. For fun, I figured, I’d explain some of these mysteries, and give my opinion about possible solutions. And first, I’m going to discuss the CMB Axis of Evil, a strange pattern in the leftover radiation from the Big Bang.

A Much-Too-Short Summary of Cosmic Inflation and the CMB

About 13.8 billion years ago, the universe was extremely hot, so hot that matter couldn’t form at all… it was just a chaotic soup of charged particles. Hot things (and accelerating charges) glow. And this hot soup was glowing incredibly brightly. As time passed, the universe expanded and cooled, but this glow remained, bathing all of time and space in light.

(The reason for why the universe was so hot in the first place depends on whether cosmic inflation is true. Either it’s because the Big Bang just happened or it’s because, after cosmic inflation, a particle called the inflaton dumped all of its energy into creating hot matter.)

Even today, the glow remains, filling the universe. As the universe expanded, the glow dimmed and its light changed colors (due to gravitational redshift), until it became microwaves instead of visible or ultraviolet light. This ubiquitous glow is called the Cosmic Microwave Background, or CMB for short, and if you turn an old analogue TV to an unused channel, some of the static you hear is CMB radiation picked up by your TV antenna.

Since its discovery, the CMB has been one of our most powerful probes of cosmology. It lets us accurately measure how fast the universe is expanding, the relative amounts of normal stuff vs dark energy and dark matter, how the density of matter fluctuated in the early universe, how the Earth is moving relative to the expansion of the universe, and much more.

Some parts of the early universe were more dense and some were less, and this translates to slight, random variation in the color of light in the CMB. And in turn, we can translate this into a temperature. The temperature of the CMB is incredibly consistent across the sky. It’s an almost perfect 2.725 Kelvin. However, there are tiny fluctuations relative to this mean, and these reflect the dynamics of the early universe. Figure 2 shows a map of these fluctuations and I describe how this map is attained in my post on BICEP2.

Figure 2. The measured CMB mapped on a flat surface. (Image due to the Planck collaboration.)

The CMB Axis of Evil

It’s very hard to see in figure 2, but with a little massaging, we can see that many of the fluctuations in the CMB align along a single axis, called the axis of evil, as shown in figure 1. (Formally, the quadrupole and octopole moments of the fluctuations align.) At first glance, is quite strange, because we believe that the fluctuations in the density of the early universe should be randomly distributed in a particular way… and this is exactly the way they are distributed on smaller scales. The mottled look of figure 2 is exactly due to this particular random behaviour of the fluctuations in the CMB.

So what’s going on? There are a couple of possibilities. I’ll go over them and add my opinion (and the scientific consensus or lack thereof).

Errors in Foreground and Modelling

Perhaps the most boring explanation is that we made a mistake when creating the CMB maps like figure 1 and figure 2. As the story of BICEP2 shows, making those maps is very hard. To create them, we have to account for all the other sources of microwave radiation in the universe and carefully remove them from our measurements.

Over time, we’ve gotten incredibly good at this…so good that we can extract all sorts of information about the early universe from the CMB. But that doesn’t mean we’re always right. There could be extra dust in the solar system. Or a confluence of the gravitational pull of distant galaxies on the light of the CMB (called the integrated Sachs-Wolfe effect) could magnify a normal random fluctuation so that it appears significant.

(I am really oversimplifying the integrated Sachs-Wolfe effect here. But that’s a story for another time.)

I think errors in foreground modelling could easily account for the axis of evil.

The Universe is a Doughnut or a Sphere

Imagine an ant living on the surface of a doughnut. The ant is so small that the doughnut appears flat to it. As the ant travels forward, it will eventually return to where it started, no matter what direction it travelled. From our perspective, of course, this is because a doughnut wraps around. But to the ant, this would be quite mysterious! Figure 3 shows the doughnut from both our perspective and the ant’s perspective. This is very similar to how if you travel East on the Earth, you eventually return to your starting place.

travel on a torus
Figure 3. An ant travels on a doughnut. From our perspective (left), the ant returns to where it started because the doughnut wraps around on itself. But from the ant’s perspective (right) it seems to walk in a straight path and eventually return to where it started.

What if our universe was like the doughnut, but in three dimensions? So if you start going in a direction, say towards Andromeda, and keep going for as long as possible, billions of light years, you would eventually get back to where you started (ignoring of course that the universe is expanding and thus the distance you would have to travel would increase faster than you could travel it).

What if, perhaps we see the same things on both sides of the axis of evil because they are literally the same things and the universe has wrapped around on itself? In the original paper discussing the axis of evil, the authors discuss this very possibility. It’s a nice idea, but it can actually be tested by trying to match images of stars and galaxies (and fluctuations in the cosmic microwave background) on opposite sides of the sky to see if they look the same. The results, however, are not favourable. So no one takes this idea very seriously… even though it’s very clever.

Cosmic Variance

This one takes a bit of explanation. So bear with me. First, let’s talk about something called a posteriori statistics.

A Posterioiri Statistics

Imagine a teacher breaks her students into two groups. She tells one group to flip a coin ten times and record the result as a sequence of heads or tails. The group might record, for example,


which would correspond to a string of four tails, then a string of four heads, then one head, and one tail. She tells the other group of students to make up ten coin flips, but try to do so in a way that looks random. The two collections the students return are:




And, masterfully, the teacher immediately picks out the truly random sequence.  Which one is it? How does she do it? The second sequence, TTHHHHTHTH, which looks very structured, is the random one.

The human mind is very good at picking out patterns, and attributes a cause to every pattern it sees. But random numbers, very naturally, randomly in fact, appear to make patterns, even though the pattern doesn’t mean anything. It’s just random noise. The teacher takes advantage of this. She knows her students will avoid creating a sequence that looks too structured, because they don’t think random numbers look like that. But random numbers can easily look like that.

Of course, the probability that precisely the second sequence would emerge is less than one percent. But the emergence of some sequence that looks vaguely like the second sequence is vastly more likely.  You can think of this like finding a cool looking cloud, or Jesus in your morning toast. You see the cool looking cloud and you think “Wow! A cloud that looks like an airplane! What are the odds?” But you should be thinking “Wow! A cloud that looks like an airplane! The odds of me finding a cloud that looks like something interesting are quite high because there are a lot of clouds and a lot of things I think are interesting.”

This sort of thinking is called a posteriori statistics. And in general, it causes mistaken analysis.

The CMB Axis of Evil

So what does this have to do with the CMB? Well, people who study the CMB are well aware of the danger of a posteriori statistics, so they try to avoid thinking in this way. One way to avoid this sort of thinking is to make many many measurements. If you have a huge number of sequences of coin flips, on average, the randomness (or lack thereof) will become manifest.

And this is indeed what we do for most of the cosmic microwave background. The fluctuations on small scales, which give figures 1 and 2 their mottled texture, are numerous and we can do many statistics on them by looking at different areas of the sky.

But the axis of evil is different. It covers almost the whole sky. And we only have one sky to make measurements of! So it’s not possible to do good statistics. The fact that we have only one universe to measure, which we believe emerged from random processes, and that we can’t do statistics on a whole ensemble of universes is called cosmic variance.

And cosmic variance interferes with our ability to avoid a posteriori statistics. It lets us fool ourselves into believing that the way our universe turned out is special, when there may in fact be a multitude of equally probable ways our universe could have been. And it is entirely possible that the axis of evil is one such “fluke.”

It is possible, in principle, to reduce the effects of cosmic variance. If we could move to another position in the universe, we would be able to see a different portion of the CMB (because the light that could have reached us since the CMB was created would come from a different place in the universe). In 1997, Kamionkowski and Loeb suggested using the emissions of distant dust to extrapolate what the CMB looks like to that dust. In principle, it would be possible, but very very hard, to use this trick to test whether or not the axis of evil comes from cosmic variance.

As you may have guessed from the amount of time I devoted to the explanation, I find cosmic variance to be a very compelling cause of the axis of evil.

The Most Likely Story, In My Opinion

So… what do I think is the cause of the axis of evil? The following is my opinion and not rigorous science. But it went something like this. Due to random fluctuations in the way the universe could have been, something that looks like the axis of evil formed in the CMB, but much less significant. This would be the cosmic variance explanation. To this day, the “axis of evil” remains statistically insignificant. But, because our models of cosmic microwave sources and filters look like in the universe and in our solar system are flawed, and because we don’t take the integrated Sachs-Wolfe effect into account, the axis of evil appears much bigger to us than it actually is.

So in my mind the axis is caused both by imperfect experiments and analysis and by the human need to find patterns in everything.


I owe a huge thanks to my friend and colleague, Ryan Westernacher-Schneider, who told me this story last spring and compiled a summary and list of references. Ryan basically wrote this blog post. I just paraphrased and summarized his words.

Further Reading

I’m not the first science writer to cover this material. Both Ethan Siegal and Brian Koberlein have great articles on it. Check them out:

  • This is Brian Koberlein’s article.
  • This is Ethan Siegal’s.

For those of you interested in reading about the axis of evil in more depth. Here are a few resources.

  • This is the first paper to discuss the axis of evil. It also discusses the possibility that the universe is a doughnut.
  • This paper coined the term “axis of evil.”
  • This paper discusses the possibility of solar-system dust producing the axis of evil.
  • This paper discusses the integrated Sachs-Wolfe effect and how it enhances the axis of evil.
  • This paper proposes a way of reducing cosmic variance.
  • This is the collected published results by the Planck collaboration, which analyses all aspects of the CMB in great depth.

Related Reading

If you enjoyed this post, you might enjoy my other posts on cosmology. I wrote a two-part series on the BICEP2 experiment:

I have three-part series on the early universe:

I have a fun article that describes the cosmic microwave background as the surface of an inside-out star:

Posted in cosmology, Physics, Science And Math | Tagged , , , , , , | 8 Comments

A Retraction: Backwards Heat is Not Chaotic

Figure 1. Fluid turbulence, such as vortices, hurricanes, and tornadoes, can be described as chaotic. Source: Wikimedia Commons

Yesterday I wrote a post that explored the flow of heat both forwards and backwards in time. I used this as a venue to introduce the notion of entropy and to describe one extreme example of the butterfly effect—where small changes in initial data can create big changes in the final result. That’s all fine and good and I stand by that.

But I said that the reverse heat equation, which runs the flow of heat backwards in time, was an example of chaos. And as this reddit user points out, this is very wrong. I have now fixed the original post so that it doesn’t say anything wrong. But I owe you all an explanation here.

The Heat Equation is Not Chaotic

You can never, ever actually solve the reverse heat equation. It is an example of a so-called ill-posed problem. And understanding which problems are well-posed or ill-posed is a very important topic in both physics and mathematics. (This is actually the reason I’m interested in the reverse heat equation. It’s the archetypical ill-posed problem.)

Truly chaotic systems, on the other hand, are well-posed. Although they depend strongly on their initial conditions, meaning that finding exact solutions is difficult, they can be solved. To illustrate the difference, let’s look again at the reverse heat equation, shown in figure 2.

reverse heat!
Figure 2. The heat equation, run in reverse. Colour shows temperature. Dark blue is coldest and red is hottest.

Temperature differences just build on themselves exponentially until the whole thing becomes completely unmanageable. And this is the problem. Now let’s look at a genuinely chaotic system: the flow of water in a very shallow pond, as shown in figure 3. (You can find another good video here.)

Figure 3. Fluid turbulence. Brightness shows vorticity (roughly energy in the vortexes). The small vortices merge into bigger ones. Image made by my friend and colleague, John Ryan Westernacher-Schneider, who works on fluid turbulence.

Notice the vortices that form? The precise initial configuration of the water dramatically changes the positions of the vortices. However, although the vortices merge, they don’t grow so much that we can’t make predictions any more. And this is the important difference. This property, called topological mixing, is also what keeps the heat equation from being chaotic.
(There are other technical reasons that the heat equation is not chaotic. But this is the big one, and it’s the thing that I really failed to emphasize in my last post. So I’m emphasizing it here.)

As an aside, notice how small vortices become bigger? This is a property of fluids that are tightly confined in one direction like in a shallow pond or on the surface of the Earth. It’s actually why hurricanes form. Small vortices merge to become big vortices. In fluids without the confinement, the process goes the other way, big vortices become small.

My Apologies

As a physicist—and not a mathematician—I believed that I knew the definition of mathematical chaos when I did not. And instead of checking my facts, I just blithely went ahead and wrote about it.

Many physicists don’t know about mathematical chaos; I’m not ashamed of my ignorance. But I am ashamed of not doing my homework before writing about a topic with which I am unfamiliar. Many of you trust me as an authority on math and physics, and in yesterday’s post, I failed to live up to that trust.

I promise to be more careful in the future.

Posted in Mathematics, Physics, Science And Math | Tagged , , , , , , | 2 Comments

Heat, Chaos, and Predictability

A funny comic about the butterfly effect
Figure 1. The butterfly effect: a sinister insect plot?

The butterfly effect, shown comically in figure 1, is the idea that a very small change in one place on Earth can cause a very big change somewhere else. In this case, a butterfly flaps its wings and causes a tornado. This metaphor illustrates the mathematical concept of chaos, in which the Earth’s atmosphere is a chaotic system. While a single butterfly probably isn’t literally responsible for a tornado, mathematical chaos is very real and important. So this week, I’m going to try giving you some intuition for the butterfly effect using one extreme example from physics.


Suppose we take a flat, rectangular piece of metal and heat it up at four specific spots. Figure 2 shows what will happen to the metal: The four hot spots (shown in red at the start) will cool off as the heat spreads out, diffusing across the metal until the whole piece reaches the same temperature.

Heat diffusion
Figure 2. The heat from four hot spots on a piece of metal diffuses across the metal. Colour shows temperature. Red is hottest. Dark blue is coolest.

If we isolated the piece of metal beforehand, no heat can “escape” it, so it will never cool back down to its original temperature. The total amount of energy in the system will stay the same. The only thing that changes is how the heat is distributed over the metal’s surface. This “flow” of heat is described by the heat equation. Given any distribution of temperature across the metal, we can use the heat equation to know how hot each area of the metal will be at any point in the future.

But what if, instead of making a prediction about the future, we want to make a postdiction? What if we want to know the temperature of the metal at some point in the past?

Heat Flow Backwards?

Of course, we know the temperature change originated at the four spots we heated up, but let’s pretend we don’t. Suppose that we only saw our metal piece after its whole surface had reached the same temperature. Furthermore, suppose that we’re just a little uncertain about the temperature of the metal now. Maybe there are a few spots that are slightly hotter or colder than average—say, from us touching it, or from sunlight. Probably the best way to figure out what the metal looked like in the past is to take our best guess as to the temperature now, feed that number into the heat equation, and run it in reverse, right?

I did exactly that and figure 3 shows the result.

reverse heat!
Figure 3. The heat equation, run in reverse. Colour shows temperature. Dark blue is coldest and red is hottest.

That doesn’t look anything like the four dots! What’s going on? The heat equation run in reverse, creatively called the reverse heat equation, suffers from the butterfly effect. Small uncertainties in the known temperature distribution cause huge variations in the “postdicted” temperature distribution. In the case of the reverse heat equation, this effect is so severe that we can’t make any useful statements.

Let’s try to understand what’s going on.

Understanding the Reverse Heat Equation

Why is the reverse heat equation so chaotic? What causes the butterfly effect here? Let’s think about how heat behaves. Heat spreads out, from hot regions into cooler regions. This makes hot regions cool down and cold regions warm up. Eventually everything becomes uniform.

If you reverse this behaviour, like rewinding a video, heat moves from cold regions to hot regions. Hot regions become even hotter and cold regions become even colder! This means that if you take a surface with a uniform temperature and randomly make some spots just a little hotter than others, those random warm spots will just keep getting warmer. Any difference from the average temperature, no matter how small, gets exaggerated exponentially. This means that if we want to work backwards from a near-uniform temperature distribution to find out how it originally looked, we need to be exactly certain of the temperature everywhere. And we can never be exactly certain. Measurement tools are flawed. And even if we did have perfect tools, quantum mechanics forbids infinitely precise measurements (at least, in finite time).

Worse, since heat diffuses, every original pattern—no matter how strange—leads to a uniform temperature across the metal. So even if the heat spread out perfectly, with every spot exactly the same temperature as every other spot, the reverse heat equation is still useless. Confronted with an infinite number of possible original patterns, it’s forced to just make an arbitrary decision. And while this process isn’t random, the solution that the equation picks will almost certainly be incorrect, since its odds are literally infinity to one.

What Makes Heat Special?

The inability to make postdictions about temperature is surprising. Most of the laws of physics work perfectly well in reverse. If I know the height of waves in a pond—like the one shown in figure 4, for example—at the present moment, then I can say what the pond will be doing at any moment in time, whether past or future. (At least in principle. In reality, friction will convert much of the wave motion into heat. The waves also need to be sufficiently low-energy; otherwise, water can become chaotic. I’ll get to that in a bit.)

Figure 4. The height of waves in a rectangular pond, neglecting energy loss. Colour represents height. Red is high, blue is low.

So why is heat special? Roughly speaking, the temperature of a metal is actually an average of the energy of the atoms that make it up. In principle, we could track the motion of every individual atom and make a prediction of their motion after heating the metal up with a laser. Then we could make a good postdiction by tracking the atomic motion back in time.

Of course, this is impossible in practice. There’s way too many particles and way too much information to keep track of, so we’d need a practically infinite amount of computing power. So instead, we use the abstraction of temperature, which averages over the particles.

This abstraction has a price, however.  We are intentionally hiding information from ourselves: the precise configuration of the metal. And so it should come as no surprise that we can’t use the heat equation in reverse. We lack the necessary information to do so! We can even quantify how much information we’ve hidden from ourselves. The quantity that tells us this is the entropy of the system. And one way to understand the Second Law of Thermodynamics (“entropy never decreases”) is that, as we step forward in time using the heat equation, we forget more and more about the initial configuration of our metal.

(I want to note that, although I’ve been talking about tracking particles, which are classical, quantum mechanics has analogous ideas. Instead of tracking particles, you track—or average over—a wavefunction whose amplitude represents the probability of measuring all the of the positions of a huge number of particles.)

Manageable Chaos

The reverse heat equation is totally unusable. There is no saving it. But it is an extreme example of the butterfly effect. And it’s not actually chaotic. True chaos is more manageable because it is well-posed, meaning that predictions are, in principle, possible.

Manageable chaos emerges naturally in many areas of science. If the pressure is strong enough, or the temperature or speeds high enough, fluids like air and water are actually chaotic, but in a way that we can handle. Because it takes a lot of computing power to handle the chaos in the atmosphere, it’s very difficult to make concrete predictions about the weather…but it’s not impossible.

Large-scale phenomena, like planetary motion, can also be chaotic. Two objects gravitationally attracted to each other will behave pretty predictably, but adding even one more mass to the system can cause their motion to become chaotic. Satellites under the gravitational influence of both the Earth and the moon, or both the sun and Jupiter, are important examples of such three-body systems.

Understanding chaotic systems is very difficult, but it’s also essential if we are to understand much of the universe. And in many cases, we can manage the chaos.

Related Reading

If you enjoyed this post, you may enjoy some of my other posts on mathematics.

  • In this post, I describe the many sizes of infinity.
  • In this post, I describe the history of imaginary numbers.

Further Reading

  • If you’re curious how I produced those images, I put my code in the IPython notebooks in this bitbucket repository. Feel free to play around with them. I’m afraid there’s no documentation at the moment.
  • You can find a more technical discussion of the heat equation and reverse heat equation in this blog post by an engineering Ph.D. student.
  • And here‘s an in-depth discussion of entropy as “lost information.”
  • And for a much more in-depth discussion of chaos, check out this awesome ebook.
Posted in Mathematics, Physics, Science And Math | Tagged , , , | 1 Comment

In-Falling Geodesics in Our Local Spacetime

Figure 1. The path of a ball (rainbow) after I drop high above the surface of the Earth. The green surface is our local spacetime. The red line points towards the Earth, the blue line points forwards in time. The black line is the surface of the Earth.

My previous post was a description of the shape of spacetime around the Earth. I framed the discussion by asking what happens when I drop a ball from rest above the surface of the Earth. Spacetime is curved. And the ball takes the straightest possible path through spacetime. So what does that look like? Last time I generated a representation of the spacetime to illustrate.

However, I generated some confusion by claiming that it “should be obvious” that the straightest possible path is curved towards or away from the Earth. When a textbook author says “the proof is trivial” usually what they mean is that they don’t want to go through the work of writing a proof. The same is true here, I didn’t want to generate a picture with the path of the ball in it.  Since this was confusing however, I apologize. And to make it up to you, I’ve plotted the path of the ball, shown in figure 1.

Note that it approaches a straight line. That’s because as it accelerates it’s approaching the speed of light (we are neglecting air resistance and exaggerating the distance from the surface of the Earth to make that happen). The path of the ball is curved—it curves with the surface, after all. But it’s as straight as it possibly can be. And that’s what makes it a geodesic.

Note also that the speed of light is a straight line that’s wider than 45 degrees. I told you last time that in Minkowski space light travels at 45 degree angles. However, to make the curvature of the spacetime visible, I stretched out lengths radially (the direction of the red arrow) a bit. So actually light cones in this plot are wider. I didn’t think this would be visible when I made the plot before, but it’s quite clear if you include the geodesics. So I apologize for that slight misrepresentation last time.

I’ve updated the previous post to include this plot. So this week’s post is only for those of you who read the last post.

Posted in Uncategorized | Tagged , , , , , , , , | Leave a comment

Our Local Spacetime

Gravity Probe B circling Earth
Figure 1. People usually imagine the distortion of spacetime due to the earth as something like this: a dip in the fabric of space. As we’ll see, the actual distortion is quite different. (Source: Gravity Probe B)

General relativity tells us that mass (and energy) bend spacetime. And when people visualize the effect of a planet on spacetime, they usually imagine something like in figure 1, where the planet creates a “dip” in spacetime much like a “gravitational well.” But today I’m going to show you what spacetime actually looks like near a planet… and it doesn’t look anything like the common picture.

This is the fifth part in my many-part series on general relativity. Here are the first four parts:

Dropping the Ball

As we learned, general relativity tells us that gravity is really a distortion in how we measure distance and duration. In the presence of mass, spacetime distorts so that distances are longer or shorter and time flows more or less quickly. Then objects (under no forces) travel along the straightest possible path through this distorted spacetime. And this motion, which doesn’t look straight, is what we perceive as gravity.

But what does this curvature look like? It’s hard to visualize. And as a result, I often get the following question: how does all this work on Earth? If I stand at the top of a cliff and drop a bowling ball, as shown in figure 2, what causes it to accelerate towards the Earth? How does the structure of spacetime make that happen? Why doesn’t it, for example, simply fall at a constant speed? Or simply hold still in the air?

I dropped the ball.
Figure 2. Me dropping a ball (red) off of a cliff.

To understand this, we’re going to try and visualize our local spacetime.

Minkowski Space

Before we talk about curved spacetime, though, I want to remind you what spacetime looks like in the absence of gravity… i.e., when it’s flat. That’s the domain of special relativity. Flat spacetime is called Minkowski space.

In Minkowski space, we give each point (or event) a position in space and a position in time, as shown in figure 3.

Two events in Minkowski Space
Figure 3. Two events in Minkowski space. Event B happens after event A, but both happen at different places.

In Minkowski space, people and objects exist at all times (between birth and death at least), but move between places. The line representing someone or something’s path through space and time is called a worldline. If an object is stationary, the worldline is vertical. If an object is moving, the worldline is at an angle, and the slope of the line is based on the speed at which the object is moving, as shown in figure 4.

Worldlines in Minkowski Space
Figure 4. Worldlines in Minkowski space. Because Jack isn’t moving, his worldline is vertical. Jill, on the other hand, is moving away from Jack at a constant rate, so her worldline is angled at a constant slope.

When working in Minkowski space, it is customary to work in units where the speed of light is one. We do this so that we can convert between position and time, which we treat as two different types of distance. (For instance, a second is the amount of time it takes light to travel 3\times 10^8 meters.)

Using such units, the worldline of a photon is a line forty-five degrees off of each axis–i.e., a line whose slope is one. The worldlines of light traveling away from a point in every direction thus form the light cone for that point.The light cones traveling into the future are future-directed and the light cones traveling into the past are  past-directed, as shown in figure 5.

Light cones in Minkowski Space
Figure 5. Future- and past-directed light cones emanating from event A.

Because nothing can travel faster than light, the light cones determine what events in the past can affect current events and what events in the future can be affected by the present. As shown in figure 6, if event B is in the past-directed light cone of event A, it would be possible for event B to affect event A. However, since event C is outside of the light cone, it can’t possibly affect event A.

Past-directed causality
Figure 6. Because event B is in the past-directed light cone of event A, it can affect event A. However, because event C is outside the light cone, it cannot affect event A.

Visualizing Far From the Earth

Since we can’t visualize a four-dimensional spacetime, we’re going to make some simplifying assumptions. We’re going to imagine that spacetime only depends on how far we are from the Earth, and we’re going to ignore things like lattitude and longitude. This brings us from a four-dimensional spacetime to a two-dimensional one, which we can visualize by putting it into a three-dimensional volume.

However, things are still tricky because we want distances one travels on our two-dimensional spacetime to match up with the distances one travels in the real four-dimensional spacetime. And this is going to distort the image slightly from what we would intuitively expect. Because our visualization preserves distances in this way, it’s called an isometric embedding.

Far from the Earth, we can get the shape of spacetime in our visualization by taking piece of paper with the graph of Minkowski space in figure 3, putting one hand each on the top and bottom of the paper, and lifting it so that the centre sags, as shown in figure 7. Because paper isn’t stretchy and the graph paper didn’t rip, we know distances were preserved.

the arrow of time is blue
Figure 7. Spacetime in our visualization far from the Earth. The red arrow points towards the Earth and the blue arrow points towards the future. Lines parallel to the red arrow are lines of constant time while lines parallel to the blue arrow are lines of constant distance from the Earth.

But wait! I said that spacetime far from the Earth was flat! So in that case, shouldn’t it just look like figure 3 and not be bent like it is in figure 7 at all? It turns out that, in the sense that we care about, both figure 3 and figure 7 are flat. The kind of curvature we’re interested in is exactly equivalent to a distortion of how we measure distance. If the graph paper doesn’t rip, it’s flat. In this sense, any shape you can make from a sheet of paper is flat.

This type of curvature is called intrinsic curvature. A two-dimensional shape is intrinsically curved if one would need to stretch or distort or cut a piece of paper to make it. In other words, if distance changes on the surface of the shape. (There are higher-dimensional generalizations of this too.) There’s another type of curvature called extrinsic curvature, which describes how a surface looks when you put it in a volume. Figure 7 is extrinsically curved while figure 3 is not.

But why do we insist on figure 7 if both figures are flat? Well, flat spacetime certainly could look like figure 3, but if it did, we would run into trouble when we got closer to the Earth. Not all two-dimensional shapes fit in three dimensions and if we want the shape of spacetime near the Earth to fit, while at the same time preserving distances, then the bit of spacetime far from the Earth has to look like figure 7.

Our Local Spacetime

Now that we know what spacetime looks like far from the Earth, we’re ready to explore what it looks like near Earth. Our local spacetime is shown in figure 8.

Figure 8. The shape spacetime near Earth. The red arrow points towards the Earth, the surface of which is a solid black tube. The blue arrow points into the future.

The lines parallel to the red arrow are lines of constant time, and the lines parallel to the blue arrow are lines of constant distance from the Earth. Notice that the surface of the Earth, the big solid black line, is not a point but a line. This is the worldline of the surface of the Earth. Notice also that the lines scrunch together as you approach the surface of the Earth. This is because lengths and durations are actually shrinking near the Earth. We age slightly slower at sea level than we do on an airplane. (This is related to the gravitational redshift I discussed in an older post.)

If it looks like that scrunching together would eventually lead to the lines of constant distance lying on top of each other, you’re right! If I made the surface of the Earth a smaller and smaller radius, then the lines would eventually lie on top of each other. And that would be the event horizon of a black hole. The spacetime wouldn’t stop at the event horizon, of course. It would happily continue. But that’s a story for another time.

I should note that to make the curvature more visible, I’ve stretched out the axis along the red arrow. This means light travels at about 30 degrees off of horizontal, not 45 degrees.

Dropping the Ball Again

So what happens when I stand on a cliff and drop a ball from the top of the cliff? The ball wants to take the straightest possible path through spacetime. Since I don’t throw the ball, I just drop it, it starts in a path roughly like that of the blue arrow. This is a path of constant radius where the only motion is forward in time. It should be roughly visible in the picture that such a path is extremely bendy. The more the ball moves either towards or away from the Earth, the straighter the path.

Of course, because the ball can’t travel faster than light. So a path like that of the red arrow, which is almost a straight line, isn’t valid. The ball has to be within my light cone. Therefore, the worldline of the ball will be some path that travels both forward in time and towards the Earth. And because of the way space and time curve, this will appear as an “accelerating” path.

Figure 9. The path of the ball after I drop it. The rainbow line is the path of the ball. Angle changed to make curvature of surface and path clearer.

I plot the geodesic for the ball in figure 9. Note that it approaches a straight line. That’s because as it accelerates it’s approaching the speed of light (we are neglecting air resistance and exaggerating the distance from the surface of the Earth to make that happen). Note also that the speed of light is a straight line that’s wider than 45 degrees. That’s because of the stretched axis. The path of the ball is curved—it curves with the surface, after all. But it’s as straight as it possibly can be. And that’s what makes it a geodesic.

It’s worth noting that a path away from the Earth would also be a valid worldline. And indeed, it would be just as straight as the path towards the Earth. If, instead of dropping my ball, I threw it upwards at escape velocity, this is indeed the worldline it would choose.

If we’d somehow included lattitude and longitude in our visualization, we could have seen worldlines where the ball orbited the Earth too.

Cool, huh? I think that’s enough for now.

Spacetime Isn’t Curved Into Anything

Our visualization exercise today may have lead you to believe that spacetime must be curved inside some higher-dimensional space. After all, to show you the curvature of spacetime near the Earth, I took a two-dimensional spacetime and put it in a three-dimensional volume. But I did this out of convenience, to help us understand what goes on near a planet. In truth, all you need for spacetime to be curved is for distances and durations to distort. And they can distort all by themselves, without depending on a higher-dimensional space.

Play With it Yourself

If you’re interested in exploring our local spacetime, good news! I wrote a Python script that generates the surface I showed you in figure 8. You can find it in the following github repository:

Your plots won’t look exactly like figure 8, because I generated that figure using Maple 16, which makes nicer 3d plots. But it should still be fun to explore.

Further Reading

I created my visualization using the excellent paper Spacetime Embedding Diagrams for Black Holes by Don Marolf. You can find a preprint of the paper here:

I used a black hole to describe the spacetime around the Earth because far from the event horizon, the spacetimes are the same.

Related Reading

This post relied on a a fair amount of special relativity. If you want to learn more about that, you may want to check out some of my older posts on special relativity:

Thanks for reading, everyone! See you next time!

Posted in Uncategorized | Tagged , , , , , , , , , | 4 Comments

Distance Ripples: How Gravitational Waves Work

Look at those curves!
Figure 1. Artist’s conception of the gravitational waves emitted by a pair of in-spiralling compact objects (like black holes or neutron stars). Image due to the LIGO collaboration.

Gravitational waves are “ripples in space time” that propagate through it like waves on water. That’s the common story and, for the most part, it’s right. But what does that mean? This is part four in my many-part series on general relativity. The first three parts introduce general relativity from the ground up. You can find them here:

Okay. Without further ado, gravitational waves!

Spooky Action at a Distance

First, I want to help you get an intuition for why gravitational waves should exist. So before we dive into the relativity, let’s step back for a moment and imagine boring old Newtonian gravity. Suppose we have a bowling ball (blue) and a marble (red), as shown in figure 2. We take the bowling ball and we move it periodically towards and away from the marble. As we do, we measure the strength of the gravitational pull the bowling ball exerts on the marble. It gets stronger as the bowling ball gets near and weaker as it moves further away. This is plotted in the bottom of figure 2.

dat sine wave
Figure 2. A simple experiment with Newtonian gravity. We take a Bowling ball (blue) and a marble (red) and move the bowling ball back and forth. The plot (bottom) shows the strength of the gravity that the bowling ball exerts on the marble over time.

Notice how wavy the gravitational strength looks? At this point, you might be tempted to call it a gravitational wave. But that temptation is leading you astray. See, an important property of waves is that they travel at finite speed. Information can’t travel instantly. But in Newtonian physics, the marble feels the change in the gravitational pull of the bowling ball instantly.

So what needs to change? Well, all we need to do is sprinkle a little bit of special relativity into the mix, since special relativity says that information can travel no faster than light. Then the wiggles plotted in figure 2 would be delayed. So the marble would only feel a gravitational force a bit after we move the bowling ball.

That would be a gravitational wave.

Since special relativity is basically true, and we feel gravitational forces, this should convince you that gravitational waves should exist. And it should also give you a sense on what a gravitational wave should be like. We should feel a temporary change in the “pull” from the gravity a distant object, which is an echo of its motion.

Gravitational Waves in General Relativity

But of course, gravitational waves don’t actually work the way I just described. Gravity is not a force, it’s a distortion in the way we measure distance. So how do gravitational waves work in this context? Well, in some sense, I already told you. Gravity is a distortion of how we measure distance. So a gravitational wave is a distortion in how we measure distance that travels.

Of course, there are some caveats, most of which I won’t get into. The most important caveat is how the distance distorts. Distances don’t just grow and shrink evenly in every direction. They grow in one direction and shrink in another. For example, if you took a circular ring of particles, I’m a fan of marbles, floating in outer space, and a gravitational wave passed by, you’d observe them distort into one ellipse and then another, as shown in figure 3. And this happens because the distances between the particles are changing. (For experts: I’m showing the + polarization. If you rotate by 45 degrees, you get the x polarization.)

ring around the gravity...
Figure 3. A ring of particles distorted as a gravitational wave passes by. Distances stretch in one direction and then shrink in another. Source: wikimedia commons

Detecting Gravitational Waves, Part 1

So how would you detect a gravitational wave? Should we arrange a bunch of marbles in space and wait for them to distort? Well, in principle we could do that. But spacetime is very stiff and the distortion in distance from a gravitational wave is quite small, which is why we haven’t detected any gravitational waves yet. To see a distortion large enough that we could see, we’d need a very big ring of marbles.

Fortunately, we have one. An artist’s impression is shown in figure 4. Except our marbles are all neutron stars and our ring is millions of lightyears wide. Basically, each marble is a type of star called a millisecond pulsar, which is a neutron star that’s rotating very fast. For reasons I won’t get into, this makes it emit light (though usually not visible light) in a beam. And as it rotates, we see a pulse as the beam points towards us, like a lighthouse. To measure a distortion in spacetime due to a gravitational wave, we measure how long a pulse takes to reach us over many many pulses. If a pulse comes before or later than it should, that might be a gravitational wave! To see if it is, we need to check with all the pulsars in the “ring” to see if they distorted in the right way and do some fancy math.

yeah, pulsars!
Figure 4. Artist’s depiction of a pulsar timing array. Source: NANOGrav.

This whole scheme is called pulsar timing, which is done with pulsar timing arrays. A pulsar timing array is a collaboration of people who use telescopes, like the one at Arecibo shown in figure 5, that keep track of millisecond pulsars and do statistics to see if they’ve detected a gravitational wave.

Figure 5. The radio telescope at Arecibo. Image by Jerry Valentin.

Detecting Gravitational Waves, Part 2

Pulsar timing is great and all… but is there a more… direct way we can find gravitational waves? Maybe something we can build on Earth? I’m glad you asked! We don’t really need a ring of particles, right? All we actually need are two very very precise rulers… set up so that we can measure distance growing in one direction and shrinking in another.

Fortunately, light makes an incredibly good ruler. So we can make our rulers out of laser light and compare them to detect a gravitational wave. That’s how the two LIGO detectors and detectors like them work. One of the detectors is shown in figure 6.

Livingston, I presume?
Figure 6. The LIGO detector in Livingston Louisiana. Image by the LIGO collaboration.

Each LIGO detector has two 4km long, vacuum-sealed, seismically isolated, supercooled laser arms that measure distance incredibly accurately. If you compare the distances measured in the two arms (which is actually all you can do because LIGO is a laser interferometer), the measurement in the difference is accurate to better than one part in 10^{22}. This means they can measure a change in distance one one-thousandth of the width of a proton.

The LIGO systems were recently upgraded and they’re coming online this year. So stay tuned in the following years for news of a gravitational wave detection!


I should mention that moving a mass in a straight line back and forth, as in figure 2, is not enough to excite a gravitational wave in general relativity. The motion of the mass needs to have a so-called quadrupole moment. Most motions in the real world, such as orbiting a star, do have a quadrupole moment. But I wanted to mention this so that you’re not under the impression that all motion produces gravitational waves. Just most motion.

Stay Tuned!

I have a lot more to say about gravitational waves. But I think this is enough for now. In future posts, look forward to learning about the astrophysical systems that produce gravitational waves and listening to the sound of two black holes colliding.

Further Reading

I didn’t pull my description from a single source, this time. I used a bunch of textbooks, such as Spacetime and Geometry by Sean Carroll and Introduction to 3+1 Numerical Relativity by Miguel Alcubierre. But here’s some more accessible resources:

Related Reading

If you liked this post (and my other general relativity posts) you may be interested in some of my posts on relativistic astrophysics:

Posted in Uncategorized | Tagged , , , , , , | 7 Comments

General Relativity is the Curvature of Spacetime

Einstein rings are awesome!
Figure 1. The light from a distant blue galaxy is warped and distorted into a ring by the curvature of spacetime caused by a red galaxy. (Source:

Figure 1 shows light from a distant blue galaxy that is distorted into a so-called Einstein ring by the curvature of spacetime around a red galaxy. This is called gravitational lensing and today we’ll learn how it works.

This is part three of my many-part series on general relativity. Last time, I told you how general relativity is the dynamics of distance, which we know is a consequence of the fact that gravity is the same as acceleration. This time, I describe the consequences of the fact gravity warps distance. And in the process, we’ll learn precisely why gravity looks like a force, even though it isn’t one.

(If you haven’t read parts one and two, I recommend you do so now. You can find them here and here.)

When Distance Warps, Space Curves

First, let’s try to understand what a warping of distance means. We’re going to find that it’s the same as curvature. To understand the connection, let’s go closer to home and imagine a curved space we’re all familiar with: the surface of the Earth.

Imagine that you’re driving from your home town of City to the capital, Metropolis, and that there’s a mountain in the way, as shown on the left in figure 2. Travelling over the mountain takes more time than travelling around, both because the mountain is tall and because the vertical climb is more difficult.

A three-dimensional picture of what’s going on would show that the ground is curved upward into the shape of a mountain, forcing you to go around. However, it’s possible to encode the same information in two dimensions. If we draw the two paths on a map, as shown on the right in figure 2, the path over the mountain looks straight and the path around it looks curved. However, we define the straight path to be longer than the curved one, even though our Euclidean eyes tell us otherwise.

Travell time from City to Metropolis is shorter if we go around the mountain, rather than over it.
Figure 2. In three dimensions (left), we see that, because a mountain is in the way, the red path is shorter than the green path. However, we can encode the same information in two dimensions (right) by defining the g green path to be the longer path, despite what we perceive to be intuitively obvious

This tells us that a curved surface (in this case, the region around City and Metropolis, which bulges out with a mountain) is the same as a surface where distance is distorted. And we can go the other way. A distortion in the way we measure distance implies curvature.

In the context of general relativity, this is what we mean when we say spacetime is curved. Distance has warped such that the straightest possible path is not what you expect.

In Curved Spacetime, Straight Paths Look Curved

Let’s get some better intuition for how curved spaces work. The curved surface we’re most familiar with is the Earth, so let’s see if we can’t get some feel for curvature by exploring how we move around on Earth.

Say you want to go from Narita, Japan to San Diego, U.S.A. What’s the shortest route? Naively, you’d look at a map and draw a straight line between the two cities. However, if you look at Japan Airline’s route map, shown in figure 3, you’ll see something quite different.

Japan Airlines Route Map
Figure 3. The naive route between Narita and San Diego is a straight line. However, aeroplanes actually fly a great deal further north than that (source).

What’s going on (other than the effects of prevailing winds)? It’ll help if we look at the Earth as a sphere instead of as a plane, as shown in figure 4. The straight line between the two cities goes through the Earth, so that’s a no-go. The naive path is just a straight line on a flat map, which in this case keeps our latitude more or less constant; this is doable, but not the best we can do. The best path is a path that goes a bit north.

Paths between Narita and San Diego
Figure 4. Because the Earth is curved, we can’t travel in a straight line between Narita and San Diego (left). A straight line on a map, the naive path, isn’t the best we can do either (center). The best path heads north (right).

What’s so special about this last path? Every path on the Earth must curve, because the sphere curves. However, there’s a portion of the curvature of the path that comes from the curvature of the Earth and there’s a portion that comes from the curvature of the path itself. The latter is called the geodesic curvature. A path that’s as straight as possible—i.e., whose only curvature is the curvature of the surface it’s on—is called a geodesic. This straightest possible path, which has no curvature of its own, will always also be the shortest possible path between two points.

The geodesics for planet Earth are the great circles. These are the circles with the same radius as the Earth; in other words, take the circle formed by the equator and rotate it to make it pass through any two points, as shown in figure 5. A great circle will always cut the Earth into two hemispheres of equal size. However, they will no longer be the Northern and Southern Hemispheres we’re used to.

Great circles are great!
Figure 5. The great circle that connects Narita and San Diego. It has the same radius as the Earth itself.

The lesson that I want you to take away from this is that, in a curved space (or spacetime!), straight doesn’t mean what you think it means. In flat space, a geodesic is a straight line. But in curved space, a geodesic is not a straight line. But it’s the closest thing to a straight line you can get. Indeed, it’s the appropriate definition of straightness.

In curved spacetime, straight lines look curved.

Gravity, Curvature, and Lensing

I told you gravity isn’t a force, but looks like one. We’re now almost ready to understand that. Let’s walk through the argument. The presence of mass, which we typically think of as gravity, distorts distance and time nearby. This, as we just learned, curves spacetime. And in a curved spacetime, straight lines don’t look straight.

Now here’s the clincher.

In the absence of an external force, objects travel along the straightest possible paths, geodesics, through spacetime.

In the absence of gravity, those paths look like the straight lines we’re all used to. But in the presence of mass, they can look very curved.

That, my friends is the gravitational “force.” And let’s be clear. It’s not a force! Particles under the influence of gravity aren’t moving, at least not in the traditional sense. (They’re moving forward in time only.) It’s just that, to us, they appear to be moving because spacetime is curved. This is why, in Galileo’s famous experiment at the leaning tower of Pisa, the feather and the bowling ball fall at the same rate: they’re not falling at all.

Gravitational Lensing

We’re now ready to discuss the gravitational lensing shown in figure 1. The red galaxy distorts the spacetime around it, very much like the “mountain” in figure 2 so that the straightest possible path light coming from the distant blue galaxy behind it is curved. The result is that light gets spread out and “lensed” to form the Einstein ring you see in the image.

Gravitational lensing is a powerful tool. We use it to search for dark matter, to measure the age and size of the universe, and even to look for planets outside the solar system. In short, it’s pretty awesome.

Spacetime Isn’t Curving Into Anything

Now, before I conclude, there’s a common misconception that I want to nip in the bud. People think that, because the universe is curved, it has to be curved into something. In other words, in the same way that the surface of the Earth is a curved two-dimensional sphere embedded in three-dimensional space, the curved four-dimensional universe must be embedded in some higher-dimensional space.

This is wrong.

The universe doesn’t need to be embedded in a bigger space to be curved. All it needs is for the way we measure distance in our own four dimensions to be distorted. We can understand and encode all the information we need about the curvature of spacetime in how distances shrink and durations stretch out. In other words, if you look at figure 2, the left picture isn’t important. Only the right picture matters.

Okay, that’s all for now, folks. Starting next time, I’m going to discuss various different cool properties of general relativity… black holes, gravitational waves, that sort of thing. Exciting!

Related Reading

If you liked this post, you may be interested in some of my older posts on gravity, curvature, and all that.

Further Reading

I took my treatment of general relativity from Sean Carroll‘s excellent text, Spacetime and Geometry. However, there are some great, less technical resources online. Currently, my favorite is this five-part series by PBS Space Time on youtube:

If you’d like some information on the history and context of general relativity and the measurements we’ve made that tell us it’s true, check out these great articles by Ethan Siegal and Brian Koberlein:

Posted in Geometry, Mathematics, Physics, Relativity, Science And Math | Tagged , , , , , , | 6 Comments

General Relativity is the Dynamics of Distance

kogler crazy art installation
Figure 1. This art installation by Peter Kogler at the Zagreb Museum of Contemporary Art gives a feeling of what the spacetime we live in might look like, at least in extreme cases. (Source:

This is part two in a many-part series on general relativity. Last time, I described how Galileo almost discovered general relativity. In particular, I told you that gravity isn’t a force. In fact, gravity is the same as acceleration. Now, this is a completely crazy idea. After all, we’re all sitting in the gravitational field of the Earth right now, but we don’t feel like we’re moving, let alone accelerating. But let’s take this crazy idea at face value and see where it leads us.

(Of course, the Earth is spinning, which is an acceleration. And it’s orbiting the sun, which is an acceleration. And the sun is moving in the galaxy. But let’s ignore all that. It’s not important for the argument I want to make.)

But first, we need to make a brief detour  and discuss the Doppler effect.

(If you haven’t read my previous post on why gravity is acceleration, I recommend you do so now. It is here:

The Doppler Effect

The Doppler effect is a bit complicated (especially for light), so I won’t go into too much depth. Instead, I’ll describe it by analogy. (I’ve given the same analogy before, in my article on the expanding universe. So if you remember, you can skip all this.)

Imagine that Paul Dirac and Leopold Kronecker are playing catch, as in figure 2. Each second, Kronecker throws a ball to Dirac, who catches it. Thus, the frequency of balls that Dirac catches is 1 Hertz (Hz)—one per second, or one inverse second.

Kronecker and Dirac Playing Catch While not Moving
Figure 2. Leopold Kronecker (left) and Paul Dirac (right) playing catch. Every second, Kronecker throws a ball to Dirac, who catches it. Thus, the frequency of balls caught is 1 Hz. (Source for Kronecker can be found here. Source for Dirac can be found here.)

But now imagine that Dirac starts backing away from Kronecker, as shown in figure 3. Kronecker continues to throw at a rate of one ball per second. However, since Dirac is moving away from the balls, each one takes longer to get to him. Thus, he catches the balls at a rate slower than one per second…say, one every 1.5 seconds.

Dirac moves away from Kronecker.
Figure 3. Dirac starts moving away from Kronecker. Because it takes the balls longer to reach Dirac, he only catches one every 1.5 seconds, even though Kronecker still throws the balls at a rate of one per second. (Source for Kronecker can be found here. Source for Dirac can be found here.)

A similar thing happens with both light and sound. (In the case of sound, we call it the acoustic Doppler effect.) Light is a wave. It has peaks and troughs which wiggle up and down in time, as shown in figure 4. The number of peaks (or troughs) per meter is called the wave number.  The speed at which it wiggles up and down in time is called the frequency. The two are related by the speed of the light wave, which is always constant, so they’re basically interchange-able.

Wave with labels
Figure 4. A light wave has peaks and troughs. The number of peaks that pass by Dirac in a given second is analogous to the frequency of the wave.

The frequency of a light wave is analogous to the frequency at which Kronecker throws balls at Dirac. Instead of counting the number of times Dirac throws the ball, we count the peaks of the wave. The frequency of a light wave also determines its color; high frequencies are blue, while low frequencies are red.

This means that if Kronecker fires a green laser at Dirac, and Dirac moves away from him, the laser light will appear more reddish to Dirac than it does to Kronecker. This is called a redshift. If Dirac were moving away from from Kronecker at an increasing rate, in other words if Dirac were accelerating, the redshift would be even more pronounced.

Gravitational Redshift

So what does all this have to do with gravity? Well remember, gravity is acceleration. So we should be able to see a Doppler-like effect just by moving from a region with strong gravity into a region with weak gravity, or vice-versa. To see what I mean, imagine that Kronecker and Dirac are up to their old tricks. But this time, imagine that Kronecker is on Earth, and Dirac is in space, as shown in figure 5.

I'd watch a movie about Dirac in space...
Figure 5. Kronecker sends a beam of green laser light from Earth (where gravity is strong) to Dirac in space (where gravity is weaker). By the time it arrives, the light is redder.

Kronecker fires a green laser up at Dirac. Now, remember: gravity is acceleration. Both Kronecker and Dirac are in a gravitational field, so they’re both accelerating. But Kronecker is in a stronger field, so he’s accelerating more. This means that, from Dirac’s perspective, Kronecker is accelerating away from him. Therefore, by the time the light reaches Dirac, he sees it redshifted because of the Doppler effect.

In the context of general relativity, we call this gravitational redshift, and it’s a real effect. We need to take it into account when we read signals sent to us from gps satellites, for example.

Redshift, Distance, Time

The peaks and troughs of light make it an extraordinarily good ruler. If you know the wave number of a wave of light, you can count the number of peaks and in the wave between two places and calculate how far away those two places are from each other. In a very real sense, distance is defined by this procedure.

How, then, do we interpret the redshifted light that Dirac sees? If light on Earth is redshifted when it goes into space, that light stretches out. The distance between adjacent peaks in the light wave grows. Does this mean that distance itself grows?

Yes. It means exactly that.

In a strong gravitational field, distances are shorter than in a weak gravitational field. Indeed, because the wave number of a wave and the frequency of a wave are interchange-able, this also means that times are longer in duration in a strong gravitational field than in a weak gravitational field.

We started with the crazy (but true!) idea that gravity is the same as acceleration. But this has lead us to an even crazier (but still true!) idea: gravity shrinks distance and stretches duration.

This is what people mean when they say that gravity is a warping of space and time (or suggestively, spacetime). The very way that we measure distance is distorted by a gravitational field.

And general relativity is the dynamics of distance.

Next time we’ll talk about how a warped spacetime creates the illusion of a gravitational force.

Further Reading

I took the gravitational redshift argument directly out of the excellent textbook Spacetime and Geometry by Sean Carrol. If you have a good background in math and you want to learn general relativity, I highly recommend it. Here are some other resources:

  • This is a nice video on the Doppler effect.
  • The PBS Spacetime Vlog has an excellent series of videos on general relativity. The first two videos cover what I’ve covered so far, but from a different perspective. You can find them here and here.
Posted in Physics, Relativity, Science And Math | Tagged , , , , , , , , , | 3 Comments