Unfortunately I will be taking a hiatus on blog posts until mid December. The reason is that graduate school is pretty hectic at the moment and I’m feeling a bit too overwhelmed. See you all in about three weeks!
I have one simple request.
And that is to have sharks
with frikkin laser beams
attached to their heads!
Always look on the bright side
…unless you’re holding a laser pointing device.
The laser is, without a doubt, one of the most ubiquitous, archetypal technologies of modern times. And it is one of the most direct applications of quantum mechanics. But how do lasers work?
It All Starts In The Atom
The story starts deep within the atom. I’ve previously discuss the fact that particles are waves and that this forces electrons to have only certain specific energies inside an atom. The energy and momentum of a particle control how many times the corresponding wave wiggles. And these must fit in a circle around the nucleus of the atom, as shown below.
If the atom is part of a molecule, especially a crystal, the discrete allowed energies become so numerous that they look like continuous bands. And this leads to band structure.
For clarity, physicists often imagine extremely simple atoms with only two or three allowed electron orbits, each of which is allowed only at a single specific energy and a single specific momentum. We then plot these energies as a function of their allowed momenta. The plot is called an “energy level diagram,” and it looks something like the figure below.
Between Light And Matter
Now let’s imagine an electron sits in the lowest energy level, as shown below.
When a photon—a light particle—hits the atom (or alternatively passes right through it), it has the potential to affect the electron. Classically (i.e., without quantum mechanics), the light would accelerate the electron, since the electron is a charged particle and light is made up of electromagnetic fields and electromagnetic fields affect charged particles. However, if the electron accelerated, it will gain kinetic energy. This gain is only allowed if the electron ends up with one of the allowed energies.
If the electron is accelerated, it will absorb the photon, absorbing both the energy and momentum of the photon. So it is only allowed to absorb the photon if the electron’s new energy and momentum are allowed within the atom. Otherwise, surprisingly, the photon passes right through the atom unmolested, as shown below.
The same process works in reverse. Electrons are lazy and they want to be in the lowest possible energy state. So they’ll do whatever they can to drop from a high energy state to a lower one. And the easiest way for an electron to drop to a lower energy state is by
emitting a photon. The emitted photon must, of course, have energy and momentum such that the electron’s new energy state is allowed, as shown below. This process is known as fluorescence.
The rules determining how an electron may change energy and momentum are called “selection rules.”
Cheating Selection Rules
Of course, selection rules aren’t absolute. Quantum mechanics is inherently probabilistic, and the Heisenberg uncertainty principle forbids us from knowing all quantities perfectly well. This means that if we shine a beam of light on an atom such that most of the photons have the wrong energy and momentum for the electron to transition to a new energy level… every once and a while, by pure quantum chance, a photon will come along with the right energy and momentum and the electron will transition, as shown below.
Another way you can think about it is that, eventually, the electron itself moves a little bit out of the allowed energy levels and it can absorb one of the forbidden photons, as shown below.
Now, let’s imagine that an electron starts in a low-energy state. And it is excited into a high energy state by a photon with the appropriate energy and momentum. Then, while the electron is still in this high-energy state, another photon with the same energy and momentum hits the atom. What happens?
Intuitively, the photon should pass harmlessly through the atom, unabsorbed, because the electron has nowhere to go. However, this isn’t what happens at all. The electron will drop down to a low-energy state and emit an identical photon, traveling in the same direction and with the same energy and momentum as the incident photon, as shown below. This is called stimulated emission, and it is the magic that makes lasers work.
Unfortunately, I can’t really give a good explanation for how stimulated emission works. The mathematics behind it, and that predicts it comes from time-dependent perturbation theory, a way to examine the quantum mechanics of complicated situations. I can say that absorption and stimulated emission are opposites. The math for each is the same. Indeed, process that’s most different is the most intuitive: fluorescence, where the atom decays without any stimulus at all.
Population Inversion and Gain
If we could take advantage of stimulated emission, we could use it to amplify a beam of light and make it very intense. More importantly, ever photon in the beam could be generated from a single seed photon. The beam could be made of clones, all traveling in the same direction, all with the same energy and momentum. This would let us control the properties of the beam very precisely. (This property is called coherence.)
Unfortunately, atoms like to fluoresce, which means that most electrons do not stay in a high-energy state for long enough for us to initiate stimulated emission. Is there a way around this?
There is a way around this problem! Some transitions between states take longer than others. (This has to do with the quantum mechanics of selection rules that I talked about earlier.) Furthermore, some transitions are more likely to occur naturally than others. In other words, if we select the right atom, we can control how electrons in it transition between states. We can find an atom where the electrons transition to a high energy state very quickly, but then decay into a middle state where they stay for a long time. If we do this fast enough, we can get all of our electrons into the middle state, as shown below. This is called a “population inversion.”
Once we have a population inversion, all it takes is one seed photon. We put a block of our inverted material (called a gain medium) in between two mirrors, as shown below. Then we make the material fluoresce once. It doesn’t really matter how. Eventually the material will fluoresce if it’s in population inversion.
Once one photon is between the two mirrors, it will bounce off of a mirror and pass through the gain medium, causing stimulated emission. Then two photons will bounce off of a mirror and pass through the gain medium, causing stimulated emission. Then four photons will bounce off of a mirror… Well you get the idea.
This is how laser light gets so intense.
But why is laser light only one color? This is actually much easier to explain. It’s a consequence of the fact that the gain medium is placed between two mirrors. Remember that photons are both particles and waves. And that the wavelength of the wave determines the color of the light. Moreover, light waves are made up of electric and magnetic fields. The electric field of the light must be zero at the mirror, because mirrors are conductors. The electrons in the mirror move to cancel whatever electric field might otherwise exist.
This means that, just as an electron orbiting a nucleus can only fit an integer number of wavelengths into the orbit, a light beam can only fit an integer number of wavelengths between the two mirrors, as shown below.Otherwise, the wavelength would not be zero.
Where don’t lasers have applications? We use them in medicine for laser eye surgery. We use them in our computers to read optical disks. We use them in our factories to cut metal. We use them to send light signals through fiber optic cables for communication. We use them to measure distance. We use them to measure time. We use them to generate fusion power, and we use them to help us calibrate our telescopes. I’ll talk about some of these ideas in future posts. If you’d like to hear about a specific application, let me know and I’ll see what I can do.
Where to even start? Here are some resources:
- PHET has a simulation of a laser suitable for classroom demonstrations. It just runs in a web applet.
- Minute Physics has a nice video. It uses Bosonic statistics to explain stimulated emission. I don’t really like this explanation, but it does give a good intuition.
- The National Ignition Facility, where they’re trying to use lasers to make fusion power has a nice article.
- How Stuff Works has an article on lasers too.
- LFI International has a nice article too.
Questions? Comments Insults?
As always, let me know if you have questions, comments, or hatemail. Or if you just want to speak your mind.
The patient accretion of knowledge,
the focusing of all one’s energies on some problem in history or science,
the dogged pursuit of excellence of whatever kind
these are right and proper ideals for life.
Nothing can escape from a black hole, not even light. This is why we call them “black.” One would imagine, then, that black holes are black invisible menaces, lurking out in the depths of space. Surprisingly, though, black holes glow. The cover image shows a radio photograph of the center of the Milky Way. The center glow, Sagittarius A, is partly due to a supermassive black hole, Sagittarius A*. (No, that doesn’t lead to a footnote…the name of the black hole actually is Sagittarius A*, pronounced “a star.”)
Black holes glow because they are very messy eaters. As a black hole sucks in surrounding matter, it pulls its food into a disk or a sphere around it, called an “accretion disk” or an “accretion shell,” as shown below. And it is partly this disk that generates the incredible glow. (There is another process, called a “jet,” which also produces a lot of light. I’ll briefly talk about it later.)
But why doesn’t stuff in the accretion disk just fall into the black hole? The answer, elegantly enough, is the same reason that the planets in our solar system don’t fall into the sun.
Imagine that you tie a ball to a string and spin it over your head. The ball will fly out to stretch the string as much as possible and, if you let the string go, the ball will fly away from you in a direction tangential to the circle. This effect is so prominent that it can be used to make a weapon called a “bola.”
As Sir Isaac Newton predicted, objects like to travel in straight lines–you have to push or pull them to make them deviate. This resistance to change is called momentum. Thus, to make an object travel in a circle, you have to constantly pull it towards the center of the circle, forcing it to turn. The faster an object moves (or the more massive it is), the harder it is to turn, and the more force you have to use to pull it towards the center of the circle. Although the object’s tendency to fly out of the circle emerges purely from its momentum, for convenience, we often pretend it’s a separate “centrifugal force.”
Matter in accretion disks is often spinning too fast to fall into the black hole. The gravitational pull of the black hole isn’t strong enough to counteract the centrifugal force of the matter–partly because the black hole is spinning too and drags the matter with it, partly because the matter was spinning to begin with. (On the cosmic scale, most things in the universe are spinning.)
Over time, the black hole does win. The matter does lose outward momentum and fall into the black hole. (Like energy, momentum can’t be created or destroyed, but it can be transferred. Most of it is vented through the “jet” light-creating process that I’ll briefly explain later.) However, as stuff falls into the black hole, the gravitational pull of the black hole accelerates it up to incredible speeds, which in turn heats it up to incredible temperatures. And hot matter glows.
(Temperature actually contributes to the glow in another, less direct way. The in-falling matter is often so hot that it ionizes, its electrons separating from their nuclei. These charged particles follow the spin of the disk they’re in, which causes them to accelerate. Since accelerating charges emit light–which, incidentally, is how radios work–the disk glows even brighter.)
The glow has another surprising effect, though. We often imagine accretion disks to be very thin, flattened out by the spinning of the disk and the black hole, the same way that a pizza chef flattens out dough by spinning it. But they’re actually a bit thick. The secret is light.
A Quantum of “Push”
In the time of Sir Isaac Newton, there were two competing ways of understanding light. Newton believed that light was made out of tiny particles called “corpuscles” that carried kinetic energy and momentum and bounced off of things like any normal particle. In contrast, Christiaan Huygens believed that light was like sound: a wave that propagated through a clear medium, like air or glass.
Of course, we know now that Newton and Huygens were both right (to a degree). Quantum mechanics has shown us that light is both a particle and a wave. It bends and refracts like a wave, but it carries energy and momentum like a particle. This means it can bounce off of things and exert force. (Although light is a wave, it doesn’t need a medium like sound does. It can propagate in empty space.)
Imagine that a beam of light bounces off of a mirror, as shown below. One way to describe this is by using the equations of optics and electromagnetism. However, another way is to imagine a bunch of physical particles–which we now call photons–hitting the mirror and bouncing off of it. But Newton tells us that “for every action there is an equal and opposite reaction.” When the mirror pushes the photons, the photons must push back.
This effect is called radiation pressure. We don’t usually notice it because each individual photon doesn’t carry much energy compared to a human being. We need a lot of them to exert an appreciable force. However, we can harness radiation pressure to do some pretty cool things. The solar sail proposal for space travel is based on this idea.
(Experts know that the conception of light as a wave also predicts that it carries energy and momentum. However, we need to treat light as an electromagnetic wave, governed by Maxwell’s equations. Particle-wave duality lets me explain radiation pressure a lot more easily.)
Why Accretion Disks are Thick
So what does radiation pressure have to do accretion disks? As we now know, the matter in the accretion disk is producing quite a lot of light. When this light scatters, it exerts an outward force on the in-falling stuff, partly counteracting the pull of gravity and the flattening effect of the spin. If enough photons hit the in-falling gas, something amazing happens: the matter stops falling. The constant radiation pressure from within the disk completely counteracts the force of gravity.
The point when the glow of the accreting matter is bright enough to stop it from falling into the black hole is called the Eddington limit, after Sir Arthur Stanley Eddington. With rare exception, we never see accretion disks glowing brighter than this; if there’s enough glow to cause that, it means more matter is flying outwards than inwards, so the disk dissipates and the glow subsides. (The Eddington limit is usually lower than the brightness required to completely counteract gravity. The radiation pressure has some help from the centrifugal force, as discussed above.)
This is also why accretion disk are thick. The force of gravity and the incredible spin of the black hole should flatten the disk out like a pizza crust, and to a good extent, it does. However, the light from the glow of the disk pushes the matter outward and puffs it up a little bit, so that it looks more like a slightly squished donut. (Accretion disks seem to fall into several categories of shape–some thicker, some thinner. The factors involved are an ongoing area of research, but radiation pressure is often important.)
In the case of rotating black holes, there’s another source of light, the so-called “jets.” The plasma physics of the disk accelerates the in-falling matter to enormous velocities, ultimately launching it into space around the poles of the black hole and along the axis of rotation. These incredibly powerful jets of matter, which glow for the same basic reasons of centrifugal force as accretion disks, are another reason black holes are easy to spot. They also allow matter in the accretion disk to bleed off its outward momentum enough to fall into the black hole.
What I’ve given you is a very simplistic introduction to a very rich and difficult topic. Accretion physics is still an active area of research. To truly understand what’s going on, we need to simulate what happens to the stuff in the accretion disk, taking fluid dynamics, electromagnetism, and general relativity into account. I’ve tried to find some non-technical resources.
Questions? Comments? Insults?
I am by no means an expert on accretion physics, so I could have gotten something wrong here. If I have, please bring it to my attention! And if you have any questions, please bring those to my attention, too–I’ll do my best to answer them!
But let there be spaces in your togetherness
and let the winds of the heavens dance between you.
Love one another but make not a bond of love:
let it rather be a moving sea
between the shores of your souls.
Two weeks ago now, I flew to Conway, Arkansas to attend the wedding of my very good friends Vincent and Mary. This and an academic conference got in the way of blogging for a little while but I’m back. As such I decided to a post in their honor about bonding. Not human bonding, mind you, but on chemical bonding. Specifically, covalent bonding! You probably know that atoms missing electrons like to form covalent bonds with each other where they share their electrons. But why does this happen? The secret lies in the quantum mechanical nature of electrons.
This post will rely heavily on the articles I’ve previously written about quantum mechanics. You might want to check out my previous posts. These ones will be particularly helpful:
- I originally did a three part series on quantum mechanics where I first explained particle wave duality, then I explained what a matter wave is using the Bohr model of the hydrogen atom. Finally, I offered an interpretation of matter waves as probability waves.
- I’ve also discussed the Pauli exclusion principle, which forbids two fermions from existing in the same quantum state.
- And, perhaps most importantly, I’ve talked about quantum tunneling, which describes how quantum particles can do things classical particles cannot. Quantum tunneling is very similar to what I’m going to describe here.
If you don’t want to go through all of these, I will try and link to the relevant ones as they come up in discussion.
In physics, we have this thing we call energy. We usually break energy into two categories: kinetic energy and potential energy. (There are more “types,” but in the end, they can be reduced to these two types.) Kinetic energy is a little easier to understand, so let’s talk about that first.
Roughly, Kinetic energy measures how much something moves, and how difficult it is to make that thing move or to stop it once it’s moving. In classical mechanics, the kinetic energy is given by one half the square of the velocity times the mass,
(Astute readers may remember that I described momentum in a similar way. I said it measured how much something is moving and how difficult it is to change an object’s motion. Kinetic energy and momentum are very much related. The biggest difference is that momentum is a vector. It has a size and a direction… and it measures the direction of motion as well as the resistance to change. On the other hand, energy is a scalar. It’s just a number, which comes from squaring the velocity. Also, although energy can be transferred between objects and transformed between potential energy and kinetic energy, momentum only describes motion. Finally, as a rough intuition, momentum measures how difficult it is to make a small change in an object’s motion, while kinetic energy measures how many small changes are required to change the motion in a big way.)
Potential energy measures the ability to generate kinetic energy. If I’m very high up on a cliff and I jump off, I can accelerate very fast and acquire a lot of kinetic energy (which I stole from the Earth’s gravitational field). This means that my potential energy (at least from gravity) is proportional to my height. Other sources of potential energy include electric and magnetic fields, springs, and even massive particles themselves.
The total energy of a system or a particle is the sum of the kinetic energy and the potential energy. And the total amount all energy in the (classical) universe is conserved; It can’t be created or destroyed, simply passed around between particles, objects, and people.
(Experts know I’m glossing over a lot here. In truth, the distinction between kinetic and potential energies is pretty artificial. The physicist and mathematician Emmy Noether defined energy as whatever quantity a physical system possess that doesn’t change in time. In other words, it is the time-translational symmetry of the system. We’ve simply given names to the contributions to the energy like kinetic and potential energy. And indeed, energy may not be conserved for the universe as a whole. In Einstein’s general relativity, energy is not necessarily conserved.)
Of Energy and Wiggles
I’ve described what energy means in a “classical” system, where quantum effects are negligible. But in quantum mechanics, things get a bit weird (as they often do). If we understand kinetic energy, it’s simple enough to define potential energy as the ability to create kinetic energy. But what does kinetic energy mean in the quantum case? We defined it as the motion of a particle, but quantum particles don’t travel in the same way… they’re waves that exist everywhere at once. What does motion mean in this context?
We can take a hint from the wave nature of quantum particles. In quantum mechanics, the height of the wave at a given position tells us how likely it is we’ll observe a particle there. But all waves wiggle. The height rises and falls. And it just so happens that the number of times the wave wiggles over a given distance determines the energy! I’ve plotted three quantum probability functions below, each with a different energy. In this case, the quantum particle—say an electron—is confined to a box of length 1 by an infinitely strong electric field. This means that the probability of measuring the particle outside of the box is zero, and the height of the wavefunction must reflect that.
Now the thing is, the energy depends on the wiggles per unit volume. So if we made the box longer, all three particles would lose energy because they’d all still have the same number of wiggles, but there would be fewer wiggles per cubic meter. Let’s make a note of that. It’ll be important later.
The Electron and The Nucleus
In the past, I’ve given you Niels Bohr’s description of the atom, which demonstrates that electrons in atoms can only have specific kinetic energies. This is because only an integer number of wiggles fit around a circle if you want the wave of a particle to agree with itself after you go 360 degrees around the circle. The same holds true for most quantum systems.
For simplicity, lets look at the hydrogen atom in a different light, though. Let’s imagine that a hydrogen nucleus (i.e., a proton) exerts an attractive force on the electron and ignore that the electron can orbit around the nucleus. In other words, let’s imagine one-dimensional atoms.
Because the proton is so much heavier than the electron, we can basically think of it as stationary. The attractive force between the electron and the proton gives the electron some potential energy depending on its position in space. To make the whole problem easier to visualize, let’s make an analogy with gravity. On Earth, when we’re high up, we have a lot of potential energy and when we’re down low we have very little. We can describe the potential energy of the electron by plotting as if it were a height above the ground (or a depth below the ground). In the case of the hydrogen atom, that potential energy looks something like this:
In our little approximation, the electron is more strongly attracted to the proton the closer it gets to the proton. If it overlapped with the proton, it’d be infinitely attracted to it. However, particles in quantum mechanics have a minimum energy they’re allowed to have. In this case, that minimum is the Bohr energy. If we plot the probability distribution for an electron in the lowest energy state of the atom, it looks something like this:
If the electron were classical, it couldn’t go farther away from the center of the atom than the classical turning points, which I’ve marked with big black dots. This is because the electron doesn’t have enough energy to “climb” out of the potential energy well and leave. But in quantum mechanics, the electron can exist in places it’s classically forbidden to be. This is very similar to quantum tunneling. This uniquely quantum behavior is critical to explaining how atomic bonds work.
One Electron, Two Nuclei
Imagine we take our hydrogen atom and move it next to a proton. Now there are two potential wells like the one above. If the wells are far enough apart, the electron only sees its own proton, it’s own hydrogen atom, as shown below.
However, if we move the new proton close enough to the hydrogen atom, the potential energy profile changes. It starts to look something like the figure below.
And now the true quantum nature of the electron comes into play. Remember when I said that quantum particles can exist where they classically should not? Well when we bring the two “potential wells” together, the electron in the left well has some probability of existing between the two wells. And, indeed, even if it started in the left well, the electron will ooze into the right well so that it spends about half its time in each well. Then the picture of the wavefunction look something like this:
But now something funny has happened. The electron used to have one wiggle in some amount of space. Now it has more wiggles, but it also has a lot more space. The result? The electron has lost a lot of energy!
This is why atoms bond. The electrons in an an atom want to be in the lowest energy state they can, and adding another atom lets the electrons lose some energy. There’s an optimal distance between the two nuclei which gives the electron a minimal energy. And this is what controls the length of atomic bonds.
Usually each atom in a covalent bond has an electron, not just one of the atoms. Fortunately, so long as there are only two electrons in the shared orbit (the one where the bonding happens), the electrons don’t see each other at all. Each electron chooses a “spin,” which has to do with the magnetic field an electron produces. (Spin will be the subject of a later article, I promise.) There are two possible spins and, so long as each electron has a different spin, they don’t see each other. However, if more electrons appear, we get a problem because there are only two possible spins and the third electron must choose a spin that’s already been taken. Then the electrons repel and the bond breaks.
See for Yourself?
The physics education group at the University of Colorado at Boulder has developed a simulation of a quantum particle bound by two potential wells. Click on the image below to see it in action! For the atomic bonding case, change the toggle on the right from “square” to “1D coulomb.”
Questions? Comments? Insults?
This post is a bit technical so if you have any questions please do ask! And if you’re a physical chemist and you know better than me, pipe up!
A “quantum gravity expert” is presumably
someone well acquainted with the details
of our immense ignorance of the subject.
I suppose I count.
I long ago promised that I would discuss some of my own research. Here’s the first post that makes good on that promise. Today I’ll discuss a theory of quantum gravity.
Why Quantum Gravity?
Without a doubt, the two greatest advances in physics in the last 120 years were the advent of general relativity and quantum mechanics. These two amazing theories have totally changed the way we see the world. Quantum mechanics describes the physics of the very small, while general relativity describes the physics of the very massive.
Usually we don’t encounter very small, very massive things. However, they do exist, and we’d like to understand them. Black holes are the quintessential quantum gravitational mystery. A black hole is so incredibly massive that it it pulls all matter within it down to a tiny—perhaps infinitesimally small—point. And this point is small enough that we need quantum mechanics to understand it.
We also need quantum gravity to describe the universe on large scales. Modern cosmology tells us that the universe is expanding, even accelerating. If we extrapolate back to right before the universe began with a bang, it was infinitesimally small. A point. And it was—in a sense—the most massive it is possible for anything to be. We need quantum mechanics to understand this. Indeed, the most successful story we have about the early universe, inflation, relies heavily on quantum mechanics.
The accelerating universe is also a mystery, and scientists hope that quantum gravity will be able to explain it.
Unfortunately, quantum mechanics and general relativity don’t agree. Not at all. Quantum mechanics assumes that quantum particles, described by their wavefunctions, evolve in a static, eternal universe. However, in general relativity, the background itself is a living thing. Space and time reshape themselves according to the stuff contained in the universe. The quantum particles are affected by the changes in shape of the universe and affect the universe in turn, forming a feedback loop. This makes combining quantum mechanics and general relativity extremely hard.
Later, people came up with ideas like string theory, loop quantum gravity, and causal sets, all of which attempt to solve the problem of quantum gravity. But so far, although each theory has its success stories, no one theory has proven itself to be correct… or even predicted anything we can test. The best we can say is that most of them can show they look like general relativity if you take out quantum mechanics.
Needless to say, this problem is hard.
One of the things I work on is a candidate theory of quantum gravity called Causal Dynamical Triangulations, or CDT for short. Here’s how it works.
Adding Up All Universes
I already described one way to handle quantum mechanics, called the Feynman path integral. Classically, a particle takes the path between two points that minimizes (technically extremizes) the energy cost for the particle. In quantum mechanics, the particle is wave and it takes all possible paths between the two points. Then the probability of the particle traveling from the first point to the second point is given by the sum of a function of the energy costs of all possible paths.
We can take this idea and apply it to quantum gravity. Roughly, a classical universe starts with some three-dimensional shape and ends with some three-dimensional shape. It will evolve from the initial shape to the final shape in a way that minimizes (extremizes) the energy cost of that transition. Since the universe is a single shape of spacetime, we think of this sort of like a soap bubble connecting two wire rings. The wire rings force a shape at the beginning and end of the bubble, but the middle of the bubble can be whatever it wants.
So what’s the quantum analog? In quantum gravity, the probability of the universe of evolving from some initial shape to some final shape is given by the sum of some function of the energies of all possible spacetimes that connect the two initial and final shapes. This is called a sum over histories, since we’re summing over all possible histories of the universe.
Unfortunately, this sum over histories is incredibly hard to compute, or even define. Given an initial shape of the universe and a final shape of the universe, there are uncountably many spacetimes that connect the two. How do we sum over all those histories? How do we even find all of those histories? We need some clever tricks to do it.
Adding Up Some Universes
Right now, we don’t know how to find all the spacetimes that should contribute to the sum over histories. As a next best thing, we want to find the spacetimes that contribute the most to the sum. Imagine you take the number 1. You add it to . Then you add that to , and then , add inifnitum. Your sum looks something like this:
But pretty quickly the number stops changing when you add more terms to your sum.
If each successive term in the sum shrinks quickly enough, the sum itself stops growing very quickly at all. If we added new terms ad infinitum, we’d get
But with only five terms we’re almost there! Although there are infinity more terms to add before we get to the final answer, five terms gives us a darn good approximate answer.
We’d like to do the same with quantum spacetime. We can approximate the sum over all histories by taking the sum only over the histories that add the most to the Feynman path integral. So for now, our goal is to find those histories.
Right now, it is not at all obvious which histories contribute the most to the sum. To find them, we take advantage of a correspondance between quantum mechanics and statistical mechanics, called the Wick rotation. I’ve discussed before how in relativity, distances in the time direction square to negative numbers. This means we can think of the time direction as imaginary. (See my previous post on imaginary numbers for more info on what that means.)
But what happens if we make time real again? If the spacetime is sufficiently well-behaved (and it doesn’t have to be!), we can rotate the time axis through the complex plane to make it real. This transforms our spacetime into something we’re more used to, where all the distances square to positive numbers. Later, once we’ve evaluated our sum over histories, we can undo the rotation to find the right answer. This is called a Wick rotation.
Why do we Wick rotate? By changing the time axis to a real axis, our quantum system becomes a classical exercise in probabilities. If we had a humongous bag of Wick-rotated spacetimes (also called Euclidean spacetimes), and we stuck our hand in the bag and pulled a universe out, we could use statistics to figure out how likely it is that we’d pull out a given universe. And better yet, the most likely universes are the ones that contribute the most to the sum over histories if we Wick-rotate back.
So all we need to do is make a bag of Wick-rotated universes and pull universes out of the bag at random. In other words, we need to randomly generate Euclidean universes. And we can do that on a computer.
The Universe On Your Laptop
Spacetime as we know it is continuous. If you take a cube of empty space, and you zoom in on it with your microscope, you could zoom in forever. No matter how close you look, no matter how small the things you’re looking at are, you can always zoom in further and look at smaller stuff. In other words, things can be infinitely small. (We don’t know whether this is actually true. People have proposed a quantum of distance called the Planck length, which might be as small as things get. But we usually treat things as continuous.)
A computer has finite precision, though. It can’t encode things that are infinitely small. Instead we have to stop somewhere. And this means that we have to transform the smooth continuous shape of the universe into something made up of points and lines of fixed, nonzero size.
(To make things easier to understand and visualize, I’m going to drop from four dimensions to three. Now we have two spatial dimensions and one time dimension. This isn’t as crazy as it sounds. A lot of my research has been on three-dimensional quantum gravity. The reason is that the physics is easier, but we can still learn something about the four-dimensional case. Here’s a whole article in Scientific American about why quantum gravity in flatland is a good idea.)
In Causal Dynamical Triangulations, we make a specific choice about how to encode information on the computer. This choice is motivated by making the spacetime “nice” enough to Wick-rotate. For that to be possible, we enforce that there is a well-defined time direction. This sounds obvious, but it’s not always true in general relativity. You can have arbitrarily crazy spacetimes where time loops on itself, or where the time direction depends on where you are in the spacetime. Indeed, in string theory, the basic spacetime, a Calabi-Yau manifold absolutely does not have a well-defined notion of time.
We also enforce that there are no wormholes or baby universes, which also add ambiguity to the notion of “time.”
We construct our computerized universe out of equilateral tetrahedra, each of which is a tiny piece of Minkowski space. Each tetrahedron spans two discrete times. The orientation of the tetrahedron determines the effect it has on spacetime. The three possible orientations are shown below. They’re labeled by the number of vertices they have on each time slice. So a -tetrahedron has three vertices on the lower time slice and 1 vertex on the upper time slice. And so on.
We put the tetrahedra together so that face meets face and edge meets edge—there can’t be any gaps. When we put it all together, the spacetime looks something like the image below. The image doesn’t quite capture what’s going on because the tetrahedra are all the same size and you can’t really pack them together in a flat spacetime. So when the edges look like they’re different sizes, they’re not. This is actually curvature of the spacetime the tetrahedra are supposed to make up.
To figure out which spacetimes are most probable, we need to be able to measure how curved they are. This is an integral piece of Einstein’s theory of general relativity. So let’s step back and think about how we can measure curvature. I’ve discussed before about how it’s possible to measure curvature by looking at angles. Basically, look at a triangle and measure the failure of the interior angles of the triangle to add up to 180 degrees. We can use a similar idea here. However, we look at the interior angles of all tetrahedra that meet at an edge and measure the failure of the sum of those angles to add up to 360 degrees.
Let’s look at an example in two dimensions. A single tetrahedron approximates a sphere. Now three triangles of the tetrahedron meet at a single vertex, as shown below. In flat space, if we rotated around that single vertex, we’d travel 360 degrees. However, the interior angle of each triangle at that vertex is less than 120 degrees and they add up to a smaller number. This tells us the curvature at that point. This is called Regge calculus.
A Universe Factory
Now we know how to put a universe on our computer. But we still haven’t got a likely universe. The way we get one of those is to take a universe, any universe at all, put it on our computer, and then make random changes to it. Each time we make a change, we measure whether the new, modified universe is more or less likely than the previous universe. If the new one is more likely, we keep the change. Otherwise, we reject it a fraction of the time and keep it a fraction of the time, based on how much less likely it is than the previous universe. This generates a single probable spacetime.
A typical simulation of a single Wick-rotated quantum universe is shown below. The long direction is the time axis and the other two directions show the size of the universe at a given discrete time. The movie is showing the universe evolve from an arbitrary initial configuration to a likely final configuration. About halfway through the simulation, we get to a probable configuration. After that, the changes are just quantum fluctuations around the mean.
This type of simulation is called a Monte Carlo simulation. The set of decisions the program uses to make the simulation go is called the Metropolis-Hastings algorithm.
The Average Universe
Unfortunately, it’s not enough to generate a single likely universe. To perform a sum over histories, we have to average over lots of them. If we do this, we can generate the average Wick-rotated, quantum universe. (We don’t know how to Wick-rotate back, so for now we have to do everything with real time.)
The expected quantum universe in the ground (or lowest energy) state is shown below. (Lowest energy means that the universe is empty and that it evolves from a big bang to a big crunch: nothing to nothing.) I’ve plotted spatial area of the universe as a function of discrete Euclidean time. Don’t worry about the details. I just wanted to show you that the plot is smooth after you average it out, even though each individual universe is pretty bumpy. The error bars show the quantum fluctuations. If we Wick-rotated the universe we lived in now, it would look a lot like this plot… which tells us that causal dynamical triangulations reduces to general relativity when we take away quantum mechanics.
So on large scales the universe of causal dynamical triangulations looks like Einstein’s universe. What about on small scales? Something very weird happens if you zoom in close enough… the universe begins to look like a spider web. In the past I’ve talked about the idea of fractional dimension and a way to measure it, called spectral dimension. We can measure the dimension of the universe of causal dynamical triangulations, and we see that it’s not what we expect.
The scale dependence of the dimension is plotted below. On large scales, the dimension is four, like we expect. But as we move to small scales, the dimension drops dramatically… all the way down to 2.8! We don’t really know what’s going on here, but it’s a hint of truly quantum behavior. We expect there to be a “quantum foam” and this might be what we’re seeing.
The State of the Art
Now you know the basics of Causal Dynamical Triangulations. Understanding the theory is an ongoing effort by less than fifty people around the world. So far, we can only simulate empty universes. Some people are working on putting matter into the model. I’m working on studying the probabilities of the universe evolving between different initial and final shapes. Others are working on testing how strict we have to be with the Wick rotation. It’s an ongoing story, so I hope you’ll keep your eyes peeled!
There isn’t much on causal dynamical triangulations. So here’s some further reading on that and on quantum gravity in general.
- The inventors of causal dynamical triangulations wrote an article for Scientific American. The article is here, but it’s behind a paywall. You can find it for free here.
- For a more technical introduction to causal dynamical triangulations, I recommend CDT founder Renate Loll‘s article, “The Emergence of Spacetime, or Quantum Gravity on Your Desktop.”
- For a perspective on Loop Quantum gravity, check out the community’s website.
- The string theorist Sean Carrol often talks about quantum gravity and physics in general in a very accessible manner.
- Lee Smolin is a popularist author on quantum gravity and physics in general. You might want to look at his website.
- I could hardly leave you without pointing you to my mentor in all things quantum gravity, Steve Carlip.
- And here’s a whole blog on quantum gravity!
Play With it Yourself
If you’re especially excited about quantum gravity, and especially brave, you might want to try and run a simulation yourself. We are planning on open sourcing the code in the near future. So, for reference, here’s a link to a (currently locked) github repository.
Here’s the code: https://github.com/ucdavis/CDT
And the documentation I’ve written is here:
If you use the code, please cite it as ours. The original author is Rajesh Kommu. However other authors include myself, Steve Carlip, Joshua Cooperman, Christian Anderson, David Kamensky, Kyle Lee, and Adam Getchell.
Alas, the majority of the code is in #LISP . You might like that, but most likely, you resent it. Sorry about that.
Questions? Comments? Insults?
I’m afraid that this post might have been less clear than previous posts. It’s certainly longer! So if you have any questions, please let me know so I can clear up the confusion!
are a fine and wonderful
refuge of the divine spirit
almost an amphibian between
being and non-being
~Gottfried Wilhelm Liebnitz
One of the first things we learn how to do is multiply numbers. . That sort of thing. But what if we multiply a number by itself? This is the familiar operation, which we call squaring a number. and . That sort of thing. You can take a number to a power by multiplying it by itself some number of times equal to the power. So if you square a number, you’ve taken it to the second power. You can take it to the third power too, and this is called cubing. . And so on.
Now we can ask another question. Given any number, call it . Is there some number (say ““) such that is the square of ? In other words, can we find such that ?
In the case of positive real numbers, the answer is always yes, although you may not be able to write down the number as a fraction. (These numbers are called irrational, and that is a story for another time.) We call the square root of , and we denote it .
You’ll notice that I said positive real numbers. Why do I specify this? (For that matter, why do I specify the word real? You’ll see.) Well, and this goes for every negative number. The square is always positive. The same goes for positive numbers. It’s impossible for a real number to square to a negative number. In other words, negative numbers do not have square roots.
Nevertheless, mathematicians and physicists take the square roots of negative numbers all the time. We call them imaginary numbers. Where do these come from? Why do we use them? Read on to find out!
The story of imaginary numbers begins with the story of the roots of polynomials. Roughly, a polynomial is a function that takes some input, and constructs the output as some combination of powers of the input. For example, a quadratic polynomial takes the form
where is the input, is the output, and , , and are real numbers. (Interesting side note: with the right choice of coefficients, a quadratic polynomial describes the trajectory of an object shot out of a cannon.) Specifically, we might have that
If you plot the output as a function of input, with the output on the vertical axis and the input on the horizontal axis, it looks something like this.
You’ll notice that when the input is or , the output is zero—the line crosses the horizontal axis. These are called the roots of the polynomial, and they’re important because finding the roots of a polynomial allow us to solve equations involving polynomials. For example, our graph tells us that the equation
is solved by and . We know this because we can rearrange the equation to become
In other words, we’re looking for the values of , our input, such that the output is zero.
Notice that not all polynomials have roots. For example, the polynomial
does not have any roots. This is because, if you make a plot like above, the curve never crosses the zero line:
But it was not in the study of quadratic polynomials that imaginary numbers were discovered. Rather, it was in the study of cubic polynomials. A cubic polynomial is a polynomial of the form
where , , and are real numbers as before. For example, if we choose , , , and , we get the cubic polynomial
If we plot this polynomial as before, we get something at looks like this:
Notice how this cubic polynomial has more roots than the quadratic? This is a general feature of polynomial equations. The maximum number of roots a polynomial can have is equal to the highest power in the unknown of the polynomial. In this case, we have an term, so the the highest power is three and we can have three roots.
Of course, we can have _fewer_ roots. It’s very easy to construct a polynomial with only one root. Take our cubic from before, but set . when we plot this, we get:
Can a cubic polynomial have zero roots? Well, technically yes. If we choose , , , and , we end up with
which we know is a quadratic polynomial with zero roots. But what if we force ? Well, cubing—multiplying a number by itself three times—preserves sign. and . In general, if is is a real number, . Furthermore, any positive real number can be represented as the cube of some smaller positive real number. Combined, these results mean that every real number can be represented as for some real number . Mathematicians call this property surjectivity. What this means is that so long as we choose in our cubic polynomial it will have at least one root.
The Impossible Case
In the mid 1500s, mathematician Gerolamo Cardano noticed that every cubic polynomial (with an term) has at least one root. Even better, he found an explicit formula to find at least one root of a polynomial if it had the form
where and are positive real numbers. This is the formula:
where means the cubed root of . It’s the number which, if you multiply it by itself three times, gives .
The formula is horrible, I know. We now call this formula Cardano’s Formula. Disturbingly, however, Cardano’s formula sometimes yields square roots of negative numbers, which we know don’t exist. For example, if our polynomial is
then Cardano’s formula tells us that a root is
This doesn’t make any sense! But mathematician Rafael Bombelli noticed something. If we just pretend the square roots are okay but don’t evaluate them, things work out. Bombelli found that
Then, he just evaluated Cardano’s formula as if it was okay:
This disturbed a lot of people, of course. Cardano and Bombelli themselves were deeply uncomfortable with it. Nevertheless, Cardano used this trick regularly. In his magnum opus, Artis magnae sive de regulis algebraicis liber unus (Trans: The great Book of Art), Cardano wrote about a simpler example:
It is clear that this case is impossible. Nevertheless, we shall work thus: we divide 10 into two equal parts, making each 5. These we square, making 25. Subtract 40, if you will, from the 25 thus produced, as I showed you on the chapter on operations in the sixth book leaving a remainder of -15, the square root of which added to or subtracted from 5 gives parts of the product which is 40. These will be and .
Putting aside the mental tortures involved, multiply and making 25-(-15) which is +15. Hence the product is 40.
(Source: A History of Algebra: From al-Kwarizmi to Emmy Noether, by B.L. van der Waerden. Emphasis mine.)
The Impossible Possible
(At this point, I’m going to leave the chronological narrative behind and discuss ideas in a way that are conceptually easy for me. So the people I mention first may have been born years after the people I mention last died.)
At first, people just used the square roots of negative numbers to produce real numbers. Rene Descartes (of I think therefore I am fame) coined the term imaginary numbers, and the name stuck. Carl Friedrich Gauss, one of the finest mathematicians in history, noticed something else though. If we treat as a number, every quadratic polynomial has two roots. Remember our rootless quadratic from before? It was
But we can re-write this as
Then the polynomial is zero exactly when or . Indeed, if you allow for imaginary numbers, every cubic polynomial has exactly 3 roots and every polynomial with in it has exactly 4 roots, etc. This is the fundamental theorem of algebra, and it’s an incredibly powerful theoretical tool.
The Complex Plane
What these imaginary numbers mean, however, requires additional tools. Leonhard Euler, another of the giants of mathematics, took the complex the complex numbers out of one dimension and into two. He also gave the fundamental complex number a name. He let
In Euler’s formalism, the numbers no longer lived on a number line, they lived in the complex plane. Multiples of the number were on the horizontal axis, and multiples of were on the vertical axis. Sums of real and imaginary numbers could be anywhere on the plane.
With the advent of the complex plane, the world of numbers grew dramatically.
It’s worth noting that you can always convert a complex number into a real number by taking its norm squared. I’ll explain by example. Say we have the number
If we square this number, the result will still be partly imaginary:
But, if we flip the sign on the term with the , we get a new number. If we multiply this new number by the original number, we get a real number:
And this holds true for any complex number
The Most Beautiful Equation in All of Mathematics
Now that we have two dimensions to play around in, we can make circles. In the plane, a plane made of just real numbers in two directions, the and coordinates of a point on a circle can be described by sine functions and cosine functions. You might have seen a diagram like this one in school.
This description is called polar coordinates. Given a point, we’re relating the position of the point along the and axes of the plane to it’s position around a circle, which we call the angle , and its of radius .
This is all for planes made of real numbers. But is there an equivalent to polar coordinates in the complex plane? Euler found one. But now the axis is imaginary! Euler found that polar coordinates for the complex plane look like this.
But Euler didn’t stop there, he found a very strange relationship between exponentiation (taking things to powers) and angles in the complex plane:
At first glance, this formula looks really funny. What does it mean to multiply a number by itself an imaginary number of times? For that matter, what does it mean to multiply a number a fractional or irrational number of times? Really this is totally nonsensical. Generalizing to fractional exponentiation is easy. We’re just abusing notation a bit. You see, we can write the root of a number as
And then we can take fractional exponents by multiplying the number the correct number of times and then taking the correct root. For example,
This makes manipulating exponents extremely easy because now the root of any number to the power is obviously the number itself:
But what does it mean to take an imaginary power? All I can say for now is that there is a straightforward way to generalize the operation I told you about—multiplying a number by itself some number of times—to this more abstract notion of exponentiation using a tool from calculus called a Taylor series, which I’ll describe another time.
Euler’s formula works, though… and if we choose , the sines and cosines simplify and we get what many people call the most beautiful equation in mathematics:
is fundamental to calculus. is fundamental to geometry, is the fundamental unit of imaginary numbers, and 0 and 1 are the building blocks of the counting numbers. In this one formula, we have the five most important numbers in all of mathematics, related to each other in an incredibly simple way.
That’s about all I wanted to say about what imaginary numbers are. As pure mathematical constructs, imaginary numbers are beautiful and powerful tools for solving algebraic equations. But are they useful in the real world? The word imaginary implies not so much.
But actually, imaginary numbers are everywhere in math, science, and engineering. The mathematician J. S. Hadamard perhaps put it best:
The shortest path between two truths in the real domain passes through the complex domain.
I’ll give a few examples here.
Imaginary numbers show up in electromagnetism and electrical engineering, where scientists take advantage of Euler’s formula. Electromagnetic waves are made out of electric and magnetic fields feeding into each other, and the fields change in both space and time. Really, this means we should use two functions to describe the waves, one describing the space evolution and one describing the time evolution. However, we can get away with using only one function with real and imaginary parts. This is because the waves are described by sine functions and cosine functions. We hide the extra function by using Euler’s formula to transform the sines and cosines in and into a single term.
From the start, physicists used imaginary numbers to formulate quantum mechanics. The quantum wavefunctions that describe the positions of particles live in the complex plane and it is their norm squared that determines the probability of finding a particle in a given position. In its full complex glory, the Schrodinger wave equation is:
The is indeed the imaginary .
The Feynman path integral I described last time also uses imaginary numbers. Indeed, it uses Euler’s formula. In the path integral, you sum over the directions of many arrows pointing in different directions as you travel along all possible paths between two places. Those arrows can be represented, using Euler’s formula, as an imaginary number. And, in the language of mathematic,s we write a Feynman integral as
where and are the initial and final points respectively and where is the energy cost for the particle to travel along a given path. The big tells us to sum over all paths connecting the initial and final points… and the big is equivalent to the quantum wavefunction in Schrodinger’s equation.
Special and General Relativity
In some of my previous articles, I’ve explained how the speed of light is constant and how this leads to special relativity. Later, I described an alternate formulation of special relativity called Minkowski Space. In Minkowski space, space and time are unified into a single spacetime. Furthermore, vectors (just think arrows—a direction and a length) pointing in the time direction have negative square length. In other words, vectors point in the time direction are imaginary! So in Minkowski space, time is an imaginary number! Then we can think Lorentz transformations—the mathematical transformations that generate length contraction and time dilation—as rotations by an imaginary angle.
There’s a beautiful connection between quantum mechanics, which I’ve talked a lot about and statistical mechanics, which I’ve only touched upon. We can connect these two disparate fields by a Wick rotation, which rotates a quantum system through the complex plane. It changes a Feynman path integral into a more classical sum over statistical states.
That’s about all I have to say for applications for now. I promise there will be more about complex numbers in future posts. For now, I just wanted to list a couple of resources that I liked.
- Orlando Merino offers a short history of complex numbers in cliff notes form.
- Raymond Smullyan has some nice remarks on complex numbers on his website.
Questions? Comments? Insults?
That’s all for now. If you have any questions, comments, feedback, or corrections, please don’t hesitate to let me know!
Will you understand what I’m going to tell you?
…No, you’re not going to be able to understand it.
… I don’t understand it. Nobody does.
~Richard Feynman on the Path Integral
The “paradox” is only a conflict between reality
and your feeling of what reality “ought to be.”
~Richard Feynman, in his lectures on physics
Quantum mechanics is a very strange beast. Things tunnel and ooze. You can’t know both position and momentum at the same time. These strange properties come from the amazing realization that particles are waves. Not only that, but the amplitude of the wave tells us the likelihood of measuring a particle at a given position! This staggering revelation helps us understand fundamental things, like the very structure of an atom.
In the past, I’ve written extensively about quantum mechanics. The standard equation is called the Schrodinger Equation, after the discoverer, Erwin Schrodinger. However, there are actually a number of ways to think about quantum mechanics. Richard Feynman, who won the 1965 Nobel prize in physics, constructed another way of thinking about quantum particles, called the path integral. Here I try to explain the path integral as I understand it.
Before reading, you might want to read my previous, more introductory articles on quantum mechanics. My trilogy on the fascinating history and motivation of quantum mechanics can be found here:
I also wrote some ancillary articles on the consequences of quantum mechanics.
- One on the Heisenberg Uncertainty Principle: http://www.thephysicsmill.com/2013/01/13/resolution-fourier-analysis-and-the-heisenberg-uncertainty-principle/
- One on the Pauli Exclusion Principle: http://www.thephysicsmill.com/2013/01/27/binary-unity-the-pauli-exclusion-principle/
- One on quantum tunneling: http://www.thephysicsmill.com/2013/02/24/the-fundamental-oneness-of-nature-quantum-tunneling/
I also wrote an article on band theory and used it in a later article to explain how transistors work. You can find them here:
The Principle of Least Action
Imagine you’re a lifeguard at a beach. As you coolly watch the vacationers, you spot an emergency. Pierre Louis Maupertuis is drowning!
But there’s a problem. How can you get to Maupertuis most quickly? It’s quite hard to run on the sand, after all. There’s a cement path that goes towards the left, and that’s faster… but it’s not quite in the correct direction! What do you do?
Because you’re clever, you realize that you should spend some time on the path, and some time on the sand. You run down the cement path for a little ways, and run across the sand the rest of the way. You choose the correct distance on the cement path such that, running at top speed, you minimize the amount of time it takes to go rescue Maupertuis.
When do rescue Maupertuis, he tells you a staggering fact. He believes that when a light ray passes between air and water, or air and glass, it makes a similar decision. The light starts at a given position, call it point , and wants to get somewhere else, call it point . The light knows that it travels slower throw the water than it does through the air, but that the air might not be the most direct route, so it chooses to spend part of its time in the air and part of its time in the water.
You scoff at Maupertuis because it means that the light somehow knows what the quickest path will be. Since this seems to attribute both intelligence and prescience to photons, it’s extremely unintuitive.
But he tells you that he can prove that the idea works… he can reproduce the law of refraction. Besides, because we are part of the universe, we can never really observe it and understand it from the outside. Our models and descriptions reflect our human nature and are limited by it. Why should nature feel obligated to behave in the way you expect?
What Maupertuis just described to you is called the calculus of variations, and it has a wide variety of applications in physics. Calculus of variations is the base technique of Lagrangian mechanics, which (along with its lesser-known cousin, Hamiltonian mechanics) offers an alternative to Newton’s method of solving physics problems. The idea is that every object moves to expend the least energy. (For experts: to make the action extremal.) This is called the principle of least action.
A Quantum Principle of Least Action
Don’t worry about what those symbols actually mean. What I want to emphasize is that, although the story it tells is very different, the Schrodinger equation is the quantum analog of the old Newton equation:
I wrote Newton’s law to look more similar to Schrodinger’s equation… but what you’re probably used to is
(Technically, the Schrodinger equation is the quantum analog of Hamilton’s equations, which can be used to derive Newton’s force law… and are actually very related to the principle of least action. But that’s a story for another day.)
So if Schrodinger’s equation is the analog of Newton’s ideas. Is there an analog to Maupertuis’ principle of least action? Richard Feynman thought about this question rather a lot… and he came up with a solution.
Imagine a particle is at some initial position in the -plane, and we want to know what path it will take to some final position. By the classical least action principle, the particle will take a path between the two positions that costs the least energy. But, if the particle is a quantum particle, it’s not really localized at a point. Instead, the particle is a wave… and it doesn’t take one path from the initial position to the final position, it takes all possible paths.
But what does this mean computationally? Well, quantum particles are represented by probability waves, and quantum mechanics is an inherently probabilistic theory. We need some way of connecting all these paths to some notion of “wavy-ness.”
An important property of waves that we want to preserve is the “superposition principle.” This is an essential piece of the quantum mechanics picture, so we want to preserve this idea. I’ve previously discussed wave interference
and the superposition principle, so if you remember my previous discussions, feel free to skip to the next section.
Imagine waves as wiggles on a very stretchy string. If I try and push up on the string (make a wiggle that goes up) and you try and push down on the string (make a wiggle that goes down) at the same time, neither of us ends up moving the string as much as we intended. This is called destructive interference. Similarly, if I push up on the string at the same time that you push up on the string, we’ll probably stretch it quite a lot. This is called constructive interference. The process of overlaying one wave over another is called superposition.
Waves and Circles
A hint about how to preserve “wavy-ness” comes from looking at how sine waves appear from a circle. If we take a dot, and it travels around the circle. The vertical motion of the dot traces out the shape of a sine wave. Notice that the wave repeats itself every time the dot traces a full circle. The number of times the dot traces the circle each second is called the “frequency” of the wave. If the wave is a sound wave, then this is the frequency of sound—or the pitch. If the wave is a light wave, this is the color of the light.
We can use this. Let’s pretend there’s a little arrow at the center of the circle, and that it spins to point at the place the dot is. We’ll try and use this arrow to represent the wave. Now we take the frequency wave as a definition, and use it to tell us how fast to spin the arrow. The higher the frequency, the faster we spin the arrow.
But how do we encapsulate the superposition principle? The answer comes from vector addition, which I’ve talked about a little bit before. Imagine we have two arrows, a blue arrow and a green arrow, as shown below (on the left). We can make a third arrow, a red one, by taking the the tail of the green arrow and putting it at the tip of the blue arrow. We then use the red arrow to connect the tail of the blue arrow to the tip of the green one.
This allows us to encapsulate the idea of waves adding up on top of each other. If the blue and green vectors represent two waves, then the red vector represents the wave created by interference.
(Astute readers may remember that I gave a similar picture for calculating acceleration vectors, only with the roles of the red and green arrows reversed. That’s no accident. Acceleration calculations use vector subtraction, which is the opposite of vector addition.)
Now, notice that we can totally cancel out two vectors, if they point in opposite directions, as shown below. The resulting sum vector has zero length, and is called the zero vector. In the language of waves, this is the extreme case of destructive interference.
The Feynman Path Integral
Now we have all the ingredients to take the principle of least action and make it quantum. Because quantum mechanics is a probabilistic theory, our principle of least action should make a probabilistic statement. Suppose we have a quantum particle with some wavelength or frequency (either will do—they’re inverses of each other) which we just measured at some initial point. We want to know the probability of finding it at some final point.
To do this, we take the take our little arrows and make them rotate at a speed based on how difficult it is for a particle to travel through a given point—the harder the travel, the slower the arrow spins. (For experts, they spin with frequency equal to the action.) In other words, if the particle starts in a thick fluid, the arrows would spin slower than if the particle started in the vacuum of space.
Now, we discussed before that the particle takes all paths between the initial point and the final point. So we make our arrows follow each path and rotate them as we go along, as shown below.
At the end of the story, when we’ve followed all the paths—and usually there are an uncountable number of them—we take the arrows at the final point and add them up by vector addition. The final vector corresponds to the height (amplitude) of the wavefunction that represents the particle. And this, my friends, gives us the probability that the particle transitions from the initial position to the final position! That’s a Feynman path integral!
Applications And Implications
Just like the principle of least action is completely equivalent to Newton’s force law, the Feynman path integral is completely equivalent to the Schrodinger picture of quantum mechanics. Although it’s much harder to compute by, the path integral is often more intuitive, and it let’s physicists think about particle interactions very quickly and easily without actually having to compute the integral. Feynman path diagrams, which I haven’t explained come from this formalism.
The path integral is also important for quantum gravity, where it is unclear how to merge quantum mechanics and general relativity. The Feynman path integral, which accepts knowledge of the past and the future, seems to mesh well with general relativity where time and space are one. As I’ll explain in a later article, my own research used the path integral in an attempt to understand how quantum gravity works.
Richard Feynman actually wrote a book for the layperson on this very subject. It’s called “Q.E.D. The Strange Theory of Light And Matter.” It’s a wonderful book and you should all read it. Feynman has a gift for explaining things simply. I hope that this article has whet your appetite for the words by the master.
The excellent blog LessWrong also has an article on the path integral. In fact, they have a whole series of articles on physics. You should check it out.
Finally, I really suggest you all watch Feynman’s lectures on physics. As I said, Feynman has a gift for exposition, and all his videos for non-physicists are available to watch online. If you’re interested in learning some elementary physics from the master, he also taught an introductory college physics class, and Microsoft found the videos of his lectures and put them online.
Hi everyone! This week, I was traveling to Park City, Utah, to participate in the 3-week Park City Mathematics Institute. It’s currently a blast! I have more time now, but in the meantime, I asked my good friend Mike Schmidt to write a guest article for me. He wrote on probability which, if you’ve been reading for a while you know, is deeply connected to modern physics.
Anyway, here’s the article. Thanks, Mike!
The laws of Probability
So true in general
So fallacious in particular.
Throughout history, humans have played games of chance. However, it wasn’t until the 1600s when Blaise Pascal and Pierre de Fermat started to investigate the a mathematical description of chance. The story starts, as most good stories do, with gambling. The following passage comes from Tom Apostol’s excellent calculus textbook:
“A gambler’s dispute in 1654 led to the creation of a mathematical theory of probability by two famous French mathematicians, Blaise Pascal and Pierre de Fermat. Antoine Gombaud, Chevalier de Méré, a French nobleman with an interest in gaming and gambling questions, called Pascal’s attention to an apparent contradiction concerning a popular dice game. The game consisted in throwing a pair of dice 24 times; the problem was to decide whether or not to bet even money on the occurrence of at least one “double six” during the 24 throws. A seemingly well-established gambling rule led de Méré to believe that betting on a double six in 24 throws would be profitable, but his own calculations indicated just the opposite.
This problem and others posed by de Méré led to an exchange of letters between Pascal and Fermat in which the fundamental principles of probability theory were formulated for the first time. Although a few special problems on games of chance had been solved by some Italian mathematicians in the 15th and 16th centuries, no general theory was developed before this famous correspondence.
The Dutch scientist Christian Huygens, a teacher of Leibniz, learned of this correspondence and shortly thereafter (in 1657) published the first book on probability; entitled De Ratiociniis in Ludo Aleae, it was a treatise on problems associated with gambling. Because of the inherent appeal of games of chance, probability theory soon became popular, and the subject developed rapidly during the 18th century. The major contributors during this period were Jakob Bernoulli (1654-1705) and Abraham de Moivre (1667-1754).
What does it mean to be random? If you do something the same way each time, but the outcome is different, that’s random. More formally, an operation which results in different results given identical starting conditions is said to be random. A random system, is the system where you perform the operation. For example, if I flip a coin, I’ve performed a random operation. But the coin is the random system.
Before the turn of the 20th century the predominant theory of the world was that of determinism. Determinism is the belief that if there is a set of initial conditions, there is only one result. Well, this would lead us to believe in a deterministic “clockwork” world; there would be no randomness.
At the time, our definition of randomness wasn’t really considered to be true. Instead, a system which exhibited severe sensitivity to initial conditions was considered to be random. Take for instance a coin flip; if you knew the exact force and direction imparted to the coin you could determine which face would fall upwards. Since there is always errors in measurement, there will aways be some level of uncertainty. In the coin example, there is so much sensitivity to initial force and direction, you could never be certain of where it would fall.
At the beginning of the 20th century, determinism started to seem incorrect. As evidence of quantum mechanics accumulated, notions of true randomness began to emerge. For the first time in physics, there was now evidence which competed directly with determinism. Quantum mechanics suggests there is no way, no matter how careful you are, to completely determine the outcome of an experiment.
(In fact, there was a recent paper arguing that even a coin flip is a quantum-mechanical effect, and inherently random. The paper was co-authored by Andreas Albrecht, one of the inventors of inflationary theory, and his student. There’s a nice article on it in new scientist, which is unfortunately behind a paywall. If you want, though, you can read the actual paper for free here.)
Upon hearing this, the situation seems pretty bleak, doesn’t it? Fortunately, the distinction between “I don’t know how this will pan out” and “I can’t know how this will pan out” doesn’t really change the analysis! So, how exactly do we characterize a random outcome? Well, we can only express the likelihoods of specific outcomes.
Let’s start looking at the most basic probabilistic system, the coin flip. If you flip a coin once, you’ll either get H(eads) or T(ails). Since each option is equally likely, the likelihood is 1/2 for heads or tails. Notice that 1/2 + 1/2 is 1. This is always a rule of probabilities: the sum of all likelihoods is one. In other words, this means that we require that something happen.
To show what this could look like, the following animation shows how a the counts add up over the course of many flips. The left plot shows the percent of flips that came out heads (left bar) and the percentage of tails (right bar) that have appeared as we flip the coins. The right plot shows the percentage of heads as a function of flips.
After many flips the chart will settle down to something that looks like this:
Now, let’s look at another coin flip, but this time let’s flip two coins! In this case there are four possible outcomes: HH HT TH TT. Again, since every option is equally as likely to occur, we say that each has a 1/4 chance to occur. In this case we’ve required what is called ordering. In other words, Coin 1 is distinct from Coin 2. What if we relaxed this requirement? That is to say, what if we threw both coins in the air and didn’t know which was which? In this case we couldn’t tell the difference between the HT and TH outcomes, they would be the same! Now we have a new situation, the outcomes are now: HH (HT or TH) TT. We only have three outcomes but the middle outcome is ambivalent of order. We’ll say the middle outcome has a weight of two.
This will change our arithmetic slightly; now the likelihood of an outcome is the weight/sum of weights. The sum of weights would be 1 + 2 + 1 = 4. Therefore, HH is 1/4, (HT or TH) is 1/2, and TT is 1/4.
Another way to think about it is to treat probability as a counting problem. The probability of getting a specific outcome is the number of ways that outcome can occur, divided by the total number of ways any outcome can occur. In the two coin flip example, there are two ways that you can get a head and a tails, TH or HT… and a total number of 4 outcomes. Then you get a probability of 2/4=1/2.
Let’s take another break and talk about what these likelihoods mean. A likelihood is, pedantically, the weight of a particular outcome compared to other outcomes. In the single coin flip example, heads and tails are equally weighted and are the only two options, so we can see the likelihood is 0.5 (as 0.5 + 0.5 = 1 ). Now, let’s consider what happens when there are unequal weights; suppose you have a special 6-sided dice where two sides are the same, say 1. There are 6 possible outcomes 1,1,2,3,4,5. We now say the likelihood of getting a 1 on a single role is 2/6 or 1/3. We can compare this to the likelihood of rolling a 2 which is 1/6.
Now, we may compare the two likelihoods and say that rolling a 1 is twice as likely as rolling any other number. Additionally, if we look at a set of trials, say six rolls, we will “expect” to see two ones, and one of each other number.
Now what does it mean to expect a specific outcome? Say we run a very large number of trials; on average, two 1′s will appear per trial, and each number will appear one time. So when we speak about the likelihood of an outcome, we are only making claims about what would happen on average if we ran the same trial many times.
What if I flip coin after another, and I want to know the probability of getting two heads in a row? We already know from counting that the answer will be 1/4. But is there another way to look at it? There sure is! If you do two random things, and they’re not related, then you can calculate the probability of one outcome happening after the other by multiplying the two probabilities. So, since the probability of getting H on a single coin flip is 1/2, the probability of getting H twice in a row is
Since we now have a good understanding of basic probability, let’s lay out some things to think about.
First, one of the greatest unexpected answers from probability is the Birthday Problem.
The questions asks: what is the probability that two people in a room share a birthday? To answer this question it’s actually easier to first consider the opposite probability, the chance that in a room of n people, no two people share a birthday. To start, we know that there are only 365 days in a year and every person in a room has a birthday on one of those days. We start to build a probability distribution by looking at each person in the room. For the first person, we say their birthday can be on any day. For the second, if they do not share a birthday with the first person, we only have 364 days to then choose from. For the third, only 363. We can therefore see the probability that no one in the room shares a birthday looks like:
If then the last term will be zero and will be zero! Now if we considered a room full of 365 people, we know it would be impossible to have no pair of two people who did not share a birthday, as we only have only 365 unique birthdays. So, the two ways of thinking fit with each other.
Now, what about the first question? The chance that no pair has a birthday and the chance that two do share a birthday must, when added, equal one. Therefore, we can say or where is the probability that out of n people, at least one pair share’s a birthday. Amazingly, if you compute it, ! This is quite astounding! On first look, 23 people sounds too low, but you must remember to consider the group as a whole where each pair must be considered.
Second, the Gambler’s fallacy is the belief that if you see a deviation from the expected likelihood in a small number of samples, that at some point the likelihood must over-correct to make up for the previous deviation. To see why this is untrue, we must remember two, say rolls of a dice, are independent. In other words, the outcome of the first roll doesn’t change the outcome of the second. This means there is no way for the dice to “know” it needs to correct and will therefore continue to play out with the same likelihood every time.
If you have any questions, as always, feel free to ask. If you have requests for additional topics, again, please post a comment asking for it.
Did you use a mouse recently? Did you type on a keyboard? Did you click on a link? Connect to a computer network? Odds are, if you’re reading this, you have. You can thank Douglas Engelbart (1925–2013) for all of these inventions and more.
In December 1950, Engelbart had it all. He was engaged to be married, had a good job as a radar technician, and was generally doing well for himself. At this point, he decided that this wasn’t good enough. He decided he wanted to improve the world and that, although they were in their infancy, computers were the best way to do this.
Even though computers at the time were enormous mainframes, totally unavailable to the public, Engelbart had a vision of
“intellectual workers sitting at display ‘working stations’, flying through information space, harnessing their collective intellectual capacity to solve important problems together in much more powerful ways. Harnessing collective intellect, facilitated by interactive computers, became his life’s mission at a time when computers were viewed as number crunching tools.” (quote from wikipedia).
Engelbart went back to graduate school in electrical engineering and quickly became one of the worlds foremost experts in computer-human interaction in a time when most people had never even seen a computer.
Just four days ago, on July 2, Engelbart passed away. He never retired. R.I.P. Douglas Engelbart, you’ve done humanity a great service.
Most of Engelbart’s philosophy can be found in the book Boosting Our Collective IQ, by Douglas C. Engelbart.
There’s a nice memorial to Engelbart on xkcd:
Expansion means complexity
and complexity decay.
~Cyril Northcote Parkinson
This is part three of a series on the early universe. In the first article, I described the history of the Big Bang theory and why we believe the universe started in a colossal explosion. In the second article, I described some inconsistencies in the Big Bang theory that need correcting. Now I’ll explain how the theory of cosmic inflation addresses these inconsistencies and why we might believe it in inflation. This explanation will use ideas from quantum mechanics and general relativity; you can find my articles on these subjects here and here.
The Tantalizing Almost-Problems
The horizon problem refers to the strange homogeneity in the cosmic microwave background (or CMB), the light that fills the universe left over from the Big Bang. Light from one end of the universe should not have had time to reach the other end of the universe since time began. However, the CMB looks the same no matter where we look–which is very unlikely unless photons on one end of the sky had time to mix with the photons on the other end of the sky.
The flatness problem has to do withgeneral relativity’s prediction that space and time should be curved… even on a universal scale. In fact, at any given time, space forms a curved three-dimensional hypersurface that lives in four-dimensional spacetime. After the Big Bang, the spatial part of the universe should have become more and more curved. However, we can’t observe any curvature at all! This, too, is very unlikely
The reason I call the horizon and flatness problems “almost-problems” is because they don’t indicate inconsistencies in the Big Bang theory with absolute certainty. Given the unmodified Big Bang theory, we could resolve the horizon and flatness problems simply by assuming that we live in a very special, very unlikely universe. But this resolution isn’t very intellectually satisfying. That’s why cosmologists, most notably Alan Guth, Andrei Linde, Andreas Albrecht, and Paul Steinhardt, developed the theory of inflation.
Resolving the Almost-Problems
To resolve the horizon problem, we need opposite sides of the sky to have been close enough to each other in the early universe for light to pass between them, so that they could mix and homogenize. (We call this state of closeness causal contact.) Furthermore, we need them to have stopped causal contact about 13.7 billion years ago, in order for us to be able to observe them slowly re-entering causal contact today.
One way to achieve this would be for the spatial universe to stay relatively static after the Big Bang. The small size of the universe and its lack of change would allow opposite sides of the sky to enter causal contact with each other. After mass and energy across the universe had sufficiently mixed, the universe would enter a period of extremely rapid expansion, which we call inflation. The inflationary period would pull objects in the universe apart from each other so quickly that light would no longer be able to pass between them. Thus, they would exit causal contact. (Afterwards, however, the rapid expansion would have needed to stop somehow. I will discuss this necessity shortly.)
This rapid expansion also fixes the flatness problem. For example, both balloons and the planet Earth are round–but when we stand on the Earth, it looks flat. This is because everything appears to be flat if you look close enough. (Side note: this is the principle on which the entire field of differential geometry is based!) And we’re very small compared to the Earth. In other words, we’re looking at it very closely–from the viewpoint of our tiny heights, which is measured on a scale of mere meters. Similarly, as space expands rapidly during the inflationary period, it appears larger and larger compared to us. Thus, it appears flat.
But how could the universe expand like that? And if it did, why isn’t it still expanding? We need a way in which the universe enters a period of rapid inflation and then leaves it. To understand how this might occur, cosmologists take a hint from a current mystery: dark energy.
Currently, the universe is expanding at an ever-increasing rate. It’s not expanding as quickly as in the inflationary period, but if trends continue, it will eventually accelerate to inflationary rates. We don’t know what’s causing this acceleration—it’s possible that the physics on the subject is simply wrong—but one thing that could cause it is a particle with negative energy. With some weird quantum exceptions, we don’t observe any negative energy now, but it’s possible that such a particle existed in the early universe. Indeed, it’s possible that the sign of the energy of the particle changed over time. We call this particle the inflaton.
In the hot, dense early universe, the inflaton filled the universe–and while the universe was static, inflatons from one end of the universe had a chance to interact with the other end, so that the universe because homogeneous. After enough time, the inflaton’s energy became negative, causing the universe to expand. (Indeed, the high density of inflatons would have caused extremely rapid expansion.) Eventually, the energy flipped sign again and inflation stopped. The inflatons then slowly disappeared through a number of processes (like particle collisions) and transferred all their energy into the curvature of spacetime, mass, and light. This process is called reheating, which appears to us as the Big Bang. The energy from the inflatons that went into light became the Cosmic Microwave Background.
But does this mean that we didn’t have a real Big Bang? That inflation only makes it appear as if we had one? Well, we don’t know for sure. Inflation erases all evidence of itself and of what happened before the inflationary period. The shape of the universe before inflation is spread out across the current universe beyond what we can see. In fact, because the universe is still expanding at an increasing rate, there are places in the universe we will never observe. Personally, I do believe there was a Big Bang before inflation. It is at least one very compelling story for the beginning of the universe.
(Some sticky technical details: the inflaton energy is based on the slope of the curve representing its potential energy as a function of the quantum field. Most theories of inflation, such as slow-roll inflation, posit some potential energy curve such that the inflaton only looks like dark energy. Eventually the slope changes and the inflaton becomes much less exotic.)
Quantum Fluctuations And The Seeds of Galaxies
Like electrons today, the inflaton was a quantum particle. And, like all quantum particles today, the inflaton sometimes appeared and disappeared at random throughout the universe. These quantum fluctuations have far-reaching consequences today. Say that in the small early universe, a few million new inflaton particle appeared in a single place. If the universe were static, this wouldn’t matter. The inflatons would spread out naturally and, over time, the inflatons would be equally dense everywhere.
However, in the period of rapid expansion, the new inflatons (which couldn’t travel faster than light) may have been unable to reach other parts of the universe. Indeed, if the inflation were rapid enough, the inflatons at one end of the pocket of high density may not have been able to reach the inflatons at the other end of the pocket. The space between them might have been simply increasing too fast.
Eventually, inflation stopped, and the inflatons disappeared and transferred their energy to other particles or to the fabric of spacetime itself. But this means that the low-density pockets of inflatons would have transferred significantly less energy than the high-density pockets. These early quantum fluctuations in the inflaton density would have later become quantum fluctuations in the cosmic microwave background and in the density of mass itself. Thus, these high-density inflaton pockets must have been the seeds from which galaxies were formed!
But what about the energy that became the CMB? Are there quantum fluctuations there, too? There sure are! Although the CMB is homogeneous almost everywhere, there are tiny little fluctuations in the temperature of the CMB. And these fluctuations line up exactly with the predictions of cosmic inflation! This is the real triumph of cosmic inflation as a theory, and why most of the scientific community now believes it.
Just recently, the Planck collaboration released the results of its four-year study of the CMB. This work refines the early work by the 11-year WMAP survey. The above image shows quantum fluctuations in the temperature of the CMB. The oval is the observable sky; orange represents higher temperature and blue represents lower temperature. (The scale is blown up so that contrast between high and low temperature is obvious, but the difference is actually only about one part in one hundred thousand.) This single image–or more accurately, the earlier WMAP image–is what convinced the majority of cosmologists that inflation might be true.
The Verdict and the Controversey
Through an incredible theoretical and experimental effort, cosmic inflation is slowly becoming a real theory with predictive power. Most astrophysicists, myself included, think that it’s pretty plausible. However, the verdict is still out as to whether or not it’s correct. There are a number of competing theories. Perhaps the most prominent one is the anthropic principle. The idea is that only a universe as flat and homogeneous as ours could support life…so the one we exist in must have these absurdly unlikely properties, or else we wouldn’t be here asking these questions.
There are also a number of competing theories of inflation, mostly having to do with how inflation stopped. Slow-roll inflation predicts that the sign of the inflaton deterministically switched. However, there are other possibilities. For instance, eternal inflation argues that inflation never stopped on the cosmic scale. Instead, because of quantum fluctuations in the density and energy of inflatons, certain bubbles of stability—called bubble universes—appear in the universe where inflation just happened to slow down or stop. (We would live in one of these bubble universes.) These bubble universes would remain relatively static compared to the inflating universe around them, thus obeying the expansion pattern predicted by the Big Bang theory. However, the non-stable parts of the universe surrounding these bubbles would continue to inflate, causing the bubbles to move away from each other. Eventually, the universe would begin to look like Swiss cheese, where the holes are bubble universes.
Questions? Comments? Insults?
Cosmic inflation is pretty complicated and I don’t have as strong a grasp on it as I would like. If I’ve made a mistake, please let me know! (If you want more explanation or have insults or kudos for me, let me know that, too! Thanks as always for reading!