The trouble with Reichenbach

(Note: this blog post is vaguely related to a paper I wrote. You can find it on the arXiv here. )

Suppose you are walking along the beach, and you come across two holes in the rock, spaced apart by some distance; let us label them ‘A’ and ‘B’. You observe an interesting correlation between them. Every so often, at an unpredictable time, water will come spraying out of hole A, followed shortly after by a spray of water out of hole B. Given our day-to-day experience of such things, most of us would conclude that the holes are connected by a tunnel underneath the rock, which is in turn connected to the ocean, such that a surge of water in the underground tunnel causes the water to spray from the two holes at about the same time.

Image credit: some douchebag
Now, therein lies a mystery: how did our brains make this deduction so quickly and easily? The mere fact of a statistical correlation does not tell us much about the direction of cause and effect. Two questions arise. First, why do correlations require explanations in the first place? Why can we not simply accept that the two geysers spray water in synchronisation with each other, without searching for explanations in terms of underground tunnels and ocean surges? Secondly, how do we know in this instance that the explanation is that of a common cause, and not that (for example) the spouting of water from one geyser triggers some kind of chain reaction that results in the spouting of water from the other?

The first question is a deep one. We have in our minds a model of how the world works, which is the product partly of history, partly of personal experience, and partly of science. Historically, we humans have evolved to see the world in a particular way that emphasises objects and their spatial and temporal relations to one another. In our personal experience, we have seen that objects move and interact in ways that follow certain patterns: objects fall when dropped and signals propagate through chains of interactions, like a series of dominoes falling over. Science has deduced the precise mechanical rules that govern these motions.

According to our world-view, causes always occur before their effects in time, and one way that correlations can arise between two events is if one is the cause of the other. In the present example, we may reason as follows: since hole B always spouts after A, the causal chain of events, if it exists, must run from A to B. Next, suppose that I were to cover hole A with a large stone, thereby preventing it from emitting water. If the occasion of its emission were the cause of hole B’s emission, then hole B should also cease to produce water when hole A is covered. If we perform the experiment and we find that hole B’s rate of spouting is unaffected by the presence of a stone blocking hole A, we can conclude that the two events of spouting water are not connected by a direct causal chain.

The only other way in which correlations can arise is by the influence of a third event — such as the surging of water in an underground tunnel — whose occurrence triggers both of the water spouts, each independently of the other. We could promote this aspect of our world-view to a general principle, called the Principle of the Common Cause (PCC): whenever two events A and B are correlated, then either one is a cause of the other, or else they share a common cause (which must occur some time before both of these events).

The Principle of Common Cause tells us where to look for an explanation, but it does not tell us whether our explanation is complete. In our example, we used the PCC to deduce that there must be some event preceding the two water spouts which explains their correlation, and for this we proposed a surge of water in an underground tunnel. Now suppose that the presence of water in this tunnel is absolutely necessary in order for the holes to spout water, but that on some occasions the holes do not spout even though there is water in the tunnel. In that case, simply knowing that there is water in the tunnel does not completely eliminate the correlation between the two water spouts. That is, even though I know there is water in the tunnel, I am not certain whether hole B will emit water, unless I happen to know in addition that hole A has just spouted. So, the probability of B still depends on A, despite my knowledge of the ‘common cause’. I therefore conclude that I do not know everything that there is to know about this common cause, and there is still information to be had.


It could be, for instance, that the holes will only spout water if the water pressure is above a certain threshold in the underground tunnel. If I am able to detect both the presence of the water and its pressure in the tunnel, then I can predict with certainty whether the two holes will spout or not. In particular, I will know with certainty whether hole B is going to spout, independently of A. Thus, if I had stakes riding on the outcome of B, and you were to try and sell me the information “whether A has just spouted”, I would not buy it, because it does not provide any further information beyond what I can deduce from the water in the tunnel and its pressure level. It is a fact of general experience that, conditional on complete knowledge of the common causes of two events, the probabilities of those events are no longer correlated. This is called the principle of Factorisation of Probabilities (FP). The union of FP and PCC together is called Reichenbach’s Common Cause Principle (RCCP).


In the above example, the complete knowledge of the common cause allowed me to perfectly determine whether the holes would spout or not. The conditional independence of these two events is therefore guaranteed. One might wonder why I did not talk about the principle of predetermination: conditional on on complete knowledge of the common causes, the events are determined with certainty. The reason is that predetermination might be too strong; it may be that there exist phenomena that are irreducibly random, such that even a full knowledge of the common causes does not suffice to determine the resulting events with certainty.

As another example, consider two river beds on a mountain slope, one on the left and one on the right. Usually (96% of the time) it does not rain on the mountain and both rivers are dry. If it does rain on the mountain, then there are four possibilities with equal likelihood: (i) the river beds both remain dry, (ii) the left river flows but the right one is dry (iii) the right river flows but the left is dry, or (iv) both rivers flow. Thus, without knowing anything else, the fact that one river is running makes it more likely that the other one is. However, conditional that it rained on the mountain, if I know that the left river is flowing (or dry), this does not tell me anything about whether the right river is flowing or dry. So, it seems that after conditioning on the common cause (rain on the mountain) the probabilities factorise: knowing about one river tells me nothing about the other.


Now we have a situation in which the common cause does not completely determine the outcomes of the events, but where the probabilities nevertheless factorise. Should we then conclude that the correlations are explained? If we answer ‘yes’, we have fallen into a trap.

The trap is that there may be additional information which, if discovered, would make the rivers become correlated. Suppose I find a meeting point of the two rivers further upstream, in which sediment and debris tends to gather. If there is only a little debris, it will be pushed to one side (the side chosen effectively at random), diverting water to one of the rivers and blocking the other. Alternatively, if there is a large build-up of debris, it will either dam the rivers, leaving them both dry, or else be completely destroyed by the build-up of water, feeding both rivers at once. Now, if I know that it rained on the mountain and I know how much debris is present upstream, knowing whether one river is flowing will provide information about the other (eg. if there is a little debris upstream and the right river is flowing, I know the left must be dry).


Before I knew anything, the rivers seemed to be correlated. Conditional on whether it rained on the mountain-top, the correlation disappeared. But now, conditional that it rained on the mountain and on the amount of debris upstream, the correlation is restored! If the only tools I had to explain correlations was the PCC and the FP, then how can I ever be sure that the explanation is complete? Unless the information of the common cause is enough to predetermine the outcomes of the events with certainty, there is always the possibility that the correlations have not been explained, because new information about the common causes might come to light which renders the events correlated again.

Now, at last, we come to the main point. In our classical world-view, observations tend to be compatible with predetermination. No matter how unpredictable or chaotic a phenomenon seems, we find it natural to imagine that every observed fact could be predicted with certainty, in principle, if only we knew enough about its relevant causes. In that case, we are right to say that a correlation has not been fully explained unless Reichenbach’s principle is satisfied. But this last property is now just seen as a trivial consequence of predetermination, implicit in out world-view. In fact, Reichenbach’s principle is not sufficient to guarantee that we have found an explanation. We can only be sure that the explanation has been found when the observed facts are fully determined by their causes.

This poses an interesting problem to anyone (like me) who thinks the world is intrinsically random. If we give up predetermination, we have lost our sufficient condition for correlations to be explained. Normally, if we saw a correlation, after eliminating the possibility of a direct cause we would stop searching for an explanation only when we found one that could perfectly determine the observations. But if the world is random, then how do we know when we have found a good enough explanation?

In this case, it is tempting to argue that Reichenbach’s principle should be taken as a sufficient (not just necessary) condition for an explanation. Then, we know to stop looking for explanations as soon as we have found one that causes the probabilities to factorise. But as I just argued with the example of the two rivers, this doesn’t work. If we believed this, then we would have to accept that it is possible for an explained correlation to suddenly become unexplained upon the discovery of additional facts! Short of a physical law forbidding such additional facts, this makes for a very tenuous notion of explanation indeed.

So fuck off
The question of what should constitute a satisfactory explanation for a correlation is, I think, one of the deepest problems posed to us by quantum mechanics. The way I read Bell’s theorem is that (assuming that we accept the theorem’s basic assumptions) quantum mechanics is either non-local, or else it contains correlations that do not satisfy the factorisation part of Reichenbach’s principle. If we believe that factorisation is a necessary part of explanation, then we are forced to accept non-locality. But why should factorisation be a necessary requirement of explanation? It is only justified if we believe in predetermination.

A critic might try to argue that, without factorisation, we have lost all ability to explain correlations. But I’m saying that this true even for those who would accept factorisation but reject predetermination. I say, without predetermination, there is no need to hold on to factorisation, because it doesn’t help you to explain correlations any better than the rest of us non-determinists! So what are we to do? Maybe it is time to shrug off factorisation and face up to the task of finding a proper explanation for quantum correlations.

Jacques Pienaar’s guide to making physics (Pt.1)

(Not to be confused with using Principals as tools, which is what happens if your school Principal is a tool because he never taught you the difference between a Principal and a principle. Also not to be confused with a Princey-pal, who is a friend that happens to be a Prince).

`These principles are the boldly generalized results of experiment; but they appear to derive from their very generality a high degree of certainty. In fact, the greater the generality, the more frequent are the opportunities for verifying them, and such verifications, as they multiply, as they take the most varied and most unexpected forms, leave in the end no room for doubt.’ -Poincaré

One of the great things Einstein did, besides doing physics, was trying to explain to people how to do it as good as him. Ultimately he failed, because so far nobody has managed to do better than him, but he left us with some really interesting insights into how to come up with new physical theories.

One of these ideas is the concept of using `principles’. A principle is a statement about how the word works (or should work), stated in ordinary language. They are not always called principles, but might be called laws, postulates or hypotheses. I am not going to argue about semantics here. Just consider these examples to get a flavour:

The Second Law of Thermodynamics: You can’t build an engine which does useful work and ends up back in its starting position without producing any heat.

Landauer’s principle: you can’t erase information without producing heat.

The Principle of Relativity: It is impossible to tell by local experiments whether or not your laboratory is moving.

And some not strictly physics ones:

Shirky’s law: Institutions will try to preserve the problem to which they are the solution.

Murphy’s law: If something can go wrong, it will go wrong.

Stigler’s law: No scientific discovery is named after its original discoverer (this law was actually discovered by R.K. Merton, not Stigler).

Parkinson’s law: Work always expands to fill up the time allocated to doing it.
(See Wikipedia’s list of eponymous laws for more).

You’ll notice that principles are characterised by two main things: they ring true, and they are vague. Both of these properties are very important for their use in building theories.

Now I can practically hear the lice falling out as you scratch your head in confusion. “But Jacques! How can vagueness be a useful thing to have in a Principle? Shouldn’t it be made as precise as possible?”

No, doofus. A Principle is like an apple. You know what an apple is right?


Well, you think you do. But if I were to ask you, what colour is an apple, how sweet is an apple, how many worms are in an apple, you would have to admit that you don’t know, because the word “apple” is too vague to answer those questions. It is like asking how long is a piece of string. Nevertheless, when you want to go shopping, it suffices to say “buy me an apple” instead of “buy me a Malus domestica, reflective in the 620-750 nanometer range, ten percent sugar, one percent cydia pomonella“.

The only way to make a principle more precise is within the context of a precise theory. But then how would I build a new theory, if I am stuck using the language of the old theory? I can make the idea of an apple more precise using the various scientifically verified properties that apples are known to have, but all of that stuff had to come after we already had a basic vague understanding of what an “apple” was, e.g. a kind of round-ish thing on a tree that tastes nice when you eat it.

The vagueness of a principle means that it defines a whole family of possible theories, these being the ones that kind of fit with the principle if you take the right interpretation. On one hand, a principle that is too vague will not help you to make progress, because it will be too easy to make it fit with any future theory; on the other hand, a principle that is not vague enough will leave you stuck for choices and unable to progress.

The next aspect of a good principle is that it “rings true”. In other words, there is something about it that makes you want it to be true. We want our physical theories to be intuitive to our soft, human brains, and these brains of ours have evolved to think about the world in very specific terms. Why do you think physics seems to be all about the locations of objects in space, moving with time? There are infinitely many ways to describe physics, but we choose the ones we do because of the way our physical senses work, the way our bodies interact with the world, and the things we needed to do in order to survive up to this point. What is the principle of least action? It is a river flowing down a mountain. What is Newtonian mechanics? It is animals moving on the plains. We humans need to see the world in a special way in order to understand it, and good principles are what allow us to shoehorn abstract concepts like thermodynamics and gravitational physics into a picture that looks familiar to us, that we can work with.

That’s why a good principle has to ring true — it has to appeal to the limited imaginative abilities of us humans. Maybe if we were different animals, the laws of physics would be understood in very different terms. Like, the Newtonian mechanics of snakes would start with a simple model of objects moving along snake-paths in two dimensions (the ground), and then go from there to arbitrary motions and higher dimensions. So intelligent snakes might have discovered Fourier analysis way before humans would have, just because they would have been more used to thinking in wavy motions instead of linear motions.


So you see, coming up with good principles is really an art form, that requires you to be deeply in touch with your own humanity. Indeed, principle-finding is part of the great art of generating hypotheses. It is a pity that many scientists don’t practice hypothesis generation enough to realise that it is an art (or maybe they don’t practice art enough?) It is also ironic that science tries so hard to eliminate the human element from the theories, when it is so apparent in the final forms of the theories themselves. It is just like an artist who trains so hard to hide her brush strokes, to make the signature of her hand invisible, even though the subject of the painting is her own face.

Ok, now that we know what principles are, how do we find them? One of the best ways is by the age-old method of Induction. How does induction work? It really deserves its own post, but here it is in a nutshell. Let’s say that you are a turkey, and you observe that whenever the farmer makes a whistle, there is some corn in your bowl. So, being a smart turkey, you might decide to elevate this empirical pattern to a general principle, called the Turkey Principle: whenever the farmer whistles, there is corn in your bowl. BOOM, induction!

Now, what is the use of this principle? It helps you to narrow down which theories are good and which are bad. Suppose one day the farmer whistles but you discover there is not corn in the bowl, but rather rice. With your limited turkey imagination, you are able to come up with three hypotheses to explain this. 1. There was corn in the bowl when the farmer whistled, but then somebody came along and replaced it with rice; 2. the Turkey Principle should be amended to the Weak Turkey Principle, which states that when the farmer whistles, food, but not necessarily corn, will be in the bowl; 3. the contents of the bowl are actually independent of the farmer’s whistling, and the apparent link between these phenomena is just a coincidence. Now, with the aid of the Principle, we can see that there is a clear preference for hypothesis 1 over 2, and for 2 over 3, according to the extent that each hypothesis fits with the Turkey Principle.

This example makes it clear that deciding which patterns to upgrade to general principles, and which to regard as anomalies, is again a question of aesthetics and artistry. A more perceptive turkey might observe that the farmer is not a simple mechanistic process, but a complex and mysterious system, and therefore may not be subject to such strong constraints with regards to his whistling and corn-giving behaviour as are implied by the Turkey Principle. Indeed, were the turkey perceptive enough to guess at the farmer’s true motives, he might start checking the tool shed to see if the axe is missing before running to the food bowl every time the farmer whistles. But this turkey would no doubt be working on hypotheses of his own, motivated by principles of his own, such as the Farmer-is-Not-to-be-Trusted Principle (in connection with the observed correlation of turkey disappearances and family dinner parties).

An example more relevant to physics is Einstein’s Equivalence Principle: that no local experiment can determine whether the laboratory is in motion, or is stationary in a gravitational field. The principle is vague, as you can see by the number of variations, interpretations, and Weak and Strong versions that exist in the literature; but undoubtedly it rings true, since it appears to be widely obeyed all but the most esoteric phenomena, and it gels nicely with the Principle of Relativity. While the Equivalence Principle was instrumental in leading to General Relativity, it is a matter of debate how it should be formulated within the theory, and whether or not it is even true. Much like hammers and saws are needed to make a table, but are not needed after the table is complete, we use principles to make theories and then we set them aside when the theory is complete. The final theory makes predictions perfectly well without needing to refer to the principles that built it, and the principles are too vague to make good predictions on their own. (Sure, with enough fiddling around, you can sit on a hammer and eat food off a saw, but it isn’t really comfortable or easy).

For more intellectual reading on principle theories, see the SEP entry on Einstein’s Philosophy of Science, and Poincare’s excellent notes.

Wigner has no friends in space

The title phrase of this post is taken from an article by Seth Lloyd that appeared on today’s arXiv, entitled “Analysis of a work of quantum art“. Lloyd was talking about an artwork in collaboration with artist Diemut Strebe, called `Wigner’s friends‘ in which a pair of telescopes are separated, one remaining on Earth and the other going to the International Space Station. According to Lloyd, Strebe motivates the work by appealing to the concepts of quantum superposition and entanglement, referring to physicist Eugene Wigner’s famous thought experiment in which one experimenter, Wigner’s friend, finds herself in a superposition prior to Wigner’s measurement. In Strebe’s scenario, both telescopes are aimed at interstellar space, and it is the viewers of the exhibition that are held responsible for collapsing the superposition of the orbiting telescope by observing the image on the ground-based telescope. The idea is that, since there is nobody looking at the orbiting telescope, the image on its CCD array initially exists in a quantum superposition of all possible artworks; hence Wigner has no friends in space. Before I discuss this intriguing work, let me first start a new art movement.

I was doing my PhD at the University of Queensland when my friend Aggie (also a PhD at that time) came to me with an intriguing problem. She needed to integrate a function over a certain region of three-dimensional space. This region could be obtained by slicing corners off a cube in a certain way, but Aggie was finding it impossible to visualize what the resulting shape would look like. Even after doing a 3D plot in Mathematica, she felt that there was something missing from the flattened projections that one had to click-and-drag to rotate. She wanted to know if I’d ever seen this shape before, and if I could maybe draw it for her or make one out of paper and glue (Weirdly, I have always had an undeserved reputation for drawing and origami). I did my best with paper and sticky-tape, but it didn’t quite come out right, so I gave up. In the end, she went and bought some plasticine and made a cube, then cut off the corners until she got the shape she wanted. Now that she could hold it in her hands, she finally felt that she understood just what she was dealing with. She went back to her computer to perform the integration.

At the time, it did not occur to me to ask “Is it art?” While its form was elegant, it was there to serve a practical purpose, namely to help Aggie (who probably did not once suspect that she was doing Art) in her calculation by condensing certain abstract ideas into a concrete form.

Soft Cube
© Malcolm Wright

Disclaimer: Before continuing, please note that I reject the idea that there can be a universal definition of Art. I further reject the (often claimed) corollary that therefore anything and everything can be Art. Instead, I posit that there are many different Arts, and just like living species, they are continually springing into existence, evolving into new forms, and going extinct. Just as a discussion about “what is a species” can lead to interminable and never-ending arguments, I posit that it is much better and more constructive to discuss “what is a lion”? Here, I am going to talk about, and attempt to define, something that might be called Science-Art, Technologism, Scientism, or something like that. Let’s go with `Zappism’, because it reminds me of things that supposedly go `zap’, but really don’t, like lasers.

So what is Zappism? Let me give some examples of what it is and what it is not. Every now and then, there are Art in Science exhibitions where academic researchers submit images of pretty things that they encountered in the course of their research. I include in this category colourful images of fractals, decorated graphs of pretty mathematical functions, astrophysical images of planets and stars and things, and basically anything where a scientist was just mucking around and noticed something beautiful and then made it into a graphic. For this stuff I would suggest the name “Scientific Found Art”, but it is not Zappism.

© Jonathan McCabe. An example of scientific found art.

Aggie’s shape might seem at first to fit the bill of found art, but there is a crucial difference: were the shape not pretty, it still would have served its purpose, which was to explore, in material form, scientific ideas that would otherwise have been elusive and abstract. A computer simulation of a fractal does not serve this purpose unless one also comes to understand the fractal better as a consequence of the simulation, and I’m not convinced this is true any more than one can understand a sentence better by writing it out in binary and then colouring it in.

Zappism is the art of using some kind of medium — be it painting, film, music, literature or something else — and using it to transform some ethereal and ungraspable Platonisms of science into things the human mind can more readily play with. Sometimes something is lost in translation, like adding unscientific `zap’ sounds to lasers, but this is acceptable as long as the core idea is translated — in the case of lasers, the idea that light can be focused into beams that can burn through things.

Many episodes of Star Trek exhibit Zappism. In the episode `Tuvix‘, the transporter merges two crew members into a single person, an incident that is explicitly explained by appealing to the way the transporter recombines matter. Similarly, Cronenberg’s film The Fly is classic Zappism, as is Spielberg’s Jurassic Park. Indeed, almost any science fiction that uses science in an active way almost can’t help but be Zappist. Science fiction can still fail to be Zappist if it uses the science as a kind of gloss or sugar-coating, instead of engaging with the science as a main ingredient. Star Wars is not really Zappist because it is not concerned with the mechanisms of the technology invoked. Luke and Darth might as well be using swords and riding on flying horses for all the story cares, making it is more like Science Fantasy (Why do lightsabers simply stop at a convenient sword-length?)

A science fiction movie can always ignore inconvenient facts, like conservation of momentum, or how there is no sound in space. These annoying truths are often seen as getting in the way of good action and drama. The truth is the opposite: it takes a creative leap of genius to see how to use these facts to the advantage of dramatic effects. The recent film Coherence does a brilliant job of using the idea of Schrodinger’s Cat to create a tense and frightening scenario. When film, art and storytelling are able to incorporate physical law in a natural and graspable way, we are one step closer to connecting the public to cutting-edge science.

Screen Shot 2015-01-08 at 9.57.10 PM
Actress Emily Baldoni grapples with Schrödinger’s equation in Coherence.

On the non-cinematic side, Koen Vanmechelen’s breeding program for cosmopolitan chickens, Maguire and collaborator’s epic project `Dr. Brainlove‘, and Theo Jansen’s Strandbeest could all be called examples of Zappism. But perhaps the most revealing examples are those that do not explicitly use physical technology for the scientific motive, but instead use abstract ideas. For these I cite Dali’s Persistence of Memory (and its Disintegration) with their roots in Relativity theory and Quantum Mechanics; the book Flatland by Edwin Abbott; Alice in Wonderland by Carroll; Gödel, Escher, Bach: An Eternal Golden Braid by Hofstadter, and similar books that bring abstract scientific or mathematical ideas into an imaginable form. A truly great work of Zappism was the invention of the Rubik’s Cube, by the Hungarian sculptor and mathematician Erno Rubik. Rubik conceived the cube as a solution to a more abstract structural design problem of how to rotate the parts of a cube in all three dimensions while keeping the parts connected.

Returning now to Strebe’s artwork `Wigner’s friends’, it should be remarked that the artwork is not a scientific experiment and there is no actual demonstration of quantum coherence between the telescopes. However, Seth Lloyd for some reason seems intent on defending the idea that maybe, just maybe, there is some tiny smidgen of possibility that there is something quantum going on in the experiment. I understand his enthusiasm: I also think it is a very cool artwork, and somehow the whole point of the artwork is its reference to quantum mechanics. But in order to plausibly say that something quantum was really going on in Strebe’s artwork, Lloyd is forced to invoke the Many Worlds interpretation, which to me is tantamount to begging the question — under that assumption isn’t my cheese sandwich also in a quantum superposition?

I don’t see why all this is necessary: when Dali painted the Disintegration of the Persistence of Memory, nobody was scrambling to argue that his oil paint was in a quantum superposition on the canvas. It would be just as absurd as insisting that Da Vinci’s portrait of the Mona Lisa actually contained a real person. There is a sense in which the artistic representation of a person is bound to physics — it is constrained to some extent by the way physical masses compose in three dimensional space — but the art of correct representation is not to be confused with the real thing. Even Mondrian, whose works were famously highly abstract, insisted that he was bound to the true representation of Nature as he saw it [1]. To me, Strebe’s artwork is a representation of quantum mechanics, put into a physical and graspable form, and that is what makes it Zappism. But is it good Zappism? That depends on whether the audience feels any closer to understanding quantum mechanics after the experience.

[1] “The masses generally find my work rather vague. I construct lines and color combinations on a flat surface, in order to express general beauty with the utmost awareness. Nature (or that which I see) inspires me . . . but I want to come as close as possible to the truth…” Source:

Ten Rules for Research

I see a lot of articles out there giving advice in the form of a list of rules. People have a fascination with rule lists. You’ve got the rules of Fight Club, the writer who uses a personal formula, policemen who follow “The Book” to the letter, gangsters with a personal code of ethics, and so on. So here’s my list of rules for being a scientist.

1. Keep reading everything.

2. The value of public speaking skills cannot be underestimated.

3. Remember the big questions that got you here in the first place.

4. Take philosophy seriously, but only the parts you can understand.

5. Sometimes, you just have to shut up and calculate.

6. Don’t distract yourself from the things you don’t know by working on things you do know.

7. The best defense against politics is integrity and a smile.

8. The more certain you are of a result, the more you should double check it.

9. If you aren’t curious to know the result of a calculation, it isn’t worth doing it.

10.  Ask dumb questions. If you are truly an idiot, you’ll be found out eventually, so you might as well satisfy your curiosity in the meantime.

In the end, I think Rule 1 is most important.  So, you should go and read Michael Nielsen’s classic advice to researchers, which is far more eloquent than the garbage you read on my blog.

Calvin and Hobbes
© 2013 Bill Watterson

Time-travel, decoherence, and satellites.

I recently returned to my roots, contributing to a new paper with Tim Ralph (who was my PhD advisor) on the very same topic that formed a major part of my PhD. Out of laziness, let me dig up the relevant information from an earlier post:

“The idea for my PhD thesis comes from a paper that I stumbled across as an undergraduate at the University of Melbourne. That paper, by Tim Ralph, Gerard Milburn and Tony Downes of the University of Queensland, proposed that Earth’s own gravitational field might be strong enough to cause quantum gravity effects in experiments done on satellites. In particular, the difference between the strength of gravity at ground-level and at the height of the orbiting satellite might be just enough to make the quantum particles on the satellite behave in a very funny non-linear way, never before seen at ground level. Why might this happen? This is where the story gets bizarre: the authors got their idea after looking at a theory of time-travel, proposed in 1991 by David Deutsch. According to Deutsch’s theory, if space and time were bent enough by gravity to create a closed loop in time (aka a time machine), then any quantum particle that travelled backwards in time ought to have a very peculiar non-linear behaviour. Tim Ralph and co-authors said: what if there was only a little bit of space-time curvature? Wouldn’t you still expect just a little bit of non-linear behaviour? And we can look for that in the curvature produced by the Earth, without even needing to build a time-machine!”

Artistic view of matter in quantum superposition on curved space-time. Image courtesy of Jonas Schmöle, Vienna Quantum Group.

In our recent paper in New Journal of Physics, for the special Focus on Gravitational Quantum Mechanics, Tim and I re-examined the `event formalism’ (the fancy name for the nonlinear model in question) and we derived some more practical numerical predictions and ironed out a couple of theoretical wrinkles, making it more presentable as an experimental proposal. Now that there is growing interest in quantum gravity phenomenology — that is, testable toy models of quantum gravity effects — Tim’s little theory has an excitingly real chance of being tested and proven either right or wrong. Either way, I’d be curious to know how it turns out! On one hand, if quantum entanglement survives the test, the experiment would stand as one of the first real confirmations of quantum field theory in curved space-time. On the other hand, if the entanglement is destroyed by Earth’s gravitational field, it would signify a serious problem with the standard theory and might even confirm our alternative model. That would be great too, but also somewhat disturbing, since non-linear effects are known to have strange and confusing properties, such as violating the fabled uncertainty principle of quantum mechanics.

You can see my video debut here, in which I give an overview of the paper, complete with hand-drawn sketches!


(Actually there is a funny story attached to the video abstract. The day I filmed the video for this, I had received a letter informing me that my application for renewal of my residence permit in Austria was not yet complete — but the permit itself had expired the previous day! As a result, during the filming I was half panicking at the thought of being deported from the country. In the end it turned out not to be a problem, but if I seem a little tense in the video, well, now you know why.)

Why does matter curve space and time?

This is one of those questions that has always bugged me.
Suppose that, somewhere in the universe, there is a very large closed box made out of some kind of heavy, neutral matter. Inside this box a civilisation of intelligent creatures have evolved. They are made out of normal matter like you and me, except that for some reason they are very light — their bodies do not contain much matter at all. What’s more, there are no other heavy bodies or planets inside this large box aside from the population of aliens, whose total mass is too small to have any noticeable effect on the gravitational field. Thus, the only gravitational field that the aliens are aware of is the field created by the box itself (I’m assuming there are no other massive bodies near to the box).

Setting aside the obvious questions about how these aliens came to exist without an energy source like the sun, and where the heck the giant box came from, I want to examine the following question: in principle, is there any way that these aliens could figure out that matter is the source of gravitational fields?

Now, to make it interesting, let us assume the density of the box is not uniform, so there are some parts of its walls that have a stronger gravitational pull than others. Our aliens can walk around on these parts of the walls, and in some parts the aliens even become too heavy to support their own weight and get stuck until someone rescues them. Elsewhere, the walls of the box are low density and so the gravitational attraction to them is very weak. Here, the aliens can easily jump off and float away from the wall. Indeed, the aliens spend much of their time floating freely near the center of the box where the gravitational fields are weak. Apart from that, the composition of the box itself does not change with time and the box is not rotating, so the aliens are quickly able to map out the constant gravitational field that surrounds them inside the box, with its strong and weak points.

Like us, the aliens have developed technology to manipulate the electromagnetic field, and they know that it is the electromagnetic forces that keeps their bodies intact and stops matter from passing through itself. More importantly, they can accelerate objects of different masses by pushing on them, or applying an electric force to charged test bodies, so they quickly discover that matter has inertia, measured by its mass. In this way, they are able to discover Newton’s laws of mechanics. In addition, their experiments with electromagnetism and light eventually lead them to upgrade their picture of space-time, and their Newtonian mechanics is replaced by special relativistic mechanics and Maxwell’s equations for the electromagnetic field.

So far, so good! Except that, because they do not observe any orbiting planets or moving gravitating bodies (their own bodies being too light to produce any noticible attractive forces), they still have not reproduced Newtonian gravity. They know that there is a static field permeating space-time, called the gravitational field, that seems to be fixed to the frame of the box — but they have no reason to think that this gravitational force originates from matter. Indeed, there are two philosophical schools of thought on this. The first group holds that the gravitational field is to be thought of analogously to the electromagnetic field, and is therefore sourced by special “gravitational charges”. It was originally claimed that the material of the box itself carries gravitational charge, but scrapings of the material from the box revealed it to be the same kind of matter from which the aliens themselves were composed (let’s say Carbon) and the scrapings themselves seemed not to produce any gravitational fields, even when collected together in large amounts of several kilograms (a truly humungous weight to the minds of the aliens, whose entire population combined would only weigh ten kilograms). Some aliens pointed out that the gravitational charge of Carbon might be extremely weak, and since the mass of the entire box was likely to be many orders of magnitude larger than anything they had experienced before, it is possible that its cumulative charge would be enough to produce the field. However, these aliens were criticised for making ad-hoc modifications to their theory to avoid its obvious refutation by the kilograms-of-Carbon experiments. If gravity is analogous to the electromagnetic force — they were asked with a sneer — then why should it be so much weaker than electromagnetism? It seemed rather too convenient.

Some people suggested that the true gravitational charge was not Carbon, but some other material that coated the outside of the box. However, these people were derided even more severely than were the Carbon Gravitists (as they had become known). Instead, the popular scientific consensus shifted to a modern idea in which the gravitational force was considered to be a special kind of force field that simply had no source charges. It was a God-given field whose origin and patterns were not to be questioned but simply accepted, much like the very existence of the Great Box itself. This following gained great support when someone made a great discovery: the gravitational force could be regarded as the very geometry of spacetime itself.

The motivation for this was the peculiar observation, long known but never explained, that massive bodies always had the same acceleration in the gravitational field regardless of their different masses. A single alien falling towards one of the gravitating walls of the box would keep speed perfectly with a group of a hundred Aliens tied together, despite their clearly different masses. This dealt a crushing blow to the remnants of the Carbon Gravitists, for it implied that the gravitational charge of matter was exactly proportional to its inertial mass. This coincidence had no precedent in electromagnetism, where it was known that bodies of the same mass could have very different electric charges.

Under the new school of thought, the gravitational force was reinterpreted as the background geometry of space-time inside the box, which specified the inertial trajectories of all massive bodies. Hence, the gravitational force was not a force at all, so it was meaningless to ascribe a “gravitational charge” to matter. Tensor calculus was developed as a natural extension of special relativity, and the aliens derived the geodesic equation describing the motion of matter in a fixed curved space-time metric. The metric of the box was mapped out with high precision, and all questions about the universe seemed to have been settled.

Well, almost all. Some troublesome philosophers continued to insist that there should be some kind of connection between space-time geometry and matter. They wanted more than just the well-known description of how geometry caused matter to move: they tried to argue that matter should also tell space-time how to curve.

“Our entire population combined only weighs a fraction of the mass of the box. What would happen if there were more matter available to us? What if we did the Carbon-kilogram experiment again, but with 100 kilograms? Or a million? Surely the presence of such a large amount of matter would have an effect on space-time itself?”

But these philosophers were just laughed at. Why should any amount of matter affect the eternal and never-changing space-time geometry? Even if the Great Box itself were removed, the prevailing thought was that the gravitational field would remain, fixed as it was in space-time and not to any material source. So they all lived happily ever after, in blissful ignorance of the gravitational constant G, planetary orbits, and other such fantasies.


Did you find this fairytale disturbing? I did. It illustrates what I think is an under-appreciated uncomfortable feature of our best theories of gravity: they all take the fact that matter generates gravity as a premise, without justification apart from empirical observation. There’s nothing strictly wrong with this — we do essentially the same thing in special relativity when we take the speed of light to be constant regardless of the motion of its source, historically an empirically determined fact (and one that was found quite surprising).

However, there is a slight difference: one can in principle argue that the speed of light should be reference-frame independent from philosophical grounds, without appealing to empirical observations. Roughly, the relativity principle states that the laws of physics should be the same in all frames of motion, and from among the laws of physics we can include the non-relativistic equations of the electromagnetic field, from which the constant speed of light can be derived from the electric and magnetic constants of the vacuum. As far as I know, there is no similar philosophical grounding for the connection between matter and geometry as embodied by the gravitational constant, and hence no compelling reason for our hypothetical aliens to ever believe that matter is the source of space-time geometry.

Could it be that there is an essential piece missing from our accounts of the connection between matter and space-time? Or are our aliens are doomed by their unfortunately contrived situation, never to deduce the complete laws of the universe?

Skin Deep, by Xetobyte
Image Credit: Xetobyte


Stop whining and accept these axioms.

One of the stated goals of quantum foundations is to find a set of intuitive physical principles, that can be stated in plain language, from which the essential structure of quantum mechanics can be derived.

So what exactly is wrong with the axioms proposed by Chiribella et. al. in arXiv:1011.6451 ? Loosely speaking, the principles state that information should be localised in space and time, that systems should be able to encode information about each other, and that every process should in principle be reversible, so that information is conserved. The axioms can all be explained using ordinary language, as demonstrated in the sister paper arXiv:1209.5533. They all pertain directly to the elements of human experience, namely, what real experimenters ought to be able to do with the systems in their laboratories. And they all seem quite reasonable, so that it is easy to accept their truth. This is essential, because it means that the apparently counter intuitive behaviour of QM is directly derivable from intuitive principles, much as the counter intuitive aspects of special relativity follow as logical consequences of its two intuitive axioms, the constancy of the speed of light and the relativity principle. Given these features, maybe we can finally say that quantum mechanics makes sense: it is the only way that the laws of physics can lead to a sensible model of information storage and communication!

Let me run through the axioms briefly (note to the wise: I take the `causality’ axiom as implicit, and I’ve changed some of the names to make them sound nicer). I’ll assume the reader is familiar with the distinction between pure states and mixed states, but here is a brief summary. Roughly, a pure state describes a system about which you have maximum information, whereas a mixed state can be interpreted as uncertainty about which pure state the system is really in. Importantly, a pure state does not need to determine the outcomes to every measurement that could be performed on it: even though it contains maximal information about the state, it might only specify the probabilities of what will happen in any given experiment. This is what we mean when we say a theory is `probabilistic’.

First axiom (Distinguishability): if there is a mixed state, for which there is at least one pure state that it cannot possibly be with any probability, then the mixed state must be perfectly distinguishable from some other state (presumably, the aforementioned one). It is hard to imagine how this rule could fail: if I have a bag that contains either a spider or a fly with some probability, I should have no problem distinguishing it from a bag that contains a snake. On the other hand, I can’t so easily tell it apart from another bag that simply contains a fly (at least not in a single trial of the experiment).

Second axiom (Compression): If a system contains any redundant information or `extra space’, it should be possible to encode it in a smaller system such that the information can be perfectly retrieved. For example, suppose I have a badly edited book containing multiple copies of some pages, and a few blank pages at the end. I should be able to store all of the information written in the book in a much smaller book, without losing any information, just by removing the redundant copies and blank pages. Moreover, I should be able to recover the original book by copying pages and adding blank pages as needed. This seems like a pretty intuitive and essential feature of the way information is encoded in physical systems.

Third axiom (Locality of information): If I have a joint system (say, of two particles) that can be in one of two different states, then I should be able to distinguish the two different states over many trials, by performing only local measurements on each individual particle and using classical communication. For example, we allow the local measurements performed on one particle to depend on the outcomes of the local measurements on the other particle. On the other hand, we do not need to make use of any other shared resources (like a second set of correlated particles) in order to distinguish the states. I must admit, out of all the axioms, this one seems the hardest to justify intuitively. What indeed is so special about local operations and classical communication that it should be sufficient to tell different states apart? Why can’t we imagine a world in which the only way to distinguish two states of a joint system is to make use of some other joint system? But let us put this issue aside for the moment.

Fourth axiom (Locality of ignorance): If I have two particles in a joint state that is pure (i.e. I have maximal information about it) and if I measure one of them and find it in a pure state, the axiom states that the other particle must also be in a pure state. This makes sense: if I do a measurement on one subsystem of a pure state that results in still having maximal information about that subsystem, I should not lose any information about the other subsystems during the process. Learning new information about one part of a system should not make me more ignorant of the other parts.

So far, all of the axioms described above are satisfied by classical and quantum information theory. Therefore, at the very least, if any of these axioms do not seem intuitive, it is only because we have not sufficiently well developed our intuitions about classical physics, so it cannot really be taken as a fault of the axioms themselves (which is why I am not so concerned about the detailed justification for axiom 3). The interesting axiom is the last one, `purification’, which holds in quantum physics but not in probabilistic classical physics.

Fifth axiom (Conservation of information) [aka the purification postulate]: Every mixed state of a system can be obtained by starting with several systems in a joint pure state, and then discarding or ignoring all except for the system in question. Thus, the mixedness of any state can be interpreted as ignorance of some other correlated states. Furthermore, we require that the purification be essentially unique: all possible pure states of the total set of systems that do the job must be convertible into one another by reversible transformations.

As stated above, it is not so clear why this property should hold in the world. However, it makes more sense if we consider one of its consequences: every irreversible, probabilistic process can be obtained from a reversible process involving additional systems, which are then ignored. In the same way that statistical mechanics allows us to imagine that we could un-scramble an egg, if only we had complete information about its individual atoms and the power to re-arrange them, the purification postulate says that everything that occurs in nature can be un-done in principle, if we have sufficient resources and information. Another way of stating this is that the loss of information that occurs in a probabilistic process is only apparent: in principle the information is conserved somewhere in the universe and is never lost, even though we might not have direct access to it. The `missing information’ in a mixed state is never lost forever, but can always be accessed by some observer, at least in principle.

It is curious that probabilistic classical physics does not obey this property. Surely it seems reasonable to expect that one could construct a probabilistic classical theory in which information is ultimately conserved! In fact, if one attempts this, one arrives at a theory of deterministic classical physics. In such a theory, having maximal knowledge of a state (i.e. the state is pure) further implies that one can perfectly predict the outcome of any measurement on the state, but this means the theory is no longer probabilistic. Indeed, for a classical theory to be probabilistic in the sense that we have defined the term, it necessarily allows processes in which information is irretrievably lost, violating the spirit of the purification postulate.

In conclusion, I’d say this is pretty close to the mystical “Zing” that we were looking for: quantum mechanics is the only reasonable theory in which processes can be inherently probabilistic while at the same time conserving information.