# Why a density matrix is not a probability distribution

I’m back! And I’m ready to hit you with some really heavy thoughts that have been weighing me down, because I need to get them off my chest.

Years ago, Rob Spekkens and Matt Leifer published an article in which they tried to define a “causally neutral theory of quantum inference”. Their starting point was an analogy between a density matrix (actually the Choi-Jamiołkowski matrix of a CPT map, but hey, lets not split hairs) and a conditional probability distribution. They argued that in many respects, this “conditional density matrix” could be used to define equations of inference for density matrices, in complete analogy with the rules of Bayesian inference applied to probability distributions.

At the time, something about this idea just struck me as wrong, although I couldn’t quite put my finger on what it was. The paper is not technically wrong, but the idea just felt like the wrong approach to me. Matt Leifer even wrote a superb blog post explaining the analogy between quantum states and probabilities, and I was kind of half convinced by it. At least, my rational brain could find no flaw in the idea, but some deep, subterranean sense of aesthetics could not accept this analogy.

In my paper with Časlav Brukner on quantum causal models, we took a diametrically opposite approach. We refused to deal directly with quantum states, and instead tried to identify the quantum features of a system by looking only at the level of statistics, where the normal rules of inference would apply. The problem is, the price that you pay in working at the level of probabilities is that the structure of quantum states slips through your fingers and gets buried in the sand.

(I like to think of probability distributions as sand. You can push them around any which way, but the total amount of sand stays the same. Underneath the sand, there is some kind of ontological structure, like a dinosaur skeleton or an alien space-ship, whose ridges and contours sometimes show through in places where we brush the sand away. In quantum mechanics, it seems that we can never completely reveal what is buried, because when we clear away sand from one part, we end up dumping it on another part and obscuring it.)

One problem I had with this probability-level approach was that the quantum structure did not emerge the way I had hoped. In particular, I could not find anything like a quantum Reichenbach Principle to replace the old classical Reichenbach Principle, and so the theory was just too unconstrained to be interesting. Other approaches along the same lines tended to deal with this by putting in the quantum stuff by hand, without actually `revealing’ it in a natural way. So I gave up on this for a while.

And what became of Leifer and Spekkens’ approach, the one that I thought was wrong? Well, it also didn’t work out. Their analogy seemed to break down when they tried to add the state of the system at more than two times. To put the last nail in, last year Dominic Horsman et.al presented some work that showed that any approach along the lines of Leifer and Spekkens would run into trouble, because quantum states evolving in time just do not behave like probabilities do. Probabilities are causally neutral, which means that when we use information about one system to infer something about another system, it really doesn’t matter if the systems are connected in time or separated in space. With quantum systems, on the other hand, the difference between time and space is built into the formalism and (apparently) cannot easily be got rid of. Even in relativistic quantum theory, space and time are not quite on an equal footing (and Carlo Rovelli has had a lot to say about this in the past).

Still, after reading Horsman et. al.’s paper, I felt that it was too mathematical and didn’t touch the root of the problem with the Leifer-Spekkens analogy. Again, it was a case of technical accuracy but without conveying the deeper, intuitive reasons why the approach would fail. What finally reeled me back in was a recent paper by John-Mark Allen et. al in which they achieve an elegant definition of a quantum causal model, complete with a quantum Reichenbach Principle (even for multiple systems), but at the expense of essentially giving up on the idea of defining a quantum conditional state over time, and hence forfeiting the analogy with classical Bayesian inference. To me, it seemed like a masterful evasion, or like one of those dramatic Queen sacrifices you see in chess tournaments. They realized that what was standing in the way of quantum causal models was the desire to represent all of the structure of conditional probability distributions, but that this was not necessary for defining a causal model. So they were able to achieve a quantum causal model with a version of Reichenbach’s Principle, but at the price of retaining only a partial analog of classical Bayesian inference.

This left me pondering that old paper of Leifer and Spekkens. Why did they really fail? Is there really no way to salvage a causally neutral theory of quantum inference? I will do my best to answer only the first question here, leaving the second one open for the time being.

One strategy to use when you suspect something is wrong with an idea, is to imagine a world in which the idea is true, and then enter that world and look around to see what is wrong with it. So let us imagine that, indeed, all of the usual rules of Bayesian inference for probabilities have an exact counterpart in terms of density matrices (and similar objects). What would that mean?

Well, if it looks like a duck and quacks like a duck … we could apply Ockham’s razor and say that a density matrix must actually represent a duck probability distribution. This is great news! It means that we can argue that density matrices are actually epistemic objects — they represent our ignorance about some underlying reality. This could be the chance we’ve been waiting for to sweep away all the sand and reveal the quantum skeleton underneath!

The problem is that this is too good to be true. Why? Because it seems to imply that a density matrix uniquely corresponds to a probability distribution over the elements of reality (whatever they are). In jargon, this means that the resulting model would be preparation non-contextual. But this is impossible because — as Spekkens himself proved in a classic paper — any ontological model of quantum mechanics must be preparation contextual.

Let me try to simplify that. It turns out that there are many different ways to prepare a quantum system, such that it ends up being described by the same statistics (as determined by its density matrix). For example, if a machine prepares one of two possible pure states based on the outcome of a coin flip (whose outcome is unknown to us), this can result in the same density matrix for the system as a machine that deterministically entangles the quantum system with another system (which we don’t have access to). These details about the preparation that don’t affect the density matrix are called the preparation context.

The thing is, if a density matrix is really interpretable as a probability distribution, then which distribution it represents has to depend on the preparation context (that is what Spekkens proved). Since Leifer and Spekkens are only looking at density matrices (sans context), we should not expect them to behave just like probability distributions — the analogy has to break somewhere.

Now this is nowhere near a proof — that’s why it is appearing here on a dodgy blog instead of in a scientific publication. But I think it does capture the reason why I felt that the approach of Leifer and Spekkens was just `wrong’: they seemed to be placing density matrices into the role of probabilities, where they just don’t fit.

Now let me point out some holes in my own argument. Although a density matrix can’t be thought of as a single probability distribution, it can perhaps be thought of as representing an equivalence class of distributions, and maybe these equivalence classes could turn out obey all the usual laws of classical inference, thereby rescuing the analogy. However, there is absolutely no reason to expect this to be true — on the contrary, one would expect it to be false. To me, this would be almost like if you tried to model a single atom as if it were a whole gas of particles, and found that it works. Or, to get even more Zen, it would be like an avalanche consisting of one grain of sand.

It is interesting that the analogy can be carried as far as it has been. Perhaps this can be accounted for by the fact that, even though they can’t serve as replacements for probability distributions, density matrices do have a tight relationship with probabilities through the Born rule (the rule that tells us how to predict probabilities for measurements on a quantum system). So maybe we should expect at least some of the properties of probabilities to somehow rub off on density matrices.

Although it seems that a causally neutral theory of Bayesian inference cannot succeed using just density matrices (or similar objects), perhaps there are other approaches that would be more fruitful. What if one takes an explicitly preparation-contextual ontological model (like the fascinating Beltrametti-Bugajski model) and uses it to supplement our density matrices with the context that they need in order to identify them with probability distributions? What sort of theory of inference would that give us? Or, what if we step outside of the ontological models framework and look for some other way to define quantum inference? The door remains tantalizingly open.

# A meditation on physical units: Part 2

[Preface: This is the second part of my discussion of this paper by Craig Holt. It has a few more equations than usual, so strap a seat-belt onto your brain and get ready!]

“Alright brain. You don’t like me, and I don’t like you, but let’s get through this thing and then I can continue killing you with beer.”    — Homer Simpson

Imagine a whale. We like to say that the whale is big. What does that mean? Well, if we measure the length of the whale, say by comparing it to a meter-stick, we will count up a very large number of meters. However, this only tells us that the whale is big in comparison to a meter-stick. It doesn’t seem to tell us anything about the intrinsic, absolute length of the whale. But what is the meaning of `intrinsic, absolute’ length?

Imagine the whale is floating in space in an empty universe. There are no planets, people, fish or meter-sticks to compare the whale to. Maybe we could say that the whale has the property of length, even though we have no way of actually measuring its length. That’s what `absolute’ length means. We can imagine that it has some actual number, independently of any standard for comparison like a meter-stick.

In Craig’s Holt’s paper, this distinction — between measured and absolute properties — is very important. All absolute quantities have primes (also called apostrophes), so the absolute length of a whale would be written as whale-length’ and the absolute length of a meter-stick is written meter’. The length of the whale that we measure, in meters, can be written as the ratio whale-length’ / meter’ . This ratio is something we can directly measure, so it doesn’t need a prime, we can just call it whale-length: it is the number of meter sticks that equal a whale-length. It is clear that if we were to change all of the absolute lengths in the universe by the same factor, then the absolute properties whale-length’ and meter’ would both change, but the measurable property of whale-length would not change.

Ok, so, you’re probably thinking that it is weird to talk about absolute quantities if we can’t directly measure them — but who says that you can’t directly measure absolute quantities? I only gave you one example where, as it turned out, we couldn’t measure the absolute length. But one example is not a general proof. When you go around saying things like “absolute quantities are meaningless and therefore changes in absolute quantities can’t be detected”, you are making a pretty big assumption. This assumption has a name, it is called Bridgman’s Principle (see the last blog post).

Bridgman’s Principle is the reason why at school they teach you to balance the units on both sides of an equation. For example, `speed’ is measured in units of length per time (no, not milligrams — this isn’t Breaking Bad). If we imagine that light has some intrinsic absolute speed c’, then to measure it we would need to have (for example) some reference length L’ and some reference time duration T’ and then see how many lengths of L’ the light travels in time T’. We would write this equation as:

where C is the speed that we actually measure. Bridgman’s Principle says that a measured quantity like C cannot tell us the absolute speed of light c’, it only tells us what the value of c’ is compared to the values of our measuring apparatus, L’ and T’ (for example, in meters per second). If there were some way that we could directly measure the absolute value of c’ without comparing it to a measuring rod and a clock, then we could just write c’ = C without needing to specify the units of C. So, without Bridgman’s Principle, all of Dimensional Analysis basically becomes pointless.

So why should Bridgman’s Principle be true in general? Scientists are usually lazy and just assume it is true because it works in so many cases (this is called “proof by induction”). After all, it is hard to find a way of measuring the absolute length of something, without referring to some other reference object like a meter-stick. But being a good scientist is all about being really tight-assed, so we want to know if Bridgman’s Principle can be proven to be watertight.

A neat example of a watertight principle is the Second Law of Thermodynamics. This Law was also originally an inductive principle (it seemed to be true in pretty much all thermodynamic experiments) but then Boltzmann came along with his famous H-Theorem and proved that it has to be true if matter is made up of atomic particles. This is called a constructive justification of the principle [1].

The H Theorem makes it nice and easy to judge whether some crackpot’s idea for a perpetual motion machine will actually run forever. You can just ask them: “Is your machine made out of atoms?” And if the answer is `yes’ (which it probably is), then you can point out that the H-Theorem proves that machines made up of atoms must obey the Second Law, end of story.

Coming up with a constructive proof, like the H-Theorem, is pretty hard. In the case of Bridgman’s Principle, there are just too many different things to account for. Objects can have numerous properties, like mass, charge, density, and so on; also there are many ways to measure each property. It is hard to imagine how we could cover all of these different cases with just a single theorem about atoms. Without the H-Theorem, we would have to look over the design of every perpetual motion machine, to find out where the design is flawed. We could call this method “proof by elimination of counterexamples”. This is exactly the procedure that Craig uses to lend support to Bridgman’s Principle in his paper.

To get a flavor for how he does it, recall our measurement of the speed of light from equation (1). Notice that the measured speed C does not have to be the same as the absolute speed c’. In fact we can rewrite the equation as:

and this makes it clear that the number C that we measure is not itself an absolute quantity, but rather is a comparison between the absolute speed of light c’ and the absolute distance L’ per time T’. What would happen if we changed all of the absolute lengths in the universe? Would this change the value of the measured speed of light C? At first glance, you might think that it would, as long as the other absolute quantities on the left hand side of equation (2) are independent of length. But if that were true, then we would be able to measure changes in absolute length by observing changes in the measurable speed of light C, and this would contradict Bridgman’s Principle!

To get around this, Craig points out that the length L’ and time T’ are not fundamental properties of things, but are actually reducible to the atomic properties of physical rods and clocks that we use to make measurements. Therefore, we should express L’ and T’ in terms of the more fundamental properties of matter, such as the masses of elementary particles and the coupling constants of forces inside the rods and clocks. In particular, he argues that the absolute length of any physical rod is equal to some number times the “Bohr radius” of a typical atom inside the rod. This radius is in turn proportional to:

where h’, c’ are the absolute values of Planck’s constant and the speed of light, respectively, and m’e is the absolute electron mass. Similarly, the time duration measured by an atomic clock is proportional to:

As a result, both the absolute length L’ and time T’ actually depend on the absolute constants c’, h’ and the electron mass m’e. Substituting these into the expression for the measured speed of light, we get:

where X,Y are some proportionality constants. So, the factors of c’ cancel and we are left with C=X/Y. The numbers X and Y depend on how we construct our rods and clocks — for instance, they depend on how many atoms are inside the rod, and what kind of atom we use inside our atomic clock. In fact, the definition of a `meter’ and a `second’ are specially chosen so as to make this ratio exactly C=299,792,458 [2].

Now that we have included the fact that our measuring rods and clocks are made out of matter, we see that in fact the left hand side of equation (5) is independent of any absolute quantities. Therefore changing the absolute length, time, mass, speed etc. cannot have any effect on the measured speed of light C, and Bridgman’s principle is safe — at least in this example.

(Some readers might wonder why making a clock heavier should also make it run faster, as seems to be suggested by equation (4). It is important to remember that the usual kinds of clocks we use, like wristwatches, are quite complicated things containing trillions of atoms. To calculate how the behaviour of all these atoms would change the ticking of the overall clock mechanism would be, to put it lightly, a giant pain in the ass. That’s why Craig only considers very simple devices like atomic clocks, whose behaviour is well understood at the atomic level [3].)

Another simple model of a clock is the light clock: a beam of light bouncing between two mirrors separated by a fixed distance L’. Since light has no mass, you might think that the frequency of such a clock should not change if we were to increase all absolute masses in the universe. But we saw in equation (4) that the frequency of an atomic clock is proportional to the electron mass, and so it would increase. It then seems like we could measure this increase in atomic clock frequency by comparing it to a light clock, whose frequency does not change — and then we would know that the absolute masses had changed. Is this another threat to Bridgman’s Principle?

The catch is that, as Craig points out, the length L’ between the mirrors of the light clock is determined by a measuring rod, and the rod’s length is inversely proportional to the electron mass as we saw in equation (1). So if we magically increase all the absolute masses, we would also cause the absolute length L’ to get smaller, which means the light-clock frequency would increase. In fact, it would increase by exactly the same amount as the atomic clock frequency, so comparing them would not show us any difference! Bridgman’s Principle is saved again.

Let’s do one more example, this time a little bit more extreme. According to Einstein’s theory of general relativity, every lump of mass has a Schwarzschild radius, which is the radius of a sphere such that if you crammed all of the mass into this sphere, it would turn into a black hole. Given some absolute amount of mass M’, its Schwarzschild radius is given by the equation:

where c’ is the absolute speed of light from before, and G’ is the absolute gravitational constant, which determines how strong the gravitational force is. Now, glancing at the equation, you might think that if we keep increasing all of the absolute masses in the universe, planets will start turning into black holes. For instance, the radius of Earth is about 6370 km. This is the Schwarzschild radius for a mass of about a million times Earth’s mass. So if we magically increased all absolute masses by a factor of a million, shouldn’t Earth collapse into a black hole? Then, moments before we all die horribly, we would at least know that the absolute mass has changed, and Bridgman’s Principle was wrong.

Of course, that is only true if changing the absolute mass doesn’t affect the other absolute quantities in equation (6). But as we now know, increasing the absolute mass will cause our measuring rods to shrink, and our clocks to run faster. So the question is, if we scale the masses by some factor X, do all the X‘s cancel out in equation (6)?

Well, since our absolute lengths have to shrink, the Schwarzschild radius should shrink, so if we multiply M’ by X, then we should divide the radius R’ by X. This doesn’t balance! Hold on though — we haven’t dealt with the constants c’ and G’ yet. What happens to them? In the case of c’, we have c’ = C L’ / T’. Since L’ and T’ both decrease by a factor of X (lengths and time intervals get shorter) there is no overall effect on the absolute speed of light c’.

How do we measure the quantity G’? Well, G’ tells us how much two masses (measured relative to a reference mass m’) will accelerate towards each other due to their gravitational attraction. Newton’s law of gravitation says:

where N is some number that we can measure, and it depends on how big the two masses are compared to the reference mass m’, how large the distance between them is compared to the reference length L’, and so forth. If we measure the acceleration a’ using the same reference length and time L’,T’, then we can write:

where the A is just the measured acceleration in these units. Putting this all together, we can re-arrange equation (7) to get:

and we can define G = (A/N) as the actually measured gravitational constant in the chosen units. From equation (9), we see that increasing M’ by a factor of X, and hence dividing each instance of L’ and T’ by X, implies that the absolute constant G’ will actually change: it will be divided by a factor of X2.

What is the physics behind all this math? It goes something like this: suppose we are measuring the attraction between two masses separated by some distance. If we increase the masses, then our measuring rods shrink and our clocks get faster. This means that when we measure the accelerations, the objects seem to accelerate faster than before. This is what we expect, because two masses should become more attractive (at the same distance) when they become more massive. However, the absolute distance between the masses also has to shrink. The net effect is that, after increasing all the absolute masses, we find that the masses are producing the exact same attractive force as before, only at a closer distance. This means the absolute attraction at the original distance is weaker — so G’ has become weaker after the absolute masses in the universe have been increased (notice, however, that the actually measured value G does not change).

Returning now to equation (6), and multiplying M’ by X, dividing R’ by X and dividing G’ by X2, we find that all the extra factors cancel out. We conclude that increasing all the absolute masses in the universe by a factor of a million will not, in fact, cause Earth to turn into a black hole, because the effect is balanced out by the contingent changes in the absolute lengths and times of our measuring instruments. Whew!

Craig’s paper is long and very thorough. He compares a whole zoo of physical clocks, including electric clocks, light-clocks, freely falling inertial clocks, different kinds of atomic clocks and even gravitational clocks made from two orbiting planets. Not only does he generalize his claim to Newtonian mechanics, he covers general relativity as well, and the Dirac equation of quantum theory, including a discussion of Compton scattering (a photon reflecting off an electron). Besides all of this, he takes pains to discuss the meaning of coupling constants, the Planck scale, and the related but distinct concept of scale invariance. All in all, Craig’s paper just might be the most comprehensive justification for Bridgman’s principle so far in existence!

Most scientists might shrug and say “who needs it?”. In the same way, not many scientists care to examine perpetual motion machines to find out where the flaw lies. In this respect, Craig is a craftsman of the first order — he cares deeply about the details. Unlike the Second Law of Thermodynamics, Bridgman’s Principle seems rarely to have been challenged. This only makes Craig’s defense of it all the more important. After all, it is especially those beliefs which we are disinclined to question that are most deserving of a critical examination.

Footnotes:

[1] Some physical principles, like the Relativity Principle, have never been given a constructive justification. For this reason, Einstein himself seems to have regarded the Relativity Principle with some suspicion. See this great discussion by Brown and Pooley.

[2] Why not just set it to N=1? Well, no reason why not! Then we would replace the meter by the `light second’, and the second by the `light-meter’. And we would say things like “Today I walked 0.3 millionths of a light second to buy an ice-cream, and it took me just 130 billion light-meters to eat it!” So, you know, that would be a bit weird. But theorists do it all the time.

[3] To be perfectly strict, we cannot assume that a wristwatch will behave in the same way as an atomic clock in response to changes in absolute properties; we would have to derive their behavior constructively from their atomic description. This is exactly why a general constructive proof of Bridgman’s Principle would be so hard, and why Craig is forced to stick with simple models of clocks and rulers.

# A meditation on physical units: Part 1

[Preface: A while back, Michael Raymer, a professor at the University of Oregon, drew my attention to a curious paper by Craig Holt, who tragically passed away in 2014 [1]. Michael wrote:
“Dear Jacques … I would be very interested in knowing your opinion of this paper,
since Craig was not a professional academic, and had little community in
which to promote the ideas. He was one of the most brilliant PhD students
in my graduate classes back in the 1970s, turned down an opportunity to
interview for a position with John Wheeler, worked in industry until age
50 when he retired in order to spend the rest of his time in self study.
In his paper he takes a Machian view, emphasizing the relational nature of
all physical quantities even in classical physics. I can’t vouch for the
technical correctness of all of his results, but I am sure they are
inspiring.”

The paper makes for an interesting read because Holt, unencumbered by contemporary fashions, freely questions some standard assumptions about the meaning of `mass’ in physics. Probably because it was a work in progress, Craig’s paper is missing some of the niceties of a more polished academic work, like good referencing and a thoroughly researched introduction that places the work in context (the most notable omission is the lack of background material on dimensional analysis, which I will talk about in this post). Despite its rough edges, Craig’s paper led me down quite an interesting rabbit-hole, of which I hope to give you a glimpse. This post covers some background concepts; I’ll mention Craig’s contribution in a follow-up post. ]

______________
Imagine you have just woken up after a very bad hangover. You retain your basic faculties, such as the ability to reason and speak, but you have forgotten everything about the world in which you live. Not just your name and address, but your whole life history, family and friends, and entire education are lost to the epic blackout. Using pure thought, you are nevertheless able to deduce some facts about the world, such as the fact that you were probably drinking Tequila last night.

The first thing you notice about the world around you is that it can be separated into objects distinct from yourself. These objects all possess properties: they have colour, weight, smell, texture. For instance, the leftover pizza is off-yellow, smells like sardines and sticks to your face (you run to the bathroom).

While bending over the toilet for an extended period of time, you notice that some properties can be easily measured, while others are more intangible. The toilet seems to be less white than the sink, and the sink less white than the curtains. But how much less? You cannot seem to put a number on it. On the other hand, you know from the ticking of the clock on the wall that you have spent 37 seconds thinking about it, which is exactly 14 seconds more than the time you spent thinking about calling a doctor.

You can measure exactly how much you weigh on the bathroom scale. You can also see how disheveled you look in the mirror. Unlike your weight, you have no idea how to quantify the amount of your disheveled-ness. You can say for sure that you are less disheveled than Johnny Depp after sleeping under a bridge, but beyond that, you can’t really put a number on it. Properties like time, weight and blood-alcohol content can be quantified, while other properties like squishiness, smelliness and dishevelled-ness are not easily converted into numbers.

You have rediscovered one of the first basic truths about the world: all that we know comes from our experience, and the objects of our experience can only be compared to other objects of experience. Some of those comparisons can be numerical, allowing us to say how much more or less of something one object has than another. These cases are the beginning of scientific inquiry: if you can put a number on it, then you can do science with it.

Rulers, stopwatches, compasses, bathroom scales — these are used as reference objects for measuring the `muchness’ of certain properties, namely, length, duration, angle, and weight. Looking in your wallet, you discover that you have exactly 5 dollars of cash, a receipt from a taxi for 30 dollars, and you are exactly 24 years old since yesterday night.

You reflect on the meaning of time. A year means the time it takes the Earth to go around the Sun, or approximately 365 and a quarter days. A day is the time it takes for the Earth to spin once on its axis. You remember your school teacher saying that all units of time are defined in terms of seconds, and one second is defined as 9192631770 oscillations of the light emitted by a Caesium atom. Why exactly 9192631770, you wonder? What if we just said 2 oscillations? A quick calculation shows that this would make you about 110 billion years old according to your new measure of time. Or what about switching to dog years, which are 7 per human year? That would make you 168 dog years old. You wouldn’t feel any different — you would just be having a lot more birthday parties. Given the events of last night, that seems like a bad idea.

You are twice as old as your cousin, and that is true in dog years, cat years, or clown years [2]. Similarly, you could measure your height in inches, centimeters, or stacked shot-glasses — but even though you might be 800 rice-crackers tall, you still won’t be able to reach the aspirin in the top shelf of the cupboard. Similarly, counting all your money in cents instead of dollars will make it a bigger number, but won’t actually make you richer. These are all examples of passive transformations of units, where you imagine measuring something using one set of units instead of another. Passive transformations change nothing in reality: they are all in your head. Changing the labels on objects clearly cannot change the physical relationships between them.

Things get interesting when we consider active transformations. If a passive transformation is like saying the length of your coffee table is 100 times larger when measured in cm than when measured in meters, then an active transformation would be if someone actually replaced your coffee table with a table 100 times bigger. Now, obviously you would notice the difference because the table wouldn’t fit in your apartment anymore. But imagine that someone, in addition to replacing the coffee table, also replaced your entire apartment and everything in it with scaled-up models 100 times the size. And imagine that you also grew to into a giant 100 times your original size while you were sleeping. Then when you woke up, as a giant inside a giant apartment with a giant coffee table, would you realise anything had changed? And if you made yourself a giant cup of coffee, would it make your giant hangover go away?

We now come to one of the deepest principles of physics, called Bridgman’s Principle of absolute significance of relative magnitude, named for our old friend Percy Bridgman. The Principle says that only relative quantities can enter into the laws of physics. This means that, whatever experiments I do and whatever measurements I perform, I can only obtain information about the relative sizes of quantities: the length of the coffee table relative to my ruler, or the mass of the table relative to the mass of my body, etc. According to this principle, actively changing the absolute values of some quantity by the same proportion for all objects should not affect the outcomes of any experiments we could perform.

To get a feeling for what the principle means, imagine you are a primitive scientist. You notice that fruit hanging from trees tends to bob up and down in the wind, but the heavier fruits seems to bounce more slowly than the lighter fruits (for those readers who are physics students, I’m talking about a mass on a spring here). You decide to discover the law that relates the frequency of bobbing motion to the mass of the fruit. You fill a sack with some pebbles (carefully chosen to all have the same weight) and hang it from a tree branch. You can measure the mass of the sack by counting the number of pebbles in it, but you still need a way to measure the frequency of the bobbing. Nearby you hear the sound of water dripping from a leaf into a pond. You decide to measure the frequency by how many times the sack bobs up and down in between drips of water. Now you are ready to do your experiment.

You measure the bobbing frequency of the sack for many different masses, and record the results by drawing in the dirt with a stick. After analysing your data, you discover that the frequency f (in oscillations per water drop) is related to the mass m (in pebbles) by a simple formula:

where k stands for a particular number, say 16.8. But what does this number really mean?

Unbeknownst to you, a clever monkey was watching you from the bushes while you did the experiment. After you retire to your cave to sleep, the monkey comes out to play a trick on you. He carefully replaces each one of your pebbles with a heavier pebble of the same size and appearance, and makes sure that all of the heavier pebbles are the same weight as each other. He takes away the original pebbles and hides them. The next day, you repeat the experiment in exactly the same way, but now you discover that the constant k has changed from yesterday’s value of 16.8 to the new value of 11.2. Does this mean that the law of nature that governs the bobbing of things hanging from the tree has changed overnight? Or should you decide that the law is the same, but that the units that you used to measure frequency and mass have changed?

You decide to apply Bridgman’s Principle. The principle says that if (say) all the masses in the experiment were changed by the same proportion, then the laws of physics would not allow us to see any difference, provided we used the same measuring units. Since you do see a difference, Bridgman’s Principle says that it must be the units (and not the law itself) that has changed. `These must be different pebbles’ you say to yourself, and you mark them by scratching an X onto them. You go out looking for some other pebbles and eventually you find a new set of pebbles which give you the right value of 16.8 when you perform the experiment. `These must be the same kind of pebbles that I used in the original experiment’ you say to yourself, and you scratch an O on them so that you won’t lose them again. Ha! You have outsmarted the monkey.

Notice that as long as you use the right value for k — which depends on whether you measure the mass using X or O pebbles — then the abstract equation (1) remains true. In physics language, you are interpreting k as a dimensional constant, having the dimensions of  frequency times √mass. This means that if you use different units for measuring frequency or mass, the numerical value of k has to change in order to preserve the law. Notice also that the dimensions of k are chosen so that equation (1) has the same dimensions on each side of the equals sign. This is called a dimensionally homogeneous equation. Bridgman’s Principle can be rephrased as saying that all physical laws must be described by dimensionally homogeneous equations.

Bridgman’s Principle is useful because it allows us to start with a law expressed in particular units, in this case `oscillations per water-drop’ and `O-pebbles’, and then infer that the law holds for any units. Even though the numerical value of k changes when we change units, it remains the same in any fixed choice of units, so it represents a physical constant of nature.

The alternative is to insist that our units are the same as before (the pebbles look identical after all). That means that the change in k implies a change in the law itself, for instance, it implies that the same mass hanging from the tree today will bob up and down more slowly than it did yesterday. In our example, it turns out that Bridgman’s Principle leads us to the correct conclusion: that some tricky monkey must have switched our pebbles. But can the principle ever fail? What if physical laws really do change?

Suppose that after returning to your cave, the tricky monkey decides to have another go at fooling you. He climbs up the tree and whispers into its leaves: `Do you know why that primitive scientist is always hanging things from your branch? She is testing how strong you are! Make your branches as stiff and strong as you can tomorrow, and she will reward you with water from the pond’.

The next day, you perform the experiment a third time — being sure to use your `O-pebbles’ this time — and you discover again that the value of k seems to have changed. It now takes many more pebbles to achieve a given frequency than it did on the first day. Using Bridgman’s Principle, you again decide that something must be wrong with your measuring units. Maybe this time it is the dripping water that is wrong and needs to be adjusted, or maybe you have confidence in the regularity of the water drip and conclude that the `O-pebbles’ have somehow become too light. Perhaps, you conjecture, they were replaced by the tricky monkey again? So you throw them out and go searching for some heavier pebbles. You find some that give you the right value of k=16.8, and conclude that these are the real `O-pebbles’.

The difference is that this time, you were tricked! In fact the pebbles you threw out were the real `O-pebbles’. The change in k came from the background conditions of the experiment, namely the stiffness in the tree branches, which you did not consider as a physical variable. Hence, in a sense, the law that relates bobbing frequency to mass (for this tree) has indeed changed [3].

You thought that the change in the constant k was caused by using the wrong measuring units, but in fact it was due to a change in the physical constant k itself. This is an example of a scenario where a physical constant turns out not to be constant after all. If we simply assume Bridgman’s Principle to be true without carefully checking whether it is justified, then it is harder to discover situations in which the physical constants themselves are changing. So, Bridgman’s Principle can be thought of as the assumption that the values of physical constants (expressed in some fixed units) don’t change over time. If we are sure that the laws of physics are constant, then we can use the Principle to detect changes or inaccuracies in our measuring devices that define the physical units — i.e. we can leverage the laws of physics to improve the accuracy of our measuring devices.

We can’t always trust our measuring units, but the monkey also showed us that we can’t always trust the laws of physics. After all, scientific progress depends on occasionally throwing out old laws and replacing them with more accurate ones. In our example, a new law that includes the tree-branch stiffness as a variable would be the obvious next step.

One of the more artistic aspects of the scientific method is knowing when to trust your measuring devices, and when to trust the laws of physics [4]. Progress is made by `bootstrapping’ from one to the other: first we trust our units and use them to discover a physical law, and then we trust in the physical law and use it to define better units, and so on. It sounds like a circular process, but actually it represents the gradual refinement of knowledge, through increasingly smaller adjustments from different angles. Imagine trying to balance a scale by placing handfuls of sand on each side. At first you just dump about a handful on each side and see which is heavier. Then you add a smaller amount to the lighter side until it becomes heavier. Then you add an even smaller amount to the other side until it becomes heavier, and so on, until the scale is almost perfectly balanced. In a similar way, switching back and forth between physical laws and measurement units actually results in both the laws and measuring instruments becoming more accurate over time.

______________

[1] It is a shame that Craig’s work remains incomplete, because I think physicists could benefit from a re-examination of the principles of dimensional analysis. Simplified dimensional arguments are sometimes invoked in the literature on quantum gravity without due consideration for their meaning.

[2] Clowns have several birthdays a week, but they aren’t allowed to get drunk at them, which kind of defeats the purpose if you ask me.

[3] If you are uncomfortable with treating the branch stiffness as part of the physical law, imagine instead that the strength of gravity actually becomes weaker overnight.

[4] This is related to a deep result in the philosophy of science called the Duhem-Quine Thesis.
Quoth Duhem: `If the predicted phenomenon is not produced, not only is the questioned proposition put into doubt, but also the whole theoretical scaffolding used by the physicist’.

# Bootstrapping to quantum gravity

“If … there were no solid bodies in nature there would be no geometry.”
-Poincaré

A while ago, I discussed the mystery of why matter should be the source of gravity. To date, this remains simply an empirical fact. The deep insight of general relativity – that gravity is the geometry of space and time – only provides us with a modern twist: why should matter dictate the geometry of space-time?

There is a possible answer, but it requires us to understand space-time in a different way: as an abstraction that is derived from the properties of matter itself. Under this interpretation, it is perfectly natural that matter should affect space-time geometry, because space-time is not simply a stage against which matter dances, but is fundamentally dependent on matter for its existence. I will elaborate on this idea and explain how it leads to a new avenue of approach to quantum gravity.

First consider what we mean when we talk about space and time. We can judge how far away a train is by listening to the tracks, or gauge how deep a well is by dropping a stone in and waiting to hear the echo. We can tell a mountain is far away just by looking at it, and that the cat is nearby by tripping over it. In all these examples, an interaction is necessary between myself and the object, sometimes through an intermediary (the light reflected off the mountain into my eyes) and sometimes not (tripping over the cat). Things can also be far away in time. I obviously cannot interact with people who lived in the past (unless I have a time machine), or people who have yet to be born, even if they stood (or will stand) exactly where I am standing now. I cannot easily talk to my father when he was my age, but I can almost do it, just by talking to him now and asking him to remember his past self. When we say that something is far away in either space or time, what we really mean is that it is hard to interact with, and this difficulty of interaction has certain universal qualities that we give the names `distance’ and `time’.
It is worth mentioning here, as an aside, that in a certain sense, the properties of `time’ can be reduced to properties of `distance’ alone. Consider, for instance, that most of our interactions can be reduced to measurements of distances of things from us, at a given time. To know the time, I invariably look at the distance the minute hand has traversed along its cycle on the face of my watch. Our clocks are just systems with `internal’ distances, and it is the varying correspondence of these `clock distances’ with the distances of other things that we call the `time’. Indeed, Julian Barbour has developed this idea into a whole research program in which dynamics is fundamentally spatial, called Shape Dynamics.

So, if distance and time is just a way of describing certain properties of matter, what is the thing we call space-time?

We now arrive at a crucial point that has been stressed by philosopher Harvey Brown: the rigid rods and clocks with which we claim to measure space-time do not really measure it, in the traditional sense of the word `measure’. A measurement implies an interaction, and to measure space-time would be to grant space-time the same status as a physical body that can be interacted with. (To be sure, this is exactly how many people do wish to interpret space-time; see for instance space-time substantivalism and ontological structural realism).

Brown writes:
“One of Bell’s professed aims in his 1976 paper on `How to teach relativity’ was to fend off `premature philosophizing about space and time’. He hoped to achieve this by demonstrating with an appropriate model that a moving rod contracts, and a moving clock dilates, because of how it is made up and not because of the nature of its spatio-temporal environment. Bell was surely right. Indeed, if it is the structure of the background spacetime that accounts for the phenomenon, by what mechanism is the rod or clock informed as to what this structure is? How does this material object get to know which type of space-time — Galilean or Minkowskian, say — it is immersed in?” [1]

I claim that rods and clocks do not measure space-time, they embody space-time. Space-time is an idealized description of how material rods and clocks interact with other matter. This distinction is important because it has implications for quantum gravity. If we adopt the more popular view that space-time is an independently existing ontological construct, it stands to reason that, like other classical fields, we should attempt to directly quantise the space-time field. This is the approach adopted in Loop Quantum Gravity and extolled by Rovelli:

“Physical reality is now described as a complex interacting ensemble of entities (fields), the location of which is only meaningful with respect to one another. The relation among dynamical entities of being contiguous … is the foundation of the space-time structure. Among these various entities, there is one, the gravitational field, which interacts with every other one and thus determines the relative motion of the individual components of every object we want to use as rod or clock. Because of that, it admits a metrical interpretation.” [2]

One of the advantages of this point of view is that it dissolves some seemingly paradoxical features of general relativity, such as the fact that geometry can exist without (non-gravitational) matter, or the fact that geometry can carry energy and momentum. Since gravity is a field in its own right, it doesn’t depend on the other fields for its existence, nor is there any problem with it being able to carry energy. On the other hand, this point of view tempts us into framing quantum gravity as the mathematical problem of quantising the gravitational field. This, I think, is misguided.

I propose instead to return to a more Machian viewpoint, according to which space-time is contingent on (and not independent of) the existence of matter. Now the description of quantum space-time should follow, in principle, from an appropriate description of quantum matter, i.e. of quantum rods and clocks. From this perspective, the challenge of quantum gravity is to rebuild space-time from the ground up — to carry out Einstein’s revolution a second time over, but using quantum material as the building blocks.

My view about space-time can be seen as a kind of `pulling oneself up by one’s bootstraps’, or a Wittgenstein’s ladder (in which one climbs to the top of a ladder and then throws the ladder away). It works like this:
Step 1: define the properties of space-time according to the behaviour of rods and clocks.
Step 2: look for universal patterns or symmetries among these rods and clocks.
Step 3: take the ideal form of this symmetry and promote it to an independently existing object called `space-time’.
Step 4: Having liberated space-time from the material objects from which it was conceived, use it as the independent standard against which to compare rods and clocks.

Seen in this light, the idea of judging a rod or a clock by its ability to measure space or time is a convenient illusion: in fact we are testing real rods and clocks against what is essentially an embodiment of their own Platonic ideals, which are in turn conceived as the forms which give the laws of physics their most elegant expression. A pertinent example, much used by Julian Barbour, is Ephemeris time and the notion of a `good clock’. First, by using material bodies like pendulums and planets to serve as clocks, we find that the motions of material bodies approximately conform to Newton’s laws of mechanics and gravitation. We then make a metaphysical leap and declare the laws to be exactly true, and the inaccuracies to be due to imperfections in the clocks used to collect the data. This leads to the definition of the `Ephemeris time’, the time relative to which the planetary motions conform most closely to Newton’s laws, and a `good clock’ is then defined to be a clock whose time is closest to Ephemeris time.

The same thing happens in making the leap to special relativity. Einstein observed that, in light of Maxwell’s theory of electromagnetism, the empirical law of the relativity of motion seemed to have only a limited validity in nature. That is, assuming no changes to the behaviour of rods and clocks used to make measurements, it would not be possible to establish the law of the relativity of motion for electrodynamic bodies. Einstein made a metaphysical leap: he decided to upgrade this law to the universal Principle of Relativity, and to interpret its apparent inapplicability to electromagnetism as the failure of the rods and clocks used to test its validity. By constructing new rods and clocks that incorporated electromagnetism in the form of hypothetical light beams bouncing between mirrors, Einstein rebuilt space-time so as to give the laws of physics a more elegant form, in which the Relativity Principle is valid in the same regime as Maxwell’s equations.

By now, you can guess how I will interpret the step to general relativity. Empirical observations seem to suggest a (local) equivalence between a uniformly accelerated lab and a stationary lab in a gravitational field. However, as long as we consider `ideal’ clocks to conform to flat Minkowski space-time, we have to regard the time-dilated clocks of a gravitationally affected observer as being faulty. The empirical fact that observers stationary in a gravitational field cannot distinguish themselves (locally) from uniformly accelerated observers then seems accidental; there appears no reason why an observer could not locally detect the presence of gravity by comparing his normal clock to an `ideal clock’ that is somehow protected from gravity. On the other hand, if we raise this empirical indistinguishability to a matter of principle – the Einstein Equivalence Principle – we must conclude that time dilation should be incorporated into the very definition of an `ideal’ clock, and similarly with the gravitational effects on rods. Once the ideal rods and clocks are updated to include gravitational effects as part of their constitution (and not an interfering external force) they give rise to a geometry that is curved. Most magically of all, if we choose the simplest way to couple this geometry to matter (the Einstein Field Equations), we find that there is no need for a gravitational force at all: bodies follow the paths dictated by gravity simply because these are now the inertial paths followed by freely moving bodies in the curved space-time. Thus, gravity can be entirely replaced by geometry of space-time.

As we can see from the above examples, each revolution in our idea of space-time was achieved by reconsidering the nature of rods and clocks, so as to make the laws of physics take a more elegant form by incorporating some new physical principle (eg. the Relativity and Equivalence principles). What is remarkable is that this method does not require us to go all the way back to the fundamental properties of matter, prior to space-time, and derive everything again from scratch (the constructive theory approach). Instead, we can start from a previously existing conception of space-time and then upgrade it by modifying its primary elements (rods and clocks) to incorporate some new principle as part of physical law (the principle theory approach). The question is, will quantum gravity let us get away with the same trick?

I’m betting that it will. The challenge is to identify the empirical principle (or principles) that embody quantum mechanics, and upgrade them to universal principles by incorporating them into the very conception of the rods and clocks out of which general relativistic space-time is made. The result will be, hopefully, a picture of quantum geometry that retains a clear operational interpretation. Perhaps even Percy Bridgman, who dismissed the Planck length as being of “no significance whatever” [3] due to its empirical inaccessibility, would approve.

[1] Brown, Physical Relativity, p8.
[2] Rovelli, `Halfway through the woods: contemporary research on space and time’, in The Cosmos of Science, p194.
[3] Bridgman, Dimensional Analysis, p101.

# Science, psychoanalyzed

“The problem for us is not, are our desires satisfied or not? The problem is, how do we know what we desire?”

-Slavoj Žižek

The most fundamental dramatic tension is the tension between the divided self. We have all on occasion experienced an internal dialogue like the following: `I ate the cookie despite myself. I knew it was wrong, but I couldn’t help myself. Afterwards, I hated myself’. On one hand, this dialogue makes sense to us and its meaning seems clear; on the other hand, it makes no sense without a division of the self. Who is the myself against whose wishes I eat the cookie? Who is the I that could not help myself? Who, afterwards, is hated, and who is the hater? To admit that the self can be both the subject and object of an action is equivalent to admitting that the self is divided.

Let us therefore deliver ourselves into the hands of Freud, who will lead us down a rabbit-hole of self-discovery. Who are these characters, the id, ego and superego? The id is the instinctive, reactive, animalistic part of the mind. It expresses emotion without reflection, it is wordless, mute, free of morals, shame or self-consciousness. The superego is the embodiment of laws and limitations. When the child learns that it is separate from the world, confined to a small, weak body and cannot have everything it wants – when it learns that it is at the mercy of beings far more powerful who dictate its life – it internalises these limitations and laws by creating the superego. The superego tells us what we are not allowed to do, where we cannot go, and what is forbidden by physical, moral or societal laws.

The fundamental tension between superego and id demands a mediator to decide whether to go with the desires of the id or follow the rules of the superego. This mediator, haplessly caught between the two, is our hero, ourselves: the ego. When the ego obeys the superego, the id is suppressed and frustrated, while the ego becomes more powerful and more strict in its demands. When the ego obeys the id instead, the satisfaction is short-lived, for the id knows only the present moment, and is hungry again no sooner it is fed. Meanwhile, the superego brings its vengeance on the ego for the transgression, afflicting it with guilt and feelings of inferiority. The id expresses our desires and fears, the superego expresses our judgements, and the ego determines how we respond in our actions. Before reading the end of this paragraph, take a moment to re-read the dialogue about the cookie and try to name the actors and the victims. Did you do it? The id wanted to eat the cookie, the superego knew it was wrong, and the ego ate it. The superego was helpless to stop the ego, but afterwards, it hated the ego, and punished it with feelings of guilt. Now it makes sense.

Humans have a curious obsession with the number three. There are three wise men, the holy trinity, the `third eye’ of Hinduism. Dramatic tension between fictional characters also frequently relies on combinations of three. It is an entertaining exercise (but not always fruitful) to identify the roles of id, ego and superego in famous triplets from mythology and fiction. Here is a puzzle for you. In Brisbane, I used to frequent a coffee house called Three Monkeys. Inside, they had amassed a collection of depictions and statuettes of the `Three Wise Monkeys’, a mystical image originating from Japan in which the first monkey has covered its eyes, the second its ears, and the last one its mouth. The image is typically associated with the maxim: see no evil, hear no evil, speak no evil, thought to originate from a similar passage in the Chinese Analects of Confucius. The puzzle is this: if the monkeys were to represent the different aspects of the divided self, which monkey is the id, which is the ego and which is the superego? Or does the comparison simply fail? My own answer is given at the end of this essay.

Tension is by nature unsustainable. It must eventually resolve itself in one of three ways: destruction, reconciliation, or transformation into a new kind of tension (which just means the destruction of some things and the reconciliation of others). Destruction can occur when the division between the id and superego is too extreme, tearing apart the ego with opposing forces. Since the ego exists only to mediate the conflict between the other two, a reconciliation of the id with the superego automatically conciliates the ego as well. This represents a dissolution of the ego, meaning a loss of the distinction between the self and the external world: the attainment of Nirvana in the eastern philosophies. In reality, however, most of us experience only a very small and partial conciliation of this type, a sort of secret collaboration between the superego and the id. This secret collaboration is at the core of science, so let us examine it in more detail.

The easiest way to appreciate the perverse but necessary collaboration between superego and id is to look at stories and films. There, the characters are nicely separated into roles that often reflect the roles of our divided selves. Take Batman and the Joker as depicted in Christopher Nolan’s film, The Dark Knight. The Joker is obviously a candidate for the id:

Batman, although a vigilante, is a good fit for the superego: he is the true enforcer of law, both the judge and the executioner. In fact it is the police force, embodied by Commissioner Gordon, that best represents the ego in its unenviable position, caught between the two rogue elements. Given these roles, we finally understand this brilliant exchange:
Batman: Then why do you want to kill me?
Joker: I don’t want to kill you! What would I do without you? Go back to ripping off mob dealers? No, no, NO! No. You… you complete me.
You could not ask for a more perfect exposition of the mutual dependence of the superego and the id.

Sometimes the bond is more subtle. Consider one of fiction’s greatest characters: Sherlock Holmes. Not coincidentally, Holmes is a poster boy for scientists, with his strict adherence to a method based on evidence, reasoning and deduction. Quite obviously, he is a manifestation of the superego, leaving Watson to carry the banner of the ego. He wears it well enough, constantly being lectured and berated by Holmes, occasionally skeptical and rebellious but always respectful of Holmes’ superior judgement. Where, then, could the id be hiding? Therein lies a profound mystery, worthy of Holmes himself! One is tempted to point at Moriarty, the great enemy of Holmes – but the shoe does not fit. In Moriarty one finds exactly the kind of characteristics more typical of the superego: self-confidence verging on megalomania, mercilessness, a strict adherence to methodology. He is more like Holmes’s evil twin – the vindictive, cruel side of the superego – than the impulsive and chaotic id.

My own theory is that Holmes is a much more subtle character than he first appears. Who is the Holmes that we find, lost in a wordless reverie, playing the violin? Who is the Holmes that disguises himself to play a prank on poor Watson – the Holmes who, indeed, delights in upsetting Watson with eccentric and erratic behaviour? Who is the Holmes that goes missing for days, only to be found curled up in a den of iniquity, his eyes clouded with Opium? I contend that Holmes has an instinctive, intuitive and sensitive side that embodies the id, working in harmony with his superego aspect. Indeed, the seedy side of Holmes – his indulgent, drug-taking, reckless aspect – is somehow essential to completing the portrait of his genius. We would not find him so credible, so impressive, so almost mystical in his virtuosity if it were not for this dark side.

The superego and id can indeed collaborate, but it is usually only in a secretive, almost illicit way as though neither can admit that it depends on the other. The superego turns a blind eye, allowing the id to run wild, and then acts surprised and disappointed when it discovers the transgression. Then ensues what is in essence a sadomasochistic mock-punishment, since the id secretly enjoys the flogging, and the superego knows it, but plays along. In short, the union between superego and id is possible through the hypocritical self-awareness of both parties that they depend on each other to exist. They throw themselves into their respective roles with even more gusto, maintaining as it were a secret conspiracy against the ego, keeping up the tension but with a knowing cynicism.

We now begin to see the first inklings of the mad scientist. The quintessential mad scientist is Dr. Jekyll and Mr. Hyde, whose two faces represent unmistakably a perverse union of superego and id; other examples in fiction abound. The mad scientist is in fact the manifestation in an individual character of the public’s view of scientific activity in general. Since (as Kuhn tells us) science is a human activity, its attributes can be traced to attributes of the human mind. In other words, science as an institution can be psychoanalyzed.
Science is defined on one hand by its rationality, its strict adherence to method, zero tolerance for transgression of its rules, and a claim to superiority in its judgements and conclusions about the world. On the other hand, science is a powerful vehicle for the realisation of our (human) fantasies: what technology is not born from the dream of a science-fiction nerd? Technology is transgressive in the same way that dreams are transgressive: there is no taboo in science, no political correctness, no boundaries. At its purest, science and technology is obscene, disturbing and visionary all at once. Medicine is born of the desire to be immortal, chemistry is born of our desire to have power over the substances and forces of the world, to make gold and riches from lead; physics is born of our desire to fly through the sky like a bird, to be invisible, telepathic, omnipotent. Biology promises us the power to make animals and other organisms serve our needs, and psychology offers us power over each other. Science, with all of its adherence to evidence, logic and deduction, remains silent on matters of its purpose, has nothing to suggest about the ends to which it should be used. There lies hidden the id of science: an amoral, primitive, instinctive drive of humanity, just like the indignant infant trying to come to terms with the world. Without an effective intermediary in the form of public discussion and deliberation over scientific advances, science risks becoming a Sherlock without a Watson, that is, a Dr. Jekyll and Mr. Hyde.

Of course, just as it does in the individual’s psyche, the scientific id also plays a beneficial role: it supplies the creative drive and aesthetic sensibility without which science would be impossible. This is why we cannot divorce the id from the superego in science without destroying science altogether. Eliminate the id from Science, and you are left with a stagnant dogma; eliminate the superego, the methodology and tools of rational inquiry, and you are left with mysticism and superstition. The philosophy of science does an injustice to the true mechanism of scientific progress by focusing too much on the methodology – how to evaluate evidence and test hypotheses – and neglecting to address the aesthetic side of science.

How do we generate hypotheses? Where do ideas come from? Scientists themselves often don’t acknowledge the role that instinct and intuition plays in proposing new theories – we tend to downplay it, or insist that science progresses without any creative input. If that were really true, computer programs could do science in the foreseeable future. But most of us consider the revolution of the machines to still be far away, for the simple reason that we don’t yet know how to teach computers to be creative and to select `good’ hypotheses from the vast pool of logically possible hypotheses. This is (so far) a uniquely human ability, which has everything to do with gut feelings, impulsive thoughts and secret desires. The philosophy of science would perhaps benefit greatly from a more careful examination of this hidden aspect of scientific progress.

My answer to the three monkey’s question is this: The monkey who cannot speak is the id, because the id is voiceless. That leaves the blind monkey and the deaf monkey. It boils down to a matter of opinion here, but the argument that appeals to me most is this one: the superego has a closer relationship with the id than the ego does. Since the blind monkey can neither see nor hear the id (because the id can’t talk), but the deaf monkey can at least see the id, it stands to reason that the deaf monkey is the superego and the blind monkey is the ego.

# The trouble with Reichenbach

(Note: this blog post is vaguely related to a paper I wrote. You can find it on the arXiv here. )

Suppose you are walking along the beach, and you come across two holes in the rock, spaced apart by some distance; let us label them ‘A’ and ‘B’. You observe an interesting correlation between them. Every so often, at an unpredictable time, water will come spraying out of hole A, followed shortly after by a spray of water out of hole B. Given our day-to-day experience of such things, most of us would conclude that the holes are connected by a tunnel underneath the rock, which is in turn connected to the ocean, such that a surge of water in the underground tunnel causes the water to spray from the two holes at about the same time.

Now, therein lies a mystery: how did our brains make this deduction so quickly and easily? The mere fact of a statistical correlation does not tell us much about the direction of cause and effect. Two questions arise. First, why do correlations require explanations in the first place? Why can we not simply accept that the two geysers spray water in synchronisation with each other, without searching for explanations in terms of underground tunnels and ocean surges? Secondly, how do we know in this instance that the explanation is that of a common cause, and not that (for example) the spouting of water from one geyser triggers some kind of chain reaction that results in the spouting of water from the other?

The first question is a deep one. We have in our minds a model of how the world works, which is the product partly of history, partly of personal experience, and partly of science. Historically, we humans have evolved to see the world in a particular way that emphasises objects and their spatial and temporal relations to one another. In our personal experience, we have seen that objects move and interact in ways that follow certain patterns: objects fall when dropped and signals propagate through chains of interactions, like a series of dominoes falling over. Science has deduced the precise mechanical rules that govern these motions.

According to our world-view, causes always occur before their effects in time, and one way that correlations can arise between two events is if one is the cause of the other. In the present example, we may reason as follows: since hole B always spouts after A, the causal chain of events, if it exists, must run from A to B. Next, suppose that I were to cover hole A with a large stone, thereby preventing it from emitting water. If the occasion of its emission were the cause of hole B’s emission, then hole B should also cease to produce water when hole A is covered. If we perform the experiment and we find that hole B’s rate of spouting is unaffected by the presence of a stone blocking hole A, we can conclude that the two events of spouting water are not connected by a direct causal chain.

The only other way in which correlations can arise is by the influence of a third event — such as the surging of water in an underground tunnel — whose occurrence triggers both of the water spouts, each independently of the other. We could promote this aspect of our world-view to a general principle, called the Principle of the Common Cause (PCC): whenever two events A and B are correlated, then either one is a cause of the other, or else they share a common cause (which must occur some time before both of these events).

The Principle of Common Cause tells us where to look for an explanation, but it does not tell us whether our explanation is complete. In our example, we used the PCC to deduce that there must be some event preceding the two water spouts which explains their correlation, and for this we proposed a surge of water in an underground tunnel. Now suppose that the presence of water in this tunnel is absolutely necessary in order for the holes to spout water, but that on some occasions the holes do not spout even though there is water in the tunnel. In that case, simply knowing that there is water in the tunnel does not completely eliminate the correlation between the two water spouts. That is, even though I know there is water in the tunnel, I am not certain whether hole B will emit water, unless I happen to know in addition that hole A has just spouted. So, the probability of B still depends on A, despite my knowledge of the ‘common cause’. I therefore conclude that I do not know everything that there is to know about this common cause, and there is still information to be had.

It could be, for instance, that the holes will only spout water if the water pressure is above a certain threshold in the underground tunnel. If I am able to detect both the presence of the water and its pressure in the tunnel, then I can predict with certainty whether the two holes will spout or not. In particular, I will know with certainty whether hole B is going to spout, independently of A. Thus, if I had stakes riding on the outcome of B, and you were to try and sell me the information “whether A has just spouted”, I would not buy it, because it does not provide any further information beyond what I can deduce from the water in the tunnel and its pressure level. It is a fact of general experience that, conditional on complete knowledge of the common causes of two events, the probabilities of those events are no longer correlated. This is called the principle of Factorisation of Probabilities (FP). The union of FP and PCC together is called Reichenbach’s Common Cause Principle (RCCP).

In the above example, the complete knowledge of the common cause allowed me to perfectly determine whether the holes would spout or not. The conditional independence of these two events is therefore guaranteed. One might wonder why I did not talk about the principle of predetermination: conditional on on complete knowledge of the common causes, the events are determined with certainty. The reason is that predetermination might be too strong; it may be that there exist phenomena that are irreducibly random, such that even a full knowledge of the common causes does not suffice to determine the resulting events with certainty.

As another example, consider two river beds on a mountain slope, one on the left and one on the right. Usually (96% of the time) it does not rain on the mountain and both rivers are dry. If it does rain on the mountain, then there are four possibilities with equal likelihood: (i) the river beds both remain dry, (ii) the left river flows but the right one is dry (iii) the right river flows but the left is dry, or (iv) both rivers flow. Thus, without knowing anything else, the fact that one river is running makes it more likely that the other one is. However, conditional that it rained on the mountain, if I know that the left river is flowing (or dry), this does not tell me anything about whether the right river is flowing or dry. So, it seems that after conditioning on the common cause (rain on the mountain) the probabilities factorise: knowing about one river tells me nothing about the other.

Now we have a situation in which the common cause does not completely determine the outcomes of the events, but where the probabilities nevertheless factorise. Should we then conclude that the correlations are explained? If we answer ‘yes’, we have fallen into a trap.

The trap is that there may be additional information which, if discovered, would make the rivers become correlated. Suppose I find a meeting point of the two rivers further upstream, in which sediment and debris tends to gather. If there is only a little debris, it will be pushed to one side (the side chosen effectively at random), diverting water to one of the rivers and blocking the other. Alternatively, if there is a large build-up of debris, it will either dam the rivers, leaving them both dry, or else be completely destroyed by the build-up of water, feeding both rivers at once. Now, if I know that it rained on the mountain and I know how much debris is present upstream, knowing whether one river is flowing will provide information about the other (eg. if there is a little debris upstream and the right river is flowing, I know the left must be dry).

Before I knew anything, the rivers seemed to be correlated. Conditional on whether it rained on the mountain-top, the correlation disappeared. But now, conditional that it rained on the mountain and on the amount of debris upstream, the correlation is restored! If the only tools I had to explain correlations was the PCC and the FP, then how can I ever be sure that the explanation is complete? Unless the information of the common cause is enough to predetermine the outcomes of the events with certainty, there is always the possibility that the correlations have not been explained, because new information about the common causes might come to light which renders the events correlated again.

Now, at last, we come to the main point. In our classical world-view, observations tend to be compatible with predetermination. No matter how unpredictable or chaotic a phenomenon seems, we find it natural to imagine that every observed fact could be predicted with certainty, in principle, if only we knew enough about its relevant causes. In that case, we are right to say that a correlation has not been fully explained unless Reichenbach’s principle is satisfied. But this last property is now just seen as a trivial consequence of predetermination, implicit in out world-view. In fact, Reichenbach’s principle is not sufficient to guarantee that we have found an explanation. We can only be sure that the explanation has been found when the observed facts are fully determined by their causes.

This poses an interesting problem to anyone (like me) who thinks the world is intrinsically random. If we give up predetermination, we have lost our sufficient condition for correlations to be explained. Normally, if we saw a correlation, after eliminating the possibility of a direct cause we would stop searching for an explanation only when we found one that could perfectly determine the observations. But if the world is random, then how do we know when we have found a good enough explanation?

In this case, it is tempting to argue that Reichenbach’s principle should be taken as a sufficient (not just necessary) condition for an explanation. Then, we know to stop looking for explanations as soon as we have found one that causes the probabilities to factorise. But as I just argued with the example of the two rivers, this doesn’t work. If we believed this, then we would have to accept that it is possible for an explained correlation to suddenly become unexplained upon the discovery of additional facts! Short of a physical law forbidding such additional facts, this makes for a very tenuous notion of explanation indeed.

The question of what should constitute a satisfactory explanation for a correlation is, I think, one of the deepest problems posed to us by quantum mechanics. The way I read Bell’s theorem is that (assuming that we accept the theorem’s basic assumptions) quantum mechanics is either non-local, or else it contains correlations that do not satisfy the factorisation part of Reichenbach’s principle. If we believe that factorisation is a necessary part of explanation, then we are forced to accept non-locality. But why should factorisation be a necessary requirement of explanation? It is only justified if we believe in predetermination.

A critic might try to argue that, without factorisation, we have lost all ability to explain correlations. But I’m saying that this true even for those who would accept factorisation but reject predetermination. I say, without predetermination, there is no need to hold on to factorisation, because it doesn’t help you to explain correlations any better than the rest of us non-determinists! So what are we to do? Maybe it is time to shrug off factorisation and face up to the task of finding a proper explanation for quantum correlations.

# Jacques Pienaar’s guide to making physics (Pt.1)

PRINCIPLES AS TOOLS
(Not to be confused with using Principals as tools, which is what happens if your school Principal is a tool because he never taught you the difference between a Principal and a principle. Also not to be confused with a Princey-pal, who is a friend that happens to be a Prince).

`These principles are the boldly generalized results of experiment; but they appear to derive from their very generality a high degree of certainty. In fact, the greater the generality, the more frequent are the opportunities for verifying them, and such verifications, as they multiply, as they take the most varied and most unexpected forms, leave in the end no room for doubt.’ -Poincaré

One of the great things Einstein did, besides doing physics, was trying to explain to people how to do it as good as him. Ultimately he failed, because so far nobody has managed to do better than him, but he left us with some really interesting insights into how to come up with new physical theories.

One of these ideas is the concept of using `principles’. A principle is a statement about how the word works (or should work), stated in ordinary language. They are not always called principles, but might be called laws, postulates or hypotheses. I am not going to argue about semantics here. Just consider these examples to get a flavour:

The Second Law of Thermodynamics: You can’t build an engine which does useful work and ends up back in its starting position without producing any heat.

Landauer’s principle: you can’t erase information without producing heat.

The Principle of Relativity: It is impossible to tell by local experiments whether or not your laboratory is moving.

And some not strictly physics ones:

Shirky’s law: Institutions will try to preserve the problem to which they are the solution.

Murphy’s law: If something can go wrong, it will go wrong.

Stigler’s law: No scientific discovery is named after its original discoverer (this law was actually discovered by R.K. Merton, not Stigler).

Parkinson’s law: Work always expands to fill up the time allocated to doing it.
(See Wikipedia’s list of eponymous laws for more).

You’ll notice that principles are characterised by two main things: they ring true, and they are vague. Both of these properties are very important for their use in building theories.

Now I can practically hear the lice falling out as you scratch your head in confusion. “But Jacques! How can vagueness be a useful thing to have in a Principle? Shouldn’t it be made as precise as possible?”

No, doofus. A Principle is like an apple. You know what an apple is right?

Well, you think you do. But if I were to ask you, what colour is an apple, how sweet is an apple, how many worms are in an apple, you would have to admit that you don’t know, because the word “apple” is too vague to answer those questions. It is like asking how long is a piece of string. Nevertheless, when you want to go shopping, it suffices to say “buy me an apple” instead of “buy me a Malus domestica, reflective in the 620-750 nanometer range, ten percent sugar, one percent cydia pomonella“.

The only way to make a principle more precise is within the context of a precise theory. But then how would I build a new theory, if I am stuck using the language of the old theory? I can make the idea of an apple more precise using the various scientifically verified properties that apples are known to have, but all of that stuff had to come after we already had a basic vague understanding of what an “apple” was, e.g. a kind of round-ish thing on a tree that tastes nice when you eat it.

The vagueness of a principle means that it defines a whole family of possible theories, these being the ones that kind of fit with the principle if you take the right interpretation. On one hand, a principle that is too vague will not help you to make progress, because it will be too easy to make it fit with any future theory; on the other hand, a principle that is not vague enough will leave you stuck for choices and unable to progress.

The next aspect of a good principle is that it “rings true”. In other words, there is something about it that makes you want it to be true. We want our physical theories to be intuitive to our soft, human brains, and these brains of ours have evolved to think about the world in very specific terms. Why do you think physics seems to be all about the locations of objects in space, moving with time? There are infinitely many ways to describe physics, but we choose the ones we do because of the way our physical senses work, the way our bodies interact with the world, and the things we needed to do in order to survive up to this point. What is the principle of least action? It is a river flowing down a mountain. What is Newtonian mechanics? It is animals moving on the plains. We humans need to see the world in a special way in order to understand it, and good principles are what allow us to shoehorn abstract concepts like thermodynamics and gravitational physics into a picture that looks familiar to us, that we can work with.

That’s why a good principle has to ring true — it has to appeal to the limited imaginative abilities of us humans. Maybe if we were different animals, the laws of physics would be understood in very different terms. Like, the Newtonian mechanics of snakes would start with a simple model of objects moving along snake-paths in two dimensions (the ground), and then go from there to arbitrary motions and higher dimensions. So intelligent snakes might have discovered Fourier analysis way before humans would have, just because they would have been more used to thinking in wavy motions instead of linear motions.

So you see, coming up with good principles is really an art form, that requires you to be deeply in touch with your own humanity. Indeed, principle-finding is part of the great art of generating hypotheses. It is a pity that many scientists don’t practice hypothesis generation enough to realise that it is an art (or maybe they don’t practice art enough?) It is also ironic that science tries so hard to eliminate the human element from the theories, when it is so apparent in the final forms of the theories themselves. It is just like an artist who trains so hard to hide her brush strokes, to make the signature of her hand invisible, even though the subject of the painting is her own face.

Ok, now that we know what principles are, how do we find them? One of the best ways is by the age-old method of Induction. How does induction work? It really deserves its own post, but here it is in a nutshell. Let’s say that you are a turkey, and you observe that whenever the farmer makes a whistle, there is some corn in your bowl. So, being a smart turkey, you might decide to elevate this empirical pattern to a general principle, called the Turkey Principle: whenever the farmer whistles, there is corn in your bowl. BOOM, induction!

Now, what is the use of this principle? It helps you to narrow down which theories are good and which are bad. Suppose one day the farmer whistles but you discover there is not corn in the bowl, but rather rice. With your limited turkey imagination, you are able to come up with three hypotheses to explain this. 1. There was corn in the bowl when the farmer whistled, but then somebody came along and replaced it with rice; 2. the Turkey Principle should be amended to the Weak Turkey Principle, which states that when the farmer whistles, food, but not necessarily corn, will be in the bowl; 3. the contents of the bowl are actually independent of the farmer’s whistling, and the apparent link between these phenomena is just a coincidence. Now, with the aid of the Principle, we can see that there is a clear preference for hypothesis 1 over 2, and for 2 over 3, according to the extent that each hypothesis fits with the Turkey Principle.

This example makes it clear that deciding which patterns to upgrade to general principles, and which to regard as anomalies, is again a question of aesthetics and artistry. A more perceptive turkey might observe that the farmer is not a simple mechanistic process, but a complex and mysterious system, and therefore may not be subject to such strong constraints with regards to his whistling and corn-giving behaviour as are implied by the Turkey Principle. Indeed, were the turkey perceptive enough to guess at the farmer’s true motives, he might start checking the tool shed to see if the axe is missing before running to the food bowl every time the farmer whistles. But this turkey would no doubt be working on hypotheses of his own, motivated by principles of his own, such as the Farmer-is-Not-to-be-Trusted Principle (in connection with the observed correlation of turkey disappearances and family dinner parties).

An example more relevant to physics is Einstein’s Equivalence Principle: that no local experiment can determine whether the laboratory is in motion, or is stationary in a gravitational field. The principle is vague, as you can see by the number of variations, interpretations, and Weak and Strong versions that exist in the literature; but undoubtedly it rings true, since it appears to be widely obeyed all but the most esoteric phenomena, and it gels nicely with the Principle of Relativity. While the Equivalence Principle was instrumental in leading to General Relativity, it is a matter of debate how it should be formulated within the theory, and whether or not it is even true. Much like hammers and saws are needed to make a table, but are not needed after the table is complete, we use principles to make theories and then we set them aside when the theory is complete. The final theory makes predictions perfectly well without needing to refer to the principles that built it, and the principles are too vague to make good predictions on their own. (Sure, with enough fiddling around, you can sit on a hammer and eat food off a saw, but it isn’t really comfortable or easy).

For more intellectual reading on principle theories, see the SEP entry on Einstein’s Philosophy of Science, and Poincare’s excellent notes.

# Wigner has no friends in space

The title phrase of this post is taken from an article by Seth Lloyd that appeared on today’s arXiv, entitled “Analysis of a work of quantum art“. Lloyd was talking about an artwork in collaboration with artist Diemut Strebe, called `Wigner’s friends‘ in which a pair of telescopes are separated, one remaining on Earth and the other going to the International Space Station. According to Lloyd, Strebe motivates the work by appealing to the concepts of quantum superposition and entanglement, referring to physicist Eugene Wigner’s famous thought experiment in which one experimenter, Wigner’s friend, finds herself in a superposition prior to Wigner’s measurement. In Strebe’s scenario, both telescopes are aimed at interstellar space, and it is the viewers of the exhibition that are held responsible for collapsing the superposition of the orbiting telescope by observing the image on the ground-based telescope. The idea is that, since there is nobody looking at the orbiting telescope, the image on its CCD array initially exists in a quantum superposition of all possible artworks; hence Wigner has no friends in space. Before I discuss this intriguing work, let me first start a new art movement.

I was doing my PhD at the University of Queensland when my friend Aggie (also a PhD at that time) came to me with an intriguing problem. She needed to integrate a function over a certain region of three-dimensional space. This region could be obtained by slicing corners off a cube in a certain way, but Aggie was finding it impossible to visualize what the resulting shape would look like. Even after doing a 3D plot in Mathematica, she felt that there was something missing from the flattened projections that one had to click-and-drag to rotate. She wanted to know if I’d ever seen this shape before, and if I could maybe draw it for her or make one out of paper and glue (Weirdly, I have always had an undeserved reputation for drawing and origami). I did my best with paper and sticky-tape, but it didn’t quite come out right, so I gave up. In the end, she went and bought some plasticine and made a cube, then cut off the corners until she got the shape she wanted. Now that she could hold it in her hands, she finally felt that she understood just what she was dealing with. She went back to her computer to perform the integration.

At the time, it did not occur to me to ask “Is it art?” While its form was elegant, it was there to serve a practical purpose, namely to help Aggie (who probably did not once suspect that she was doing Art) in her calculation by condensing certain abstract ideas into a concrete form.

Disclaimer: Before continuing, please note that I reject the idea that there can be a universal definition of Art. I further reject the (often claimed) corollary that therefore anything and everything can be Art. Instead, I posit that there are many different Arts, and just like living species, they are continually springing into existence, evolving into new forms, and going extinct. Just as a discussion about “what is a species” can lead to interminable and never-ending arguments, I posit that it is much better and more constructive to discuss “what is a lion”? Here, I am going to talk about, and attempt to define, something that might be called Science-Art, Technologism, Scientism, or something like that. Let’s go with `Zappism’, because it reminds me of things that supposedly go `zap’, but really don’t, like lasers.

So what is Zappism? Let me give some examples of what it is and what it is not. Every now and then, there are Art in Science exhibitions where academic researchers submit images of pretty things that they encountered in the course of their research. I include in this category colourful images of fractals, decorated graphs of pretty mathematical functions, astrophysical images of planets and stars and things, and basically anything where a scientist was just mucking around and noticed something beautiful and then made it into a graphic. For this stuff I would suggest the name “Scientific Found Art”, but it is not Zappism.

Aggie’s shape might seem at first to fit the bill of found art, but there is a crucial difference: were the shape not pretty, it still would have served its purpose, which was to explore, in material form, scientific ideas that would otherwise have been elusive and abstract. A computer simulation of a fractal does not serve this purpose unless one also comes to understand the fractal better as a consequence of the simulation, and I’m not convinced this is true any more than one can understand a sentence better by writing it out in binary and then colouring it in.

Zappism is the art of using some kind of medium — be it painting, film, music, literature or something else — and using it to transform some ethereal and ungraspable Platonisms of science into things the human mind can more readily play with. Sometimes something is lost in translation, like adding unscientific `zap’ sounds to lasers, but this is acceptable as long as the core idea is translated — in the case of lasers, the idea that light can be focused into beams that can burn through things.

Many episodes of Star Trek exhibit Zappism. In the episode `Tuvix‘, the transporter merges two crew members into a single person, an incident that is explicitly explained by appealing to the way the transporter recombines matter. Similarly, Cronenberg’s film The Fly is classic Zappism, as is Spielberg’s Jurassic Park. Indeed, almost any science fiction that uses science in an active way almost can’t help but be Zappist. Science fiction can still fail to be Zappist if it uses the science as a kind of gloss or sugar-coating, instead of engaging with the science as a main ingredient. Star Wars is not really Zappist because it is not concerned with the mechanisms of the technology invoked. Luke and Darth might as well be using swords and riding on flying horses for all the story cares, making it is more like Science Fantasy (Why do lightsabers simply stop at a convenient sword-length?)

A science fiction movie can always ignore inconvenient facts, like conservation of momentum, or how there is no sound in space. These annoying truths are often seen as getting in the way of good action and drama. The truth is the opposite: it takes a creative leap of genius to see how to use these facts to the advantage of dramatic effects. The recent film Coherence does a brilliant job of using the idea of Schrodinger’s Cat to create a tense and frightening scenario. When film, art and storytelling are able to incorporate physical law in a natural and graspable way, we are one step closer to connecting the public to cutting-edge science.

On the non-cinematic side, Koen Vanmechelen’s breeding program for cosmopolitan chickens, Maguire and collaborator’s epic project `Dr. Brainlove‘, and Theo Jansen’s Strandbeest could all be called examples of Zappism. But perhaps the most revealing examples are those that do not explicitly use physical technology for the scientific motive, but instead use abstract ideas. For these I cite Dali’s Persistence of Memory (and its Disintegration) with their roots in Relativity theory and Quantum Mechanics; the book Flatland by Edwin Abbott; Alice in Wonderland by Carroll; Gödel, Escher, Bach: An Eternal Golden Braid by Hofstadter, and similar books that bring abstract scientific or mathematical ideas into an imaginable form. A truly great work of Zappism was the invention of the Rubik’s Cube, by the Hungarian sculptor and mathematician Erno Rubik. Rubik conceived the cube as a solution to a more abstract structural design problem of how to rotate the parts of a cube in all three dimensions while keeping the parts connected.

Returning now to Strebe’s artwork `Wigner’s friends’, it should be remarked that the artwork is not a scientific experiment and there is no actual demonstration of quantum coherence between the telescopes. However, Seth Lloyd for some reason seems intent on defending the idea that maybe, just maybe, there is some tiny smidgen of possibility that there is something quantum going on in the experiment. I understand his enthusiasm: I also think it is a very cool artwork, and somehow the whole point of the artwork is its reference to quantum mechanics. But in order to plausibly say that something quantum was really going on in Strebe’s artwork, Lloyd is forced to invoke the Many Worlds interpretation, which to me is tantamount to begging the question — under that assumption isn’t my cheese sandwich also in a quantum superposition?

I don’t see why all this is necessary: when Dali painted the Disintegration of the Persistence of Memory, nobody was scrambling to argue that his oil paint was in a quantum superposition on the canvas. It would be just as absurd as insisting that Da Vinci’s portrait of the Mona Lisa actually contained a real person. There is a sense in which the artistic representation of a person is bound to physics — it is constrained to some extent by the way physical masses compose in three dimensional space — but the art of correct representation is not to be confused with the real thing. Even Mondrian, whose works were famously highly abstract, insisted that he was bound to the true representation of Nature as he saw it [1]. To me, Strebe’s artwork is a representation of quantum mechanics, put into a physical and graspable form, and that is what makes it Zappism. But is it good Zappism? That depends on whether the audience feels any closer to understanding quantum mechanics after the experience.

[1] “The masses generally find my work rather vague. I construct lines and color combinations on a flat surface, in order to express general beauty with the utmost awareness. Nature (or that which I see) inspires me . . . but I want to come as close as possible to the truth…” Source: http://www.comesaunter.com/2012/02/piet-mondrian-on-his-art.html

# Ten Rules for Research

I see a lot of articles out there giving advice in the form of a list of rules. People have a fascination with rule lists. You’ve got the rules of Fight Club, the writer who uses a personal formula, policemen who follow “The Book” to the letter, gangsters with a personal code of ethics, and so on. So here’s my list of rules for being a scientist.

2. The value of public speaking skills cannot be underestimated.

3. Remember the big questions that got you here in the first place.

4. Take philosophy seriously, but only the parts you can understand.

5. Sometimes, you just have to shut up and calculate.

6. Don’t distract yourself from the things you don’t know by working on things you do know.

7. The best defense against politics is integrity and a smile.

8. The more certain you are of a result, the more you should double check it.

9. If you aren’t curious to know the result of a calculation, it isn’t worth doing it.

10.  Ask dumb questions. If you are truly an idiot, you’ll be found out eventually, so you might as well satisfy your curiosity in the meantime.

In the end, I think Rule 1 is most important.  So, you should go and read Michael Nielsen’s classic advice to researchers, which is far more eloquent than the garbage you read on my blog.

# Time-travel, decoherence, and satellites.

I recently returned to my roots, contributing to a new paper with Tim Ralph (who was my PhD advisor) on the very same topic that formed a major part of my PhD. Out of laziness, let me dig up the relevant information from an earlier post:

“The idea for my PhD thesis comes from a paper that I stumbled across as an undergraduate at the University of Melbourne. That paper, by Tim Ralph, Gerard Milburn and Tony Downes of the University of Queensland, proposed that Earth’s own gravitational field might be strong enough to cause quantum gravity effects in experiments done on satellites. In particular, the difference between the strength of gravity at ground-level and at the height of the orbiting satellite might be just enough to make the quantum particles on the satellite behave in a very funny non-linear way, never before seen at ground level. Why might this happen? This is where the story gets bizarre: the authors got their idea after looking at a theory of time-travel, proposed in 1991 by David Deutsch. According to Deutsch’s theory, if space and time were bent enough by gravity to create a closed loop in time (aka a time machine), then any quantum particle that travelled backwards in time ought to have a very peculiar non-linear behaviour. Tim Ralph and co-authors said: what if there was only a little bit of space-time curvature? Wouldn’t you still expect just a little bit of non-linear behaviour? And we can look for that in the curvature produced by the Earth, without even needing to build a time-machine!”

In our recent paper in New Journal of Physics, for the special Focus on Gravitational Quantum Mechanics, Tim and I re-examined the `event formalism’ (the fancy name for the nonlinear model in question) and we derived some more practical numerical predictions and ironed out a couple of theoretical wrinkles, making it more presentable as an experimental proposal. Now that there is growing interest in quantum gravity phenomenology — that is, testable toy models of quantum gravity effects — Tim’s little theory has an excitingly real chance of being tested and proven either right or wrong. Either way, I’d be curious to know how it turns out! On one hand, if quantum entanglement survives the test, the experiment would stand as one of the first real confirmations of quantum field theory in curved space-time. On the other hand, if the entanglement is destroyed by Earth’s gravitational field, it would signify a serious problem with the standard theory and might even confirm our alternative model. That would be great too, but also somewhat disturbing, since non-linear effects are known to have strange and confusing properties, such as violating the fabled uncertainty principle of quantum mechanics.

You can see my video debut here, in which I give an overview of the paper, complete with hand-drawn sketches!

(Actually there is a funny story attached to the video abstract. The day I filmed the video for this, I had received a letter informing me that my application for renewal of my residence permit in Austria was not yet complete — but the permit itself had expired the previous day! As a result, during the filming I was half panicking at the thought of being deported from the country. In the end it turned out not to be a problem, but if I seem a little tense in the video, well, now you know why.)