One of the stated goals of quantum foundations is to find a set of intuitive physical principles, that can be stated in plain language, from which the essential structure of quantum mechanics can be derived.
So what exactly is wrong with the axioms proposed by Chiribella et. al. in arXiv:1011.6451 ? Loosely speaking, the principles state that information should be localised in space and time, that systems should be able to encode information about each other, and that every process should in principle be reversible, so that information is conserved. The axioms can all be explained using ordinary language, as demonstrated in the sister paper arXiv:1209.5533. They all pertain directly to the elements of human experience, namely, what real experimenters ought to be able to do with the systems in their laboratories. And they all seem quite reasonable, so that it is easy to accept their truth. This is essential, because it means that the apparently counter intuitive behaviour of QM is directly derivable from intuitive principles, much as the counter intuitive aspects of special relativity follow as logical consequences of its two intuitive axioms, the constancy of the speed of light and the relativity principle. Given these features, maybe we can finally say that quantum mechanics makes sense: it is the only way that the laws of physics can lead to a sensible model of information storage and communication!
Let me run through the axioms briefly (note to the wise: I take the `causality’ axiom as implicit, and I’ve changed some of the names to make them sound nicer). I’ll assume the reader is familiar with the distinction between pure states and mixed states, but here is a brief summary. Roughly, a pure state describes a system about which you have maximum information, whereas a mixed state can be interpreted as uncertainty about which pure state the system is really in. Importantly, a pure state does not need to determine the outcomes to every measurement that could be performed on it: even though it contains maximal information about the state, it might only specify the probabilities of what will happen in any given experiment. This is what we mean when we say a theory is `probabilistic’.
First axiom (Distinguishability): if there is a mixed state, for which there is at least one pure state that it cannot possibly be with any probability, then the mixed state must be perfectly distinguishable from some other state (presumably, the aforementioned one). It is hard to imagine how this rule could fail: if I have a bag that contains either a spider or a fly with some probability, I should have no problem distinguishing it from a bag that contains a snake. On the other hand, I can’t so easily tell it apart from another bag that simply contains a fly (at least not in a single trial of the experiment).
Second axiom (Compression): If a system contains any redundant information or `extra space’, it should be possible to encode it in a smaller system such that the information can be perfectly retrieved. For example, suppose I have a badly edited book containing multiple copies of some pages, and a few blank pages at the end. I should be able to store all of the information written in the book in a much smaller book, without losing any information, just by removing the redundant copies and blank pages. Moreover, I should be able to recover the original book by copying pages and adding blank pages as needed. This seems like a pretty intuitive and essential feature of the way information is encoded in physical systems.
Third axiom (Locality of information): If I have a joint system (say, of two particles) that can be in one of two different states, then I should be able to distinguish the two different states over many trials, by performing only local measurements on each individual particle and using classical communication. For example, we allow the local measurements performed on one particle to depend on the outcomes of the local measurements on the other particle. On the other hand, we do not need to make use of any other shared resources (like a second set of correlated particles) in order to distinguish the states. I must admit, out of all the axioms, this one seems the hardest to justify intuitively. What indeed is so special about local operations and classical communication that it should be sufficient to tell different states apart? Why can’t we imagine a world in which the only way to distinguish two states of a joint system is to make use of some other joint system? But let us put this issue aside for the moment.
Fourth axiom (Locality of ignorance): If I have two particles in a joint state that is pure (i.e. I have maximal information about it) and if I measure one of them and find it in a pure state, the axiom states that the other particle must also be in a pure state. This makes sense: if I do a measurement on one subsystem of a pure state that results in still having maximal information about that subsystem, I should not lose any information about the other subsystems during the process. Learning new information about one part of a system should not make me more ignorant of the other parts.
So far, all of the axioms described above are satisfied by classical and quantum information theory. Therefore, at the very least, if any of these axioms do not seem intuitive, it is only because we have not sufficiently well developed our intuitions about classical physics, so it cannot really be taken as a fault of the axioms themselves (which is why I am not so concerned about the detailed justification for axiom 3). The interesting axiom is the last one, `purification’, which holds in quantum physics but not in probabilistic classical physics.
Fifth axiom (Conservation of information) [aka the purification postulate]: Every mixed state of a system can be obtained by starting with several systems in a joint pure state, and then discarding or ignoring all except for the system in question. Thus, the mixedness of any state can be interpreted as ignorance of some other correlated states. Furthermore, we require that the purification be essentially unique: all possible pure states of the total set of systems that do the job must be convertible into one another by reversible transformations.
As stated above, it is not so clear why this property should hold in the world. However, it makes more sense if we consider one of its consequences: every irreversible, probabilistic process can be obtained from a reversible process involving additional systems, which are then ignored. In the same way that statistical mechanics allows us to imagine that we could un-scramble an egg, if only we had complete information about its individual atoms and the power to re-arrange them, the purification postulate says that everything that occurs in nature can be un-done in principle, if we have sufficient resources and information. Another way of stating this is that the loss of information that occurs in a probabilistic process is only apparent: in principle the information is conserved somewhere in the universe and is never lost, even though we might not have direct access to it. The `missing information’ in a mixed state is never lost forever, but can always be accessed by some observer, at least in principle.
It is curious that probabilistic classical physics does not obey this property. Surely it seems reasonable to expect that one could construct a probabilistic classical theory in which information is ultimately conserved! In fact, if one attempts this, one arrives at a theory of deterministic classical physics. In such a theory, having maximal knowledge of a state (i.e. the state is pure) further implies that one can perfectly predict the outcome of any measurement on the state, but this means the theory is no longer probabilistic. Indeed, for a classical theory to be probabilistic in the sense that we have defined the term, it necessarily allows processes in which information is irretrievably lost, violating the spirit of the purification postulate.
In conclusion, I’d say this is pretty close to the mystical “Zing” that we were looking for: quantum mechanics is the only reasonable theory in which processes can be inherently probabilistic while at the same time conserving information.