Warning: This article deals with boring computer statistics that only a computer nerd could love. So if you are a normal person, you better skip this one.
As some devil like Machiavelli (author of “Il Principe”) or SunTzu (author of “The Art of War“) would advise, we quantum computerists should study our competition if we want to increase our chances of beating it someday . The main competition for quantum computers are parallel supercomputers. It looks like QCs won’t outperform supercomputers at most tasks, but they may outperform them, and by a very wide margin, at some special tasks, such as MCMC and simulation of quantum systems.
One way to start learning about the weaknesses of our supercomputer brethren/competitors is to look at the excellent compendium of supercomputer stats published biannually by top500.org. Their latest list is for June 2010 and can be viewed here in html format. (A much more detailed list in Excel format can be obtained here. The Excel one is what I used to generate Figs.2,3,4).
The news flash for this semester is that China is nipping at the heels of the USA. Nebulae (Rmax=1.27e15 FLOPS), owned by China, is currently the second fastest (of those that submitted stats) supercomputer in the world, after Jaguar (Rmax=1.75e15 FLOPS), owned by the USA. Fig.1, taken from the top500.org website, shows how various countries are faring in the supercomputing sweepstakes. Sadly, India, Africa and South America are nowhere to be seen, and even mighty Japan has been falling behind for more than a decade.
Given such nice data (and already in Excel format to boot!), as a nerdy scientist that I am, I couldn’t resist making some plots (so easy to do with Excel or R once the data has been entered into a spreadsheet!). So patient reader, if you are still with me, and you are a weirdo like me, you might enjoy the following 3 beautiful plots that I generated:
- Fig.2 A scatterplot of Speed = Rmax (FLOPS) versus Number of Cores
Fig.3 A scatterplot of Power (Watts) versus Number of Cores
Fig.4 A scatterplot of Power/Speed (Joules per Operation) versus Number of Cores
The moral from Fig.2 is that, as one would expect, there is much variation depending on technology, but for our best technology, Speed scales linearly with Number of Cores, at a rate of about 2e14 FLOPS/(1e5 cores) ~ 2e9 FLOPS/core
In Fig.3, we again see variation due to technology, but for our best technology, Power Consumption scales linearly with Number of Cores, at a rate of about 1e6 Watts/(1e5 cores) ~ 10 Watts/core
Since both Power and Speed scale linearly with Number of Cores, Power/Speed = Energy Per Flop is approximately constant as a function of the Number of Cores. Fig.4 confirms this. For our best technology, we get from Fig.4 about 3e-9 Watts/FLOPS = 3e-9 Joules/Floating Point Op
For comparison, Boltzmann’s constant k = 1.4e-23 Joules/Kelvin. At room temperature 25 °C (77 °F, 298 K) 1kT is equivalent to 4.1e−21 Joules
And what would be the analogous metrics for QCs? Too early to tell, but let me speculate. For QCs, instead of number of cores, we can use number of qubits. And instead of number of floating point ops, we can count the number of CNOTs (see footnote). I expect that Power Consumption and CNOTs/sec will both scale linearly with number of qubits, and therefore alpha= Joules/CNOT will be independent of number of qubits. alpha will vary depending on which QC hardware type (ion traps, Josephson junctions, etc.) we use. Of course, alpha=Joules/Flop for classical computers and alpha=Joules/CNOT for QCs will differ. The great advantage of QCs over classical computers is that for certain fixed tasks like MCMC, the number of CNOTs required by a QC will go as the square root of the number of floating point operations required by a classical supercomputer.
You can always express your computation as a SEO (Sequence of Elementary Operations), where the elementary operations consists of 1-body operations like qubit rotations and 2-body operations like CNOTs. 2-qubit operations take much longer to perform than 1-qubit operations, so neglect the 1-qubit operations and count only CNOTs. Even if 1-qubit operations are not negligibly short, the number of 1-qubit and 2-qubit operations are roughly proportional, so counting only 2-body operations is good enough as a metric of speed.