# Quantum Bayesian Networks

## May 31, 2011

### Vulcan Child or Your Children?

Filed under: Uncategorized — rrtucci @ 1:28 pm

## May 29, 2011

### To Build a Fire

Filed under: Uncategorized — rrtucci @ 7:25 am

It can be awfully hard to build a fire when you are all alone in the middle of wild, inhospitable territory. D-Wave is trying to do just that. To start a fire in commercial quantum computing. And they are currently the only company trying to do so. You have to be very foolhardy to do something like that, but that’s what it takes to achieve real progress.

D-Wave builds and sells adiabatic (or annealing) quantum computers based on Josephson Junction superconducting technology. (An adiabatic QC is different from a gate model QC. Other researchers are trying to build gate model QCs.) The May 12, 2011 issue of the prestigious Nature magazine featured a very nice paper by D-Wave. (I discussed the physics behind the paper in a previous post). Two weeks later, D-Wave announced that they had signed a multi-year contract with the Lockheed Martin company for an undisclosed amount.

For me, since I believe that someday soon quantum computing will become a very useful and widespread technology, this is great news, because it benefits not just D-Wave but the whole QC field. I’m hoping that this news will generate new interest, jobs, private investment and startups in the QC field.

I believe that D-Wave’s future success is totally guaranteed

I believe that the QC field is very bountiful and presently untapped, and that therefore D-Wave or any other company that makes an honest attempt to explore this field and stake a commercial claim in it, will almost certainly succeed.

Since it was founded in 1999, D-Wave has obtained more than $100 million in funding from sources such as: the venture capital firm Harris & Harris, the Canadian government, and Goldmann Sachs. This is quite an impressive sum. And now, with their newly announced contract with Lockheed Martin, they are sitting financially in the catbird seat, at least for a while. D-Wave’s recent paper in Nature has convinced even some skeptics that the D-Wave computer is doing interesting quantum physics. I also believe that D-Wave’s future success is not guaranteed in the least Unfortunately, the fact that D-Wave is doing interesting quantum physics does not guarantee that it will be commercially successful in the long-term. The example of the Thinking Machines Corporation comes to mind. According to Wikipedia, Thinking Machines was founded in 1982 and filed for bankruptcy in 1994. During its 12 years of existence, it produced 5 successive commercial versions of its Connection Machine, plus many innovations in hardware and software for parallel processing. Thus, as a computer science experiment, it was hugely successful. According to Wikipedia: It became profitable in 1989 thanks to its DARPA contracts,[citation needed] and in 1990 the company had$65 million (USD) in revenue, making it the market leader in parallel supercomputers. In 1991, DARPA reduced its purchases amid criticism it was unfairly subsidizing Thinking Machines at the expense of other vendors like Cray, IBM, and in particular, NCUBE and MasPar. By 1992 the company was losing money again, due to lack of business

The reasons usually given for why it went bankrupt are over-reliance on DARPA contracts and gross mis-management. The following riveting article, which I highly recommend, explains in detail the reasons for its downfall:

The Rise and Fall of Thinking Machines. A close up look at a doomed-yet-brilliant start-up computer company that never quite grasped the basics of business. By Gary A. Taubes (Inc. Mag, Sep 15, 1995)

The good news is that according to the Wikipedia article, once Thinking Machines went broke, large parts of it were purchased by Sun and Oracle corporations, so it doesn’t look like its employees went hungry.

I don’t think D-Wave is mis-managed, but note that Lockheed Martin is primarily a defense contractor.

D-Wave also has to worry about the looming competition. We hope adiabatic QCs pan out, but it’s possible that gate model QCs will prove superior. (In such a case, D-Wave could use their expertise in superconducting technology to develop their own gate model QC, so this wouldn’t necessarily put them out of the game.)

An exciting horse race is likely to occur between D-Wave and the research groups at UCSB and Yale Univ. that are trying to build a gate model QC using superconducting technology.

The UCSB and Yale Univ. groups have been making some very confident noises lately. UCSB recently unveiled its “RezQu” architecture:

And Yale Univ. is pushing its “transmon qubits”:

• “How coherent are Josephson junctions?” by Hanhee Paik, D. I. Schuster, Lev S. Bishop, G. Kirchmair, G. Catelani, A. P. Sears, B. R. Johnson, M. J. Reagor, L. Frunzio, L. Glazman, R. J. Schoelkopf, arXiv:1105.4652, abstract

## May 24, 2011

### D-Wave’s Tunneling Quantum Computer

Filed under: Uncategorized — rrtucci @ 3:58 pm

D-Wave company recently got a very nice paper about its quantum computer published in the prestigious Nature magazine. I’ve decided to write two blog posts about it. The present post will summarize the paper, with an emphasis on physics issues. The next post will discuss the paper’s “sociological” implications.

• “Quantum annealing with manufactured spins”(Nature 473, 194–198, 12 May 2011). This is a link to the paper’s abstract in Nature. As one would expect, Nature expects you to pay a king’s ransom to view the paper on line, so you may have to go to an old-fashioned paper library to read it. I’m sure that, in due time, D-Wave will make available for free, on the web, a pdf copy of the paper.

(The paper’s Supplementary Materials are available for free at the Nature web site. The paper and the Supplementary Materials overlap somewhat, but not much. The Supplementary Materials is a paper in itself. It contains stuff that was left out for brevity from the main paper, such as extra experimental data, extra details about the experimental methods used, and extra information about how the authors calculated the theoretical curves that they are comparing to the data.)

• A blog post by Geordie Rose and Suzanne Gildert, members of D-Wave.
• A blog post by the Scott Aaronson (MIT computer science professor and one of the world’s most respected authorities on classical and quantum complexity theory)
• A blog post (in Spanish) by the amazing and top-notch scientist and blogger, Dr. Francisco R. Villatoro, alias La Mula Francis (Profesor Titular de la Universidad de Málaga (UMA))

Fig.1

Fig.1 shows one of D-Wave’s superconducting flux qubits. It consists of two superconducting loops. Flux $\Phi_1$ plus a much smaller “bias flux” $\Phi_{1x}$ links loop 1. Likewise, $\Phi_2 + \Phi_{2x}$ links loop 2. The magnetic moment of loop 1 can point either up or down, and these two possibilities correspond to the two states $|0\rangle$ and $|1\rangle$ of the qubit.

Fig.2

Fig.2 shows a potential energy curve for one of these qubit. The axes of the graph are energy $U$ versus flux $\Phi_1$. A potential energy curve of this type is called a double-well potential. Let quantum state $|n\rangle$ occupy the discrete energy level $E_n$ for $n=0,1,2,...$. The qubit can be localized in the right well, in states $|0\rangle$ and $|2\rangle$, or in the left well in states $|1\rangle$ and $|3\rangle$. The qubit can also occupy a state which is above the energy barrier and spreads over both wells (this is illustrated by state $|13\rangle$ in Fig.2.) At low temperatures, the qubit occupies, with very high probability, the two lowest states $|0\rangle$ and $|1\rangle$ (or a superposition of these two states).

The shape of the double-well potential of Fig.2 is characterized by two important parameters:

• $\delta U$ is the height of the energy barrier. It can be changed by changing the flux bias $\Phi_{2x}$.
• $2h$ is the energy difference between the bottoms of the two wells. It can be changed by changing the flux bias $\Phi_{1x}$. The spin of a qubit is proportional to its magnetic moment, which is proportional to $\Phi_1 + \Phi_{1x}$. Furthermore, $\Phi_{1x}$ is proportional to $2h$. Thus $2h$ can also be thought of as a spin bias. Note that $2h$ quantifies the symmetry breaking of the double-well potential.

For novices, it might help to think of the Y axis ($U$) of the graph in Fig.2 as being the potential energy $mgz$ where $mg$ is the weight of an object and $z$ is how high the object is above the surface of the earth. You can think of the X axis ($\Phi_1$) of the graph as the horizontal displacement of the object on the surface of the earth. Now imagine two buildings separated by a wall. Call the height of the wall $\delta U$. The right building has two floors 0 and 2 and the left one has two floors 1 and 3. Both buildings also have a common floor 13. Floors that are higher than the wall, like floor 13, are connected and common to both buildings. Quantum mechanics allows a qubit to go from floor 0 to floor 1 or vice versa by walking through the wall (“tunneling”) instead of jumping over it. The bottoms of the buildings differ by $2h$ in their elevations. It’s possible to vary the height of the separating wall (this would change $\delta U$). It’s also possible to raise or lower one building with respect to the other (this would change $2h$).

The D-Wave paper reports on an experiment where $\delta U$ and $h$ both depend on time, varying approximately like step functions; that is, they rise from zero to a maximum value, doing this monotonically and fairly quickly. Let

• $t_{delay}$ , the “delay time”, be the time at which $h(t)$ reaches half of its maximum value.
• $t_{freeze}$ , the “freeze time”, be the time at which $\delta U(t)$ reaches half of its maximum value.

According to Suzanne Gildert, one could say that during the experiment, a snap-shot of the qubit was taken at each time $t_{delay}$. This allowed D-wave to obtain a snap-shot of the qubit in flagrante delicto, i.e., right at the time $t_{freeze}$ at which it was performing “the act” (probably praying or brushing its teeth).

For times less than $t_{freeze}$ , $\delta U\approx 0$, so the wall between the two buildings has approximately zero height. Thus, at such times the potential energy curve is parabolic-looking (as one gets for a harmonic oscillator), and the Hamiltonian $\cal H$ for all the qubits is approximately a trivial Hamiltonian, call it ${\cal H}_{trivial}$.

For times greater than $t_{freeze}$ , $\delta U\approx \delta U_{max}$, so the wall between the two buildings has maximum height. Thus, at such times the potential energy curve is a double-well, and the Hamiltonian $\cal H$ for all the qubits is a non-trivia Hamiltonian, call it ${\cal H}_{non-trivial}$.

Fig.3

Fig.3 shows a qualitative graph of $\delta U(t)$ and $h(t)$ versus time. Actually, it shows two instances of $h(t)$, one instance in which $t_{delay} < t_{freeze}$ so $t_{delay}$ is said to be “early”, and one instance in which $t_{delay} > t_{freeze}$ so $t_{delay}$ is said to be “late”. Note that $2h_{max}<, where $k_B$ is Boltzmann's constant and $T$ is the temperature of the qubit.

In the EARLY case, $2h$ (=the difference in the elevations of the bottoms of the two buildings) is changed BEFORE the wall between the buildings is erected. In this case, if the qubit initially occupies floors 0 and 1 with equal probability, it sees NO WALL blocking its way, so it ends up moving with high probability (which depends on temperature) to floor 0, which is the lowest floor of the lowest of the two buildings, and that's how it is found at freeze time.

In the LATE case, $2h$ (=the difference in the elevations of the bottoms of the two buildings) is changed AFTER the wall between the buildings is erected. In this case, if the qubit initially occupies floors 0 and 1 with equal probability, it sees A HIGH WALL blocking its way, so it ends up not changing much, continuing to occupy floors 0 and 1 with equal probability, and that's how it is found at freeze time.

Fig.4

Fig.4 shows how $t_{freeze}$ changes with temperature for:
(1) a theoretical model that is quantum mechanical and assumes just two states $|0\rangle$ and $|1\rangle$
(2) a theoretical model that is quantum mechanical and assumes just four states $|0\rangle$, $|1\rangle$, $|2\rangle$, $|3\rangle$
(3) a theoretical model that is classical,
(4) the data.

In Fig.4, the classical model (unlike the quantum models) gives a $t_{freeze}$ which depends linearly on temperature. This dependence can be explained as follows. $\delta U$ increases monotonically from zero to a value $\delta U_{max}>> k_B T$. Thermal fluctuations can cause the qubit to jump over the wall, with probability $exp(\frac{-\delta U}{k_B T})$, but this can only occur initially, while $\delta U$ is less than $k_B T$. This suggests that the freezing occurs when $\delta U(t_{freeze}) \approx k_B T$. Since $\delta U(t)$ varies linearly with time in the rising part of the curve, we expect $t_{freeze}$ to vary linearly with temperature.

Besides varying the $\delta U$ and $h$ of each qubit, the D-wave computer can also vary the coupling constant $J_{ab}$ between qubits $a$ and $b$. Besides considering individual qubits, the paper also considers a chain of eight qubits coupled in such a way that a domain wall must form somewhere within the chain, with all spins on one side of the domain wall pointing in one direction and all spins on the other side pointing in the opposite direction. I won't go into the details of the 8 qubit experiment here. The bottom line is that, just as for the single qubit case, D-wave investigators observe for the 8 qubit chain a signature for quantum tunneling which fits well their quantum model and is very different from what one would get with any reasonable classical model.

In conclusion, Fig.4 and many other data plots given in the paper show quite conclusively that the D-Wave computer is doing quantum tunneling.

Separability Concerns

(This section deals with concerns about the separability of the density matrix describing the collection of 128 qubits in the D-wave computer. If you don’t know what a separable density matrix is, check out the appendix at the end of this blog post).

Some experts, while agreeing that the D-Wave computer is doing quantum tunneling, have pointed out that the D-wave machine uses a non-entangled density matrix, and that the (few) known quantum algorithms that give exponential speedups use entangled density matrices. However, this is not a very strong objection at the present time, because the claim

IF a quantum algorithm uses a non-entangled density matrix, THEN there are classical algorithms that can calculate the same thing as the quantum algorithm and just as fast (i.e., with the same time complexity)

has not been proven yet, and may not even be true. That this claim hasn’t been proven yet, has been emphasized (in more technical language) by Scott Aaronson in his blog post about the paper.

Even though the density matrix for the D-wave computer is separable, it is not totally classical, because it is non-diagonal in the physical basis, and it also exhibits quantum tunneling, which is a non-classical property. I find it hard to believe that a machine with the ability to do quantum tunneling cannot do some calculations faster (i.e., with a more favorable time-complexity) than any classical machine. I say this because a machine used for minimizing functions that takes advantage of quantum tunneling might be able to escape local minima on which a classical computer would get stuck.

Appendix: Separable Density Matrices

A two-part density matrix $\rho_{AB}$ is said to be separable iff it can be written as $\rho_{AB} = \sum_{i=1}^n w_i \rho^i_A\otimes \rho^i_B$, where $n>0$, where $w_1, w_2, \ldots, w_n$ are positive numbers which add up to one, and where, for all $i$, $\ \rho^i_A$ and $\rho^i_B$ are density matrices. A non-separable density matrix is said to be entangled. For example, if $\rho_{AB}$ represents two qubits and acts on the vector space spanned by the states $\{|00\rangle, |01\rangle, |10\rangle, |11\rangle\}$, then

$\rho_{AB} = \left[ \begin{array}{cccc}\frac{1}{2}&0&0& \frac{1}{2} \\ 0&0&0&0\\ 0&0&0&0\\ \frac{1}{2} &0&0&\frac{1}{2}\end{array}\right]$

is non-separable (i.e. entangled), whereas

$\rho_{AB} = \left[ \begin{array}{cccc}\frac{1}{2}&0&0&0\\ 0&0&0&0\\ 0&0&0&0\\ 0&0&0&\frac{1}{2}\end{array}\right]$

is separable (i.e., non-entangled).

If the system AB is classical, then $\rho_{AB}$ is diagonal. A diagonal $\rho_{AB}$ with diagonal elements $P(a,b)$ is separable iff $P(a,b)$ can be written in the form $P(a,b)= \sum_i P(a|i)P(b|i)P(i)$. (Note that this condition is related to conditional independence. $P(a,b|i)$ is conditionally independent if $P(a,b|i)= P(a|i)P(b|i)$). All probability distributions $P(a,b)$ can be written in this form. Just write $P(a,b) = \sum_{a',b'} \delta_a^{a'} \delta_b^{b'}P(a',b')$ and set $i=(a',b')$. Hence, all classical multi-part density matrices are separable.

## May 14, 2011

### SIT-ting without the Deltas and Epsilons

Filed under: Uncategorized — rrtucci @ 7:59 pm

Lately, I’ve been very busy working on Shannon Information Theory (SIT). I previously wrote the following blog posts about quantum SIT

Quantum SIT has advanced by leaps and bounds in the last two decades, but I believe many aspects of it could still stand some improvements and enhancements. For example: for my taste, current treatments of quantum SIT use too many delta and epsilon arguments; they seem to me to be too epsilon and delta “trigger happy”. The situation reminds me of Calculus, where delta and epsilon arguments are useful if one wants to achieve a very high level of rigor, but can be avoided for the most part if one doesn’t mind being slightly unrigorous.

I’ve invented a new SIT technique that doesn’t use deltas and epsilons at all, or does so very sparingly. My technique is not as rigorous as the standard delta & epsilon techniques used in SIT, in the same way that an engineering Calculus book is not as rigorous as Rudin’s analysis book. However, my technique has the virtue that it makes proofs much more automatic and briefer. One could say that I am translating SIT from the language of pure mathematicians to a language that physicists might find more palatable. Using my new technique, I can easily prove the two main pillars of classical SIT: the noiseless coding theorem and the noisy channel coding theorem. Using a quantum generalization of my new technique, I am now trying to re-derive the most important already known results in quantum SIT for noiseless and noisy coding.

At the risk of pulling a Fermat’s Last Theorem trick on the readers of this blog, I promise to unveil my new results at a later time (assuming that I don’t eventually find a fatal flaw in my technique, and assuming that I eventually derive some nontrivial, previously unknown results in quantum SIT).

P.S. On my next blog post, I’ll say something about D-Wave’s exciting new results. These days, there is certainly no shortage of interesting quantum computing topics to write blog posts about.

## May 7, 2011

### Intel Waffles

Filed under: Uncategorized — rrtucci @ 5:29 am

The word “waffle” used as a verb means to waver, vacillate, avoid commitment, sit on the fence. Example:

Used as a noun, waffle means the tasty latticework breakfast molded with a waffle iron (alas, not so common in American households anymore). Example:

Intel is now producing transistors shaped like waffles.

This blog post is about the second usage. (The first usage was discussed in a previous blog post.)

Recently, the news media has been abuzz with news about Intel’s new 3D Trigate transistors. Check out, for example,

The YouTube video has some nice visuals. I highly recommend it. The AnandTech article is excellent too, and a little bit more technical.

Each transistor in a computer is analogous to a tiny valve that can shut ON or OFF the flow of water (= flow of electrons = current) in a pipe (=wire). Such a valve (=gate=barrier) can be switched ON and OFF at a certain switching rate. The faster the switching rate, the higher the power dissipation.

Until now, Intel and everyone else have been making pipes whose cross section looked like a dash; that is, like a skinny rectangle with one dimension much larger than the other. Starting with its 22nm node-length microprocessors due out before the end of this year, and thereafter, Intel will be making transistors in which the cross-section of the pipe looks like the letter U instead of a dash.

Even though a U and a dash are topologically the same, the U-cross-section pipe is preferable because it can carry more current. If the length of the dash is the same as the horizontal, bottom side of the U, then the U can carry current along 3 sides instead of just 1. Not only is a U-cross-section pipe more efficient at carrying current, but the valve to block the pipe is more efficient too, because it pinches the flow on 3 sides instead of 1. This is the reason why Intel calls it a 3D Trigate, because the gate has 3 sides, the 3 sides of the U. Other people call such transistors fin gates or FinFETs, because the U resembles a fin (a FET is a type of transistor). I like to refer to the older, non-fin gates as “flat gates”.

For any transistor, the faster the switching rate F, the higher the power dissipation P. For a fixed P, the fin gate allows higher F than the flat gate at fixed node-length. And for a fixed F, it allows a lower P. This is all explained more nicely and more quantitatively in the AnandTech article.

When the valve is shut OFF, an electron that leaks across the barrier has to tunnel across a barrier of a certain thickness and energy-height. The thickness of the barrier is a fraction (something like 1/2) of the node-length figure. For example, once Intel gets to 11nm node-length, which they plan to do by 2015, the barrier thickness will be about 5nm, which is about 10 Silicon atoms thick.

Moore’s Law
Moore’s Law says that the transistor density of widely available chips doubles every two years. Thus, Moore’s Law describes how the transistor density (which is proportional to the inverse of the squared node-length) varies with time. Intel plans to use fin gate transistors on all chips with node-lengths smaller or equal to 22nm. The fin gates are more efficient than the flat gates at a fixed node-length, but this is not expected to change the plans for how node-length will decrease with time. Therefore, this new type of gate will not affect much the inexorable progress of Moore’s Law. Intel and the rest of the microprocessor industry are still planning to reach 11nm node-lengths by 2015. The only difference is that this node-length will now be reached with fin gates instead of flat gates.

Of course, it might be possible to stack layers of these waffles on top of each other, but that approach is fraught with its own perils. Such stacks will store a lot of heat between the layers, and it might be very difficult to remove that heat quickly enough so as not to markedly degrade the performance of the chip or even melt it.

So, what does this all have to do with quantum computers?
Note that both a fin gate and a flat gate have the same barrier thickness (and the same(?) barrier energy-height) at a fixed node-length. That barrier thickness will be about 5nm for the 11nm node-length. Since quantum tunneling only depends on the barrier thickness and barrier energy-height, I think we can expect about the same amount of quantum tunneling with both fin and flat gates at 11nm node-length. So reaching node-lengths below 11nm will still require a paradigm shift other than fin gates (perhaps quantum computing?). To paraphrase the movie King Kong: Twasn’t the planes (geometry) that got him, boys, twas the beauty (of quantum tunneling and quantum mechanics) that killed the beast (of Moore’s Law).

Create a free website or blog at WordPress.com.