Quantum Bayesian Networks

July 4, 2018

Why doesn’t the BBVI (Black Box Variational Inference) algorithm use back propagation?

Filed under: Uncategorized — rrtucci @ 1:36 pm

Quantum Edward uses the BBVI training algorithm. Back Propagation, invented by Hinton, seems to be a fundamental part of most ANN (Artificial Neural Networks) training algorithms, where it is used to find gradients used to calculate the increment in the cost function during each iteration. Hence, I was very baffled, even skeptical, upon first encountering the BBVI algorithm, because it does not use back prop. The purpose of this blog post is to shed light on how BBVI can get away with this.

Before I start, let me explain what the terms “hidden (or latent) variable” and “hidden parameter” mean to AI researchers. Hidden variables are the opposite of “observed variables”. In Dustin Tran’s tutorials for Edward, he often represents observed variables by x and hidden variables by z. I will use \theta instead of z, so z=\theta below. The data consists of many samples of the observed variable x. The goal is to find a probability distribution for the hidden variables \theta. A hidden parameter is a special type of hidden variable. In the language of Bayesian networks, a hidden parameter corresponds to a root node (one without any parents) whose node probability distribution is a Kronecker delta function, so, in effect, the node only ever achieves one of its possible states.

Next, we compare algos that use back prop to the BBVI algo, assuming the simplest case of a single hidden parameter \theta (normally, there is more than one hidden parameter). We will assume \theta\in [0, 1]. In quantum neural nets, the hidden parameters are angles by which qubits are rotated. Such angles range over a closed interval, for example, [0, 2\pi]. After normalization of the angles, their ranges can be assumed, without loss of generality, to be [0, 1].

CASE1: Algorithms that use back prop.

Suppose \theta \in [0, 1],\;\;\eta > 0. Consider a cost function C and a model function M such that

C(\theta) = C(M(\theta)).

If we define the change d\theta in \theta by

d\theta = -\eta \frac{dC}{d\theta}= -\eta \frac{dC}{dM} \frac{dM}{d\theta},

then the corresponding change in the cost is

d C = d\theta \frac{dC}{d\theta} = -\eta \left( \frac{dC}{d\theta}\right)^2.

This change in the cost is negative, which is what one wants if one wants to minimize the cost.

CASE2: BBVI algo

Suppose \theta \in [0, 1],\;\;\eta > 0,\;\; \lambda > 0. Consider a reward function R (for BBVI, R = ELBO), a model function M, and a distance function dist(x, y)\geq 0 such that

R(\lambda) = R\left[\sum_\theta dist[M(\theta), P(\theta|\lambda)]\right].

In the last expression, P(\theta|\lambda) is a conditional probability distribution. More specifically, let us assume that P(\theta|\lambda) is the Beta distribution. Check out its Wikipedia article

https://en.wikipedia.org/wiki/Beta_distribution

The Beta distribution depends on two positive parameters \alpha, \beta (that is why it is called the Beta distribution). \alpha, \beta are often called concentrations. Below, we will use the notation

c_1 = \alpha > 0,

c_2 = \beta  > 0,

\lambda = (c_1, c_2).

Using this notation,

P(\theta|\lambda) = {\rm Beta}(\theta; c_1, c_2).

According to the Wikipedia article for the Beta distribution, the mean value of \theta is given in terms of its 2 concentrations by the simple expression

\langle\theta\rangle = \frac{c_1}{c_1 + c_2}.

The variance of \theta is given by a fairly simple expression of c_1 and c_2 too. Look it up in the Wikipedia article for the Beta distribution, if interested.

If we define the change dc_j in the two concentrations by

dc_j = \eta \frac{\partial R}{\partial c_j}

for j=1,2, then the change in the reward function R will be

dR = \sum_{j=1,2} dc_j \frac{\partial R}{\partial c_j}= \eta \sum_{j=1,2} \left(\frac{\partial R}{\partial c_j}\right)^2

This change in the reward is positive, which is what one wants if one wants to maximize the reward.

Comparison of CASE1 and CASE2

In CASE1, we need to calculate the derivative of the model M with respect to the hidden parameter \theta:

\frac{d}{d\theta}M(\theta).

In CASE2, we do not need to calculate any derivatives at all of the model M. (That is why it’s called a Black Box algo). We do have to calculate the derivative of P(\theta|\lambda) with respect to c_1 and c_2, but that can be done a priori since P(\theta|\lambda) is known a priori to be the Beta distribution:

\frac{d}{dc_j}\sum_\theta dist[M(\theta), P(\theta|\lambda)]= \sum_\theta \frac{d dist}{dP(\theta|\lambda)} \frac{dP(\theta|\lambda)}{dc_j}

So, in conclusion, in CASE1, we try to find the value of \theta directly. In CASE2, we try to find the parameters c_1 and c_2 which describe the distribution of \theta‘s. For an estimate of \theta, just use \langle \theta \rangle given above.

Advertisements

July 1, 2018

Is Quantum Computing Startup Xanadu, Backed by MIT Prof. Seth Lloyd, Pursuing an Impossible Dream?

Filed under: Uncategorized — rrtucci @ 5:44 pm

As all Python programmers learn soon, if you ever have a question about Python, it’s almost certain that someone has asked that question before at Stack OverFlow, and that someone has provided a brilliant answer to it there. The same company that brings Stack Overflow to us, now brings also “quantum computing stack exchange” (beta started 3 months ago). I’ve answered a few questions there already. Here is the first question I asked:

https://quantumcomputing.stackexchange.com/questions/2414/is-probabilitistic-universal-fault-tolerant-quantum-computation-possible-with

Quantum Edward, Quantum Computing Software for Medical Diagnosis and GAN (Generative Adversarial Networks)

Filed under: Uncategorized — rrtucci @ 7:28 am

Quantum Edward at this point is just a small library of Python tools for doing classical supervised learning by Quantum Neural Networks (QNNs). The basic idea behind QEdward is pretty simple: In conventional ANN (Artificial Neural Nets), one has layers of activation functions. What if we replace each of those layers by a quantum gate or a sequence of quantum gates and call the whole thing a quantum computer circuit? The replacement quantum gates are selected in a very natural way based on the chain rule of probabilities. We take that idea and run with it.

As the initial author of Quantum Edward, I am often asked to justify its existence by giving some possible use cases. After all, I work for a startup company artiste-qb.net, so the effort spent on Quantum Edward will not be justified in the eyes of our investors if it is a pure academic exercise with no real-world uses. So let me propose two potential uses.

(1) Medical Diagnosis

It is interesting that the Bayesian Variational Inference method that Quantum Edward currently uses was first used in 1999 by Michael Jordan (Berkeley Univ. prof with same name as the famous basketball player) to do medical diagnosis using Bayesian Networks. So the use of B Nets for Medical Diagnosis has been in the plans of b net fans for at least 20 years.

https://arxiv.org/abs/1105.5462

More recently, my friends Johann Marquez (COO of Connexa) and Tao Yin (CTO of artiste-qb.net) have pointed out to me the following very exciting news article:

This AI Just Beat Human Doctors On A Clinical Exam (Forbes, June 28, 2018, by Parmy Olson)

It took 2 years to train the Babylon Health AI, but the investment has begun to pay off. Currently, their AI can diagnose a disease correctly 82% of the time (and that will improve as it continues to learn from each case it considers) while human doctors are correct only 72% of the time on average. Babylon provides an AI chatbot in combination with a remote force of 250 work-from-home human doctors.

Excerpts:

The startup’s charismatic founder, Ali Parsa, has called it a world first and a major step towards his ambitious goal of putting accessible healthcare in the hands of everyone on the planet.

Parsa’s most important customer till now has been Britain’s state-run NHS, which since last year has allowed 26,000 citizens in London to switch from its physical GP clinics to Babylon’s service instead. Another 20,000 are on a waiting list to join.

Parsa isn’t shy about his transatlantic ambitions: “I think the U.S. will be our biggest market shortly,” he adds.

Will quantum computers (using quantum AI like Quantum Edward) ever be able to do medical diagnosis more effectively than classical computers? It’s an open question, but I have high hopes that they will.

(2) Generative Adversarial Networks (GAN)

GANs (Wikipedia link) have been much in the news ever since they were invented just 4 years ago, for their ability to make amazingly accurate predictions with very little human aid. For instance, they can generate pictures of human faces that humans have a hard time distinguishing from the real thing, and generate 360 degree views of rooms from only a few single, fixed perspective photos of the room.

Dusting Tran’s Edward (on which Quantum Edward is based) implements inference algorithms of two types, Variational and Monte Carlo. With Edward, one can build classical neural networks that do classification via the so called Black Box Variational Inference (BBVI) algorithm. Can BBVI also be used to do GAN classically? Yes! Check out the following 4 month old paper:

Graphical Generative Adversarial Networks, by Chongxuan Li, Max Welling, Jun Zhu, Bo Zhang https://arxiv.org/abs/1804.03429 (see footnote)

Can this be generalized to quantum mechanics, i.e. can one use BBVI to do classification and GAN on a quantum computer? Probably yes. Quantum Edward already does classification. It should be possible to extend the techniques already in use in Quantum Edward so as to do GAN too. After all, GAN is just 2 neural nets, either classical or quantum, competing against each other.

(footnote) It is interesting to note that 3 out the four authors of this exciting GAN paper work at Tsinghua Univ in Beijing. Their leader is Prof. Jun Zhu (PhD from Tsinghua Univ, post-doc for 4 yrs at Carnegie Mellon), a rising star in the AI and Bayesian Networks community. He is the main architect of the software ZhuSuan. ZhuSuan is available at GitHub under the MIT license. It is a nice alternative to Dustin Tran’s Edward. Like Edward, it implements Bayesian Networks and Hierarchical Models on top of TensorFlow. The above GAN paper and the ZhuSuan software illustrate how advanced China is in AI.

June 24, 2018

Latest Investments in Quantum Computing Startups (May & June 2018)

Filed under: Uncategorized — rrtucci @ 3:38 am

It seems that quantum computing startups have hit a major money artery in the last 2 months. And that’s just the first half of the Summer. The Summer of 2018, with its blisteringly high temperatures of social activity in the US, is going to be one for the books. Movies will be made about it.

  1. QCWare, $6.5M

  2. QxBranch, $8.5M

  3. Yale qc, $16M in 4 yrs from ARO (Army Research Office, a political organization, like all US federal agencies)

  4. Strangeworks, $4M

  5. Zapata, $5.4M

  6. Xanadu, $9M

  7. gtn.ai, $3M

June 15, 2018

Quantum Edward, models with complex-valued layers

Filed under: Uncategorized — rrtucci @ 4:31 am

The Quantum Edward Python lib (its github repo here), in its first version, comes with 2 models called NbTrolsModel and NoNbTrolsModel. However, the lib is written with enough generality so that you can also run it with other models of your own devising. A model describes a quantum circuit for a QNN (Quantum Neural Network) which is split into layers.

Below is an excerpt from the docstring for the QEdward class called NbTrolsModel. The excerpt gives the quantum circuit for the NbTrolsModel


    ...
    Below we represent them in Qubiter ASCII
    picture notation in ZL convention, for nb=3 and na=4
    [--nb---]   [----na-----]
    NbTrols (nb Controls) model:
    |0> |0> |0> |0> |0> |0> |0>
    NOTA P(x) next
    |---|---|---|---|---|---Ry
    |---|---|---|---|---Ry--%
    |---|---|---|---Ry--%---%
    |---|---|---Ry--%---%---%
    NOTA P(y|x) next
    |---|---Ry--%---%---%---%
    |---Ry--%---%---%---%---%
    Ry--%---%---%---%---%---%
    M   M   M

    A gate |---|---Ry--%---%---%---% is called an MP_Y Multiplexor,
    or plexor for short. In Ref.1 (Qubiter repo at github), see Rosetta Stone
    pdf and Quantum CSD Compiler folder for more info about multiplexors.

If you look up the definition of a multiplexor in the Qubiter repo and references therein, you will notice that a multiplexor is a real-valued gate. Hence this model, since it only uses multiplexor gates, does not parametrize the full family of complex-valued amplitudes that are allowed in quantum mechanics. The NbTrolsModel does parametrize the whole family of possible (real-valued) probability distributions P(y|x) and P(x), where x = (q3, q2, q1, q0) and y = (q6, q5, q4), where qi is an element of 0, 1 for i=0,1,2,…6

So how can we generalize the model NbTrolsModel so that it parametrizes all possible complex-valued amplitudes too. One possibility is as follows. (call it the C_NbTrolsModel)


    |0> |0> |0> |0> |0> |0> |0>
    NOTA A(x) next
    |---|---|---|---|---|---Ry
    |---|---|---|---|---|---%
    |---|---|---|---|---Ry--%
    |---|---|---|---|---%---%
    |---|---|---|---Ry--%---%
    |---|---|---|---%---%---%
    |---|---|---Ry--%---%---%
    |---|---|---%---%---%---%
    NOTA A(y|x) next
    |---|---Ry--%---%---%---%
    |---|---%---%---%---%---%
    |---Ry--%---%---%---%---%
    |---%---%---%---%---%---%
    Ry--%---%---%---%---%---%
    %---%---%---%---%---%---%
    M   M   M

This new model contains twice as many layers as the old one. Each multiplexor gate from the old model has been followed by a “diagonal unitary” gate consisting of only % or | symbols, for instance,

|---|---|---%---%---%---%

. You can look up the definition of such a gate in the Qubiter repo, in the same places where you found the def of a multiplexor. In this example, D =

 %---%---%---%

represents a 2^4=16 dimensional diagonal unitary matrix and I_8 =

 |---|---|

represents the 2^3=8 dimensional unit matrix. The whole gate is I_8 \otimes D, which is a 2^7=128 diagonal unitary matrix.

To motivate what is going on in this C_NbTrolsModel model, let me claim without proof that the first two lines of the circuit parametrize a complex amplitude A(q0), the next two lines A(q1|q0), the next two A(q2|q1, q0) and so forth.

If x = (q3, q2, q1, q0) and y = (q6, q5, q4), then

A(x) = A(q3|q2, q1, q0)A(q2|q1, q0)A(q1|q0)A(q0)

A(y|x) = A(q6|q5, q4)A(q5|q4)A(q4)

A(y, x) = A(y| x) A(x).

This is just a generalization of the chain rule for probabilities which for 3 random variables is

P(c, b, a) = P(c|b, a) P(b | a) P(a)

To go from the chain rule for probabilities to the chain rule for amplitudes, we just take the square root of all the probabilities and add a bunch of relative phase factors, leading to

A(c, b, a) = A(c|b, a) A(b | a) A(a)

Warning: Note that the expansion of a multiplexor (and of a diagonal unitary) into elementary gates (cnots and single qubit rotations) contains a huge number of gates (exp in the number of controls). However, such expansions can be shortened by approximating the multiplexor (or the diagonal unitary) using, for instance, the technique of Ref.2: Oracular Approximation of Quantum Multiplexors and Diagonal Unitary Matrices, by Robert R. Tucci, https://arxiv.org/abs/0901.3851 Another possible source of simplification: just like

P(c, b, a) = P(c|b, a) P(b | a) P(a)

represents a fully connected graph which simplifies to

P(c, b, a) = P(c| a) P(b | a) P(a)

if c is independent of b, in the same way, the chain rule in these QdEdward models might simplify due to certain conditional independences in the data.

Added July 11, 2018: Of course, this can all be generalized by making q0 a qudit with d0 states, q1 a qudit with d1 states, etc. Qudit q0 can be represented by n0 qubits, where n0 is the smallest int such that d0 ≤ 2^n0, same for qudit q1, q2, etc.

In Qubiter, the Quantum CSD Compiler decomposes an arbitrary unitary matrix into a product of multiplexors and diagonal unitaries. Qubiter also allows you to decompose multiplexors and diagonal unitaries into elementary ops (CNOTs and single qubit rotations). For example, Qubiter's CSD compiler will expand an arbitrary 3 qubit unitary matrix into the following:


%---%---%   
%---%---Ry  
%---%---%   
%---Ry--%   
%---%---%   
%---%---Ry  
%---%---%   
Ry--%---%   
%---%---%   
%---%---Ry  
%---%---%   
%---Ry--%   
%---%---%   
%---%---Ry  
%---%---% 

Hence, a QNN is like a portion of the expansion of an arbitrary unitary matrix.

When one uses complex-valued layers, the definition of ELBO must be in terms of density matrices, not classical prob distributions.

June 8, 2018

June 8, 2018, Toronto Quantum Computing Meetup, Next Event on: VIRTUAL ORCHESTRAS–AI AUGMENTED MUSIC

Filed under: Uncategorized — rrtucci @ 4:46 am

The Toronto Quantum Computing Meetup cordially invites you to a very special evening/ meetup TODAY, Friday, June 8, 2018. The Event is entitled “Virtual Orchestras — AI Augmented Music”.

Ian Andtek (once called Simon Bemudez) is a talented pianist and all around musician plus a highly skilled AI/Machine-Learning and graphics programmer. Born and raised in Venezuela, he is now a proud Canadian citizen. He will be giving a concert with his one-man, AI simulated orchestra at our usual venue/hangout, Zero Gravity Labs (free pizza? yes)

We are really pleased and honored that Ian is a member of our company artiste-qb.net. Our company is now more than just a company dedicated to quantum computing software. We aim to become an all around AI/ML company, with special emphasis in Bayesian Networks and TensorFlow, both quantum and classical. So combining music and AI falls well within our Jeff Bezos-Ian sized plans.

Below is Ian playing a musical piece composed by himself, called Ilusiones (Illusions)

June 4, 2018

June 22, 2018, Toronto Quantum Computing Meetup, Next Event On Quantum Machine Learning

Filed under: Uncategorized — rrtucci @ 7:14 am

The Toronto Quantum Computing Meetup cordially invites you to our next meeting on Friday, June 22, 2018. The Event will be on Quantum Machine Learning, specially about the software program Quantum Edward. Here is a pdf of the talk (I wrote it, but it will be delivered by someone else on my behalf)

http://www.ar-tiste.com/quantum_ed_talk/QuantumEdwardTalk.pdf

Typical slide from deck. I chose a pastel background color scheme to remind the audience of ice cream, gelato, etc.

edward_talk_chain_rule

May 24, 2018

Quantum Computing and a new book, “The Book of Why”, by Judea Pearl and Dana Mackenzie

Filed under: Uncategorized — rrtucci @ 5:23 am

“Leaping the Chasm” (1886) by Ashley Bennett, son of photographer Henry Hamilton Bennett, jumping to “Stand Rock”. See http://www.wisconsinhistory.org


Judea Pearl, UCLA professor, winner of the Turing Prize in Computer Science, is a hero to all Bayesian Network fans like me. Pearl has several books on B nets, as you can see at his Amazon page.. This blog post is to alert my readers to his most recent book, written in collaboration with Dana Mackenzie, released about a week ago, mid May 2018, entitled “The Book of Why: The New Science of Cause and Effect”.

To commemorate the release of the new book, I also wrote, besides this blog post, a small comment about the new book at the Edward Forum, and Dustin Tran, main author of Edward, responded with a comment that cites a very nice paper, less than 6 months old, by Dustin and Prof. Blei, Dustin’s thesis advisor at Columbia Univ, about the use of Judea Pearl’s causality ‘do-calculus’ within Edward.

I’ve been interested in the do-calculus for a long time, and have written two arxiv papers on the subject:

  1. Introduction to Judea Pearl’s Do-Calculus, by Robert R. Tucci (Submitted on 26 Apr 2013)
  2. An Information Theoretic Measure of Judea Pearl’s Identifiability and Causal Influence, by Robert R. Tucci (Submitted on 21 Jul 2013)
    This paper is for classical Bayesian Networks, but it can easily be generalized to quantum Bayesian Networks, by replacing probability distributions by density matrices in the information measure proposed there.

There exist more than a dozen packages written in R that implement at least partially the do-calculus. They are available at CRAN (the main R repository, named after cranberries).
This 2017 paper
contains a nice table of various R packages dealing with do-calculus.

It’s also interesting to note that BayesiaLab, a commercial software package that I love and recommend, already implements some of Pearl’s do-calculus. (full disclosure: the B net company that I work at, artiste-qb.net, has no business connections with BayesiaLab.)

By the way, artiste-qb.net provides a nice cloud service that allows you to run all these open-source do-calculus R packages on your browser, without any installation hassles. How? you ask, and if not, I’m going to tell you anyway.

***Beep, Beep, Commercial Alert***

artiste-qb.net is a multilingual (R, Python, Java, C++, English, German, Spanish, Chinese, Italian, French, you name it, we speak it) quantum open source software company.

We offer an image on AWS (the Amazon cloud service) called BayesForge.com.

BayesForge.com comes fully loaded with the Python distribution Anaconda, all of R, etc.

Bayesforge comes with most major Artificial Intelligence/Bayesian Networks, open-source packages installed, both classical ones (eg. TensorFlow, Edward, PyMC, bnlearn, etc) and quantum ones (eg., IBM Qiskit, DWave stuff, Rigetti and Google stuff, our own Quantum Fog, Quantum Edward, Qubiter, etc).

BayesForge allows you to run jupyter notebooks in Python, R, Octave (an open source matlab clone) and Bash. You can also combine Python and R within one notebook using Rmagic.

We have succeeded in dockerzing the BayesForge image and will be offering it very soon on other cloud services besides AWS, including a non-AWS cloud service in China, where AWS is so slow it is non-usable. One of our co-founders, Dr. Tao Yin, lives in ShenZhen, China, and is in charge of our China branch.

May 17, 2018

Help us program the NYC Matrix

Filed under: Uncategorized — rrtucci @ 1:59 am

Our company Artiste-qb.net does bleeding edge research into quantum computing. We are so advanced that our researchers think of Palantir and the NSA as a bunch of dotards and little Linux rocketmen.

One of our main projects is to program The NYC Matrix. We are almost done, thank you. In this picture, one of our elite programmers, Simón Bermúdez, is experimenting with quantum cloning of himself. Just like in The Matrix movies, but the real thing. Simón gets quite a lot of work done for us by slipping into this multiverse mode.

Google, IBM, Rigetti, DWave and Alibaba, you have been checkmated. artiste-qb.net is the only quantum computing firm that has mastered quantum cloning of employees.

Simón was born and raised in Venezuela, but now he lives in Toronto, Canada. He used to work at the illustrious Creative Destruction Lab (part of the U of Toronto) (hence the shirt in the photo). Now he works for artiste-qb.net. Thanks to Simón for letting me use this TOP SECRET photo. I lowered the resolution of the photo, the original one is even better.

May 9, 2018

BBVI in quantum computing, classical vs quantum supervised learning (classical vs quantum ELBO)

Filed under: Uncategorized — rrtucci @ 2:18 am

Version 1 of the computer program “Quantum Edward” that I released a few days ago uses the BBVI (Black Box Variational Inference, see Ref. 1 below) to train a qc by maximizing with respect to a parameter lambda, a “classical ELBO” (an ELBO defined in terms of classical probability distributions). I call that “classical supervised learning” by a qc (quantum computer).

But one can easily come up with a BBVI that trains a qc by maximizing with respect to a parameter lambda, a “quantum ELBO” (one defined by replacing the classical probability distributions of the classical ELBO by density matrices and sums by traces). I call this second strategy “quantum supervised learning” by a qc.

One more distinction. In Version 1 of Quantum Edward, we do C. Supervised Learning by a simulated (on a classical computer, analytical) qc. More generally, one could do (C. or Q.) Supervised Learning by a (real or simulated) qc

C. or Q. Supervised Learning by a simulated qc is immune to the quantum noise that plagues current qc’s which have almost no quantum error correction. So we definitely should explore that type of learning today.

It will be interesting to compare classification performance for various models (for either layered or DAG models with varying amounts of entanglement) for

  1. C. supervised learning by a classical computer (e.g., for Classical Neural Net layered models or for Bayesian network DAG models)
  2. (C. or Q.) supervised learning by (simulated or real) qc (e.g., for Quantum Neural Network models or for Quantum Bayesian Network models)

Nerd Nirvana will only be achieved once we can do Q. Supervised Learning by an error corrected real qc. 🙂

References:
1. R. Ranganath, S. Gerrish, D. M. Blei, “Black Box Variational
Inference”, https://arxiv.org/abs/1401.0118

May 5, 2018

Quantum Edward, First Commit

Filed under: Uncategorized — rrtucci @ 5:43 am

Today, I uploaded to GitHub the first commit of my “Quantum Edward” software. This blog is among other things a scrapbook of my quantum computing adventures. In this blog post, I want to save a copy of the first README of Quantum Edward. The software is exploratory and therefore will change a lot in the future and its README will change to mirror the changes in the software. So this first README will have a sentimental and comic value for me in years to come. Here it goes:

# Quantum Edward

Quantum Edward at this point is just a small library of Python tools for
doing classical supervised learning on Quantum Neural Networks (QNNs).

An analytical model of the QNN is entered as input into QEdward and the training
is done on a classical computer, using training data already available (e.g.,
MNIST), and using the famous BBVI (Black Box Variational Inference) method
described in Reference 1 below.

The input analytical model of the QNN is given as a sequence of gate
operations for a gate model quantum computer. The hidden variables are
angles by which the qubits are rotated. The observed variables are the input
and output of the quantum circuit. Since it is already expressed in the qc’s
native language, once the QNN has been trained using QEdward, it can be
run immediately on a physical gate model qc such as the ones that IBM and
Google have already built. By running the QNN on a qc and doing
classification with it, we can compare the performance in classification
tasks of QNNs and classical artificial neural nets (ANNs).

Other workers have proposed training a QNN on an actual physical qc. But
current qc’s are still fairly quantum noisy. Training an analytical QNN on a
classical computer might yield better results than training it on a qc
because in the first strategy, the qc’s quantum noise does not degrade the
training.

The BBVI method is a mainstay of the “Edward” software library. Edward uses
Google’s TensorFlow lib to implement various inference methods (Monte Carlo
and Variational ones) for Classical Bayesian Networks and for Hierarchical
Models. H.M.s (pioneered by Andrew Gelman) are a subset of C.B. nets
(pioneered by Judea Pearl). Edward is now officially a part of TensorFlow,
and the original author of Edward, Dustin Tran, now works for Google. Before
Edward came along, TensorFlow could only do networks with deterministic
nodes. With the addition of Edward, TensorFlow now can do nets with both
deterministic and non-deterministic (probabilistic) nodes.

This first baby-step lib does not do distributed computing. The hope is that
it can be used as a kindergarten to learn about these techniques, and that
then the lessons learned can be used to write a library that does the same
thing, classical supervised learning on QNNs, but in a distributed fashion
using Edward/TensorFlow on the cloud.

The first version of Quantum Edward analyzes two QNN models called NbTrols
and NoNbTrols. These two models were chosen because they are interesting to
the author, but the author attempted to make the library general enough so
that it can accommodate other akin models in the future. The allowable
models are referred to as QNNs because they consist of ‘layers’,
as do classical ANNs (Artificial Neural Nets). TensorFlow can analyze
layered models (e.g., ANN) or more general DAG (directed acyclic graph)
models (e.g., Bayesian networks).

References
———-

1. R. Ranganath, S. Gerrish, D. M. Blei, “Black Box Variational
Inference”, https://arxiv.org/abs/1401.0118

2. https://en.wikipedia.org/wiki/Stochastic_approximation
discusses Robbins-Monro conditions

3. https://github.com/keyonvafa/logistic-reg-bbvi-blog/blob/master/log_reg_bbvi.py

4. http://edwardlib.org/

5. https://discourse.edwardlib.org/

April 13, 2018

Toronto Quantum Computing Meetup, Next Event Featuring 3 Local Stars

Filed under: Uncategorized — rrtucci @ 4:39 pm

The Toronto Quantum Computing Meetup is the largest meetup in the world dedicated to quantum computing, so we claim the title of Quantum Meetup Supremacy, at least for now. (currently we have 1101 Supremos as members. The second biggest club is in London with 951 Brexiters members. The Quitters Brexiters have been growing fast lately. We see what you are doing: trying to sneak up on us and steal our crown. You guys are pathetic! It will never happen! Never! Grow up, you bunch of Peter Paners! )

We cordially invite you to our next meeting on Friday, April 20, 2018. The Event will feature 3 local stars, “Tres Amigos”, speaking about 3 different quantum computing related topics:

  1. Colin Lupton (from Black Brane Systems Inc.)
  2. Turner Silverthorne (from Zero Gravity Labs, part of Loyalty One)
  3. Hassan Bhatti (from CDL- Creative Destruction Lab, part of U of Toronto’s Rotman School of Management)

There will be FREE PIZZA, courtesy of ZGL

We are eternally grateful to ZGL (Zero Gravity Lab) for providing the venue for the event. ZGL is the super cool research lab of Loyalty One.

Loyalty One is one of the largest loyalty marketers in Canada.

April 7, 2018

TensorFlow Versus TensorLayers (for both classical and quantum cases)

Filed under: Uncategorized — rrtucci @ 4:39 am

Click to enlarge.

IBM announces partnership with 8 startups and Zapata announces $5.4M seed investment

Filed under: Uncategorized — rrtucci @ 3:22 am

Check out this IBM Press Release. It announces that the following 8 startups will be joining the “IBM Q Network”

  1. Zapata (U of Toronto, Aspuru-Guzik)
  2. Strangeworks (Austin TX, Whurley)
  3. QXbranch (Australia, M. Brett)
  4. Quantum Benchmark (Waterloo Canada, Lazaridis)
  5. QCWare (Ames-NASA)
  6. Q-CTRL (Univ. of Sidney, Biercuk)
  7. Cambridge Quantum Computing (London, Ilyas Khan)
  8. 1QBit (Vancouver)

It seems that the main thing these startups are getting is free access, which is not granted to everyone, to the 50 qubit IBM quantum computer. Not exactly like winning the lottery though. I suspect that in a few weeks, Google will grant everyone free access to their 72 qubit quantum computer.

The first company to be mentioned is Zapata, which starts with the letter Z. huh?? Inverse alphabetical order? You’ve got to be kidding me. Isn’t Zapata soon going to be one of IBM’s main competitors in the quantum chemistry arena? Isn’t IBM betting their qc farm on quantum chemistry? I hope, for IBM’s sake, that the master mind at IBM who conceived this program knows what he is doing.

Zapata has been much in the news lately.

Aspuru-Guzik, prof at Harvard for almost a decade and famous for his work using quantum computers to do chemistry, is moving to the U of Toronto in July. (I suspect that Matthias Troyer, another quantum chemistry eminence, will be offered and will accept the position at Harvud being vacated by Aspuru. If that happens, this will totally, completely deplete Microsoft’s quantum chemistry brain trust. He he).

Aspuru started Zapata a few months ago. Yesterday, Zapata announced that it obtained $5.4M in seed funding!!

All this Zapata news is very good news for us, artiste-qb.net. Our business plan has nothing to do with quantum chemistry, so there is very little overlap between Zapata and us. Zapata will attract qc talent to Toronto, artiste-qb.net’s home town. Such a high valuation for Zapata makes Artiste a real bargain.

3 of the 8 companies in the above list got their original funding less than 5 years ago by promising to investors that they would write software that would run on Dwave’s annealer quantum computer. But now they claim they were IBM’s best buddies and gate model experts all along. Gate model and annealer softwares look nothing alike. Judases!

All this IBM press coverage must have Microsoft stewing with envy. MS won’t have an anyon quantum computer for at least 5 years, if ever. But MS could easily compete with Google and IBM in the transmon quantum computer arena by buying Rigetti (or some similar, alternative startup, like Yale QC). The price of Rigetti is pocket change to MS. I’m sure such a sale is being actively discussed behind closed doors. After all, very few Silicon Valley startups ever reach IPO; instead, they either go bankrupt or are bought out by one of the giant Valley companies.

April 6, 2018

PyMC and Edward/TensorFlow Merging?

Filed under: Uncategorized — rrtucci @ 9:09 pm

News bulletin: Edward is now officially a part of TensorFlow and PyMC is probably going to merge with Edward.

The python software library Edward enhances TensorFlow so that it can harness both Artificial Neural Nets and Bayesian Networks. The main architect of Edward, Dustin Tran, wrote its initial versions as part of his PhD Thesis at Columbia Univ. (Columbia is the home of the illustrious Andrew Gelman, one of the fathers of hierarchical models, which are a special case of Bayesian networks). Dustin now works at Google as part of a team merging Edward with TensorFlow.

One of the near term goals of artiste-qb.net is to produce a quantum generalization of Edward. This would not run on a quantum computer but would simulate on a distributed classical computer possible experiments that could in the future be conducted on a qc.

I highly recommend the following two Discourses for Edward and PyMC:

It looks like the python software libs PyMC3 and Edward may soon merge:

https://discourse.pymc.io/t/tensorflow-backend-for-pymc4/409

This is very good news, in my opinion, because I am in love with both programs. It’s interesting to note that the current Numpy is also the result of the fortuitous marriage of two separate complementary software libs.

One can now call PyMC3 and Edward from Quantum Fog, although not very smoothly yet. See here.

Next Page »

Create a free website or blog at WordPress.com.

%d bloggers like this: