Quantum Bayesian Networks

February 15, 2019

Derivative of matrix exponential wrt each element of Matrix

Filed under: Uncategorized — rrtucci @ 1:26 am

In the quantum neural net field, in order to do backpropagation, one often wishes to take the derivative of a unitary matrix with respect to a parameter it depends on. The wonderful software PennyLane by Xanadu evaluates such derivatives using a simple formula which gives an exact answer, albeit only in special cases. Here I will discuss a simple formula that is fully general, albeit only an approximation, although reputedly a very good approximation, probably due to its symmetric nature and the smoothness of exponential functions. The method is a simple symmetric finite difference approximation.

In a StackExchange question with exactly the same title as this post, somebody called Doug suggested what he calls Higham’s “Complex Step Approximation”, to wit:

If A is a Real matrix and E_{rs} is the matrix which is 1 at position r,s and zero elsewhere,

\frac{d}{dA_{rs}}e^{A} \approx  \frac{ e^{A + ihE_{rs}} - e^{A-ihE_{rs}} }{2ih} = \frac{{\rm Im}(e^{A + ihE_{rs}})}{h}

But what if A is a Hermitian matrix and we want the derivative of exp(iA)? Here is a simple adaptation of Higham’s formula to that case.

Let E^\pm_{rs} = E_{rs} \pm E_{sr}. Note that (E^\pm_{rs})^\dagger = \pm E^\pm_{rs}.

Define a matrix M(A) by

M(A) = \left[  \begin{array}{cc} 0 &-e^{-iA}\\ e^{iA} & 0 \end{array}\right]

Then

M(A+ih) = \left[ \begin{array}{cc} 0 &-e^{-iA+h}\\ e^{iA-h} & 0 \end{array}\right]

so

M(A+ih)^\dagger = \left[ \begin{array}{cc} 0 &e^{-iA-h}\\ -e^{iA+h} & 0 \end{array}\right] =-M(A-ih)

From this, one learns the following simple recipe: the effect of the dagger on M(A) is to put a minus sign in front of the M and to take the Hermitian of the argument too.

Therefore,

\frac{d}{d{\rm Re\;}A_{rs}}M(A) \approx \frac{ M(A + ihE^+_{rs}) - M(A-ihE^+_{rs}) }{2ih}= \frac{{\rm Re\;}[M(A + ihE^+_{rs})]}{ih}

and

\frac{d}{d{\rm Im\;}A_{rs}}M(A) \approx \frac{ M(A + hE^-_{rs}) - M(A-hE^-_{rs}) }{2h}= \frac{{\rm Re\;}[M(A + hE^-_{rs})]}{h}.

Since

\frac{d}{dx}M(A) = \left[ \begin{array}{cc} 0 &-\frac{d}{dx}e^{-iA}\\ \frac{d}{dx}e^{iA} & 0 \end{array}\right] ,

it follows that

\frac{d}{dx}e^{iA} = \left[\frac{d}{dx}M(A)\right]_{10}

for x = {\rm Re\;} A_{rs}, {\rm Im\;} A_{rs}

Note:
When A is Hermitian, A_{rs} and A_{sr} = A^*_{rs} are complex conjugates so they are not independent, but the real and imaginary parts of A_{rs} are independent, so one can treat A_{rs} and A^*_{rs} as independent and do a change of variables from (A_{rs}, A^*_{rs}) to ({\rm Re}A_{rs}, {\rm Im}A_{rs}).

For discussion about this topic, in the context of quantum neural networks, see the following thread of the Pennylane discourse website
https://discuss.pennylane.ai/t/when-does-autograd-need-help/82

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: