Machine Learning Meets Quantum: Getting Started with TensorFlow Quantum

Ping Zhou, 2020-11-23

Introduction

Earlier this year, Google announced TensorFlow Quantum. It’s a framework for building Hybrid Quantum-Classical Machine Learning models. So now we are entering the era of Quantum Machine Learning - sounds exciting!

TensorFlow Quantum logo

You may wonder how these two very different worlds work together, and what these hybrid ML models look like. In this notebook, I’m going to walk you through the “Hello, many world” example to explain the basic ideas of building quantum-classical hybrid ML model and how it works on a real problem.

Concepts

Before we start with the code, I’m going to explain some basic concepts that are needed for understanding this example. Don’t be scared by the math notations in this section - they are not as hard as they might appear!

Please be noted that this notebook does not serve as a tutorial of quantum computing - if you need more comprehensive information, please refer to a textbook.

Qubit, State Vector and Bloch Sphere

Qubit: You might have already heard about the term “qubit”. Comparing to a classical bit that can be at either of the two states (e.g. 0 or 1) at any time, a qubit can be at a “superposition” of both states. The state of a qubit \(|\psi\rangle\) can be expressed as:

\begin{matrix} |\psi\rangle = \alpha|0\rangle + \beta|1\rangle \end{matrix}

So a qubit can be represented as a vector with two complex numbers:

\begin{pmatrix} \alpha \\ \beta \end{pmatrix}

Now we know how to describe a qubit. What about a quantum system? For a system with one or more qubits, we use state vector to describe the system’s current state. The state vector of the system is the tensor product of all qubits in it. For example, if we have a system with two qubits \(\begin{pmatrix}\alpha_1 \\ \beta_1\end{pmatrix}\) and \(\begin{pmatrix}\alpha_2 \\ \beta_2\end{pmatrix}\), its state vector is their tensor product:

\begin{matrix} \begin{pmatrix}\alpha_1 \\ \beta_1\end{pmatrix} \otimes \begin{pmatrix}\alpha_2 \\ \beta_2\end{pmatrix} = \begin{pmatrix} \alpha_1\alpha_2 \\ \alpha_1\beta_2 \\ \beta_1\alpha_2 \\ \beta_1\beta_2 \end{pmatrix} \end{matrix}

If you have used Cirq (Google’s Quantum framework), you’ll notice a final_state member in simulation results – That’s the state vector of the system. You may refer to this notebook for an example usage of state vectors.

Another useful way of representing qubit is using the Bloch Sphere. Every qubit can be represented as a vector on the surface of the Bloch Sphere.

A qubit \(|\psi\rangle\) on Bloch sphere can also be written as:

\begin{matrix} |\psi\rangle = \begin{pmatrix} \cos{\frac{\theta}{2}} \\ e^{i\phi}\sin\frac{\theta}{2} \end{pmatrix} \end{matrix}

Operators and Observables

Operators

What are “operators”? In short, an operator tranform qubit(s) from one state to another. Since qubits can be written as vectors, the operators are essentially matrices.

Observables

Suppose we have a quantum system. How do we know about its state? The only way is through measurements. Physical quantities that can be measured are called Observables. For example, position and momentum are observables.

In quantum world, every observable correspond to a Hermitian Operator - they are a set of operators with special properties (e.g. they equal to their conjugate transpose). In this example, we’ll use Z operator as the observable. The matrix of Z operator is like this:

\begin{matrix} Z= \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \end{matrix}

If you apply Z operator on a qubit \(\begin{pmatrix}\alpha \\ \beta\end{pmatrix}\), you’ll get this \(\begin{pmatrix}\alpha \\ -\beta\end{pmatrix}\).

Expectation Values

Being able to get state vectors of your system is nice - it means you know everything of your system. Unfortunately this is only available in simulator. In real world, it’s usually impossible to get the state vector of your system. Instead, what you’ll get is the expectation value of your system.

Expectation value is actually a concept from probability theory. Imagine you are throwing a dice which gives you a number between 1 and 6 each time. The expectation value of your dice is the sum of all possible outcomes weighted by their probabilities. Assuming your dice is perfect (i.e. each of the 6 outcomes has equal probability), the expectation value of your dice is:

\begin{matrix} \frac{1}{6}\times 1 + \frac{1}{6}\times 2 + ... + \frac{1}{6}\times 6 = 3.5 \end{matrix}

When we talk about an expectation value in quantum world, we need to specify both the state and the operator (observable).

For example, we won’t say

“Expectation value of this state vector is 0.5”

Instead, we’ll say

“Expectation value of this state vector on Z operator is 0.5”*.

Why? Because different observables will give different expectation values. As an analogy, think about projecting a vector onto an axis - choosing a different axis will give you different projection value.

So what exactly is the expectation value of a state vector on Z operator? To save your time, I’ll just write down the answer here:

Expectation value of a state vector on Z observable is \(\cos\theta\), which is its projection on Z axis.

If you are interested, you are welcome to work out the mathematics. It’s actually pretty straightforward. :-)

We can look at a few examples to validate this conclusion:

If current state is at “north pole” of the Bloch Sphere, we get 100% chance of getting +Z from measurements. Expectation value is \(\cos 0 = 1\).
If current state is at “south pole” of the Bloch Sphere, we get 100% chance of getting -Z from measurements. Expectation value is \(\cos\pi = -1\).
If current state is at point D on Bloch Sphere, we get 50% chance of getting +Z and 50% chance of -Z from measurements. Expectation value is \(\cos\frac{\pi}{2}=0\).

Things match up. Fantastic!

What Are We Trying to Solve?

To understand the hybrid model we are going to build, we need to first understand the problem we are trying to solve.

Imagine we have a quantum circuit to run. The first thing we need to do is to prepare the initial qubits for the circuit (typically \(|0\rangle\) or \(|1\rangle\)).

In simulations, we assume perfect qubits to be fed into our circuit. But in the real world, our system is noisy. Therefore the qubits we prepare are not perfect - they could deviate from the states we want. If you consider this on Bloch Sphere, it is as if a “perfect” qubit has been rotated in X/Y/Z directions with unknown angles, making it imperfect. This can be emulated as putting Rotation gates on the perfect qubit.

The angles \(\beta_1\), \(\beta_2\), \(\beta_3\) caused by noise are unknown to us.

In order for any quantum computing circuit to run properly, we need to calibrate our qubit to the target state we want. How do we do that? We simply rotate the qubit on X/Y/Z directions again to compensate for the noises. So basically we’ll apply Rotation gates to the noisy qubit:

Our problem is to determine the three parameters (\(\theta_1\), \(\theta_2\), \(\theta_3\)) for our calibration circuit. We’ll build a quantum-classical hybrid machine learning model to solve this problem.

Quantum-Classical Hybrid Machine Learning Model

The original notebook has lot of details and may not be very straightforward. I find it a better way to look from a high level first.

What we want to achieve is a model that calibrates noisy qubit:

Since we are building a Machine Learning model, we need to answer these important questions:

What are the inputs (datasets)?
What are the labels?
What are the outputs?

To help you understand the whole picture, I made an architectural diagram of our hybrid Quantum-Classical ML model. I’ll explain the components in the following sections.

Datasets

Our dataset includes two parts: commands and the noisy circuit.

Commands

“Commands” correpond to the target states we want our qubits to achieve. In this example, command 0 means we want the circuit to prepare a qubit at the “south pole” (\(|0\rangle\)), and command 1 means “north pole” (\(|1\rangle\)).

Noisy input

This is the circuit that emulates the noisy qubit that needs to be calibrated. TensorFlow Quantum (TFQ) provides a function tfq.convert_to_tensor to convert circuit objects from Cirq into tensors:

circuit_tensor = tfq.convert_to_tensor([circuit])

Labels

The labels are the expectation values of the target states we want our qubits to achieve. In this example, we want the qubit to be at \(|0\rangle\) (for input command 0) or \(|1\rangle\) (for input command 1). As we have seen earlier, expectation values at \(|0\rangle\) and \(|1\rangle\) are 1 and -1 respectively. Therefore our labels correspond to input commands are [1, -1].

Outputs

The outputs of our hybrid model are the expectation values of the calibrated qubits.

Model

Our model will include both classical and quantum parts.

The classical part is the “controller” in the diagram. It’s a normal DNN, nothing special. Its outputs are the parameters (\(\theta_1\), \(\theta_2\), \(\theta_3\)) for calibration circuit.
The quantum part is the ControllPQC layer (we’ll name it expection_layer). It takes the parameters (\(\theta_1\), \(\theta_2\), \(\theta_3\)) from controller, uses them to run the calibration circuit (model_circuit), and feeds back the outputs (expectation values) to the controller for training.

So in summary, this is what we tell our model to do:

Here is a nosiy qubit (circuit_input) and two commands (0, 1). We want you to spit out the qubit with expectation value of 1 on command 0 and expection value of -1 on command 1.

Does it make more sense now? :-)

Code Walkthrough

Now we have a decent understanding of the problem and our model. Let’s get to the code.

Setup

You’ll need to install TensorFlow (2.3.1+) and TensorFlow Quantum on your computer. The easiest way is to use pip:

pip install -q tensorflow==2.3.1
pip install -q tensorflow-quantum

Import modules

Import TFQ and the dependency modules:

import tensorflow as tf
import tensorflow_quantum as tfq

import cirq
import sympy
import numpy as np

# visualization tools
%matplotlib inline
import matplotlib.pyplot as plt
from cirq.contrib.svg import SVGCircuit

Prepare the Inputs

First we’ll create a qubit and prepare the dataset.

# This creates a "perfect" qubit at |0>.
# It will be fed into the noisy preparation circuit to emulate a noisy qubit.
qubit = cirq.GridQubit(0, 0)

# Dataset: commands_input (0 and 1)
commands = np.array([[0], [1]], dtype=np.float32)

Create a noisy circuit with randomly generated angles, and convert it into tensor so it can be used by TensorFlow.

Note that we’ll need two copies of the circuit because we have two entries in the dataset (command 0 and command 1).

# Dataset: circuit_input (noisy preparation)
# This emulates the noisy qubit by applying X/Y/Z rotations with random angles.
random_rotations = np.random.uniform(0, 2 * np.pi, 3)
noisy_preparation = cirq.Circuit(
  cirq.rx(random_rotations[0])(qubit),
  cirq.ry(random_rotations[1])(qubit),
  cirq.rz(random_rotations[2])(qubit)
)

# Convert the circuit object to tensor.
# Two copies of the circuit because we have two commands (0 and 1) to train.
datapoint_circuits = tfq.convert_to_tensor([
  noisy_preparation
] * 2)

Print out the noisy circuit. We can see that the “perfect” qubit has been rotated on X, Y and Z axis with some random angles.

# Print out the noisy qubit
print(random_rotations)
print(noisy_preparation)

[3.84902356 3.3647054  3.03737137]
(0, 0): ───Rx(1.23π)───Ry(1.07π)───Rz(0.967π)───

We’ll then create the Keras input from the datasets. These will be used when we call tf.keras.Model() later.

# Create keras inputs from the datasets
circuits_input = tf.keras.Input(shape=(),
                                # The circuit-tensor has dtype `tf.string` 
                                dtype=tf.string,
                                name='circuits_input')
commands_input = tf.keras.Input(shape=(1,),
                                dtype=tf.dtypes.float32,
                                name='commands_input')

Prepare the Labels

The labels are the expectation values we want for each command. In this example, we want 1 for command 0 and -1 for command 1.

# Labels: target expectation values (1 for command 0, -1 for command 1)
expected_outputs = np.array([[1], [-1]], dtype=np.float32)

Build our model

Classical part of the model is a simple DNN. Since the problem is fairly small, a simple DNN with two layers will be good enough.

# Classical part of our model - "controller".
# Since the problem is fairly small, a simple DNN is enough.
controller = tf.keras.Sequential([
    tf.keras.layers.Dense(10, activation='elu'),
    tf.keras.layers.Dense(3)
])

Then the quantum part of our label. We first use the Cirq API to build our calibration circuit (``model_circuit``) with three parameters (\(\theta_1\), \(\theta_2\), \(\theta_3\)).

# Quantum part of our model - "model_circuit"
# Parameters that the classical NN will feed values into.
control_params = sympy.symbols('theta_1 theta_2 theta_3')

# Create the parameterized circuit.
qubit = cirq.GridQubit(0, 0)
model_circuit = cirq.Circuit(
    cirq.rz(control_params[0])(qubit),
    cirq.ry(control_params[1])(qubit),
    cirq.rx(control_params[2])(qubit))

SVGCircuit(model_circuit)

How do we incorporate this calibration circuit (model_circuit) into our model? We need to wrap it into a Keras layer so it can be connected with other layers.

In this example we’ll wrap our circuit with the “Parameterized Quantum Circuit” (PQC) layer: tfq.layers.ControlledPQC. We name it expectation_layer because it outputs the expection values.

Also note the argument operators when we create the layer: Remember that we need to specify operator when we talk about expectation value. This is how we tell PQC that we’ll take expectation values on Z operator.

After expectation_layer is created, we then connect it with the noisy circuit (circuits_input) and calibration parameters (outputs from controller, which is dense_2).

dense_2 = controller(commands_input)

# TFQ layer for classically controlled circuits.
expectation_layer = tfq.layers.ControlledPQC(model_circuit,
                                             # Observe Z
                                             operators = cirq.Z(qubit))
expectation = expectation_layer([circuits_input, dense_2])

We then build the model in Keras just as usual.

model = tf.keras.Model(inputs=[circuits_input, commands_input],
                       outputs=expectation)

A little visualization of our model…

tf.keras.utils.plot_model(model, show_shapes=True, dpi=70)

Train our model

We then train our model just like a normal DNN, using Adam as optimizer and MSE as the loss function.

# Now we can train the model...
optimizer = tf.keras.optimizers.Adam(learning_rate=0.05)
loss = tf.keras.losses.MeanSquaredError()
model.compile(optimizer=optimizer, loss=loss)
history = model.fit(x=[datapoint_circuits, commands],
                    y=expected_outputs,
                    epochs=30,
                    verbose=0)

Training history shows that we are doing good…

# Plot the history - looks good!
plt.plot(history.history['loss'])
plt.title("Learning to Control a Qubit")
plt.xlabel("Iterations")
plt.ylabel("Error in Control")
plt.show()

Verify our model

We can verify our model by calling it with commands 0 and 1:

model([datapoint_circuits, commands])

<tf.Tensor: shape=(2, 1), dtype=float32, numpy=
    array([[ 0.9891974],
           [-0.9958328]], dtype=float32)>

As we can see, our model has calibrated the qubit into pretty well:

On command 0, it gave expectation value of 0.98114353
On command 1, it gave expectation value of -0.9409604

Both are very close to the target (1 and -1).

Congratulations! You have successfully trained your first Hybrid Quantum-Classical Machine Learning Model!

Next Steps

In the next step, I’ll expand this example into a more generic case where we can calibrate qubit with expectation values on other operators.