דף זה טרם תורגם. התוכן מוצג באנגלית.
Introduction to Quantum Machine Learning
Overview and motivation
Before beginning, please complete this short pre-course survey, which is important to help improve our content offerings and user experience.
Welcome to quantum machine learning!
The video below will give a brief introduction that is supplemented by the text below.
To briefly recap and augment the video:
- We have seen a problem be solved for the first time on a quantum computer, and then subsequently people find a way to do it on a classical supercomputer. This cycle of classical and quantum computing pushing each other to their limits will likely continue for a few years.
- There are specific problems where quantum computing can have a provable advantage over classical computing, given progress in areas such as reduction of errors and the number of qubits available. But this is still a time of exploration, searching for quantum-amenable datasets and useful quantum feature maps.
- Quantum machine learning (QML) is one of many exciting areas where quantum computing can augment or complement existing classical workflows.
Machine learning (ML) applies algorithms to data sets, and so QML might plausibly include quantum mechanics in either the data or algorithmic sides, or both. All of these possibilities are potentially interesting. But we will mostly restrict ourselves to discussions of quantum algorithms applied to classical data. One reason for this is that ML problems with classical data are already so well studied and widely available. There is broad interest in solving problems that start with classical data. Another reason is the lack of QRAM. Without the ability to store large amounts of quantum data on a relatively long timescale, methods that begin with quantum data are still fairly far from applicability to industry. It is also unclear how to "quantumly-access" classical data in an efficient manner. Two types of ML of particular interest are supervised learning, in which you train an algorithm using a labeled data set, and unsupervised learning, in which the algorithm attempts to learn about a distribution from unlabeled samples. An unsupervised algorithm might, for example, learn how to generate new samples from the same distribution, or how to cluster the samples into groups with similar characteristics.

The left image shows two categories of labeled data as in supervised learning. In this case, the categories are linearly separable. The right image shows clusters of data. In an unsupervised learning task, these data would not initially be labeled and the algorithm would study the distribution, perhaps looking for clusters. For the purposes of visualizing example clusters the algorithm might identify, the data points have now been labeled. A key difference between the two is that the supervised learning process starts with the data already labeled and the unsupervised process starts with unlabeled data, even if the data are labeled at the end.
Those with background in machine learning will already know that many solution methods involve mapping data into higher-dimensional spaces. This is especially well-explored in the context of kernels. As a brief reminder, sometimes data may be separable into categories by a line, plane, or hyperplane (we will often simply say "hyperplane" for compactness), in the same number of dimensions as the data are given. This is shown in the first image above. Other times, data may not be separable by a hyperplane in those dimensions, as shown in the second image. But there can still be structure to the data that can be exploited in a mapping to higher dimensions, which then leaves the data separable in that higher-dimensional space. This is illustrated in the mapping of the 2D data with circular symmetry into the 3D space in which the data points are arranged along a paraboloid surface.

A common goal in QML is to find a mapping from the lower-dimensional set of features into a higher-dimensional space, that effectively separates our data points so we can use the mapping to classify new data points. But this is not an easy task, and any discussion of the potential usefulness of quantum computing in machine learning must be accompanied by the appropriate caveats. In particular, we must address the nuance in dataset selection and the challenges in reaching utility scale. We must also shift away from trying to outperform classical ML algorithms on data that are already handled efficiently and well by classical algorithms and refocus the discussion to investigating new feature maps that could be useful.
Managing expectations
Many data sets used in QML applications described in literature are “feature engineered”, meaning a dataset is selected or generated specifically to show a narrow use case in which quantum computing is useful. If this seems like cheating then we’re misunderstanding the task at hand. It is not the case that some quantum feature maps enable us to solve all or many classification tasks more efficiently or scalably than classical machine learning algorithms. Rather, some quantum feature maps (not all) behave differently from classical feature maps. The task at hand is then to explore quantum circuits in the context of complex data structures. Some specific questions to address are:
- What quantum circuits are most likely to behave in novel ways, compared to classical alternatives?
- Are there real-world problems that involve data with properties best explored using such novel quantum circuits?
- Do these quantum circuits scale on near-term quantum computers?
Insufficient explanation
One often encounters a simplified explanation of how quantum computing can be powerful. It goes something like this:
Just as classical computers use bits of information, quantum computers use qubits. Given a number of bits, say 4, a classical computer can take on any one of possible states, whereas a quantum computer can exist in a superposition of all 16 states simultaneously, and operations can be performed on this entire superposition. In some cases, this naturally allows us to design potentially interesting learning algorithms based on mappings to higher dimensional spaces.
This is a true statement, but it is inadequate, and a bit misleading as we will explain. One also sees the differences between complex and real coefficients emphasized, as in:
A probabilistic classical system in which a system can be described as having certain probabilities of being in different states, can be described as follows.
In such a system, the coefficients , , , and so on can only be meaningful if they are positive, real numbers. The states in quantum computers are described by probability amplitudes that can be complex numbers.