דף זה טרם תורגם. התוכן מוצג באנגלית.
Singularity Machine Learning - Classification: A Qiskit Function by Multiverse Computing
- Qiskit Functions are an experimental feature available only to IBM Quantum® Premium Plan, Flex Plan, and On-Prem (via IBM Quantum Platform API) Plan users. They are in preview release status and subject to change.
Overview
With the "Singularity Machine Learning - Classification" function, you can solve real-world machine learning problems on quantum hardware without requiring quantum expertise. This Application function, based on ensemble methods, is a hybrid classifier. It leverages classical methods like boosting, bagging, and stacking for initial ensemble training. Subsequently, quantum algorithms such as variational quantum eigensolver (VQE) and quantum approximate optimization algorithm (QAOA) are employed to enhance the trained ensemble's diversity, generalization capabilities, and overall complexity.
Unlike other quantum machine learning solutions, this function is capable of handling large-scale datasets with millions of examples and features without being limited by the number of qubits in the target QPU. The number of qubits only determines the size of the ensemble that can be trained. It is also highly flexible, and can be used to solve classification problems across a wide range of domains, including finance, healthcare, and cybersecurity.
It consistently achieves high accuracies on classically challenging problems involving high-dimensional, noisy, and imbalanced datasets.
It is built for:
- Engineers and data scientists at companies seeking to enhance their tech offerings by integrating quantum machine learning into their products and services,
- Researchers at quantum research labs exploring quantum machine learning applications and looking to leverage quantum computing for classification tasks, and
- Students and teachers at educational institutions in courses like machine learning, and who are looking to demonstrate the advantages of quantum computing.
The following example showcases its various functionalities, including create, list, fit, and predict, and demonstrates its usage in a synthetic problem comprising two interleaving half circles, a notoriously challenging problem due to its nonlinear decision boundary.
Function description
This Qiskit Function allows users to solve binary classification problems using Singularity's quantum-enhanced ensemble classifier. Behind the scenes, it uses a hybrid approach to classically train an ensemble of classifiers on the labeled dataset, and then optimize it for maximum diversity and generalization using the Quantum Approximate Optimization Algorithm (QAOA) on IBM® QPUs. Through a user-friendly interface, users can configure a classifier according to their requirements, train it on the dataset of their choice, and use it to make predictions on a previously unseen dataset.
To solve a generic classification problem:
- Preprocess the dataset, and split it into training and testing sets. Optionally, you can further split the training set into training and validation sets. This can be achieved using scikit-learn.
- If the training set is imbalanced, you can resample it to balance the classes using imbalanced-learn.
- Upload the training, validation, and test sets separately to the function's storage using the catalog's
file_uploadmethod, passing it the relevant path each time. - Initialize the quantum classifier by using the function's
createaction, which accepts hyperparameters such as the number and types of learners, the regularization (lambda value), and optimization options including the number of layers, the type of classical optimizer, the quantum backend, and so on. - Train the quantum classifier on the training set using the function's
fitaction, passing it the labeled training set, and the validation set if applicable. - Make predictions on the previously unseen test set using the function's
predictaction.
Action-based approach
The function uses an action-based approach. You can think of it as a virtual environment where you use actions to perform tasks or change its state. Currently, it offers the following actions: list, create, delete, fit, predict, fit_predict, and create_fit_predict. The following example demonstrates the create_fit_predict action.
# Added by doQumentation — installs packages not in the Binder environment
%pip install -q scikit-learn
# Import QiskitFunctionsCatalog to load the
# "Singularity Machine Learning - Classification" function by Multiverse Computing
from qiskit_ibm_catalog import QiskitFunctionsCatalog
# Import the make_moons and the train_test_split functions from scikit-learn
# to create a synthetic dataset and split it into training and test datasets
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
# authentication
# If you have not previously saved your credentials, follow instructions at
# /docs/guides/functions
# to authenticate with your API key.
catalog = QiskitFunctionsCatalog(channel="ibm_quantum_platform")
# load "Singularity Machine Learning - Classification" function by Multiverse Computing
singularity = catalog.load("multiverse/singularity")
# generate the synthetic dataset
X, y = make_moons(n_samples=1000)
# split the data into training and test datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
job = singularity.run(
action="create_fit_predict",
num_learners=10,
regularization=0.01,
optimizer_options={"simulator": True},
X_train=X_train,
y_train=y_train,
X_test=X_test,
options={"save": False},
)
# get job status and result
status = job.status()
result = job.result()
print("Job status: ", status)
print("Action result status: ", result["status"])
print("Action result message: ", result["message"])
print("Predictions (first five results): ", result["data"]["predictions"][:5])
print(
"Probabilities (first five results): ",
result["data"]["probabilities"][:5],
)
print("Usage metadata: ", result["metadata"]["resource_usage"])
Job status: QUEUED
Action result status: ok
Action result message: Classifier created, fitted, and predicted.
Predictions (first five results): [1, 0, 0, 1, 0]
Probabilities (first five results): [[0.16849563539001172, 0.8315043646099888], [0.8726393386620336, 0.12736066133796647], [0.795344837290717, 0.20465516270928288], [0.36822585748882725, 0.6317741425111725], [0.6656662698604361, 0.3343337301395641]]
Usage metadata: {'RUNNING: MAPPING': {'CPU_TIME': 7.945035696029663}, 'RUNNING: WAITING_QPU': {'CPU_TIME': 82.41029238700867}, 'RUNNING: POST_PROCESSING': {'CPU_TIME': 77.3459484577179}, 'RUNNING: EXECUTING_QPU': {'QPU_TIME': 71.27004957199097}}
1. List
The list action retrieves all stored classifiers in *.pkl.tar format from the shared data directory. You can also access the contents of this directory by using the catalog.files() method. In general, the list action searches for files with the *.pkl.tar extension in the shared data directory and returns them in a list format.
Inputs
| Name | Type | Description | Required |
|---|---|---|---|
action | str | The name of the action from among create, list, fit, predict, fit_predict, create_fit_predict and delete. | Yes |
Usage
job = singularity.run(action="list")
2. Create
The create action creates a classifier of the specified quantum_classifier type by using the provided parameters, and saves it in the shared data directory.
The function currently supports only the QuantumEnhancedEnsembleClassifier.
Inputs
| Name | Type | Description | Required | Default |
|---|---|---|---|---|
action | str | The name of the action from among create, list, fit, predict, fit_predict, create_fit_predict and delete. | Yes | - |
name | str | The name of the quantum classifier, e.g., spam_classifier. | Yes | - |
instance | str | IBM instance. | Yes | - |
backend_name | str | IBM compute resource. Default is None, which means the backend with the fewest pending jobs will be used. | No | None |
quantum_classifier | str | The type of the quantum classifier, i.e., QuantumEnhancedEnsembleClassifier. | No | QuantumEnhancedEnsembleClassifier |
num_learners | integer | The number of learners in the ensemble. | No | 10 |
learners_types | list | Types of learners. Among supported types are: DecisionTreeClassifier, GaussianNB, KNeighborsClassifier, MLPClassifier, and LogisticRegression. Further details related to each can be found in the scikit-learn documentation. | No | [DecisionTreeClassifier] |
learners_proportions | list | Proportions of each learner type in the ensemble. | No | [1.0] |
learners_options | list | Options for each learner type in the ensemble. For a complete list of options corresponding to the chosen learner type/s, consult scikit-learn documentation. | No | [{"max_depth": 3, "splitter": "random", "class_weight": None}] |
regularization_type | str or list | Type/s of regularization to be used: onsite or alpha. onsite controls the onsite term where higher values lead to sparser ensembles. alpha controls trade-off between interaction and onsite terms where lower values lead to sparser ensembles. If a list is provided, models will be trained for each type and the best performing one will be selected. | No | onsite |
regularization | str or float or list | Regularization value. Bounded between 0 and +inf if regularization_type is onsite. Bounded between 0 and 1 if regularization_type is alpha. If set to auto, auto-regularization is used - optimal regularization parameter is found by binary search with the desired ratio of selected classifiers to total classifiers (regularization_desired_ratio) and the upper bound for the regularization parameter (regularization_upper_bound). If a list is provided, models will be trained for each value and the best performing one will be selected. | No | 0.01 |
regularization_desired_ratio | float or list | Desired ratio/s of selected classifiers to total classifiers for auto-regularization. If a list is provided, models will be trained for each ratio and the best performing one will be selected. | No | 0.75 |
regularization_upper_bound | float or list | Upper bound/s for the regularization parameter when using auto-regularization. If a list is provided, models will be trained for each upper bound and the best performing one will be selected. | No | 200 |
weight_update_method | str | Method for update of sample weights from among logarithmic and quadratic. | No | logarithmic |
sample_scaling | boolean | Whether sample scaling should be applied. | No | False |
prediction_scaling | float | Scaling factor for predictions. | No | None |
optimizer_options | dictionary | QAOA optimizer options. A list of available options is presented later in this documentation. | No | ... |
voting | str | Use majority voting (hard) or average of probabilities (soft) for aggregating learners' predictions/probabilities. | No | hard |
prob_threshold | float | Optimal probability threshold. | No | 0.5 |
random_state | integer | Control randomness for repeatability. | No | None |
- Additionally,
optimizer_optionsare enlisted as follows:
| Name | Type | Description | Required | Default |
|---|---|---|---|---|
num_solutions | integer | The number of solutions | No | 1024 |
reps | integer | The number of repetitions | No | 4 |
sparsify | float | The sparsification threshold | No | 0.001 |
theta | float | The initial value of theta, a variational parameter of QAOA | No | None |
simulator | boolean | Whether to use a simulator or a QPU | No | False |
classical_optimizer | str | Name of the classical optimizer for the QAOA. All solvers offered by SciPy, as enlisted here, are usable. You will need to set classical_optimizer_options accordingly | No | COBYLA |
classical_optimizer_options | dictionary | Classical optimizer options. For a complete list of available options, consult SciPy documentation | No | {"maxiter": 60} |
optimization_level | integer | The depth of the QAOA circuit | No | 3 |
num_transpiler_runs | integer | Number of transpiler runs | No | 30 |
pass_manager_options | dictionary | Options for generating preset pass manager | No | {"approximation_degree": 1.0} |
estimator_options | dictionary | Estimator options. For a complete list of available options, consult Qiskit Runtime Client documentation | No | None |
sampler_options | dictionary | Sampler options. For a complete list of available options, consult the Qiskit Runtime Client documentation | No | None |
- Default
estimator_optionsare:
| Name | Type | Value |
|---|---|---|
default_shots | integer | 1024 |
resilience_level | integer | 2 |
twirling | dictionary | {"enable_gates": True} |
dynamical_decoupling | dictionary | {"enable": True} |
resilience_options | dictionary | {"zne_mitigation": False, "zne": {"amplifier": "pea", "noise_factors": [1.0, 1.3, 1.6], "extrapolator": ["linear", "polynomial_degree_2", "exponential"],}} |
- Default
sampler_optionsare:
| Name | Type | Value |
|---|---|---|
default_shots | integer | 1024 |
resilience_level | integer | 1 |
twirling | dictionary | {"enable_gates": True} |
dynamical_decoupling | dictionary | {"enable": True} |
Usage
job = singularity.run(
action="create",
name="classifier_name", # specify your custom name for the classifier here
num_learners=10,
regularization=0.01,
optimizer_options={"simulator": True},
)
Validations
name:- The name must be unique, a string up to 64 characters long.
- It can only include alphanumeric characters and underscores.
- It must start with a letter and cannot end with an underscore.
- No classifier with the same name should already exist in the shared data directory.
3. Delete
The delete action removes a classifier from the shared data directory.
Inputs
| Name | Type | Description | Required |
|---|---|---|---|
action | str | The name of the action. Must be delete. | Yes |
name | str | The name of the classifier to delete. | Yes |
Usage
job = singularity.run(
action="delete",
name="classifier_name", # specify the name of the classifier to delete here
)
Validations
name:- The name must be unique, a string up to 64 characters long.
- It can only include alphanumeric characters and underscores.
- It must start with a letter and cannot end with an underscore.
- A classifier with the same name should already exist in the shared data directory.
4. Fit
The fit action trains a classifier using the provided training data.
Inputs
| Name | Type | Description | Required |
|---|---|---|---|
action | str | The name of the action. Must be fit. | Yes |
name | str | The name of the classifier to train. | Yes |
X | array or list or str | The training data. This can be a NumPy array, a list, or a string referencing a filename in the shared data directory. | Yes |
y | array or list or str | The training target values. This can be a NumPy array, a list, or a string referencing a filename in the shared data directory. | Yes |
fit_params | dictionary | Additional parameters to pass to the fit method of the classifier. | No |
fit_params
| Name | Type | Description | Required | Default |
|---|---|---|---|---|
validation_data | tuple | The validation data and labels. | No | None |
pos_label | integer or str | The class label to be mapped to 1. | No | None |
optimization_data | str | Dataset to optimize the ensemble on. Can be one of: train, validation, both. | No | train |
Usage
job = singularity.run(
action="fit",
name="classifier_name", # specify the name of the classifier to train here
X=X_train, # or "X_train.npy" if you uploaded it in the shared data directory
y=y_train, # or "y_train.npy" if you uploaded it in the shared data directory
fit_params={}, # define the fit parameters here
)
Validations
name:- The name must be unique, a string up to 64 characters long.
- It can only include alphanumeric characters and underscores.
- It must start with a letter and cannot end with an underscore.
- A classifier with the same name should already exist in the shared data directory.