| Literature DB >> 36151086 |
Elija Perrier1,2, Akram Youssry3,4, Chris Ferrie3.
Abstract
The availability of large-scale datasets on which to train, benchmark and test algorithms has been central to the rapid development of machine learning as a discipline. Despite considerable advancements, the field of quantum machine learning has thus far lacked a set of comprehensive large-scale datasets upon which to benchmark the development of algorithms for use in applied and theoretical quantum settings. In this paper, we introduce such a dataset, the QDataSet, a quantum dataset designed specifically to facilitate the training and development of quantum machine learning algorithms. The QDataSet comprises 52 high-quality publicly available datasets derived from simulations of one- and two-qubit systems evolving in the presence and/or absence of noise. The datasets are structured to provide a wealth of information to enable machine learning practitioners to use the QDataSet to solve problems in applied quantum computation, such as quantum control, quantum spectroscopy and tomography. Accompanying the datasets on the associated GitHub repository are a set of workbooks demonstrating the use of the QDataSet in a range of optimisation contexts.Entities:
Year: 2022 PMID: 36151086 PMCID: PMC9508231 DOI: 10.1038/s41597-022-01639-1
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
QDataSet File Description (Gaussian). The left column identifies each dataset in the respective QDataSet examples while the description column describes the profile of the Gaussian pulse datasets in terms of (i) number of qubits, (ii) axis of control and pulse wave-form (iii) axis and type of noise and (iv) whether distortion is present or absent.
| Dataset | Description |
|---|---|
| G_1q_X | (i) Qubits: one; (ii) Control: |
| G_1q_X_D | (i) Qubits: one; (ii) Control: |
| G_1q_XY | (i) Qubits: one; (ii) Control: |
| G_1q_XY_D | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N1N5 | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N1N5_D | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N1N6 | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N1N6_D | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N3N6 | (i) Qubits: one; (ii) Control: |
| G_1q_XY_XZ_N3N6_D | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N1 | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N1_D | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N2 | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N2_D | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N3 | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N3_D | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N4 | (i) Qubits: one; (ii) Control: |
| G_1q_X_Z_N4_D | (i) Qubits: one; (ii) Control: |
| G_2q_IX-XI_IZ-ZI_N1-N6 | (i) Qubits: two; (ii) Control: |
| G_2q_IX-XI_IZ-ZI_N1-N6_D | (i) Qubits: two; (ii) Control: |
| G_2q_IX-XI-XX | (i) Qubits: two; (ii) Control: single |
| G_2q_IX-XI-XX_D | (i) Qubits: two; (ii) Control: single |
| G_2q_IX-XI-XX_IZ-ZI_N1-N5 | (i) Qubits: two; (ii) Control: single |
| G_2q_IX-XI-XX_IZ-ZI_N1-N5 | (i) Qubits: two; (ii) Control: single |
QDataSet File Description (Square).
| Dataset | Description |
|---|---|
| S_1q_X | (i) Qubits: one; (ii) Control: |
| S_1q_X_D | (i) Qubits: one; (ii) Control: |
| S_1q_XY | (i) Qubits: one; (ii) Control: |
| S_1q_XY_D | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N1N5 | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N1N5_D | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N1N6 | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N1N6_D | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N3N6 | (i) Qubits: one; (ii) Control: |
| S_1q_XY_XZ_N3N6_D | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N1 | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N1_D | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N2 | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N2_D | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N3 | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N3_D | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N4 | (i) Qubits: one; (ii) Control: |
| S_1q_X_Z_N4_D | (i) Qubits: one; (ii) Control: |
| S_2q_IX-XI_IZ-ZI_N1-N6 | (i) Qubits: two; (ii) Control: |
| S_2q_IX-XI_IZ-ZI_N1-N6_D | (i) Qubits: two; (ii) Control: |
| S_2q_IX-XI-XX | (i) Qubits: two; (ii) Control: single |
| S_2q_IX-XI-XX_D | (i) Qubits: two; (ii) Control: single |
| S_2q_IX-XI-XX_IZ-ZI_N1-N5 | (i) Qubits: two; (ii) Control: |
| S_2q_IX-XI-XX_IZ-ZI_N1-N5_D | (i) Qubits: two; (ii) Control: |
| S_2q_IX-XI-XX_IZ-ZI_N1-N6 | (i) Qubits: two; (ii) Control: |
| S_2q_IX-XI-XX_IZ-ZI_N1-N6_D | (i) Qubits: two; (ii) Control: |
The left column identifies each dataset in the respective QDataSet examples while the description column describes the profile of the square pulse datasets in terms of (i) number of qubits, (ii) axis of control and pulse wave-form (iii) axis and type of noise and (iv) whether distortion is present or absent.
The general categorization of the provided datasets.
| Category | Qubits | Drift | Control | Noise |
|---|---|---|---|---|
| 1 | 1 | ( | ( | ( |
| 2 | 1 | ( | ( | ( |
| 3 | 2 | ( | ( | ( |
| 4 | 2 | ( | ( | ( |
The QDataSet examples were generated from simulations of either one or two qubit systems. For each one or two qubit simulation, the drift component of the Hamiltonian was along a particular axis (the z-axis) for the single-qubit case and the z-axis of the first qubit for the two-qubit case (but not the second qubit) or vice versa. Controls were applied along different axes, such as x- or y- axes. Finally, noise was similarly added to different axes: the z-axis (and in some cases the x-axis) of the single qubit case and the z-axis case of the first or second qubit for the two-qubit case.
Dataset Parameters: T: total time, set to unity for standardisation; M: the number of time-steps (discretisations); K: the number of noise realisations; Ω: the energy gap for the single qubit case (where subscripts 1 and 2 represent the energy gap for each qubit in the single qubit case); n: number of control pulses; Amax, Amin: maximum and minimum amplitude; σ: standard deviation of pulse spacing (for Gaussian pulses).
| Parameter | Value |
|---|---|
| 1 | |
| 1024 | |
| 2000 | |
| Ω | 12 |
| Ω1 | 12 |
| Ω2 | 10 |
| 5 | |
| −100 | |
| 100 | |
| T/(12 M) |
Fig. 4An example of a quantum state rotation on the Bloch sphere. The |0 > 0, |1⟩ indicates the σ-axis, the X and Y the σ and σ axes respectively. In (a), the vector is residing in a +1 σ eigenstate. By rotating about the σ axis by π/4, the vector is rotated to the right, to the +1 σ eigenstate. A rotation about the σ axis by angle θ is equivalent to the application of the unitary .
QDataSet characteristics.
| Item | Description |
|---|---|
Ω: the spectral energy gap; | |
The control pulse sequence parameters for the example: Square pulses: Gaussian pulses: | |
| A sequence of time intervals | |
| Time-domain waveform of the control pulse sequence. | |
| Time-domain waveform of the distorted control pulse sequence (if there are no distortions, the waveform will be identical to the undistorted pulses). | |
| The Pauli expectation values 18 or 52 depending on whether one or two qubits (see above). For each state, the order of measurement is: | |
| The | |
| Time domain realisations of the relevant noise. | |
| The system Hamiltonian | |
| The noise Hamiltonian | |
| The system evolution matrix | |
| The interaction unitary | |
| Set of 3 × 2000 expectation values (measurements) of the three Pauli observables for all possible states for each noise realization. For each state, the order of measurement is: | |
| The expectations values (measurements) of the three Pauli observables for all possible states averaged over all noise realizations. For each state, the order of measurement is: |
The left column identifies each item in the respective QDataSet examples (expressed as keys in the relevant Python dictionary) while the description column describes each item.
Fig. 1Plot of an undistorted (orange) pulse sequence against a related distorted (blue) pulse sequence for the single-qubit Gaussian pulse dataset with x-axis control (‘G_1q_X’) over the course of the experimental runtime. Here f(t) is the functional (Gaussian) form of the pulse sequence for time-steps t. These plots were used in the first step of the verification process for QDataSet. The shift in pulse sequence is consistent with expected effects of distortion filters. The pulse sequences for each dataset can be found in simulation_parameters =⇒ dynamic_operators =⇒ pulses (undistorted) or distorted_pulses for the distorted case (see Table (1) for a description of the dataset characteristics).
Fig. 2The frequency response (left) and the phase response (right) of the filter that is used to simulate distortions of the control pulses. The frequency is in units of Hz, and the phase response is in units of rad.
Fig. 3Plot of average observable (measurement) value for all observables (index indicates each observable in order of Pauli measurements) for all measurement outcomes for samples drawn from dataset G_1q_X (using TensorFlow ‘tf’, orange line) against the same mean for equivalent simulations in Qutip (blue line - not shown due to identical overlap) for a single dataset. Each dataset was sampled and comparison against Qutip was undertaken with equivalent results. The error between means was of order 10−6 i.e. they were effectively identical (so the blue line is not shown).
QDataSet features for quantum state tomography.
| Item | Description |
|---|---|
| Objective | Algorithm to learn characterisation of state |
| Inputs | Set of Pauli measurements { |
| Label | Final state |
| Intermediate inputs | Hamiltonians, Unitary operators, Initial states |
| Output | Estimate of final state |
| Metric | State fidelity |
The left columns lists typical categories in a machine learning architecture. The right column describes the corresponding feature(s) of the QDataSet that would fall into such categories for the use of the QDataSet in training quantum tomography algorithms.
QDataSet features for quantum noise spectroscopy.
| Item | Description |
|---|---|
| Objective | Algorithm to estimate noise operators |
| Inputs | Pulse sequence, reconstructed from the |
| Label | Set of measurements |
| Intermediate inputs | Hamiltonians, Unitary operators, Initial states |
| Output | Estimate of measurements |
| Metric | MSE (between estimates and label data) |
The left columns lists typical categories in a machine learning architecture. The right column describes the corresponding feature(s) of the QDataSet that would fall into such categories for the use of the QDataSet in training quantum tomography algorithms.
QDataSet features for quantum control.
| Item | Description |
|---|---|
| Objective | Algorithm to learn optimal sequence of controls to reach final state |
| Inputs | Hamiltonians containing Pauli generators |
| Label | Final state |
| Intermediate fixed inputs | Sequence of unitary operators |
| Intermediate weights | Sequence of pulses |
| Output | Estimate of final state |
| Metric | Average operator fidelity |
The left columns lists typical categories in a machine learning architecture. The right column describes the corresponding feature(s) of the QDataSet that would fall into such categories for the use of the QDataSet in training quantum control algorithms. The specifications are just one of a set of possible ways of framing quantum control problems using machine learning.
An example of the types of quantum data features which may be included in a dedicated large-scale dataset for QML.
| Item | Description |
|---|---|
| Quantum states | Description of states in computational basis, usually represented as vector or matrix (for |
| Measurement operators | Measurement operators used to generate measurements, description of POVM. |
| Measurement distribution | Distribution of measurement outcome of measurement operators, either the individual measurement outcomes or some average (the QDataSet is an average over noise realisations). |
| Hamiltonians | Description of Hamiltonians, which may include system, drift, environment etc Hamiltonians. Hamiltonians should also include relevant control functions (if applicable). |
| Gates and operators | Descriptions of gate sequences (circuits) in terms of unitaries (or other operators). The representation of circuits will vary depending on the datasets and use case, but ideally quantum circuits should be represented in a way easily translatable across common quantum programming languages and integrable into common machine learning platforms (e.g. TensorFlow, PyTorch). |
| Noise | Description of noise, either via measurement statistics, known features of noise, device specifications. |
| Controls | Specification and description of the controls available to act on the quantum system. |
The choice of such features will depend on the particular objectives in question. We include a range of quantum data in the QDataSet, including information about quantum states, measurement operators and measurement statistics, Hamiltonians and their corresponding gates, details of environmental noise and controls.
| Measurement(s) | Simulations of one- and two-qubit quantum systems evolving in the presence and absence of noise and distortion |
| Technology Type(s) | Simulated measurement using Python packages |
| Sample Characteristic - Organism | Simulated quantum systems |
| Sample Characteristic - Environment | Quantum systems in noisy and noiseless environments |