| Literature DB >> 35115630 |
S Moradi1, C Brandner1, C Spielvogel2, D Krajnc1, S Hillmich3, R Wille3,4, W Drexler1, L Papp5.
Abstract
Quantum machine learning has experienced significant progress in both software and hardware development in the recent years and has emerged as an applicable area of near-term quantum computers. In this work, we investigate the feasibility of utilizing quantum machine learning (QML) on real clinical datasets. We propose two QML algorithms for data classification on IBM quantum hardware: a quantum distance classifier (qDS) and a simplified quantum-kernel support vector machine (sqKSVM). We utilize these different methods using the linear time quantum data encoding technique ([Formula: see text]) for embedding classical data into quantum states and estimating the inner product on the 15-qubit IBMQ Melbourne quantum computer. We match the predictive performance of our QML approaches with prior QML methods and with their classical counterpart algorithms for three open-access clinical datasets. Our results imply that the qDS in small sample and feature count datasets outperforms kernel-based methods. In contrast, quantum kernel approaches outperform qDS in high sample and feature count datasets. We demonstrate that the [Formula: see text] encoding increases predictive performance with up to + 2% area under the receiver operator characteristics curve across all quantum machine learning approaches, thus, making it ideal for machine learning tasks executed in Noisy Intermediate Scale Quantum computers.Entities:
Year: 2022 PMID: 35115630 PMCID: PMC8814029 DOI: 10.1038/s41598-022-05971-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical datasets utilized for the study with their sample and selected feature count as well as their imbalance ratios and quantum advantage scores .
| Dataset | #Samples | Imbalance Ratio | #Features | Reference | |
|---|---|---|---|---|---|
| Pediatric Bone Marrow Transplant 2-year survival | 134 | 0.33 | 8 | 0.40 | [ |
| 16 | 0.60 | ||||
| Wisconsin Breast Cancer Malign-vs-benign | 569 | 0.37 | 8 | 1.30 | [ |
| 16 | 3.50 | ||||
| Heart Failure Mortality | 300 | 0.5 | 8 | 0.42 | [ |
Given a two-class dataset, the imbalance ratio () is , where is the number of minority class and is the total number of samples. Furthermore, measures the similarities of quantum kernel and linear classical kernel functions of the same dataset.
Comparison of the cross-validation AUC performance for different data encodings.
| Dataset | qDC | qKSVM | sqKSVM | sqKSVM* | qubits |
|---|---|---|---|---|---|
| Pediatric Bone Marrow Transplant 2YS | 0.62 | 0.63 | 0.62 | 0.61 | |
| 0.61 | 0.63 | 0.61 | 0.59 | ||
| Wisconsin Breast Cancer Malign-vs-benign | 0.92 | 0.92 | 0.88 | 0.87 | |
| 0.90 | 0.91 | 0.87 | 0.85 | ||
| Heart failure Mortality | 0.62 | 0.51 | 0.51 | 0.50 | |
| 0.60 | 0.51 | 0.51 | 0.50 |
The qDC, qKSVM, and sqKSVM run on Pennylane simulator for . For the encoding, features are encoded into qubits with sequences of Pauli-Y gate rotations () and s. In another strategy, features are encoded into qubits with sequences of the Hadamard gates, Pauli-Z gate () rotations followed by nearest neighbor s.
*The sqKSVM was also executed on the IBMQ Melbourne machine for reference comparison.
Comparison of the cross-validation AUC performance with QML and ML algorithms.
| Dataset | #Features | sqKSVM | qKSVM | qDC | cSVM | ckNN |
|---|---|---|---|---|---|---|
| Pediatric Bone Marrow Transplant 2YS | 8 | 0.61 | 0.63 | 0.60 | 0.64 | 0.61 |
| 16 | 0.66 | 0.69 | 0.64 | 0.71 | 0.64 | |
| Wisconsin Breast Cancer Malign-vs-benign | 8 | 0.87 | 0.92 | 0.91 | 0.89 | 0.90 |
| 16 | 0.88 | 0.93 | 0.90 | 0.89 | 0.93 | |
| Heart Failure Mortality* | 8 | 0.50 | 0.51 | 0.60 | 0.53 | 0.58 |
For all QML algorithms, features are encoded into qubits with sequences of Pauli-Y gate rotations () and s. All QML algorithms were executed on the IBMQ Melbourne machine.
*Heart failure has no 16-feature variant, since the maximum number of features are 13.
Figure 1Scatter diagrams of simulator inner products vs. experiment inner products for both the train state vectors and test state vectors. This data corresponds to the Wisconsin Breast Cancer dataset with 8 (left) and 16 (right) features. The red lines represent optimal fit lines based on least-squared regression.
Figure 2Quantum circuit computes the real part of the inner product . The Hadamard gate puts the ancilla qubit () into uniform superposition. A single-controlled unitary gate entangles the exited state of the ancilla qubit with the training data state vector (). The gate flips the ancilla qubit. Another single unitary controlled gate entangles the state vector of the test data () with the excited state of the ancilla qubit. A second gate flips the ancilla qubit. The Hadamard gate on the ancilla qubit interferences train and test data state vectors. The ancilla qubit is measured using a Pauli- gate. The real value of is estimated from Eq. (9). The measurement gate is done by a Pauli- gate and .
Figure 3Quantum Circuit to compute . The model circuits encode train and test data into quantum states and . The Hadamard gate on the ancilla qubit () generates a superposition of the quantum state including the train and test datasets. The application of the single-controlled swap gates with the ancilla qubit as the control results in an entangled state of Eq. (10). Another Hadamard gate on the ancilla qubit interferences and . The ancilla qubit on the state is measured in the Z basis. Therefore, the value of can be obtained from Eq. (12).
Figure 4Schematic of the sqKSVM for data classification algorithm. First, the training data vector and test are prepared on a classical computer. Next, the original training data and test data are encoded into quantum states followed by computing the kernel matrix of all pairs of the training-test data with a NISQ computer. If are considered to be a solution of the support vector, the binary classifier can be constructed based on Eq. (5).
Figure 5Topology and coupling map of the IBMQ Melbourne (https://quantum-computing.ibm.com/services). Single-qubit error rate is the error induced by applying the single-qubit gates. error is the error of the only two-qubit gates. Each circle represents a physical superconducting qubit and each shows coupling between neighbor qubits.