| Literature DB >> 34997034 |
Pei-Hua Wang1, Jen-Hao Chen2,3, Yufeng Jane Tseng4,5.
Abstract
Pharmaceutical patent analysis is the key to product protection for pharmaceutical companies. In patent claims, a Markush structure is a standard chemical structure drawing with variable substituents. Overlaps between apparently dissimilar Markush structures are nearly unrecognizable when the structures span a broad chemical space. We propose a quantum search-based method which performs an exact comparison between two non-enumerated Markush structures with a constraint satisfaction oracle. The quantum circuit is verified with a quantum simulator and the real effect of noise is estimated using a five-qubit superconductivity-based IBM quantum computer. The possibilities of measuring the correct states can be increased by improving the connectivity of the most computation intensive qubits. Depolarizing error is the most influential error. The quantum method to exactly compares two patents is hard to simulate classically and thus creates a quantum advantage in patent analysis.Entities:
Year: 2022 PMID: 34997034 PMCID: PMC8742058 DOI: 10.1038/s41598-021-04031-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(a) The Markush structure of the TAK-831 patent claim. The R group in the structure claim can be defined in a nested manner. In the TAK-831 case, the compounds defined by the Markush structure are infeasible to enumerate. (b) The code table to change SMILES symbols into binary quantum states. We used two bits to code a symbol in SMILES notation. An additional code for an empty symbol is introduced to encode the compounds with short SMILES notation. (c) The quantum circuit in the 2-data qubit experiment. In the defined case, the Grover iteration only needs to be executed one time. (d) The two sets of R groups from two patents to be compared. We use R1 and R2 to denote the variable R groups of the Markush structure in the two patents. R1 has two variable compounds for the R group, and R2 has three variable compounds for the R group. (e) The R groups from two patents to be compared using two data qubits. In this simple case, we assume that the core structures of the two patents are the same, and we need to compare only the variate substructures in the R groups. (f) The unitary gates used in the 2-data qubit experiment. One of the patent oracles defines two targets 11 and 00 by two MCX gates. The other patent oracle defines only one target by the quantum state 00. (g) The layout of ibmq_vigo. The circles indicate qubits along with their ID numbers. The arrows show the connectivity of the quantum computer. Qubit 1, in this case, has the largest connectivity among the qubits.
The output vector states in the simulation with 8 data qubits. Quantum states and are the states of the targets.
| Quantum state | Final amplitude | Probability of being measured |
|---|---|---|
| 0.706 | 0.498 | |
| 0.004 | < 0.001 | |
| 0 | 0 |
The sum of the measured probabilities of these two states is close to 1, showing that the proposed quantum circuit can obtain the correct results.
The average correct counts and percentages on the real quantum computer ibmq_vigo with two different qubit mapping strategies.
| Qubit mapping | Correct count | Correct percentage |
|---|---|---|
| T-shaped mapping | 348/1024 ± 28 | 34.0% ± 2.7% |
| Linear mapping | 303/1024 ± 23 | 29.6% ± 2.2% |
The mapping strategy leveraging the advantage of the T-shaped connectivity has better performance than the linear mapping strategy.
Comparison of the correctness when one type of error parameter is doubled or halved in the simulation.
| Error parameter | Thermal error | Depolarizing error | Measurement error |
|---|---|---|---|
| 200% | 45.6% ± 1.0% | 46.9% ± 1.1% | 97.3% ± 0.6% |
| 100% | 74.7% ± 0.7% | 65.1% ± 1.2% | 98.7% ± 0.3% |
| 50% | 78.7% ± 1.5% | 80.5% ± 0.8% | 99.5% ± 0.2% |
Reducing the depolarizing error improves the correctness the most.