Literature DB >> 35524940

Machine learning diagnosis by immunoglobulin N-glycan signatures for precision diagnosis of urological diseases.

Hiromichi Iwamura¹, Kei Mizuno², Shusuke Akamatsu², Shingo Hatakeyama^1,3, Yuki Tobisawa¹, Shintaro Narita⁴, Takuma Narita¹, Shinichi Yamashita⁵, Sadafumi Kawamura⁶, Toshihiko Sakurai⁷, Naoki Fujita¹, Hirotake Kodama¹, Daisuke Noro¹, Ikuko Kakizaki⁸, Shigeyuki Nakaji⁹, Ken Itoh^8,10, Norihiko Tsuchiya⁷, Akihiro Ito⁵, Tomonori Habuchi⁴, Chikara Ohyama^1,3, Tohru Yoneyama⁸.

Abstract

Early diagnosis of urological diseases is often difficult due to the lack of specific biomarkers. More powerful and less invasive biomarkers that can be used simultaneously to identify urological diseases could improve patient outcomes. The aim of this study was to evaluate a urological disease-specific scoring system established with a machine learning (ML) approach using Ig N-glycan signatures. Immunoglobulin N-glycan signatures were analyzed by capillary electrophoresis from 1312 serum subjects with hormone-sensitive prostate cancer (n = 234), castration-resistant prostate cancer (n = 94), renal cell carcinoma (n = 100), upper urinary tract urothelial cancer (n = 105), bladder cancer (n = 176), germ cell tumors (n = 73), benign prostatic hyperplasia (n = 95), urosepsis (n = 145), and urinary tract infection (n = 21) as well as healthy volunteers (n = 269). Immunoglobulin N-glycan signature data were used in a supervised-ML model to establish a scoring system that gave the probability of the presence of a urological disease. Diagnostic performance was evaluated using the area under the receiver operating characteristic curve (AUC). The supervised-ML urologic disease-specific scores clearly discriminated the urological diseases (AUC 0.78-1.00) and found a distinct N-glycan pattern that contributed to detect each disease. Limitations included the retrospective and limited pathological information regarding urological diseases. The supervised-ML urological disease-specific scoring system based on Ig N-glycan signatures showed excellent diagnostic ability for nine urological diseases using a one-time serum collection and could be a promising approach for the diagnosis of urological diseases.

Entities: Chemical

Keywords: biomarker; glycosylation; immunoglobulin; machine learning; urologic disease

Mesh：

Substances：

Year: 2022 PMID： 35524940 PMCID： PMC9277255 DOI： 10.1111/cas.15395

Source DB: PubMed Journal: Cancer Sci ISSN： 1347-9032 Impact factor: 6.518

androgen deprivation therapy area under the receiver operating characteristic curve bladder cancer benign prostatic hyperplasia castration‐resistant prostate cancer germ cell tumor hormone‐sensitive prostate cancer healthy volunteer microRNA nonseminoma GCT prostate cancer prostate‐specific antigen renal cell carcinoma seminoma GCT urosepsis urinary tract infection upper urinary tract urothelial cancer

INTRODUCTION

Early detection of urological diseases is challenging due to the lack of highly specific biomarkers. Screening of HSPC often leads to overdiagnosis and overtreatment due to the low specificity of PSA. Although early detection of RCC, BCa, and UTUC improves the prognosis, there are no specific biomarkers for discrimination of these diseases. , Human chorionic gonadotropin, α‐fetoprotein, and lactate dehydrogenase are useful for detecting and monitoring GCT; however, not all GCT cases are marker positive. Urosepsis is the most common severe disease resulting from UTI and it requires accurate and timely diagnosis to evaluate severity. , Therefore, more powerful and less invasive biomarkers that can be used simultaneously are needed to identify urological diseases and improve patient outcomes. Several techniques that use miRNAs and exosomes for early diagnosis of urological diseases have been reported. , , , N‐glycosylation is also a promising target for the detection. , , , , , Previously, we focused on aberrant N‐glycosylation of Ig, one of the major serum proteins, and found an aberrant N‐glycan signature of Ig using capillary electrophoresis‐based N‐glycomics, and suggested it might be useful for diagnosing BCa and UTUC. , Statistical analyses to extract disease‐specific N‐glycan signatures from vast amounts of N‐glycomics data on complex N‐glycan structures and their synthetic pathways are limited. Therefore, ML approaches could be an important tool for these analyses. , , , We aimed to simultaneously detect nine urological diseases including five cancers (RCC, BCa, UTUC, PC, and GCT) and three benign diseases (BPH, US, and UTI) using a diagnostic modeling ML approach with Ig N‐glycan signature data.

MATERIALS AND METHODS

Participants

Serum samples were obtained from patients with HSPC (n = 234), castration‐resistant PC (CRPC, n = 94), RCC (n = 100), BCa (n = 176), UTUC (n = 105), GCT (n = 73), UTI (n = 21), UTI with US (n = 145), or BPH (n = 95). These patients were treated at Kyoto University Hospital, Akita University Hospital, Tohoku University Hospital, Yamagata University Hospital, Miyagi Cancer Center Hospital, and Hirosaki University Hospital between June 2007 and July 2022. Thirty‐seven patients were excluded because the presence or absence of disease could not be determined from medical records. Urinary tract infection included cystitis or pyelonephritis without sepsis. Urinary tract infection with US was defined as the presence of UTI and systemic inflammatory response syndrome. All BPH and HSPC patients were selected for prostate biopsy‐proven cases. For supervised‐ML model training purposes, each serum collection was treated separately, even if the patient had multiple serum collections. All serum samples were collected prior to treatment, except for some HSPC patients who underwent ADT and CRPC patients; serum samples from HSPC patients with ADT and CRPC patients were collected during treatment. All samples were stored at −80°C until use. Subjects from community‐dwelling populations involved in the Iwaki Health Promotion Project were also recruited as HVs (n = 269). ,

N‐glycomics of Ig

N‐glycomics of Ig was undertaken as described previously. A flowchart is presented in Figure 1. Briefly, 100 μl serum was desalted with a Zeba Spin desalting resin plate (Thermo Fisher Scientific) and then 100 μl desalted serum was applied to a Melon Gel Spin resin plate (Thermo Fisher Scientific). After 5 min of incubation, the flow‐through was collected as the purified Ig fraction. Peptide N‐glycanase treatment and InstantQ fluorescent dye labeling of Ig N‐glycans and a cleanup process was undertaken with an Agilent AdvanceBio Gly‐X and InstantQ kit (Agilent Technologies). The InstantQ‐labeled Ig N‐glycan was then separated with the capillary electrophoresis light emitting diode‐induced fluorescence N‐glycan analysis system (Gly‐Q; Agilent Technologies). The electropherogram for each sample was automatically analyzed with Gly‐Q Manager (hIgG processing method) to define the structures of the N‐glycans (Figure S1).

FIGURE 1

Schematic flow of N‐glycomics of Ig and relative peak area heatmap of 26 different Ig N‐glycans in each disease. (A) A total of 1312 serum samples were subjected to N‐glycomics of Ig. (B) N‐glycan signatures of Ig data. Relative peak area heatmap of 26 different Ig N‐glycans in each disease. Ig N‐glycan concentrations were clustered according to the distinct N‐glycan synthetic pathways and disease groups. BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer

Supervised‐ML urological disease‐specific diagnostic modeling and statistical analysis

Model building was undertaken using DataRobot version 7.2 (DataRobot, Inc.). To create the urological disease‐specific diagnostic model, the target outcome of the supervised‐ML was set as disease classification data (HSPC, CRPC, RCC, BCa, UTUC, GCT, BPH, US, UTI, and HV). Prior to training, 20% of the Ig N‐glycan signature dataset (Figures 1 and 2) was randomly selected as a holdout dataset. The remaining 80% of the dataset was randomly divided into five mutually exclusive partitions, four of which were used as training and the last used for validation (Figure 3A). Each algorithm was evaluated four additional times by selecting a different partition as the validation data. The AUC was used to evaluate the cross‐validation data (the average of each of the five possible validation partitions) and the TensorFlow Deep Learning Classifier algorithm with the highest AUC (0.9697) was selected as diagnostic model (Figure 3A). The prediction results outputted as the probability scores for the presence of the nine urological diseases. The diagnostic performance such as true and false positive/negative frequencies and AUC of the urological disease‐specific scoring system was validated with the holdout dataset and the whole dataset (Figure 3B) by GraphPad Prism version 9.3.1 (GraphPad Software). The Kruskal–Wallis test was used to analyze differences among multiple groups.

FIGURE 2

FIGURE 3

Supervised machine learning (ML) diagnostic modeling and evaluation of urological disease‐specific score. (A) ML‐supervised diagnostic modeling by DataRobot. Eighty percent of the dataset (n = 1049) was divided into five mutually exclusive partitions, four of which were used as training and the last used for validation used for modeling of urological disease‐specific scores with the TensorFlow Deep Learning Classifier algorithm. (B) Validation of urological disease‐specific scores by true negative/positive frequencies and receiver operating characteristic curve (ROC) analysis using holdout dataset (20% of whole data, n = 262) and ROC analysis of urological disease‐specific scores using the whole dataset (n = 1312). AUC, area under the ROC curve

N‐glycan signature of Ig. (A) Twenty‐six different Ig N‐glycans were aligned according to the N‐glycan synthetic pathway. N‐glycan structures are indicated by monosaccharide symbols: yellow circles, galactose (Gal); green circles, mannose (Man); blue squares, N‐acetylglucosamine (GlcNAc); red triangles, fucose (Fuc); and magenta diamonds, N‐acetylneuraminic acid (Neu5Ac). BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer Supervised machine learning (ML) diagnostic modeling and evaluation of urological disease‐specific score. (A) ML‐supervised diagnostic modeling by DataRobot. Eighty percent of the dataset (n = 1049) was divided into five mutually exclusive partitions, four of which were used as training and the last used for validation used for modeling of urological disease‐specific scores with the TensorFlow Deep Learning Classifier algorithm. (B) Validation of urological disease‐specific scores by true negative/positive frequencies and receiver operating characteristic curve (ROC) analysis using holdout dataset (20% of whole data, n = 262) and ROC analysis of urological disease‐specific scores using the whole dataset (n = 1312). AUC, area under the ROC curve

RESULTS

Immunoglobulin N‐glycan signature of each disease

The characteristics of the participants are summarized in Table 1. Figures 1 and 2 show the concentrations of 26 different Ig N‐glycans aligned according to the N‐glycan synthesis pathway for each disease group (Ig N‐glycan signature) and this dataset was used in DataRobot to create the urological disease‐specific diagnostic scoring system (Figure 3).

TABLE 1

Characteristics of patients for analysis of Ig N‐glycan signatures

Total	HSPC	CRPC	RCC	BCa	UTUC	GCT	BPH	US	UTI	HV	p Value
n = 1312	234	94	100	176	105	73	95	145	21	269	–
Age, years (IQR)	74 (67,78)	74 (64,78)	67 (59,77)	70 (62,75)	72 (63,76)	38 (25,45)	67 (61,71)	79 (69,87)	76 (63,90)	29 (23,65)	*
Gender n, m/f	234/0	94/0	64/36	147/29	69/36	73/0	95/0	60/85	12/9	173/96	*
Urine cytology Class<IV/≥IV /NA	–	–	–	100/64/12	46/44 /12	–	–	–	–	–	–
tPSA, ng/ml (median, IQR)	1.00 (0.04–6.21)	–	–	–	–	–	5.8 (4.74–7.18)	–	–	–	*
wADT/woADT, n	107/127	–	–	–	–	–	–	–	–	–	–
SGCT /NSGCT, n	–	–	–	–	–	36/37	–	–	–	–	–
Pathological T stage, n (%)
Ta,Tis	–	–	0 (0)	0 (0)	6 (6)	–	–	–	–	–	–
T1	–	–	66 (66)	108(61)	21 (20)	–	–	–	–	–	–
T2	–	–	10 (10)	27 (15)	14 (13)	–	–	–	–	–	–
T3	–	–	17 (17)	30 (17)	44 (42)	–	–	–	–	–	–
T4	–	–	3 (3)	11 (6)	3 (3)	–	–	–	–	–	–
NA	–	–	4 (4)	0 (0)	16 (15)	–	–	–	–	–	–

Abbreviations: BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration resistant prostate cancer; f, female GCT, germ cell tumor; HSPC, hormone sensitive prostate cancer; HV, healthy volunteer; IQR, interquartile range; m, male/; NA, not available; NSGCT, nonseminoma GCT; RCC, renal cell carcinoma; SGCT, seminoma GCT; tPSA, total prostate‐specific antigen; US, urinary tract infection with sepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer; wADT, HSPC with androgen deprivation therapy; woADT, HSPC without androgen deprivation therapy.

p < 0.0001.

Characteristics of patients for analysis of Ig N‐glycan signatures Age, years (IQR) Gender n, m/f Urine cytology Class /NA tPSA, ng/ml (median, IQR) SGCT /NSGCT, n Abbreviations: BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration resistant prostate cancer; f, female GCT, germ cell tumor; HSPC, hormone sensitive prostate cancer; HV, healthy volunteer; IQR, interquartile range; m, male/; NA, not available; NSGCT, nonseminoma GCT; RCC, renal cell carcinoma; SGCT, seminoma GCT; tPSA, total prostate‐specific antigen; US, urinary tract infection with sepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer; wADT, HSPC with androgen deprivation therapy; woADT, HSPC without androgen deprivation therapy. p < 0.0001.

True and false positive/negative frequencies of scores validated in the holdout dataset

True and false positive/negative frequencies of supervised‐ML disease‐specific scores validated in the holdout dataset are shown as a confusion matrix in Figure 4. The scores for RCC detection, BCa detection, and US with UTI detection had significantly higher true positive/negative frequencies (95.0%, 95.5%, and 100%, respectively) in the holdout dataset. Figure 5 shows the impact of specific N‐glycans for the detection of each disease‐specific score. The ML approach suggested that A2F(2,3) mainly contributed to the specific detection of RCC. G2FB mainly contributed to the specific detection of US with UTI. A combination of G4S2(2,3) and G0FB mainly contributed to the specific detection of BCa.

FIGURE 4

Figure 5

Impact of specific N‐glycans for detection of each disease by urological disease‐specific score. The upper graphs represent the impact of N‐glycan structures for the detection of each disease. Relative impact >0.5 is represented as a red bar. A dotted square in the lower Ig N‐glycan synthetic pathway shows the N‐glycan structure with relative impact >0.5 for each disease. N‐glycan structures are indicated by monosaccharide symbols: yellow circles, galactose (Gal); green circles, mannose (Man); blue squares, N‐acetylglucosamine (GlcNAc); red triangles, fucose (Fuc); and magenta diamonds, N‐acetylneuraminic acid (Neu5Ac). BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer

True and false positive/negative frequency in confusion matrix of supervised machine learning urological disease‐specific score evaluated in holdout dataset. The left column shows each disease‐specific scoring system and the upper row shows the predicted results. True positive/negative and false positive/positive rates for cases determined to have each disease using each disease‐specific scoring system are shown. The size of the green circle represents the true positive/negative frequency. The size of the magenta circle represents the false positive/negative frequency. BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer Impact of specific N‐glycans for detection of each disease by urological disease‐specific score. The upper graphs represent the impact of N‐glycan structures for the detection of each disease. Relative impact >0.5 is represented as a red bar. A dotted square in the lower Ig N‐glycan synthetic pathway shows the N‐glycan structure with relative impact >0.5 for each disease. N‐glycan structures are indicated by monosaccharide symbols: yellow circles, galactose (Gal); green circles, mannose (Man); blue squares, N‐acetylglucosamine (GlcNAc); red triangles, fucose (Fuc); and magenta diamonds, N‐acetylneuraminic acid (Neu5Ac). BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer Healthy volunteer scores had a higher true positive/negative frequency (75.4%), and 11.5% or 9.8% of the HV cases were predicted as UTUC or GCT, respectively. G1[6] and G0FB had a high impact on HV detection. Although the disease‐specific scores for HSPC detection, CRPC detection, and BPH detection had higher true positive/negative frequencies (77.1%, 66.7%, and 61.1%, respectively) and could be used to discriminate between PC and non‐PC diseases, 6.2% or 10.4% of HSPC cases were predicted as CRPC or BPH, 27.8% of CRPC cases were predicted as HSPC, and 22.2% of BPH cases were predicted as HSPC. The combination of G4S2(2,3), G1[6], and G0FB were important for HSPC detection, and a combination of A1FB, G1FB, G1[6], and G0FB were important for BPH detection. N‐glycans [A3(2,6), A2(2,6), and A1(2,6)] contributed to CRPC detection. Disease‐specific scores for UTUC detection, GCT detection, and UTI detection had lower true positive/negative frequencies (38.5%, 57.1%, and 40.0%, respectively). A total of 46.2% of UTUC cases were predicted as HV. A total of 14.3% or 21.4% of GCT cases were predicted as HSPC or HV. Sixty percent of UTI cases were predicted as BCa. The number of high‐impact N‐glycans required for the detection of UTUC, GCT, or UTI was 10, 5, and 6 types, respectively. Among the N‐glycans that were required for the detection of UTUC and GCT, G1[6] and G0FB were also important N‐glycans for the detection of HV and HSPC. Among the N‐glycans required for detection of UTI, G4S2(2,3) was also an important N‐glycan for the detection of BCa.

Diagnostic accuracy of scores for the detection of each disease in the whole dataset

The diagnostic accuracy of scores for the detection of each disease in the whole dataset is shown in Figure 6 and Table S1. The AUC and specificity at 90% sensitivity of the RCC score versus each disease had a higher value (0.99, 99%, respectively) and could also detect RCC at any pathological stage (pT) (Figure 7). The AUC and specificity at 90% sensitivity of the BCa score versus each disease, except for UTI, had a higher value (0.99% and 98%, respectively) and UTI had a slightly lower value (0.88% and 64.8%, respectively). The BCa score could also indicate BCa at any pT of BCa or at any urine cytology status (Figure 7). The AUC and specificity at 90% sensitivity of UTUC scores versus each disease, except for HV, were greater than 0.93% and 77.1%, respectively, and HV had slightly lower values (0.88% and 58.1%, respectively). The UTUC score also indicated UTUC at any pT or at any urine cytology status (Figure 7).

FIGURE 6

FIGURE 7

Each urological disease‐specific score classified as clinical or pathological parameter in the whole dataset. (A) Violin plot and receiver operating characteristic (ROC) analysis of renal cell carcinoma (RCC) score classified as a pathological stage in whole dataset. (B, C) Violin plots and ROC analyses of bladder cancer (BCa) score and upper urinary tract urothelial cancer (UTUC) score classified as a pathological stage or urine cytology class

Diagnostic accuracy of supervised machine learning urological disease‐specific score for detection of each disease in whole data. (A) Violin plot of urological disease‐specific scores for detecting each disease in the whole dataset. The red line in the violin plots indicates the interquartile range (IQR) and median value. *p < 0.05, **p < 0.005, ***p < 0.001, ****p < 0.0001. ns, not significant. (B) Receiver operating characteristic (ROC) analysis of urological disease‐specific scores for detecting each disease. AUC, area under the ROC curve; BCa, bladder cancer; BPH, benign prostatic hyperplasia; CRPC, castration‐resistant prostate cancer; GCT, germ cell tumor; HSPC, hormone‐sensitive prostate cancer; HV, healthy volunteer; RCC, renal cell carcinoma; US, urosepsis; UTI, urinary tract infection; UTUC, upper urinary tract urothelial cancer Each urological disease‐specific score classified as clinical or pathological parameter in the whole dataset. (A) Violin plot and receiver operating characteristic (ROC) analysis of renal cell carcinoma (RCC) score classified as a pathological stage in whole dataset. (B, C) Violin plots and ROC analyses of bladder cancer (BCa) score and upper urinary tract urothelial cancer (UTUC) score classified as a pathological stage or urine cytology class The AUC and specificity at 90% sensitivity of the HSPC score versus each disease, except for prostate diseases, were greater than 0.93% and 83.3%, respectively, and had a slightly lower value versus BPH (0.85% and 55.1%, respectively) and versus CRPC (0.78% and 39.3%, respectively). The AUC and specificity at 90% sensitivity of the CRPC score versus each disease, except for HSPC, were greater than 0.97% and 92.6%, respectively, and there was a slightly lower value versus HSPC (0.88% and 61.7%, respectively). The AUC of HSPC score (0.85) also superior to that of total PSA (0.73), and there was no strong correlation between total PSA and HSPC score (Figure 7). The AUC and specificity at 90% sensitivity of the GCT score versus each disease, except for HV, were greater than 0.93% and 78.1%, respectively, and there was a slightly lower value versus HV (0.87% and 57.5%, respectively). The GCT score also could detect both seminoma (SGCT) and nonseminoma (NSGCT) (Figure 7). The AUC and specificity at 90% sensitivity of the BPH score versus each disease, except for HSPC, were greater than 0.95% and 84.2%, respectively, and there was a slightly lower value versus HSPC (0.91% and 68.4%, respectively). The AUC and specificity at 90% sensitivity of the US score versus each disease were significantly higher (1.00% and 100%, respectively). The AUC and specificity at 90% sensitivity of the UTI score versus each disease, except for BCa, were greater than 0.98% and 95.2%, respectively, and there was a slightly lower value versus BCa (0.95% and 81.0%, respectively). The AUC and specificity at 90% sensitivity of the HV score versus each disease, except for GCT and UTUC, were greater than 0.96% and 90.0%, respectively, and there was a slightly lower value versus GCT (0.82% and 48.7%, respectively) and versus UTUC (0.84% and 52.4%, respectively).

Diagnostic accuracy of scores for the detection of each disease in the holdout dataset

The diagnostic accuracy of scores for the detection of each disease in the whole dataset is shown in Figure S2 and Table S2. The AUC and specificity at 90% sensitivity of the RCC score versus each disease had a higher value (1.00, 100%, respectively). The AUC and specificity at 90% sensitivity of the BCa score versus each disease, except for UTI, had a higher value (0.99% and 98%, respectively) and UTI had a slightly lower value (0.90% and 77.0%, respectively). The AUC and specificity at 90% sensitivity of UTUC scores versus each disease, except for HV, were greater than 0.90% and 61.5%, respectively, and HV had slightly lower values (0.86% and 38.5%, respectively). The AUC and specificity at 90% sensitivity of the HSPC score versus each disease, except for prostate diseases, were greater than 0.92% and 72.9%, respectively, and had a slightly lower value versus BPH (0.87% and 68.7%, respectively) and versus CRPC (0.84% and 50.0%, respectively). The AUC and specificity at 90% sensitivity of the CRPC score versus each disease, except for HSPC, were greater than 0.96% and 72.2%, respectively, and there was a slightly lower value versus HSPC (0.93% and 72.2%, respectively). The AUC and specificity at 90% sensitivity of the GCT score versus each disease, except for HV, were greater than 0.94% and 85.7%, respectively, and there was a slightly lower value versus HV (0.84% and 51.0%, respectively). The AUC and specificity at 90% sensitivity of the BPH score versus each disease, except for HSPC, were greater than 0.98% and 94%, respectively, and there was a slightly lower value versus HSPC (0.92% and 61.1%, respectively). The AUC and specificity at 90% sensitivity of the US score versus each disease were significantly higher (1.00% and 100%, respectively). The AUC and specificity at 90% sensitivity of the UTI score versus each disease were greater than 0.97% and 100%, respectively. The AUC and specificity at 90% sensitivity of the HV score versus each disease, except for GCT and UTUC, were greater than 0.96% and 86.7%, respectively, and there was a slightly lower value versus GCT (0.84% and 54.1%, respectively) and versus UTUC (0.77% and 42.6%, respectively).

DISCUSSION

Early detection of urological diseases is challenging due to the scarcity of highly specific biomarkers. Biomarkers that can precisely detect multiple urological diseases simultaneously in a single measurement would be of great benefit. Although several promising biomarkers have been reported for early detection of urological diseases using miRNAs and exosomes. , , , there is only one report on diagnostic Ig N‐glycan signatures of urological diseases. Glycomics is a new subspecialty in omics science and holds great promise as a next‐generation biomarker for precision medicine. Although several researchers have reported aberrantly sialylated, agalactosylated, and fucosylated N‐glycans on Ig due to disease‐associated immunoreactions, , , , , , , there have been no studies that have examined changes in the entire N‐glycan synthesis pathway for Ig. Previously, we showed that discriminant analysis based diagnostic scoring systems using Ig N‐glycan signatures for detection of BCa and UTUC were superior to urine cytology. , This suggests that a comprehensive analysis of the N‐glycan synthesis pathway of Ig might be promising and disease‐specific. Several N‐glycan signatures, such as sialylation, fucosylation, bisecting GlcNAcylation, and branching, are regulated by various glycosyltransferase activities, and their synthetic pathways could influence each other. Although discriminating three or more diseases by discriminant analysis using N‐glycan signatures has been limited, an ML approach combined with omics data has been used for early detection of diseases, including cancer , , , and seems to be suitable for extraction of disease‐specific N‐glycan features and precise discrimination between benign and malignant conditions. Here, we showed excellent diagnostic performance of the supervised‐ML disease‐specific scoring system (Figures 4, 6, 7, and S2) in both holdout and whole datasets, and distinct N‐glycan patterns were found that contributed to detection of each disease (Figure 5). Although an imaging technique for RCC detection is widely used, it was difficult to detect until the tumor grows to a detectable size, and 30% of cases are metastatic RCC at diagnosis. , We found that a α2,3 sialyl biantennary core fucosyl N‐glycan [A2F(2,3)] on Ig contributes significantly to the specific detection of RCC, and the RCC score could even identify a small RCC, such as pT1a (Figure 7), as well as discriminate between RCC and UTUC. Thus, RCC score will be a highly promising biomarker for early diagnosis of RCC and for differentiation between invasive renal pelvis cancer and RCC in the future. A combination of α2,3 sialyl tetraantennary N‐glycan [G4S2(2,3)] and agalactosyl bisecting GlcNAc core fucosyl N‐glycan (G0FB) allowed specific detection of BCa. G4S2(2,3) also had a significant impact on UTI detection, leading to false positive/negative results for BCa detection, suggesting the need to combine urine culture test results and other factors to discriminate between BCa and UTI. In addition, two N‐glycans, that is, monogalactosyl biantennary N‐glycan (G1[6]) and agalactosyl bisecting GlcNAc core fucosyl N‐glycan (G0FB), had a high impact on the detection of UTUC. They also had a high impact on the detection of HV and GCT, leading to false positive/negative results for UTUC detection. Imaging or urine cytology for BCa and UTUC detection was useful for detection of these diseases, but patients often have invasive disease at diagnosis due to a lack of specific biomarkers for early detection. The BCa and UTUC scores showed distinct N‐glycan patterns that contributed to detection of BCa and UTUC, and both scores showed excellent diagnostic accuracy at any pathological stage or at any urine cytology status of both urothelial cancers (Figure 7). Thus, BCa and UTUC scores will be promising biomarkers for early detection and also discriminate between BCa and UTUC, suggesting that there is a benefit to selection of disease‐specific treatment. The same two N‐glycans that were useful for HV, UTUC, and GCT detection, that is, G1[6] and G0FB, were also useful for the detection of HSPC or BPH. However, a combination of G4S2(2,3), G1[6], and G0FB was important for HSPC detection, and a combination of A1FB, G1FB, G1[6], and G0FB was important for BPH detection. These results suggested that G1[6] and G0FB were highly important for the detection of several diseases (HV, UTUC, GCT, HSPC, and BPH), and that more combinations of N‐glycans in addition to G1[6] and G0FB are needed to differentiate these diseases. Meanwhile, sialyl triantennary and biantennary N‐glycan [A3(2,6), A2(2,6), and A1(2,6)] pathways strongly contributed to the detection of CRPC, suggesting that this sialyl branching N‐glycan pathway might be specific for the detection of castration resistance. Further basic study on the relevance between sialyl branching N‐glycan on Ig and the acquisition of castration resistance should clarify the mechanism. Although PSA is a well‐known gold standard biomarker in PC diagnosis and monitoring of disease progression, it often leads to overdiagnosis and overtreatment. , Although further follow‐up studies are needed, the disease‐specific score developed in this study was shown to identify not only HSPC, BPH, and HV, but also CRPC with high accuracy; in particular, the HSPC score was much superior to the total PSA test, suggesting its potential as a biomarker to reduce overdiagnosis in PC in the future. For discrimination between mild UTI and severe urosepsis with UTI, the impact of the N‐glycan pattern was completely different, suggesting that severe sepsis caused by UTI can be clearly distinguished from mild UTI. In US with UTI‐specific detection, galactosyl bisecting GlcNAc core fucosyl N‐glycan (G2FB) on Ig was found to be the main contributor. Although procalcitonin and the platelet count are useful for evaluating the severity of US, more precise diagnostic biomarkers are required for evaluation of the severity of disease. , The US score could be a promising biomarker of severe US detection. Furthermore, although the GCT score showed slightly false positive results versus HV, the GCT score showed excellent diagnostic accuracy versus non‐GCT diseases and would be a promising biomarker for early detection of GCT. The GCT score could also detect both SGCT and NSGCT, suggesting that the GCT score will be a promising biomarker for marker‐negative NSGCT (Figure 7). These results suggested that the N‐glycan signature reflects the systemic immune status, and that urological diseases associated with inflammation, such as US associated with UTI, RCC, and BCa, are easily discriminated because of the significant changes in the N‐glycan signature, while urological diseases with low or mild inflammation are difficult to discriminate according to the N‐glycan signature. Menni et al reported N‐glycan profiling of IgG involved in the humoral immune response to identify the risk of cardiovascular disease. Distinct N‐glycosylation profiles have been linked to diverse effector functions of IgG. , , Although we investigated the mixture of Ig (including IgG, IgM, and IgA), the overall results of this study hypothesized that Ig N‐glycosylation traits could identify disease risk by reflecting varying states of systemic inflammation and immune activation. Further basic studies on whether the N‐glycan signature is altered by disease onset should clarify the mechanism. The limitations of this study were the retrospective nature, limited pathological information, and changes over time with the treatment course not considered, which could lead to selection bias. The findings presented herein could enable the detection of nine urological diseases using a one‐time serum collection. Further external validation trials are needed to validate the urological disease‐specific scoring system in routine clinical practice.

AUTHOR CONTRIBUTIONS

Study concept and design: H. Iwamura, T. Yoneyama, C. Ohyama. Acquisition of data: H. Iwamura, K. Mizuno, S. Akamatsu, S. Hatakeyama, S. Narita, T. Narita, S. Yamashita, S. Kawamura, T. Sakurai, N. Fujita, H. Kodama, D. Noro, I. Kakizaki, S. Nakaji, K. Itoh, N. Tsuchiya, A. Ito, T. Habuchi, C. Ohyama, T. Yoneyama. Analysis and interpretation of data: H. Iwamura, T. Yoneyama. Drafting of the manuscript: H. Iwamura, K. Mizuno, S. Akamatsu, S. Hatakeyama, S. Narita, T. Narita, S. Yamashita, S. Kawamura, T. Sakurai, N. Fujita, H. Kodama, D. Noro, I. Kakizaki, S. Nakaji, K. Itoh, N. Tsuchiya, A. Ito, T. Habuchi, C. Ohyama, T. Yoneyama. Critical revision of the manuscript for important intellectual content: T. Yoneyama, H. Iwamura, C. Ohyama. Statistical analysis: T. Yoneyama, H. Iwamura. Obtaining funding: H. Iwamura, K. Mizuno, S. Nakaji, C. Ohyama. Administrative, technical, or material support: K. Mizuno, S. Akamatsu, S. Htakeyama, S. Narita, T. Narita, S. Yamashita, S. Kawamura, T. Sakurai, N. Fujita, H. Kodama, D. Noro, S. Nakaji, N. Tsuchiya, A. Ito, T. Habuchi, T. Yoneyama. Supervision: C. Ohyama, T. Yoneyama.

DISCLOSURE

The authors declare no conflict of interest.

ETHICAL STATEMENT

This study was approved by the ethics committee of each institute and the ethics committee of Hirosaki University Graduate School of Medicine (“Study of carbohydrate structure change in urological disease”; approval number: 2019–099, approval date: March 13, 2020 https://www.med.hirosaki‐u.ac.jp/hospital/outline/resarch/resarch.html). Written informed consent was obtained from all patients. Figure S1 Click here for additional data file. Figure S2 Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file.

34 in total

Review 1. Machine Learning Approaches for Predicting Radiation Therapy Outcomes: A Clinician's Perspective.

Authors: John Kang; Russell Schwartz; John Flickinger; Sushil Beriwal
Journal: Int J Radiat Oncol Biol Phys Date: 2015-11-11 Impact factor: 7.038

2. Serum N-glycan alteration associated with renal cell carcinoma detected by high throughput glycan analysis.

Authors: Shingo Hatakeyama; Maho Amano; Yuki Tobisawa; Tohru Yoneyama; Norihiko Tsuchiya; Tomonori Habuchi; Shin-Ichiro Nishimura; Chikara Ohyama
Journal: J Urol Date: 2013-10-16 Impact factor: 7.450

3. Sialylation of antibodies in kidney recipients with de novo donor specific antibody, with or without antibody mediated rejection.

Authors: Stéphanie Malard-Castagnet; Emilie Dugast; Nicolas Degauque; Annaïck Pallier; Jean Paul Soulillou; Anne Cesbron; Magali Giral; Jean Harb; Sophie Brouard
Journal: Hum Immunol Date: 2015-11-10 Impact factor: 2.850

Review 4. Diagnosis and activity assessment of immunoglobulin A nephropathy: current perspectives on noninvasive testing with aberrantly glycosylated immunoglobulin A-related biomarkers.

Authors: Yusuke Suzuki; Hitoshi Suzuki; Yuko Makita; Akiko Takahata; Keiko Takahashi; Masahiro Muto; Yohei Sasaki; Atikemu Kelimu; Keiichi Matsuzaki; Hiroyuki Yanagawa; Keiko Okazaki; Yasuhiko Tomino
Journal: Int J Nephrol Renovasc Dis Date: 2014-10-30

5. Glycosylation Profile of Immunoglobulin G Is Cross-Sectionally Associated With Cardiovascular Disease Risk Score and Subclinical Atherosclerosis in Two Independent Cohorts.

Authors: Cristina Menni; Ivan Gudelj; Erin Macdonald-Dunlop; Massimo Mangino; Jonas Zierer; Erim Bešić; Peter K Joshi; Irena Trbojević-Akmačić; Phil J Chowienczyk; Tim D Spector; James F Wilson; Gordan Lauc; Ana M Valdes
Journal: Circ Res Date: 2018-03-13 Impact factor: 17.367

Review 6. Extracellular vesicles as a source of prostate cancer biomarkers in liquid biopsies: a decade of research.

Authors: Manuel Ramirez-Garrastacho; Cristina Bajo-Santos; Jesus Martinez de la Fuente; Maria Moros; Carolina Soekmadji; Kristin Austlid Tasken; Aija Line; Elena S Martens-Uzunova; Alicia Llorente
Journal: Br J Cancer Date: 2021-11-22 Impact factor: 7.640

7. Machine learning diagnosis by immunoglobulin N-glycan signatures for precision diagnosis of urological diseases.

Authors: Hiromichi Iwamura; Kei Mizuno; Shusuke Akamatsu; Shingo Hatakeyama; Yuki Tobisawa; Shintaro Narita; Takuma Narita; Shinichi Yamashita; Sadafumi Kawamura; Toshihiko Sakurai; Naoki Fujita; Hirotake Kodama; Daisuke Noro; Ikuko Kakizaki; Shigeyuki Nakaji; Ken Itoh; Norihiko Tsuchiya; Akihiro Ito; Tomonori Habuchi; Chikara Ohyama; Tohru Yoneyama
Journal: Cancer Sci Date: 2022-05-25 Impact factor: 6.518