| Literature DB >> 30839747 |
Chuankai An1, A James O'Malley2, Daniel N Rockmore1,3,4.
Abstract
In this paper, we analyze the millions of referral paths of patients' interactions with the healthcare system for each year in the 2006-2011 time period and relate them to U.S. cardiovascular treatment records. For a patient, a "referral path" records the chronological sequence of physicians encountered by a patient (subject to certain constraints on the times between encounters). It provides a basic unit of analysis in a broader referral network that encodes the flow of patients and information between physicians in a healthcare system. We consider referral networks defined over a range of interactions as well as the characteristics of referral paths, producing a characterization of the various networks as well as the physicians they comprise. We further relate these metrics and findings to outcomes in the specific area of cardiovascular care. In particular, we match a referral path to occurrences of Acute Myocardial Infarction (AMI) and use the summary measures of the referral path to predict the treatment a patient receives and medical outcomes following treatment. Some referral path features are more significant with respect to their ability to boost a tree-based predictive model, and have stronger correlations with numerical treatment outcome variables. The patterns of referral paths and the derived informative features illustrate the potential for using network science to optimize patient referrals in healthcare systems for improved treatment outcomes and more efficient utilization of medical resources.Entities:
Keywords: Big data; Health record analysis; Network science; Predictive modeling; Social network analysis
Year: 2018 PMID: 30839747 PMCID: PMC6214314 DOI: 10.1007/s41109-018-0081-4
Source DB: PubMed Journal: Appl Netw Sci ISSN: 2364-8228
Fig. 1Bipartite graph between patients {α,β} and physicians {A,B,C,D}. (L) An edge between a patient and a physician means the patient visited the physician. (R) A referral path of Patient α in chronological order
Example pipeline of data processing from raw patient-physician encounter records to referral paths and edges of referral network
| (a) Raw visiting records | ||
| Patient | Physician | date;HRR;HRRcity;state;zipcode;workRVU;specialty;PHN;teaching type; etc. |
|
| A | 2011-01-01;1010;Hanover;NH;03755;1.0;family practice;First hospital;0;etc. |
|
| B | 2011-01-10;1020;Boston;MA;02101;3.0;internal medicine;Second hospital;1; etc. |
|
| C | 2011-02-01;1050;New York;NY;10021;4.0;cardiology;Third hospital;1;etc. |
|
| B | 2011-03-01;1012;Lebanon;NH;03784;2.0;family practice;Fourth hospital;0;etc. |
|
| C | 2011-03-20;1022;Newton;MA;02461;5.0;vascular surgery;Fifth hospital;1; etc. |
| (b) Referral path | ||
| Patient | Node(date;#visiting records; RVU),divided by "->" | |
|
| A(2011-01-01;1.0,1.0)->B(2011-01-10;1.0;3.0)->C(2011-02-01;1.0;4.0) | |
|
| B(2011-03-01;1.0;2.0)->C(2011-03-20;1.0;5.0) | |
| (c) Edges in the national referral network with the weights over all referral paths | ||
| Directed edge | Weights of an edge | |
| A->B | 3; 4; 4.82; 12.14; 23.42 | |
| B->C | 5; 5; 5.12; 12.32; 18.22 | |
Fig. 2An example referral path with three physicians A,B,C. The patient visits them five times. Physicians A and C are from the same HRR/hospital in blue, while physician B is from another HRR/hospital in red
Some national referral network measures in 2006-2011
| Year | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 |
|---|---|---|---|---|---|---|
| # nodes | 272353 | 296008 | 313051 | 323042 | 334452 | 347586 |
| # edges | 5708791 | 5948185 | 6313136 | 6544847 | 6785594 | 7047586 |
| Exponent of indegree power law | 3.08 | 2.80 | 1.55 | 2.76 | 1.54 | 2.74 |
| 0.97 | 0.89 | 0.21 | 0.85 | 0.22 | 0.82 | |
| Exponent of outdegree power law | 3.01 | 2.69 | 2.71 | 2.66 | 2.56 | 2.68 |
| 0.9 | 0.94 | 0.93 | 0.96 | 0.91 | 0.93 | |
| Size of the largest connected component | 271898 | 295405 | 312412 | 322452 | 333727 | 346711 |
| (in, in) degree assortativity | -0.094 | -0.088 | -0.084 | -0.085 | -0.083 | -0.084 |
| Self in/out degree correlation | 0.983 | 0.982 | 0.983 | 0.983 | 0.983 | 0.984 |
| Reciprocity of #referral | 0.878 | 0.890 | 0.896 | 0.901 | 0.902 | 0.896 |
Overall statistics of all referral paths in 2006-2011
| Year | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 |
|---|---|---|---|---|---|---|
| 4.44M | 4.45M | 4.54M | 4.59M | 4.63M | 4.66M | |
| Avg length | 3.850 | 3.907 | 3.983 | 4.023 | 4.061 | 4.115 |
| Avg gap for a referral | 8.509 | 8.506 | 8.369 | 8.352 | 8.230 | 8.060 |
| Avg time range | 24.247 | 24.727 | 24.969 | 25.245 | 25.192 | 25.109 |
| Percent of paths with recurrent nodes | 33.418 | 32.879 | 32.836 | 32.784 | 32.573 | 32.301 |
| Avg #nodes before recurrence | 4.087 | 4.130 | 4.179 | 4.196 | 4.223 | 4.271 |
| Avg physician entropy | 1.400 | 1.410 | 1.423 | 1.427 | 1.436 | 1.448 |
| Avg hospital entropy | 0.475 | 0.473 | 0.476 | 0.459 | 0.480 | 0.481 |
| Avg HRR entropy | 0.107 | 0.109 | 0.108 | 0.105 | 0.112 | 0.116 |
| Avg bidirectional pairs in a path | 0.450 | 0.455 | 0.465 | 0.474 | 0.476 | 0.479 |
Fig. 3Observed local clustering coefficient of the nodes on a referral path, and the three components parsed into their time series decomposition. The seasonal component fluctuates along the time axis
Percentage of change points in terms of increasing/decreasing trend in node position sequence of a referral path
| Year | 2007 | 2008 | 2009 | 2010 | 2011 |
|---|---|---|---|---|---|
| Clustering coefficient | 75.0 | 74.9 | 74.9 | 74.8 | 74.7 |
| Betweenness centrality | 74.9 | 74.7 | 74.8 | 74.7 | 74.5 |
| Eigenvector centrality | 74.3 | 74.2 | 74.2 | 74.1 | 74.0 |
| PageRank centrality | 74.8 | 74.6 | 74.7 | 74.6 | 74.5 |
| h-index | 70.7 | 70.6 | 70.8 | 70.8 | 70.8 |
Referrals to physician specialties over 2006-2011
| (a) Top 5 specialties. |
| Cardiovascular disease |
| Internal medicine |
| Family practice |
| Interventional cardiology |
| Pulmonary disease |
| (b) Top 5 cross-specialty referrals. |
| Internal medicine → cardiovascular disease |
| Cardiovascular disease → internal medicine |
| Family practice → cardiovascular disease |
| Cardiovascular disease → family practice |
| Internal medicine → family practice |
Fig. 4Top 10 specialties as the most key physician on referral paths in 2007-2011. Each group accounts for more than 1% of key physicians
Several pairs of strong correlations between node position and node feature on a referral path
| Node position measure | Node feature about referral path | Correlation coefficient |
|---|---|---|
| Betweenness centrality | 0.607 | |
| PageRank centrality | 0.852 | |
| PageRank centrality | 0.740 | |
| h-index | 0.783 | |
| h-index | 0.640 |
Comparison of average common connected nodes between neighbors on a referral path and the expectation in a random network with the same size
| Year | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 |
|---|---|---|---|---|---|---|
| Random network | 3.60 | 3.10 | 3.00 | 2.90 | 2.80 | 2.80 |
| Referral network | 25.13 | 24.64 | 24.95 | 24.97 | 24.95 | 24.96 |
The feature list of a hospital (PHN level referral network) for teaching hospital classification
| Feature Group | Features |
|---|---|
| PHN level network measures | #nodes, #edges, gini coefficient of indegree distribution, gini coefficient of outdegree distribution, alpha of indegree power law test, alpha of outdegree power law test, diameter, global clustering coefficient, local clustering coefficient, (in, in) assortativity, self degree correlation, reciprocity of # referral, reciprocity of RVUs |
| Difference (in - out) of edge weights on PHN traffic map | Degree, #different referred patients, #referral, geometric mean of #visit, geometric mean of RVUs, ranking index based weight |
| PHN position on PHN traffic map | Local clustering coefficient, PageRank, h-index |
| average feature of referral paths in the PHN | Length, avg-time-gap, avg-time-range, recurrent node, # nodes before recurrence, phy-entropy, PHN-entropy, HRR-entropy, common connected nodes between neighbors, bidirectional pairs |
| Average node position of the PHN in the national referral network | Local clustering coefficient, PageRank, h-index |
COTH classification results of Logistic Regression (LR) and Support Vector Machine (SVM)
| LR | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | average F-score |
|---|---|---|---|---|---|---|---|
| Recall | 0.844 | 0.902 | 0.805 | 0.882 | 0.830 | 0.792 | |
| Precision | 0.704 | 0.712 | 0.733 | 0.667 | 0.780 | 0.792 | |
| F-score | 0.768 | 0.796 | 0.767 | 0.759 | 0.804 | 0.792 | 0.781 |
| SVM | |||||||
| Recall | 0.791 | 0.762 | 0.717 | 0.774 | 0.825 | 0.914 | |
| Precision | 0.756 | 0.780 | 0.805 | 0.750 | 0.750 | 0.762 | |
| F-score | 0.773 | 0.771 | 0.759 | 0.762 | 0.786 | 0.831 | 0.780 |
They are the best two models in terms of average F-score in 2006-2011
Significant predictors in Logistic Regression for COTH classification
| Feature Name | Estimated Coefficient | 95% Confidence Interval | |
|---|---|---|---|
| Gini coefficient of degree distribution in PHN network | 2.823 | (0.844 4.802) | 5.18E-03 |
| Global clustering coefficient of PHN network | -10.693 | (-13.218 -8.167) | < 2E-16 |
| (in, in) degree assortativity | 4.813 | (2.981 6.646) | 2.63E-07 |
| Difference (in-out) of # referrals on PHN traffic map | 3.678 | (1.630 5.726) | 4.32E-04 |
| h-index of a hospital on the PHN traffic map | 7.862 | (5.877 9.847) | 8.37E-15 |
| Avg time range of a referral path | 5.138 | (2.157 8.119) | 7.29E-04 |
| ratio of referral paths with recurrent nodes | -12.950 | (-16.614 -9.286) | 4.29E-12 |
| Avg #nodes before recurrent nodes | 6.139 | (3.844 8.434) | 1.58E-07 |
| Avg #bidirected pairs on referral paths in the PHN | 6.459 | (2.407 10.512) | 1.78E-03 |
Feature list of a referral path for treatment outcome classification/regression
| Group of Features | Features and ID |
|---|---|
| Network measures in the dominant HRR | 1:#nodes, 2:#edges, 3:indegree gini coefficient, 4:outdegree gini coefficient, 5:indegree power law test alpha, 6:outdegree power law test alpha, 7: diameter, 8:global clustering coefficient, 9:local clustering coefficient, 10: (in, in) assortativity, 11:self in/out degree coefficient, 12:referral reciprocity, 13:RVU reciprocity |
| Referral path sequence | 14:#nodes, 15:average time gap, 16: time range, 17:indicator of recurrence, 18: #nodes before recurrence, 19:physician distribution entropy, 20: PHN distribution entropy, 21:HRR distribution entropy, 22:average #common connected nodes between neighbors, 23:#pairs of nodes with reciprocal referrals, 37:#change points, 38:#previous referral path in the same year, 39:distance between the first visited hospital and the end one, 40:total RVU, 41:month of the first visit, 42:#visited teaching hospitals, 43:specialty of the key physician, 44:specialty of the last physician, 45:#visited PHN with negative (in-out) degree on PHN traffic map, 46:#visited PHN with positive (in-out) degree on PHN traffic map, 47:sum of (in-out) degree for all PHN on the referral path, 60:indicator of admitted by emergency department for the first node |
| Average node positions on the referral path | 24:local clustering coefficient, 25:PageRank, 26:h-index, 27:#paths which contains the node, 28:#paths where the node is the starting one, 29:#paths where the node is the end one, 30:index of the first-time occurrence, 31:#paths where the node occurs multiple times, 32:#cross-HRR referrals proposed by the node, 33:#cross-PHN referrals proposed by the node |
| Average weights of edges in the national referral network covered by the referral path | 34:#referrals, 35:RVU, 36:ranking based weight |
| Last physician on the referral path | 48:RVU, 49:month of visit, 50:local clustering coefficient, 51:PageRank, 52:h-index, 53:#paths which contains the node, 54:#paths where the node is the starting one, 55:#paths where the node is the end one, 56:average index of the first-time occurrence, 57:#paths where the node occurs multiple times, 58:#cross-HRR referrals proposed by the node, 59:#cross-PHN referrals proposed by the node |
| Patient history information | 61:age, 62:indicator of HIV, 63:indicator of asthmatic lung disease, 64:indicator of cancer, 65:indicator of dementia, 66:indicator of diabetes, 67:indicator of liver disease, 68:indicator of chronic non-asthmatic lung disease, 69:indicator of chronic renal disease |
Classification results of GBDT for death1yr and PCI in 2007-2011
| PCI | 2007 | 2008 | 2009 | 2010 | 2011 | Average F-score |
|---|---|---|---|---|---|---|
| Recall | 0.703 | 0.700 | 0.702 | 0.695 | 0.694 | |
| Precision | 0.572 | 0.574 | 0.585 | 0.597 | 0.607 | |
| F-score | 0.631 | 0.630 | 0.638 | 0.642 | 0.647 | 0.638 |
| death1yr | ||||||
| Recall | 0.702 | 0.698 | 0.710 | 0.704 | 0.682 | |
| Precision | 0.640 | 0.632 | 0.639 | 0.650 | 0.633 | |
| F-score | 0.669 | 0.663 | 0.672 | 0.675 | 0.657 | 0.667 |
Average F-score in 2007-2011 of GBDT on groups divided by age
| Death1yr | PCI | |
|---|---|---|
| Age<=75 | 0.592 | 0.695 |
| Age>75 | 0.687 | 0.565 |
Top 10 important features for death1yr and PCI generated by Random Forest feature selection method (Genuer et al. 2010)
| Rank | Death1yr | PCI |
|---|---|---|
| 1 | Total RVU of the referral path | Average time gap on the referral path |
| 2 | Total RVU of the previous referral path | Indicator of patient’s age in 66-70 |
| 3 | Average time gap on the referral path | Average PageRank values of all physicians on the referral path |
| 4 | Time range of the referral path | Indicator of the key physician’s specialty on the referral path as “interventional cardiology” |
| 5 | Average index of the first-time occurrence on a referral path for the last physician | Indicator of patient’s age in 76+ |
| 6 | Local clustering coefficient of the last physician on the referral path | The number of referral paths that include the last physician |
| 7 | Times of being the end node on a referral path of the last physician on the referral path | Indicator of the key physician’s specialty on the referral path as “interventional cardiology” |
| 8 | Times of being the first node on a referral path for the last physician | Average #involved paths among physicians on the referral path |
| 9 | indicator of patient’s age in 76+ | Average times of being the first node on a referral path for all physicians on the referral path |
| 10 | Average times of being the end node on a referral path for all physicians on the referral path | Times of being the first node on a referral path for the last physician |
Significant predictors for LR on two binary treatment variables with estimated coefficients in 95% confidence interval (CI)
| (a) death1yr | |||
| Feature | Estimate | 95 | |
| #nodes in domain HRR | −0.243 | (−0.389−0.098) | 1.03 |
| Physician distribution entropy | −0.313 | (−0.625−0.0013) | 0.049 |
| PHN distribution entropy | −0.528 | (−0.692−0.365) | 2.34 |
| #pairs of nodes with reciprocal referrals | −2.496 | (−3.666−1.325) | 2.93 |
| Avg. PageRank values on a referral path | −2.290 | (−2.803−1.778) | <2 |
| Avg. index of first occurrence | −0.569 | (−0.974−0.164) | 0.0059 |
| Avg. proposed #cross-PHN referrals | 1.628 | (0.961 2.295) | 1.73 |
| Avg. #referrals on the corresponding edges | 8.696 | (4.771 12.620) | 1.41 |
| Avg. ranking-based weight on the corresponding edges | −3.973 | (−6.426−1.519) | 0.0015 |
| #previous paths | 2.204 | (1.908 2.500) | <2 |
| Total RVU | 11.414 | (10.461 12.367) | <2 |
| Times of being the end node of the last physician | −2.985 | (−4.075−1.896) | 7.89 |
| Avg. first occurrence index of the last physician | 4.869 | (4.176 5.562) | <2 |
| Times of occurring multiple times of the last physician | 1.778 | (1.041 2.514) | 2.23 |
| (b) PCI | |||
| Feature | Estimate | 95% CI | |
| Physician distribution entropy | −0.368 | (−0.678−0.058) | 0.019 |
| PHN distribution entropy | 0.547 | (0.359 0.734) | 1.08 |
| Avg. #common connected nodes between neighbors | 0.487 | (0.097 0.877) | 0.014 |
| Avg. PageRank values on a referral path | 3.874 | (3.337 4.411) | <2 |
| Avg. proposed #cross-PHN referrals | −1.738 | (−2.278−1.197) | 2.89 |
| Avg. #referrals on the corresponding edges | −2.222 | (−3.822−0.622) | 0.0065 |
| #previous paths | −1.845 | (−2.155−1.533) | <2 |
| Total RVU | −2.113 | (−2.909−1.315) | 2.02 |
| Local clustering coefficient of the last physician | −1.352 | (−1.969−0.735) | 1.76 |
| Avg. first occurrence index of the last physician | −3.024 | (−4.034−2.013) | 4.48 |
Significant predictors in multiple linear regression models for log(Total 1yr payments) with estimated coefficients in %95 confidence interval (CI)
| Feature | Estimate | CI | |
|---|---|---|---|
| #nodes in domain HRR | 0.121 | (0.099 0.142) | < 2 |
| referral reciprocity in domain HRR | 0.209 | (0.167 0.251) | < 2 |
| #nodes ∗ | - 2.588 | (−2.992−2.183) | < 2 |
| Physician distribution entropy | 1.365 | (1.321 1.410) | < 2 |
| PHN distribution entropy ∗ | 0.413 | (0.347 0.480) | < 2 |
| Avg. #common connected nodes between neighbors | − 0.357 | (−0.432−0.282) | < 2 |
| #pairs of nodes with reciprocal referrals | 2.618 | (2.374 2.863) | < 2 |
| Avg. local clustering coefficient on the referral path | − 1.222 | (−1.326−1.117) | < 2 |
| Avg. PageRank values on the referral path | 0.983 | (0.888 1.077) | < 2 |
| Avg. index of first occurrence on the referral path | 0.341 | (0.235 0.447) | 3.05 |
| Avg. proposed #cross-PHN referrals | − 0.592 | (−0.685−0.498) | < 2 |
| Avg. #referrals on the corresponding edges | −0.567 | (−0.902−0.232) | 9.25 |
| Avg. ranking-based weight on the corresponding edges ∗ | 0.775 | (0.485 1.064) | 1.59 |
| #previous paths ∗ | 0.304 | (0.212 0.396) | 9.28 |
| Total RVU ∗ | 5.028 | (4.604 5.451) | < 2 |
| Month of the first visit | categorical | vary for groups | < 2 |
| Specialty of the key physician | categorical | vary for groups | < 2 |
| Month of the last visit | categorical | vary for groups | < 2 |
| Avg. first occurrence index of the last physician ∗ | − 0.433 | (−0.686−0.179) | 7.99 |
Asterisk means the predictor has significant interactions with time
Fig. 5Visualization of a hospital (PHN) referral network with 30 physicians and 101 directed edges in 2011. Red, yellow and lightblue nodes represent physicians with positive, zero and negative net patient flow (NPF), respectively. Targets of referrals are marked with shadow on directed edges. The edge weights are in Table 17
A part of network weights in the PHN network of Fig. 5
| (s, t, w) | (s, t, w) | (s, t, w) | (s, t, w) | (s, t, w) |
| (20, 7, 2) | (1, 6, 2) | (22, 6, 2) | (6, 7, 11) | (10, 7, 2) |
| (7, 6, 10) | (6, 10, 3) | (7, 19, 2) | (3, 12, 2) | (12, 2, 2) |
| (17, 6, 2) | (17, 11, 3) | (6, 11, 4) | (7, 17, 3) | (14, 3, 2) |
| (6, 17, 5) | (8, 10, 2) | (14, 10, 2) | (19, 6, 3) | (4, 2, 2) |
| (2, 4, 4) | (10, 11, 3) | (11, 24, 2) | (3, 14, 2) | (11, 6, 2) |
The weights (i.e. number of referrals) of the remaining edges are one. A triple means (source, target, weight)