| Literature DB >> 27441367 |
Elena Daskalaki1, Peter Diem2, Stavroula G Mougiakakou1,2.
Abstract
Although reinforcement learning (RL) is suitable for highly uncertain systems, the applicability of this class of algorithms to medical treatment may be limited by the patient variability which dictates individualised tuning for their usually multiple algorithmic parameters. This study explores the feasibility of RL in the framework of artificial pancreas development for type 1 diabetes (T1D). In this approach, an Actor-Critic (AC) learning algorithm is designed and developed for the optimisation of insulin infusion for personalised glucose regulation. AC optimises the daily basal insulin rate and insulin:carbohydrate ratio for each patient, on the basis of his/her measured glucose profile. Automatic, personalised tuning of AC is based on the estimation of information transfer (IT) from insulin to glucose signals. Insulin-to-glucose IT is linked to patient-specific characteristics related to total daily insulin needs and insulin sensitivity (SI). The AC algorithm is evaluated using an FDA-accepted T1D simulator on a large patient database under a complex meal protocol, meal uncertainty and diurnal SI variation. The results showed that 95.66% of time was spent in normoglycaemia in the presence of meal uncertainty and 93.02% when meal uncertainty and SI variation were simultaneously considered. The time spent in hypoglycaemia was 0.27% in both cases. The novel tuning method reduced the risk of severe hypoglycaemia, especially in patients with low SI.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27441367 PMCID: PMC4956312 DOI: 10.1371/journal.pone.0158722
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Schema of AC.
Parameters of the AC algorithm.
| Parameter | Description | Value | Class |
|---|---|---|---|
| Local cost hyperglycemia weight | 1 | R | |
| Local cost hyporglycemia weight | 1 | R | |
| Discount factor long-term cost | 0.9 | R | |
| TD learning constant | 0.5 | R | |
| Critic learning rate | 0.5 | R | |
| Actor learning rate | 1 | R | |
| Critic initial parameter vector | Random in [–1 1] | R | |
| Critic initial eligibility vector | Zero | R | |
| Standard deviation exploration action | 0.05 | R | |
| Actor initial BR/IC ratio | patient-specific | S | |
| Actor initial parameter vector | patient-specific | S |
Fig 2Correlation of TE values for successive pairs of data lengths.
Percentage of time spent in the target range, mild hypoglycaemia, severe hypoglycaemia, mild hyperglycaemia and severe hyperglycaemia for each age group and the six experiments.
| Glucose Levels | E1 | E2 | E3 | E4 | E5 | E6 |
|---|---|---|---|---|---|---|
| 70–180 mg/dl | 97.18 | 94.43 | 96.92 | 96.30 | 96.28 | 94.96 |
| 50–70 mg/dl | 1.47 | 2.18 | 0.31 | 0.20 | 0.16 | 0.09 |
| < 50 mg/dl | 0.31 | 1.04 | 0.00 | 0.00 | 0.00 | 0.00 |
| 180–300 mg/dl | 1.03 | 2.35 | 2.76 | 3.50 | 3.56 | 4.96 |
| > 300 mg/dl | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 70–180 mg/dl | 86.44 | 82.73 | 81.72 | 79.59 | 81.64 | 77.81 |
| 50–70 mg/dl | 2.39 | 3.23 | 0.75 | 0.98 | 0.77 | 1.38 |
| < 50 mg/dl | 0.01 | 1.64 | 0.00 | 0.01 | 0.00 | 0.05 |
| 180–300 mg/dl | 11.07 | 12.30 | 17.08 | 19.13 | 17.12 | 20.55 |
| > 300 mg/dl | 0.10 | 0.10 | 0.45 | 0.29 | 0.47 | 0.21 |
| 70–180 mg/dl | 74.77 | 75.82 | 79.30 | 80.52 | 79.24 | 77.36 |
| 50–70 mg/dl | 14.63 | 12.21 | 2.19 | 2.72 | 1.27 | 1.35 |
| < 50 mg/dl | 6.15 | 7.33 | 0.20 | 0.33 | 0.06 | 0.05 |
| 180–300 mg/dl | 4.37 | 4.56 | 16.74 | 16.17 | 18.81 | 20.86 |
| > 300 mg/dl | 0.08 | 0.08 | 1.58 | 0.27 | 0.61 | 0.38 |
Fig 3Evolution of the AC adaptive parameters for one in silico child under an extended E6 for 30 days.
Fig 4LBGI progress of one in silico child during experiments E4 and E6.
Percentage of time spent in target range, hypoglycaemia and hyperglycaemia for AC evaluated in the 100-adult cohort under E5 and E6.
| Glucose Levels | E5 | E6 |
|---|---|---|
| 70–180 mg/dl | 95.66 | 93.02 |
| < 70 mg/dl | 0.27 | 0.27 |
| > 180 mg/dl | 4.07 | 6.71 |
Fig 5Daily LBGI of the total experiment duration for the 100 adult patients and experiments E5-E6.