Literature DB >> 34041323

Processed HIV prognostic dataset for control experiments.

Moses E Ekpenyong1,2, Philip I Etebong1, Tenderwealth C Jackson3, Edidiong J Udofa3.   

Abstract

This paper provides a control dataset of processed prognostic indicators for analysing drug resistance in patients on antiretroviral therapy (ART). The dataset was locally sourced from health facilities in Akwa Ibom State of Nigeria, West Africa and contains 14 attributes with 1506 unique records filtered from 3168 individual treatment change episodes (TCEs). These attributes include sex, before and follow-up CD4 counts (BCD4, FCD4), before and follow-up viral load (BRNA, FRNA), drug type/combination (DTYPE), before and follow-up body weight (Bwt, Fwt), patient response to ART (PR), and classification targets (C1-C5). Five (5) output membership grades of a fuzzy inference system ranging from very high interaction to no interaction were constructed to model the influence of adverse drug reaction (ADR) and subsequently derive the PR attribute (a non-fuzzy variable). The PR attribute membership clusters derived from a universe of discourse table were then used to label the classification targets as follows: C1=no interaction, C2=very low interaction, C3=low interaction, C4=high interaction, and C5=very high interaction. The classification targets are useful for building classification models and for detecting patients with ADR. This data can be exploited for the development of expert systems, for useful decision support to treatment failure classification [1] and effectual drug regimen prescription.
© 2021 The Author(s). Published by Elsevier Inc.

Entities:  

Keywords:  Adverse drug reaction; Antiretroviral therapy; HIV control data; Treatment change episode; Treatment failure classification

Year:  2021        PMID: 34041323      PMCID: PMC8142042          DOI: 10.1016/j.dib.2021.107147

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table

Value of the Data

This paper presents very useful datasets for engendering research on HIV/AIDS in the Sub-Saharan African region. Computer scientists can use the data to develop classification models and expert systems for drug pattern analysis, adverse drug reaction and failed treatment. Clinicians/physicians and pharmacists can use the developed expert system to support meaningful decisions on drug prescription, recommendation, and administration. By providing access to clinical (control) HIV data, research progress can be accelerated towards individualised medicine, where on-treatment variables influencing a set of study outcomes are analysed for the purpose of predicting patient drug response with precision. The developed models and algorithms could be made available as open-source tools with adaptive and replicable features for diverse domains/environments.

Data Description

We provide a control dataset (SupplFile.xlsx) containing average prognostic indicators of HIV (sex; before CD4 count, BCD4; follow-up CD4 count, FCD4; before viral load, BRNA; follow-up viral load, FRNA; before body weight in Kg, BWt; and follow-up body weight in Kg, FWt), treatment type/drug(s) combination (DrugNo/DrugComb) and patient response to treatment (PR). The dataset is divided into two sets, the individual treatment change episodes (TCEs) and unique records. The first set, the TCEs (or raw data) lists on each row repeated instances of other variables, save the individual drugs (or DrugNo–a number or numeric value used to identify each drug taken by the patient) which are listed on separate rows for each patient ID (PID). Table 1 populates the corresponding drug code (DrugCode) and drug name (DrugName) of the respective DrugNo, for each drug administered to patients on ART. The prognostic indicators are results of laboratory analysis conducted using biological fluid sample (the blood), while sex and body weight are determined by physical appearance and measurement using scale reader, respectively. A total of 3168 TCEs are documented. The TCEs were further processed to achieve individual unique records of 1506 patients. The unique records are condensed instances of the TCEs, with DrugNo converted into its DrugCode equivalent and concatenated to form a single, unique record. The PR is a non-fuzzy output value obtained from a fuzzy inference system evaluation of the prognostic indicators with 5 output membership grades indicating the level of drugs interaction as follows (very high interaction, high interaction, very low interaction, low interaction, and no interaction). The classification targets (C1-C5) are binary digits (0/1) used to indicate or label the occurrence of a particular membership grade.
Table 1

Drugs administered to patients on ART (https://hivdb.stanford.edu) [2].

DrugNoDrugCodeDrugNameDrugNoDrugCodeDrugName
1RTVRitonavir13DDIDidanosine
2IDVIndinavir14LPVLopinavir
3D4TStavudine15APVAmprenavir
43TCLamivudine16NVPNivarapine
5SQVSquatonavir17DRVDarunavir
6T20Nfoviritide18FTCEmtricitabine
7FPVFosamprenavir19ATVAtazanavir
8NFVNelfinavir20TPVTipranavir
9AZTZidovudine21RALRaltenovir
10ABCAbacavir22ETREtravirine
11TDFTenofovir23MVCMaraviroc
12EFVEfavirenz24DLVDelavirdine
Drugs administered to patients on ART (https://hivdb.stanford.edu) [2]. Important statistics revealing more insight into the control dataset compared with the Stanford dataset are as presented in Table 2.
Table 2

Analysis of control datasets.

Type of Control Dataset
AnalysisStanfordAkwa Ibom
Male704
Female352
Total number of drugs administered245
Minimum drug combination13
Maximum drug combination73
Number of Patients with most frequent drug combinations (actual drug combination)37 (D4T+DDI+EFV)698 (TDF+3TC+EFV)
Number of Patients with less frequent drug combinations (actual drug combination)1 (3TC)27 (AZT+3TC+EFV)
Patients with at most 2 TCEs310
Patients with at least 3 TCEs (Total TCEs)1490 (5780)1506 (3168)
Analysis of control datasets.

Experimental Design, Materials and Methods

Locally sourced data were collected directly from case files of patients receiving treatment at various health centres in Akwa Ibom State of Nigeria, including a Community Anti-Retroviral Therapy Programme–periodically carried out to reach rural dwellers. A total of 13 health facilities were used as data collection points and covers patients with both resistant and non-resistant cases who registered for treatment at the various facilities from 2015 to 2018. The investigated facilities were found to accommodate up to 10,000 patients receiving treatment in the southeast region. Due to limited resources and the high cost of treatment, only 5 drug combinations in 3 consistent treatment regimens were administered to patients free of charge, through a Family Health International (FHI) HIV/AIDS intervention programme. The number of row(s) rendered depend(s) on the patient's ART regimen administered over the treatment period. Hence, if a patient was administered a combination of 3 drugs, then, three rows are rendered (see data on individual TCEs). Collection of the control data did not involve direct contact with the patients. Instead, access to patients’ medical histories and treatment was granted by the responsible authorities after satisfying the ethical consent procedure required for the purpose of filtering the relevant data. At the University level, ethical approval was granted by the University of Uyo Institutional Health Research Ethics Committee (UNIUYO–IHREC). At the hospital level, Informed consent through written permission was obtained from the responsible health authority before embarking on the data collection. To protect patient records, details that could expose the patients’ personal details (e.g., name, address, occupation, etc.) were not documented. Each patient data was further validated for consistency before recording, while questionable, inconsistent, or not properly documented records were dropped. The control dataset holds only first line treatment episodes (initial 6 months) excavated from existing patients’ records/files under the supervision of a medical superintendent. From the control dataset, universe of discourse (UoD) membership ranges, were created to align with established ranges from domain experts/physicians. Table 3 shows the input and output fuzzy sets derived from the control dataset.
Table 3

Input and output fuzzy sets from domain knowledge.

BCD4/FCD4 (Input)
S/NMembership grade (MG)l1P1r1l2P2r2
1Low {L}022545050275500
2Medium {M}300575850350625900
3High {H}7001075145075011251500

BRNA/FRNA (Input)

1Undetected {U}00.601.200.300.901.50
2Supressed {S}1.002.153.301.202.353.50
3Not Supressed {NS}2.504.005.503.004.506.00

PR (Output)

1No Interaction {NI}027.5055532.5060
2Very Low Interaction {VLI}3047.50653552.5070
3Low Interaction {LI}6268.50756773.5080
4High Interaction {HI}7278.50857783.5090
5Very High Interaction {VHI}8288.50958793.50100
Input and output fuzzy sets from domain knowledge. The column labels indicate the internal structure of the IT2FS [3], [4], where: is the left end point bounded by both UMF () and LMF (), and is the right end point, also bounded by both UMF () and LMF (). The triangular peak location or mean, , of each end point is also bounded by and , representing the triangular peak locations of end points and , and and , respectively. Expressions deriving the IT2FL LMF and UMF can be found in [5]. To enable precise knowledge representation of PR and minimise the influence of confusing input/output boundaries, an Interval Type-2 Fuzzy Logic (IT2FL) system was developed using the JuzzyOnline Fuzzy Toolkit (http://juzzy.wagnerweb.net/) [6], [7] – an open-source toolkit for design, implementation, evaluation and sharing of Type-1 and Type-2 fuzzy logic systems. Applying Microsoft Excel functions and commands, the individual TCEs were condensed to produce a second set of data called unique records (a single row of patient record), with the DrugNo replaced with DrugCode and then concatenated with the ‘+’ symbol to form the drug combination (DrugComb). Hence, if the following drugs were administered to a patient (3TC, ABC, AZT) over the study period, then the DrugComb cell is rendered as 3TC+ABC+AZT. Microsoft Excel command was also used to label the classification targets of the unique records, based on the non-fuzzy PR values. Guided by the derived IT2FL expressions in [5], the correct target class is determined, with 1 placed in the correct target class and 0 s placed in other target classes.

Ethics Statement

The University of Uyo Institutional Health Research Ethics Committee determined that this study did not qualify as human subjects because no protected health information was collected, accessed, or distributed (UU/CHS/IHREC/014).

CRediT Author Statement

Moses Ekpenyong: Conceptualization, Methodology, Writing-Original draft, Funding acquisition, Supervision; Philip Etebong: Data curation, Investigation, Writing-Original draft; Tenderwealth Jackson: Investigation, Validation, Supervision; Edidiong Udofa: Data curation, writing – Reviewing & Editing.

Declaration of Competing Interest

Moses Ekpenyong was funded by a research grant on drug-drug combination in treatment-enable patients on antiretroviral therapy by the Tertiary Education Trust Fund (TETFund), Nigeria. The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.
SubjectHealth and Medical Sciences
Specific subject areaAdverse Drug Reaction
Type of dataTableFigure
How data were acquiredExcavation and pre-processingInstruments: hardware, software, programMake and model and of the instruments used: hardware (Intel HP Core i5 8th Gen), software (Microsoft Excel, JuzzyOnline Fuzzy Toolkit)
Data formatRawAnalysedFiltered
Parameters for data collectionPrognostic indicators of HIV were excavated and analysed.
Description of data collectionData of HIV patients were obtained directly from HIV patients’ records distributed across different health facilities.
Data source locationInstitution: University of UyoCity/Town/Region: Uyo/Akwa IbomCountry: Nigeria
Data accessibilityWith the article
Related research article[1] M.E. Ekpenyong, M.E. Edoho, I.J. Udo, P.I. Etebong, N.P. Uto, T.C. Jackson, N.M. Obiakor, A transfer learning approach to drug resistance classification in mixed HIV dataset, Informatics in Medicine Unlocked. 100,568. https://doi.org/10.1016/j.imu.2021.100568
  2 in total

Review 1.  The HIVdb system for HIV-1 genotypic resistance interpretation.

Authors:  Michele W Tang; Tommy F Liu; Robert W Shafer
Journal:  Intervirology       Date:  2012-01-24       Impact factor: 1.763

2.  Fuzzy-multidimensional deep learning for efficient prediction of patient response to antiretroviral therapy.

Authors:  Moses E Ekpenyong; Philip I Etebong; Tenderwealth C Jackson
Journal:  Heliyon       Date:  2019-07-20
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.