| Literature DB >> 35776915 |
Abstract
Historical clinical trial registry data can only be retrieved by manually accessing individual clinical trials through registry websites. This limits the feasibility, accuracy and reproducibility of certain kinds of research on clinical trial activity and presents challenges to the transparency of the enterprise of human research. This paper presents cthist, a novel, free and open source R package that enables automated scraping of clinical trial registry entry histories and returns structured data for analysis. Documentation of the implementation of the package cthist is provided, as well as 3 brief case studies with example code.Entities:
Mesh:
Year: 2022 PMID: 35776915 PMCID: PMC9249399 DOI: 10.1371/journal.pone.0270909
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Cascading Stylesheet (CSS) selectors and regular expressions indicating HTML elements and the text to be extracted from them on ClinicalTrials.gov and DRKS.de by clinicaltrials_gov_version() and drks_de_version(), respectively.
| ClinicalTrials.gov | DRKS.de | |||
|---|---|---|---|---|
| CSS | Regular expression | CSS | Regular expression | |
| Overall status | #StudyStatusBody | Overall Status: ([A-Za-z,] +) | - | - |
| Recruitment status | - | - | li.state | Recruitment Status: ([A-Za-z, -]+) |
| Enrolment | #StudyDesignBody | Enrollment: ([A-Za-z0-9 \\[\\]]+) | li.targetSize | [0–9]+ |
| Enrolment type | - | li.targetSize | Planned/Actual: ([A-Za-z]+) | |
| Start date | #StudyStatusBody | Study Start: ([A-Za-z0-9,] +) | li.schedule | [0–9]{4}/[0–9]{2}/[0–9]{2} |
| Primary completion date | #StudyStatusBody | Primary Completion: ([A-Za-z0-9, \\[\\]]+) | - | - |
| Primary completion date type | #StudyStatusBody | (\\[[A-Za-z]+\\]) | - | - |
| Closing date | - | - | li.deadline | [0–9]{4}/[0–9]{2}/[0–9]{2} |
| Minimum age | #EligibilityBody | Minimum Age: ([0–9]+) Years | li.minAge | Minimum Age: ([A-Za-z0-9] +) |
| Maximum age | #EligibilityBody | Maximum Age: ([0–9]+) Years | li.maxAge | Maximum Age: ([A-Za-z0-9] +) |
| Sex | #EligibilityBody | Sex: ([A-Za-z]+) | - | - |
| Gender | - | - | li.gender | Gender: ([A-Za-z] +) |
| Gender based | #EligibilityBody | Gender Based: ([A-Za-z]+) | - | - |
| Accepts healthy volunteers | #EligibilityBody | Accepts Healthy Volunteers: ([A-Za-z]+) | - | - |
| Inclusion criteria | #EligibilityBody | ** | - | - |
| Additional inclusion criteria | - | - | .inclusionAdd | ** |
| Exclusion criteria | - | - | .exclusion | ** |
| Primary outcomes | - | - | p.primaryEndpoint | ** |
| Secondary outcomes | - | - | p.secondaryEndpoints | ** |
| Outcome measures | #ProtocolOutcomeMeasuresBody | ** | - | - |
| Contacts | #ContactsLocationsBody | ** | ul.addresses li.address | ** |
| Sponsors | #SponsorCollaboratorsBody | ** | - | - |
Asterisks (**) indicate table data parsed and encoded as JSON rather than extracted using simple regular expressions. A single hyphen (-) indicates that this data point is not available to be downloaded for this clinical trial registry.