| Literature DB >> 28338350 |
Helena Ellis1, Mary-Beth Joshi2, Aenoch J Lynn3, Anita Walden4.
Abstract
Biobanking at Duke University has existed for decades and has grown over time in silos and based on specialized needs, as is true with most biomedical research centers. These silos developed informatics systems to support their own individual requirements, with no regard for semantic or syntactic interoperability. Duke undertook an initiative to implement an enterprise-wide biobanking information system to serve its many diverse biobanking entities. A significant part of this initiative was the development of a common terminology for use in the commercial software platform. Common terminology provides the foundation for interoperability across biobanks for data and information sharing. We engaged experts in research, informatics, and biobanking through a consensus-driven process to agree on 361 terms and their definitions that encompass the lifecycle of a biospecimen. Existing standards, common terms, and data elements from published articles provided a foundation on which to build the biobanking terminology; a broader set of stakeholders then provided additional input and feedback in a secondary vetting process. The resulting standardized biobanking terminology is now available for sharing with the biobanking community to serve as a foundation for other institutions who are considering a similar initiative.Entities:
Keywords: biobank; biorepository; data elements; interoperability; standards; terminology
Mesh:
Year: 2017 PMID: 28338350 PMCID: PMC5397220 DOI: 10.1089/bio.2016.0092
Source DB: PubMed Journal: Biopreserv Biobank ISSN: 1947-5543 Impact factor: 2.300
Three Important Reasons to Use a Common Terminology
| 1. | Searching for appropriate samples across the legacy biobanks was difficult at best and impossible at worst, without the use of a common terminology. |
| 2. | Reporting was identified as a critical requirement by principle investigators, biobank managers, and sponsors. The disparate and nonstandard terminology in the legacy systems had already been proven to be an impediment to querying and reporting across existing biospecimen databases. |
| 3. | The system would be centrally supported by a team that provides training, data migration, and ongoing support; thus, the data captured by the legacy biobanks needed to be standardized. |

Diagram of project organization and leadership.

Lifecyle of the biospecimen as defined by the biospecimen research network of the National Cancer Institute (reprinted with permission).
Scope of Work for Each of the Five Working Groups
| 1. Sample collection and storage: data elements related to the collection and storage of biological material | a. Collection event information (dates, times, temperatures, study site, physical position, etc.) |
| b. Collected material information (collection procedure, sample type, body site, quantities, etc.) | |
| c. Material acquisition information (container, identifiers, participant demographics, shipping information, etc.) | |
| d. Accessioning information (biobank identification, sign in, storage units, temperatures, freezer locations, etc.) | |
| e. Material handling information (handling instructions, quantities, dates, temperatures, times, quantities, etc.) | |
| 2. Tracking and nonchemical sample processing: data elements related to nonchemical handling of collected biological material (e.g., separation into smaller units), tracking material in and out of the biobank, and storage device monitoring | a. Biobank location, building, room, and personnel for the biobank |
| b. Handling and nonchemical processing information (identifiers, aliases, barcodes, methods, and procedures, etc.) | |
| c. Storage unit information (temperature/time logs, make, model, repair history, monitoring, and asset number, etc.) | |
| d. Material storage information (storage unit type, storage unit position, and storage conditions and temperature, etc.) | |
| e. Study/protocol descriptors and information (IRB number and status, consent status, title, principal investigator, material use restrictions, data collection parameters, and data sharing restrictions, etc.) | |
| 3. Chemical handling and derivatives: data elements associated with chemical handling, manipulation, and production of derivatives and products | a. Stabilization information |
| b. Derivative types (RNA, DNA, protein, and IHC, etc.) | |
| c. Bench-top protocols utilized (methods for extraction and detection) | |
| d. Chemical handling information (kit types, lot numbers, and method names, etc.) | |
| e. Concentration and quality metrics (units, methods) | |
| 4. Complex data: data elements associated with complex data, such as “omics” type analyses and resulting data | a. Data types available (SNP, gene array, sequencing, and raw data vs. normalized data, etc.) |
| b. Methodology/platform information (chip type, etc.) | |
| c. Analyses information (type of analyses and dates performed, etc.) | |
| d. Location/link to file and size of file | |
| e. Data describing primary and secondary data | |
| f. Analysis techniques and processes and how results are stored | |
| 5. Clinical data: data elements related to clinical outcomes and demographics | a. Standard of care data |
| b. Clinical laboratory data | |
| c. Diagnoses | |
| d. Disease stage | |
| e. Clinical follow-up/survival information (date of death, last contact, or disease recurrence, etc.) | |
| f. Detailed demographics (smoking history and marital status, etc.) |
Authoritative Sources Used for Data Elements
| The NCI Thesaurus[ | A collection of curated terms, definitions, and synonyms of primarily cancer-related biomedical concepts that are used by NCI projects, researchers, and collaborators to promote semantic interoperability | While many terms and definitions were adopted from the NCI Thesaurus, it was deemed too cancer centric for wholesale adoption when considering the needs of Duke's noncancer researchers. |
| The NCIs caDSR[ | An ISO 11179 metadata repository for common data elements used in clinical research. Researchers can query the caDSR for common data elements to help build case report forms that would be consistent and comparable with previous research | Most of the terms and definitions overlapped with the NCI Thesaurus; so this resource was not used heavily. It also was cancer centric, but it was useful for defining permissible values for a limited number of terms. |
| The NCIs CBM[ | A data model to help facilitate sharing of biospecimen resources. The CBM focuses on metadata about biospecimen resources related to the samples and participants, and contains yes/no indicators about sample annotation and sample availability | This data model was already in use at Duke in a software tool designed to “advertise” biospecimen resources; hence, it was critical that these data elements and definitions were incorporated into the terminology. |
| Commercial BIMS Software | The out-of-the-box terms that came with the inherent functionality of the BIMS | The Biobanking Data Element Standardization Project was well under way when a commercial BIMS was identified and purchased, after which the product's out-of-the-box terms were incorporated. |
| Legacy inventory systems | Data elements in use in institutional legacy inventory systems. Legacy systems included in-house developed databases and other commercial inventory systems | Each biobank that planned to use the BIMS also participated in the terminology effort and provided a list of data elements from their existing systems. Definitions were established together since they were not necessarily readily available. |
| ISBERs Best Practices for Repositories[ | A glossary provided ISBER related to their published biobanking best practices | Provided some basic terms and definitions related to the foundations of biobanking. |
| The NCI Best Practices for Biospecimen Resources[ | A glossary provided by both the NCI and ISBER related to their published biobanking best practices | Provided some basic terms and definitions related to the foundations of biobanking. |
| IRB website | The IRB website serves as a resource for Duke researchers regarding policies and procedures | Provided terms and concepts specifically related to research approvals, policies, and informed consent requirements. |
| BRISQ[ | A list of data elements that represent factors believed to influence biospecimen quality and should be considered for reporting | Very relevant and specific data elements related to biobanking science and sample quality |
| Important preanalytical variables defined by CAP[ | Variables that may affect the quality and/or value of a biospecimen from the time of consenting until the biospecimen is used banked or used for testing | Very relevant and variable related to biobanking science and sample quality |
| MIABIS[ | Set of 52 attributes defined as the minimum data set for biobanks and studies using human biospecimens that describe a biobank's content | Relevant data elements related to meta data and information needed for sharing samples |
BIMS, biobanking information management system; BRISQ, Biospecimen Reporting for Improved Study Quality; caDSR, Cancer Data Standards Registry and Repository; CAP, College of American Pathologists; CBM, Common Biorepository Model; IRB, Institutional Review Board; MIABIS, Minimum Information About BIobank data Sharing; NCI, National Cancer Institute.

Data element development process.
Categories of Data Elements
| 1. | Clinical annotation | Clinical data and information related to the participant that is important for selection of a sample for downstream use | 56 |
| 2. | Informed consent | Data elements related to the process of informed consent | 13 |
| 3. | Study administration | Data elements related to management of a biobanking study | 60 |
| 4. | Package | Data elements related to shipping and distribution of samples | 24 |
| 5. | Participant | Data elements related to a consented individual who is participating in a research study | 55 |
| 6. | Samples | Data elements related to biological material | 113 |
| 7. | Storage | Data elements related to storage of samples in a biobank | 40 |
Example of Multiple Definitions for Single-Term Sample
| Sample | A single unit of biological material (noun) | Sample |
| Sample | Several units of biological material collected at the same time from one participant (noun) | Sample set |
| Sample | A set of different units of biological material that reflect parent/child relationships (noun) | Sample family |
| Sample | The participant from whom the biological material was collected (noun) | Participant |
| Sample | To collect biological material from a participant (verb) | Collect |