Featured Image
Why is it important?
The Adam format allows fast, efficient and cheap processing of a large number of genetic data. We extend it to include whatever the doctor wants for a complete precision medicine study. This is important for small hospital research labs that can use open source solutions and run the study at low costs.
Perspectives
Read the Original
This page is a summary of: ADAM Genomics Schema - Extension for Precision Medicine Research*, April 2018, ACM (Association for Computing Machinery),
DOI: 10.1145/3194658.3194669.
You can read the full text:
Resources
A Large-scale and Extensible Platform for Precision Medicine Research
The massive adoption of high-throughput genomics, deep sequencing technologies and big data technologies have made possible the era of precision medicine. However, the volume of data and its complexity remain important challenges for precision medicine research, hindering development in this field. The literature on precision medicine research describes a few platforms to support specific types of studies, but none of these offer researchers the level of customization required to meet their specific needs [1].
The Future of Large-Scale Precision Medicine Research Platforms: Preparing the Data for Analysis
Abstract: The extensive adoption of high-throughput genomics, microarray, and deep sequencing technologies has accelerated the possibility of more complex precision medicine research using very large amounts of heterogeneous data [1]. The availability of this data allows data scientists and clinicians to develop tailored individual strategies. Therapeutic and preventive treatments can be proposed, with greater accuracy, targeting subgroups of patients for specific illnesses using large amounts of genomic, clinical, lifestyle, and environment data [2]. Next generation sequencing (NGS) technology is key in supporting precision medicine research; however, the data‘s volume and complexity poses challenges for its clinical application [3]. While Big Data‘s analytics could uncover hidden patterns, new correlations, and other insights through the examination of large-scale data sets, it is still difficult to master [4]. In this paper, we present what is required of future large-scale precision medicine platforms in terms of data extensibility and the scalability of processing on demand. It presents a proposed platform architecture as well as open-source Big Data technologies that would allow to easily enrich a flexible data schema, provide the power needed to load large amounts of data and make this centralized database available for specific precision medicine research.
ADAM Genomics Schema - Extension for Precision Medicine Research
ABSTRACT High-throughput sequencing technologies have made research on precision medicine possible. Precision medicine treatments will be effective for individual patients based on their genomic, environmental, and lifestyle factors. This requires integrating this data to find one, or a combination of, single nucleotide polymorphisms (SNPs) linked to a disease or treatment [1]. In 2013, the University of California Berkeley’s AmpLab created the ADAM genomic format that allows the transformation, analysis and querying of large amounts of genomics data by using a columnar file format. However, while ADAM addresses the issue of processing large genomics data; it lacks the ability to link the patients’ clinical and demographical data, which is crucial in precision medicine research. This paper presents an ADAM genomic schema extension to support clinical and demographical data by automating the addition of data items to the currently available ADAM schema. This extension allows for clinical, demographical and epidemiological analysis at large scale as initially intended by the AmpLab.
Contributors
The following have contributed to this page