Browsing by Author "Crowgey, Erin L."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Applied genomics: development of bioinformatics pipelines for analyzing clinical pediatric genomic data(University of Delaware, 2016) Crowgey, Erin L.; University of Delaware, Program in Bioinformatics and Systems BiologyThe onset and prognosis of several human diseases, such as cancer, are characterized by specific genomic alterations. The sequencing and assembly of the human genome is enabling advancements in personalized medicine, but the process of associating genetic mutations to a specific human disease and treatment is still complex. Recent advancements in DNA sequencing technologies, known as next generation sequencing (NGS), are enabling the detection of many genomic alterations at once. However, a primary limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis. A novel approach was created for analyzing whole exome sequencing (WES) datasets (sequenced on the Illumina platform) from clinical patients diagnosed with a rare Mendelian disease. One of the datasets used to help establish the methodologies was paired-end WES from six patients, plus their family members, with a rare disorder characterized by facio-skeletal abnormalities. Robust bioinformatics pipelines were implemented for trimming, genome alignment, single nucleotide polymorphisms (SNPs) detection and annotation, and copy number variation detection. Quality control metrics were analyzed at each step of the pipeline to ensure data integrity for clinical applications. The variants are annotated with three custom modules that enable flexible filtering of the variants based upon criteria such as quality of variant, inheritance pattern (e.g. dominant, recessive, X-linked), and minor allele frequency. Bioinformatics methodologies were also developed for analyzing NGS data generated from 19 pediatric acute myeloid leukemia patients. The bioinformatics pipelines developed were focused on single nucleotide variant detection and annotation, combined with genomic insertion / deletion detection and annotation. A list of verified single nucleotide variants was provided with the clinical NGS dataset, and the pipeline was capable of detecting ~94% of the verified single nucleotide variants using a combination of Mutect and Shimmer. The bioinformatics pipeline developed reported high quality single nucleotide variants that were previously not reported to the Children’s Oncology Group. Furthermore, detection and analysis of an internal tandem duplication (ITD) in FLT3, a known clinically relevant mutation in pediatric AML associated with poor prognosis, was conducted using Pindel. The ITD was detected in 5 of the 6 patient’s NGS data, which had previously only been detected using PCR and electrophoresis. Collectively, this dissertation project provides a unique method for prioritizing and visualizing genomic variants using functional annotations, including gene ontologies and pathway enrichment strategies.Item DNA Methylation Analysis Reveals Distinct Patterns in Satellite Cell–Derived Myogenic Progenitor Cells of Subjects with Spastic Cerebral Palsy(Journal of Personalized Medicine, 2022-11-30) Robinson, Karyn G.; Marsh, Adam G.; Lee, Stephanie K.; Hicks, Jonathan; Romero, Brigette; Batish, Mona; Crowgey, Erin L.; Shrader, M. Wade; Akins, Robert E.Spastic type cerebral palsy (CP) is a complex neuromuscular disorder that involves altered skeletal muscle microanatomy and growth, but little is known about the mechanisms contributing to muscle pathophysiology and dysfunction. Traditional genomic approaches have provided limited insight regarding disease onset and severity, but recent epigenomic studies indicate that DNA methylation patterns can be altered in CP. Here, we examined whether a diagnosis of spastic CP is associated with intrinsic DNA methylation differences in myoblasts and myotubes derived from muscle resident stem cell populations (satellite cells; SCs). Twelve subjects were enrolled (6 CP; 6 control) with informed consent/assent. Skeletal muscle biopsies were obtained during orthopedic surgeries, and SCs were isolated and cultured to establish patient–specific myoblast cell lines capable of proliferation and differentiation in culture. DNA methylation analyses indicated significant differences at 525 individual CpG sites in proliferating SC–derived myoblasts (MB) and 1774 CpG sites in differentiating SC–derived myotubes (MT). Of these, 79 CpG sites were common in both culture types. The distribution of differentially methylated 1 Mbp chromosomal segments indicated distinct regional hypo– and hyper–methylation patterns, and significant enrichment of differentially methylated sites on chromosomes 12, 13, 14, 15, 18, and 20. Average methylation load across 2000 bp regions flanking transcriptional start sites was significantly different in 3 genes in MBs, and 10 genes in MTs. SC derived MBs isolated from study participants with spastic CP exhibited fundamental differences in DNA methylation compared to controls at multiple levels of organization that may reveal new targets for studies of mechanisms contributing to muscle dysregulation in spastic CP.Item Machine learning classifier approaches for predicting response to RTK-type-III inhibitors demonstrate high accuracy using transcriptomic signatures and ex vivo data(Bioinformatics Advances, 2023-03-24) Ferrato, Mauricio H.; Marsh, Adam G.; Franke, Karl R.; Huang, Benjamin J.; Kolb, E. Anders; DeRyckere, Deborah; Grahm, Douglas K.; Chandrasekaran, Sunita; Crowgey, Erin L.Motivation: The application of machine learning (ML) techniques in the medical field has demonstrated both successes and challenges in the precision medicine era. The ability to accurately classify a subject as a potential responder versus a nonresponder to a given therapy is still an active area of research pushing the field to create new approaches for applying machine-learning techniques. In this study, we leveraged publicly available data through the BeatAML initiative. Specifically, we used gene count data, generated via RNA-seq, from 451 individuals matched with ex vivo data generated from treatment with RTK-type-III inhibitors. Three feature selection techniques were tested, principal component analysis, Shapley Additive Explanation (SHAP) technique and differential gene expression analysis, with three different classifiers, XGBoost, LightGBM and random forest (RF). Sensitivity versus specificity was analyzed using the area under the curve (AUC)-receiver operating curves (ROCs) for every model developed. Results: Our work demonstrated that feature selection technique, rather than the classifier, had the greatest impact on model performance. The SHAP technique outperformed the other feature selection techniques and was able to with high accuracy predict outcome response, with the highest performing model: Foretinib with 89% AUC using the SHAP technique and RF classifier. Our ML pipelines demonstrate that at the time of diagnosis, a transcriptomics signature exists that can potentially predict response to treatment, demonstrating the potential of using ML applications in precision medicine efforts. Availability and implementation: https://github.com/UD-CRPL/RCDML Supplementary information: Supplementary data are available at Bioinformatics Advances online at: https://doi.org/10.1093/bioadv/vbad034