Browsing by Author "Qiu, Jing"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item A full Bayesian partition model for identifying hypo- and hyper-methylated loci from single nucleotide resolution sequencing data(BioMed Central Ltd, 2016-01-11) Wang, Henan; He, Chong; Kushwaha, Garima; Xu, Dong; Qiu, Jing; Henan Wang, Chong He, Garima Kushwaha, Dong Xu and Jing Qiu; Qiu, JingBACKGROUND: DNA methylation is an epigenetic modification that plays important roles on gene regulation. Study of whole-genome bisulfite sequencing and reduced representation bisulfite sequencing brings the availability of DNA methylation at single CpG resolution. The main interest of study on DNA methylation data is to test the methylation difference under two conditions of biological samples. However, the high cost and complexity of this sequencing experiment limits the number of biological replicates, which brings challenges to the development of statistical methods. RESULTS: Bayesian modeling is well known to be able to borrow strength across the genome, and hence is a powerful tool for high-dimensional- low-sample- size data. In order to provide accurate identification of methylation loci, especially for low coverage data, we propose a full Bayesian partition model to detect differentially methylated loci under two conditions of scientific study. Since hypo-methylation and hyper-methylation have distinct biological implication, it is desirable to differentiate these two types of differential methylation. The advantage of our Bayesian model is that it can produce one-step output of each locus being either equal-, hypo- or hyper-methylated locus without further post-hoc analysis. An R package named as MethyBayes implementing the proposed full Bayesian partition model will be submitted to the bioconductor website upon publication of the manuscript. CONCLUSIONS: The proposed full Bayesian partition model outperforms existing methods in terms of power while maintaining a low false discovery rate based on simulation studies and real data analysis including bioinformatics analysis.Item Hypomethylation coordinates antagonistically with hypermethylation in cancer Development: a case study of leukemia(Biomed Central Ltd, 7/25/16) Kushwaha,Garima; Dozmorov,Mikhail; Wren,Jonathan D.; Qiu,Jing; Shi,Huidong; Xu,Dong; Garima Kushwaha, Mikhail Dozmorov, Jonathan D. Wren, Jing Qiu, Huidong Shi and Dong Xu; Qiu, JingBackground: Methylation changes are frequent in cancers, but understanding how hyper- and hypomethylated region changes coordinate, associate with genomic features, and affect gene expression is needed to better understand their biological significance. The functional significance of hypermethylation is well studied, but that of hypomethylation remains limited. Here, with paired expression and methylation samples gathered from a patient/control cohort, we attempt to better characterize the gene expression and methylation changes that take place in cancer from B cell chronic lymphocyte leukemia (B-CLL) samples. Results: Across the dataset, we found that consistent differentially hypomethylated regions (C-DMRs) across samples were relatively few compared to the many poorly consistent hypo-and highly conserved hyper-DMRs. However, genes in the hypo-C-DMRs tended to be associated with functions antagonistic to those in the hyper-C-DMRs, like differentiation, cell-cycle regulation and proliferation, suggesting coordinated regulation of methylation changes. Hypo-C-DMRs in B-CLL were found enriched in key signaling pathways like B cell receptor and p53 pathways and genes/motifs essential for B lymphopoiesis. Hypo-C-DMRs tended to be proximal to genes with elevated expression in contrast to the transcription silencing-mechanism imposed by hypermethylation. Hypo-C-DMRs tended to be enriched in the regions of activating H4K4me1/2/3, H3K79me2, and H3K27ac histone modifications. In comparison, the polycomb repressive complex 2 (PRC2) signature, marked by EZH2, SUZ12, CTCF binding-sites, repressive H3K27me3 marks, and "repressed/poised promoter" states were associated with hyper-C-DMRs. Most hypo-C-DMRs were found in introns (36 %), 3' untranslated regions (29 %), and intergenic regions (24 %). Many of these genic regions also overlapped with enhancers. The methylation of CpGs from 3'UTR exons was found to have weak but positive correlation with gene expression. In contrast, methylation in the 5'UTR was negatively correlated with expression. To better characterize the overlap between methylation and expression changes, we identified correlation modules that associate with "apoptosis" and "leukocyte activation". Conclusions: Despite clinical heterogeneity in disease presentation, a number of methylation changes, both hypo and hyper, appear to be common in B-CLL. Hypomethylation appears to play an active, targeted, and complementary role in cancer progression, and it interplays with hypermethylation in a coordinated fashion in the cancer process.Item Titanic Machine Learning Study from Disaster(Department of Applied Economics and Statistics, University of Delaware, Newark, DE., 2020-05) Cao, Emma Yiqin; Xie, Weitao; Dong, Chunzhi; Qiu, JingMachine learning plays an important role in the data science field nowadays. They can be used for classification problems. In this project, we are interested in understanding what kinds of people were more likely to survive the sinking of Titanic using different machine learning methods. Different predictors of passenger information were provided, and the survival chance of different passengers was predicted based on their covariates using 5 different machine learning methods including Conventional Logistic Regression, Random Forest, K-Nearest Neighbor, Support Vector Machine and Gradient Boosting. Grid Search Cross-validation was used for calibrating the prediction accuracy of different methods. The SVM model performs the best for our data with nine predictors and the prediction accuracy is about 83%. The Random Forest model performs the best for our data with six predictors and the prediction accuracy is also about 83%. We used Python for the whole analysis including cleaning the data, visualization, validation, and modeling.