Department of Computer and Information Sciences
Permanent URI for this community
For more information, see the Department of Computer and Information Sciences website.
Browse
Browsing Department of Computer and Information Sciences by Title
Now showing 1 - 20 of 56
Results Per Page
Sort Options
Item A comprehensive analysis of the integration of team research between sport psychology and management(Psychology of Sport and Exercise, 2020-06-13) Emich, Kyle J.; Norder, Kurt; Lu, Li; Sawhney, AmanBoth sports and organizations rely on teams. As such, the sport psychology and management literatures have contributed greatly to our understanding of team functioning. Despite this, previous reviews based on subsets of articles in these literatures indicate a lack of communication between them. In this article, we assess the state of integration between the entirety of the sport psychology and management literatures on teams by considering the full set of interconnected team articles in the SCOPUS database (6974 articles over 69 years). We use this data to conduct a combination of citation network analysis and content analysis via topic modeling to evaluate conceptual integration. The data show that interdisciplinary discussion between these two fields is lacking, particularly regarding the integration of sport psychology into management research. Whereas 7% of references to team articles in sport psychology come from management journals, only 0.6% of team references in management journals come from sport psychology. Despite this, longitudinal analysis indicates that in the last 10 years the rate of integration between these fields is increasing. We identify specific topics that have accounted for this integration and suggest topics ripe for future integration.Item A Game-Theoretic Approach to Energy-Efficient Elevator Scheduling in Smart Buildings(IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023-02-22) Maleki, Erfan Farhangi; Bhatta, Dixit; Mashayekhy, LenaBuildings, producing more carbon footprints than the transportation sector, account for a significant portion of the United States’ total energy consumption. By designing modern automation techniques, smart buildings can significantly reduce energy consumption, protect the environment, and consequently improve quality of life. This article focuses on the automation of elevator scheduling, which is an NP-Hard problem, to reduce energy usage in smart buildings and improve users’ quality of experience. We propose an optimal mathematical model for the elevator scheduling problem using integer programming. We then propose a novel game-theoretic approach that captures interactions within the elevator system to reduce energy consumption and enhance user experience. We propose a request coalition formation game, where nonoverlapping coalitions of user requests are served by elevators to minimize their movements and energy consumption while reducing service time and stops for users. We analyze the performance of our proposed approach using the optimal solution as a benchmark and Nearest Car and Fixed Sectoring algorithms as rivals. The experiments show that our approach is significantly efficient in terms of energy consumption and service time, making it suitable for smart buildings.Item A long-term high-fat diet induces differential gene expression changes in spatially distinct adipose tissue of male mice(Physiological Genomics, 2024-11-11) Alradi, Malak; Askari, Hassan; Shaw, Mark; Bhavsar, Jaysheel D.; Kingham, Brewster F.; Polson, Shawn W.; Fancher, Ibra S.The accumulation of visceral adipose tissue (VAT) is strongly associated with cardiovascular disease and diabetes. In contrast, individuals with increased subcutaneous adipose tissue (SAT) without corresponding increases in VAT are associated with a metabolic healthy obese phenotype. These observations implicate dysfunctional VAT as a driver of disease processes, warranting investigation into obesity-induced alterations of distinct adipose depots. To determine the effects of obesity on adipose gene expression, male mice (n = 4) were fed a high-fat diet to induce obesity or a normal laboratory diet (lean controls) for 12–14 mo. Mesenteric VAT and inguinal SAT were isolated for bulk RNA sequencing. AT from lean controls served as a reference to obesity-induced changes. The long-term high-fat diet induced the expression of 169 and 814 unique genes in SAT and VAT, respectively. SAT from obese mice exhibited 308 differentially expressed genes (164 upregulated and 144 downregulated). VAT from obese mice exhibited 690 differentially expressed genes (262 genes upregulated and 428 downregulated). KEGG pathway and GO analyses revealed that metabolic pathways were upregulated in SAT versus downregulated in VAT while inflammatory signaling was upregulated in VAT. We next determined common genes that were differentially regulated between SAT and VAT in response to obesity and identified four genes that exhibited this profile: elovl6 and kcnj15 were upregulated in SAT/downregulated in VAT while trdn and hspb7 were downregulated in SAT/upregulated in VAT. We propose that these genes in particular should be further pursued to determine their roles in SAT versus VAT with respect to obesity. NEW & NOTEWORTHY A long-term high-fat diet induced the expression of more than 980 unique genes across subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT). The high-fat diet also induced the differential expression of nearly 1,000 AT genes. We identified four genes that were oppositely expressed in SAT versus VAT in response to the high-fat diet and propose that these genes in particular may serve as promising targets aimed at resolving VAT dysfunction in obesity.Item A scoping review of the use of lab streaming layer framework in virtual and augmented reality research(Virtual Reality, 2023-05-02) Wang, Qile; Zhang, Qinqi; Sun, Weitong; Boulay, Chadwick; Kim, Kangsoo; Barmaki, Roghayeh LeilaThe use of multimodal data allows excellent opportunities for human–computer interaction research and novel techniques regarding virtual and augmented reality (VR/AR) experiences. Collecting, coordinating, and synchronizing a large amount of data from multiple VR/AR hardware while maintaining a high framerate can be a daunting task, despite the compelling nature of multimodal data. The Lab Streaming Layer (LSL) is an open-source framework that enables the synchronous collection of various types of multimodal data, unlike existing expensive alternatives. However, despite its potential, this framework has not been fully adopted by the VR/AR research community. In this paper, we present a guideline of the LSL framework’s use in VR/AR research as well as report current trends by performing a comprehensive literature review on the subject. We extract 549 publications using LSL from January 2015 to March 2022. We analyze types of data, displays, and targeted application areas. We describe in-depth reviews of 38 selected papers and provide use of LSL in the VR/AR research community while highlighting benefits, challenges, and future opportunities.Item A Simple Mobile Plausibly Deniable System Using Image Steganography and Secure Hardware(Proceedings of the 2024 ACM Workshop on Secure and Trustworthy Cyber-Physical Systems, 2024-06-19) Xia, Lichen; Liao, Jinghui; Chen, Niusen; Chen, Bo; Shi, WeisongTraditional encryption methods cannot defend against coercive attacks in which the adversary captures both the user and the possessed computing device, and forces the user to disclose the decryption keys. Plausibly deniable encryption (PDE) has been designed to defend against this strong coercive attacker. At its core, PDE allows the victim to plausibly deny the very existence of hidden sensitive data and the corresponding decryption keys upon being coerced. Designing an efficient PDE system for a mobile platform, however, is challenging due to various design constraints bound to the mobile systems. Leveraging image steganography and the built-in hardware security feature of mobile devices, namely TrustZone, we have designed a Simple Mobile Plausibly Deniable Encryption (SMPDE) system which can combat coercive adversaries and, meanwhile, is able to overcome unique design constraints. In our design, the encoding/decoding process of image steganography is bounded together with Arm TrustZone. In this manner, the coercive adversary will be given a decoy key, which can only activate a DUMMY trusted application that will instead sanitize the sensitive information stored hidden in the stego-image upon decoding. On the contrary, the actual user can be given the true key, which can activate the PDE trusted application that can really extract the sensitive information from the stego-image upon decoding. Security analysis and experimental evaluation justify both the security and the efficiency of our design.Item An Efficient Approach to Predict Eye Diseases from Symptoms Using Machine Learning and Ranker-Based Feature Selection Methods(Bioengineering, 2022-12-24) Marouf, Ahmed Al; Mottalib, Md Mozaharul; Alhajj, Reda; Rokne, Jon; Jafarullah, OmarThe eye is generally considered to be the most important sensory organ of humans. Diseases and other degenerative conditions of the eye are therefore of great concern as they affect the function of this vital organ. With proper early diagnosis by experts and with optimal use of medicines and surgical techniques, these diseases or conditions can in many cases be either cured or greatly mitigated. Experts that perform the diagnosis are in high demand and their services are expensive, hence the appropriate identification of the cause of vision problems is either postponed or not done at all such that corrective measures are either not done or done too late. An efficient model to predict eye diseases using machine learning (ML) and ranker-based feature selection (r-FS) methods is therefore proposed which will aid in obtaining a correct diagnosis. The aim of this model is to automatically predict one or more of five common eye diseases namely, Cataracts (CT), Acute Angle-Closure Glaucoma (AACG), Primary Congenital Glaucoma (PCG), Exophthalmos or Bulging Eyes (BE) and Ocular Hypertension (OH). We have used efficient data collection methods, data annotations by professional ophthalmologists, applied five different feature selection methods, two types of data splitting techniques (train-test and stratified k-fold cross validation), and applied nine ML methods for the overall prediction approach. While applying ML methods, we have chosen suitable classic ML methods, such as Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), AdaBoost (AB), Logistic Regression (LR), k-Nearest Neighbour (k-NN), Bagging (Bg), Boosting (BS) and Support Vector Machine (SVM). We have performed a symptomatic analysis of the prominent symptoms of each of the five eye diseases. The results of the analysis and comparison between methods are shown separately. While comparing the methods, we have adopted traditional performance indices, such as accuracy, precision, sensitivity, F1-Score, etc. Finally, SVM outperformed other models obtaining the highest accuracy of 99.11% for 10-fold cross-validation and LR obtained 98.58% for the split ratio of 80:20.Item Automated Identification of Uniqueness in JUnit Tests(ACM Transactions on Software Engineering and Methodology, 2022-05-24) Wu, Jianwei; Clause, JamesIn the context of testing, descriptive test names are desirable because they document the purpose of tests and facilitate comprehension tasks during maintenance. Unfortunately, prior work has shown that tests often do not have descriptive names. To address this limitation, techniques have been developed to automatically generate descriptive names. However, they often generated names that are invalid or do not meet with developer approval. To help address these limitations, we present a novel approach to extract the attributes of a given test that make it unique among its siblings. Because such attributes often serve as the basis for descriptive names, identifying them is an important first step towards improving test name generation approaches. To evaluate the approach, we created a prototype implementation for JUnit tests and compared its output with human judgment. The results of the evaluation demonstrate that the attributes identified by the approach are consistent with human judgment and are likely to be useful for future name generation techniques.Item A Bifactor Approximation Algorithm for Cloudlet Placement in Edge Computing(IEEE Transactions on Parallel and Distributed Systems, 2021-11-15) Bhatta, Dixit; Mashayekhy, LenaEmerging applications with low-latency requirements such as real-time analytics, immersive media applications, and intelligent virtual assistants have rendered Edge Computing as a critical computing infrastructure. Existing studies have explored the cloudlet placement problem in a homogeneous scenario with different goals such as latency minimization, load balancing, energy efficiency, and placement cost minimization. However, placing cloudlets in a highly heterogeneous deployment scenario considering the next-generation 5G networks and IoT applications is still an open challenge. The novel requirements of these applications indicate that there is still a gap in ensuring low-latency service guarantees when deploying cloudlets. Furthermore, deploying cloudlets in a cost-effective manner and ensuring full coverage for all users in edge computing are other critical conflicting issues. In this article, we address these issues by designing a bifactor approximation algorithm to solve the heterogeneous cloudlet placement problem to guarantee a bounded latency and placement cost, while fully mapping user applications to appropriate cloudlets. We first formulate the problem as a multi-objective integer programming model and show that it is a computationally NP-hard problem. We then propose a bifactor approximation algorithm, ACP, to tackle its intractability. We investigate the effectiveness of ACP by performing extensive theoretical analysis and experiments on multiple deployment scenarios based on New York City OpenData. We prove that ACP provides a (2,4)-approximation ratio for the latency and the placement cost. The experimental results show that ACP obtains near-optimal results in a polynomial running time making it suitable for both short-term and long-term cloudlet placement in heterogeneous deployment scenarios.Item BioC-compatible full-text passage detection for protein-protein interactions using extended dependency graph(Oxford University Press, 4/12/16) Peng,Yifan; Arighi,Cecilia; Wu,Cathy H.; Vijay-Shanker,K.; Yifan Peng, Cecilia Arighi, Cathy H. Wu and K. Vijay-Shanker; Arighi, Cecilia Noemi; Wu, Cathy Huey-Hwa; Shanker, Vijay KThere has been a large growth in the number of biomedical publications that report experimental results. Many of these results concern detection of protein-protein interactions (PPI). In BioCreative V, we participated in the BioC task and Developmenteloped a PPI system to detect text passages with PPIs in the full-text articles. By adopting the BioC format, the output of the system can be seamlessly added to the biocuration pipeline with little effort required for the system integration. A distinctive feature of our PPI system is that it utilizes extended dependency graph, an intermediate level of representation that attempts to abstract away syntactic variations in text. As a result, we are able to use only a limited set of rules to extract PPI pairs in the sentences, and additional rules to detect additional passages for PPI pairs. For evaluation, we used the 95 articles that were provided for the BioC annotation task. We retrieved the unique PPIs from the BioGRID database for these articles and show that our system achieves a recall of 83.5%. In order to evaluate the detection of passages with PPIs, we further annotated Abstract and Results sections of 20 documents from the dataset and show that an f-value of 80.5% was obtained. To evaluate the generalizability of the system, we also conducted experiments on AIMed, a well-known PPI corpus. We achieved an f-value of 76.1% for sentence detection and an f-value of 64.7% for unique PPI detection.Item BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID(Oxford University Press, 8/2/16) Kim,Sun; Dogan,Rezarta Islamaj; Chatr-Aryamontri,Andrew; Chang,Christie S.; Oughtred,Rose; Rust,Jennifer; Batista-Navarro,Riza; Carter,Jacob; Ananiadou,Sophia; Matos,Sergio; Santos,Andre; Campos,David; Oliveira,Jose Luis; Singh,Onkar; Jonnagaddala,Jitendra; Dai,Hong-Jie; Su,Emily Chia-Yu; Chang,Yung-Chun; Su,Yu-Chen; Chu,Chun-Han; Chen,Chien Chin; Hsu,Wen-Lian; Peng,Yifan; Arighi,Cecilia; Wu,Cathy H.; Vijay-Shanker,K.; Aydin,Ferhat; Husunbeyi,Zehra Melce; Ozgur,Arzucan; Shin,Soo-Yong; Kwon,Dongseop; Dolinski,Kara; Tyers,Mike; Wilbur,W. John; Comeau,Donald C.; Sun Kim, Rezarta Islamaj Do gan, Andrew Chatr-Aryamontri, Christie S. Chang, Rose Oughtred, Jennifer Rust, Riza Batista-Navarro, Jacob Carter, Sophia Ananiadou, Se� rgio Matos, Andre� Santos, David Campos, Jose�Lu?s Oliveira, Onkar Singh, Jitendra Jonnagaddala, Hong-Jie Dai, Emily Chia-Yu Su, Yung-Chun Chang, Yu-Chen Su, Chun-Han Chu, Chien Chin Chen,Wen-Lian Hsu,Yifan Peng, Cecilia Arighi,Cathy H. Wu, K. Vijay-Shanker, Ferhat Ayd?n, Zehra Melce Husunbey, Arzucan Ozgu, Soo-Yong Shin, Dongseop Kwon, Kara Dolinski, Mike Tyers, W. John Wilbur and Donald C. Comeau; Arighi, Cecilia Noemi; Wu, Cathy Huey-Hwa; Shanker, Vijay KBioC is a simple XML format for text, annotations and relations, and was Developmenteloped to achieve interoperability for biomedical text processing. Following the success of BioC in BioCreative IV, the BioCreative V BioC track addressed a collaborative task to build an assistant system for BioGRID curation. In this paper, we describe the framework of the collaborative BioC task and discuss our findings based on the user survey. This track consisted of eight subtasks including gene/protein/organism named entity recognition, protein-protein/genetic interaction passage identification and annotation visualization. Using BioC as their data-sharing and communication medium, nine teams, world-wide, participated and contributed either new methods or improvements of existing tools to address different subtasks of the BioC track. Results from different teams were shared in BioC and made available to other teams as they addressed different subtasks of the track. In the end, all submitted runs were merged using a machine learning classifier to produce an optimized output. The biocurator assistant system was evaluated by four BioGRID curators in terms of practical usability. The curators' feedback was overall positive and highlighted the user-friendly design and the convenient gene/protein curation tool based on text mining.Item Communication-Constrained Routing and Traffic Control: A Framework for Infrastructure-Assisted Autonomous Vehicles(IEEE Transactions on Intelligent Transportation Systems, 2022-09-07) Liu, Guangyi; Salehi, Seyedmohammad; Bala, Erdem; Shen, Chien-Chung; Cimini, Leonard J.With the increasing demand for advanced autonomous driving, the available communication resources may become constrained over different geographic areas. In addition, due to dynamic channel variations and imperfect cell deployments, guaranteeing the required communication resources for data hungry and delay-sensitive applications in autonomous vehicles (AVs), along their entire trips, becomes challenging. To address these issues, the paper investigates the feasibility of a hybrid system-optimum and user-equilibrium AV traffic framework subject to communication constraints, as well as its performance gain. Within such a framework, the paper introduces the problems of communication-constrained routing (CCR) and traffic control (CCTC) in the context of infrastructure-assisted autonomous driving and presents respective solutions. For CCR, an efficient two-layered routing scheme is proposed which can provide optimal trip duration. Simulation results show that the routing scheme achieves a good balance between longer duration of communication coverage and acceptable source-to-destination travel time. For CCTC, it is shown that there exists an optimal AV speed on each road segment, as well as an optimal inter-AV distance and an optimal number of AVs in each cell, to maximize the road-network AV throughput within a single cell. Moreover, spectrum allocation is used to achieve Pareto-optimal road-network throughput across cells, and a new key performance index (KPI) is defined to evaluate the traffic control capability of cellular systems. Simulation results validate the improvement of AV throughput via the proposed CCTC solution.Item COVID-19 Knowledge Graph from semantic integration of biomedical literature and databases(Bioinformatics, 2021-10-06) Chen, Chuming; Ross, Karen E.; Gavali, Sachin; Cowart, Julie E.; Wu, Cathy H.The global response to the COVID-19 pandemic has led to a rapid increase of scientific literature on this deadly disease. Extracting knowledge from biomedical literature and integrating it with relevant information from curated biological databases is essential to gain insight into COVID-19 etiology, diagnosis and treatment. We used Semantic Web technology RDF to integrate COVID-19 knowledge mined from literature by iTextMine, PubTator and SemRep with relevant biological databases and formalized the knowledge in a standardized and computable COVID-19 Knowledge Graph (KG). We published the COVID-19 KG via a SPARQL endpoint to support federated queries on the Semantic Web and developed a knowledge portal with browsing and searching interfaces. We also developed a RESTful API to support programmatic access and provided RDF dumps for download.Item A crowdsourcing open platform for literature curation in UniProt(PLOS Biology, 2021-12-06) Wang, Yuqi; Wang, Qinghua; Huang, Hongzhan; Huang, Wei; Chen, Yongxing; McGarvey, Peter B.; Wu, Cathy H.; Arighi, Cecilia N.The UniProt knowledgebase is a public database for protein sequence and function, covering the tree of life and over 220 million protein entries. Now, the whole community can use a new crowdsourcing annotation system to help scale up UniProt curation and receive proper attribution for their biocuration work.Item DiMeX: A Text Mining System for Mutation- Disease Association Extraction(Public Library of Science, 2016-04-13) Mahmood, A. S. M. Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K.; A. S. M. Ashique Mahmood, Tsung-Jung Wu, Raja Mazumder, K. Vijay-Shanker; Mahmood, A. S. M. Ashique; Vijay-Shanker, K.The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http:// biotm.cis.udel.edu/dimex/.We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.Item E3-UAV: An Edge-Based Energy-Efficient Object Detection System for Unmanned Aerial Vehicles(IEEE Internet of Things Journal, 2023-08-03) Suo, Jiashun; Zhang, Xingzhou; Shi, Weisong; Zhou, WeiMotivated by the advances in deep learning techniques, the application of Unmanned Aerial Vehicle (UAV)-based object detection has proliferated across a range of fields, including vehicle counting, fire detection, and city monitoring. While most existing research studies only a subset of the challenges inherent to UAV-based object detection, there are few studies that balance various aspects to design a practical system for energy consumption reduction. In response, we present the E3-UAV, an edge-based energy-efficient object detection system for UAVs. The system is designed to dynamically support various UAV devices, edge devices, and detection algorithms, with the aim of minimizing energy consumption by deciding the most energy-efficient flight parameters (including flight altitude, flight speed, detection algorithm, and sampling rate) required to fulfill the detection requirements of the task. We first present an effective evaluation metric for actual tasks and construct a transparent energy consumption model based on hundreds of actual flight data to formalize the relationship between energy consumption and flight parameters. Then we present a lightweight energy-efficient priority decision algorithm based on a large quantity of actual flight data to assist the system in deciding flight parameters. Finally, we evaluate the performance of the system, and our experimental results demonstrate that it can significantly decrease energy consumption in real-world scenarios. Additionally, we provide four insights that can assist researchers and engineers in their efforts to study UAV-based object detection further.Item Effective biomedical document classification for identifying publications relevant to the mouse Gene Expression Database (GXD)(Oxford University Press., 2017-03-24) Jiang, Xiangying; Ringwald, Martin; Blake, Judith; Shatkay, Hagit; Xiangying Jiang, Martin Ringwald, Judith Blake and Hagit Shatkay; ; Jiang, Xiangying; Shatkay, HagitThe Gene Expression Database (GXD) is a comprehensive online database within the Mouse Genome Informatics resource, aiming to provide available information about endogenous gene expression during mouse development. The information stems primarily from many thousands of biomedical publications that database curators must go through and read. Given the very large number of biomedical papers published each year, automatic document classification plays an important role in biomedical research. Specifically, an effective and efficient document classifier is needed for supporting the GXD annotation workflow. We present here an effective yet relatively simple classification scheme, which uses readily available tools while employing feature selection, aiming to assist curators in identifying publications relevant to GXD. We examine the performance of our method over a large manually curated dataset, consisting of more than 25 000 PubMed abstracts, of which about half are curated as relevant to GXD while the other half as irrelevant to GXD. In addition to text from title-and-abstract, we also consider image captions, an important information source that we integrate into our method. We apply a captions-based classifier to a subset of about 3300 documents, for which the full text of the curated articles is available. The results demonstrate that our proposed approach is robust and effectively addresses the GXD document classification. Moreover, using information obtained from image captions clearly improves performance, compared to title and abstract alone, affirming the utility of image captions as a substantial evidence source for automatically determining the relevance of biomedical publications to a specific subject area.Item emiRIT: a text-mining-based resource for microRNA information(Database, 2021-05-28) Roychowdhury, Debarati; Gupta, Samir; Qin, Xihan; Arighi, Cecilia N.; Vijay-Shanker, K.microRNAs (miRNAs) are essential gene regulators, and their dysregulation often leads to diseases. Easy access to miRNA information is crucial for interpreting generated experimental data, connecting facts across publications and developing new hypotheses built on previous knowledge. Here, we present extracting miRNA Information from Text (emiRIT), a text-miningbased resource, which presents miRNA information mined from the literature through a user-friendly interface. We collected 149 ,233 miRNA –PubMed ID pairs from Medline between January 1997 and May 2020. emiRIT currently contains ‘miRNA –gene regulation’ (69 ,152 relations), ‘miRNA disease (cancer)’ (12 ,300 relations), ‘miRNA –biological process and pathways’ (23, 390 relations) and circulatory ‘miRNAs in extracellular locations’ (3782 relations). Biological entities and their relation to miRNAs were extracted from Medline abstracts using publicly available and in-house developed text-mining tools, and the entities were normalized to facilitate querying and integration. We built a database and an interface to store and access the integrated data, respectively. We provide an up-to-date and user-friendly resource to facilitate access to comprehensive miRNA information from the literature on a large scale, enabling users to navigate through different roles of miRNA and examine them in a context specific to their information needs. To assess our resource’s information coverage, we have conducted two case studies focusing on the target and differential expression information of miRNAs in the context of cancer and a third case study to assess the usage of emiRIT in the curation of miRNA information.Item Enhancing severe hypoglycemia prediction in type 2 diabetes mellitus through multi-view co-training machine learning model for imbalanced dataset(Scientific Reports, 2024-09-30) Agraz, Melih; Deng, Yixiang; Karniadakis, George Em; Mantzoros, Christos SocratesPatients with type 2 diabetes mellitus (T2DM) who have severe hypoglycemia (SH) poses a considerable risk of long-term death, especially among the elderly, demanding urgent medical attention. Accurate prediction of SH remains challenging due to its multifaced nature, contributed from factors such as medications, lifestyle choices, and metabolic measurements. In this study, we propose a systematic approach to improve the robustness and accuracy of SH predictions using machine learning models, guided by clinical feature selection. Our focus is on developing long-term SH prediction models using both semi-supervised learning and supervised learning algorithms. Using the action to control cardiovascular risk in diabetes trial, which includes electronic health records for over 10,000 individuals, we focus on studying adults with T2DM. Our results indicate that the application of a multi-view co-training method, incorporating the random forest algorithm, improves the specificity of SH prediction, while the same setup with Naive Bayes replacing random forest demonstrates better sensitivity. Our framework also provides interpretability of machine learning models by identifying key predictors for hypoglycemia, including fasting plasma glucose, hemoglobin A1c, general diabetes education, and NPH or L insulins. The integration of data routinely available in electronic health records significantly enhances our model’s capability to predict SH events, showcasing its potential to transform clinical practice by facilitating early interventions and optimizing patient management. By enhancing prediction accuracy and identifying crucial predictive features, our study contributes to advancing the understanding and management of hypoglycemia in this population.Item Formation of bijels stabilized by magnetic ellipsoidal particles in external magnetic fields(Soft Matter, 2024-10-08) Karthikeyan, Nikhil; Schiller, Ulf D.Bicontinuous interfacially-jammed emulsion gels (bijels) are increasingly used as emulsion templates for the fabrication of functional porous materials including membranes, electrodes, and biomaterials. Control over the domain size and structure is highly desirable in these applications. For bijels stabilized by spherical particles, particle size and volume fraction are the main parameters that determine the emulsion structure. Here, we investigate the use of ellipsoidal magnetic particles and study the effect of external magnetic fields on the formation of bijels. Using hybrid Lattice Boltzmann-molecular dynamics simulations, we analyze the effect of the magnetic field on emulsion dynamics and the structural properties of the resulting bijel. We find that the formation of bijels remains robust in the presence of magnetic fields, and that the domain size and tortuosity become anisotropic when ellipsoidal particles are used. We show that the magnetic fields lead to orientational ordering of the particles which in turn leads to alignment of the interfaces. The orientational order facilitates enhanced packing of particles in the interface which leads to different jamming times in the directions parallel and perpendicular to the field. Our results highlight the potential of magnetic particles for fabrication and processing of emulsion systems with tunable properties.Item Generalization of Runoff Risk Prediction at Field Scales to a Continental-Scale Region Using Cluster Analysis and Hybrid Modeling(Geophysical Research Letters, 2022-08-26) Ford, Chanse M.; Hu, Yao; Ghosh, Chirantan; Fry, Lauren M.; Malakpour-Estalaki, Siamak; Mason, Lacey; Fitzpatrick, Lindsay; Mazrooei, Amir; Goering, Dustin C.As surface water resources in the U.S. continue to be pressured by excess nutrients carried by agricultural runoff, the need to assess runoff risk at the field scale continues to grow in importance. Most landscape hydrologic models developed at regional scales have limited applicability at finer spatial scales. Hybrid models can be used to address the scale mismatch between model simulation and applicability, but could be limited by their ability to generalize over a large domain with heterogeneous hydrologic characteristics. To assist the generalization, we develop a regionalization approach based on the principal component analysis and K-means clustering to identify the clusters with similar runoff potential over the Great Lakes region. For each cluster, hybrid models are developed by combining National Oceanic and Atmospheric Administration's National Water Model and a data-driven model, eXtreme gradient boosting with field-scale measurements, enabling prediction of daily runoff risk level at the field scale over the entire region. Key Points: Identify five clusters in the Great Lakes region with similar runoff potential Generalize hybrid models developed at field scales to a continental-scale region Predict daily runoff risk on 1 km-by-1 km grid over the entire Great Lakes region Plain Language Summary: Nutrient loading is an important factor determining water quality in the Great Lakes. Transport of nutrients to surface water is often correlated with runoff, causing detrimental effects to aquatic ecosystems, such as harmful algal blooms. Runoff risk forecasts constituting an early warning system can be used to improve timing of nutrient application, leading to dual benefits of reducing nutrient transport to surface water and leaving more nutrients in the field for crop growth. However, measurements of the edge-of-field runoff are conducted at the field scale and sparse over the Great Lakes region, posing a great challenge to developing such a warning system over the continental scale. To address the challenge, we developed a generalization approach that allows predictive models developed using the runoff measurements at the field scale to be generalized to large regions with similar hydrogeologic characteristics. We can then predict the daily runoff risk level over the entire Great Lakes domain at 1 km-by-1 km resolution, which shows promise to be the backbone of the early warning system on the forecast of daily risk level for the Contiguous U.S.
- «
- 1 (current)
- 2
- 3
- »