General Bayesian Machine Learning Models and Generative AI: I am currently working on Bayesian Kernel based machine learning models and Generative Bayesian AI models for prediction problems for complex and high dimensional data sets. Working on developing models, fast algorithms for model fitting, optimized computer codes for efficient implementations, and other scalability issues for current "Big Data" challenges.
Bayesian Feature or Variable Selection Problems: I am developing Bayesian variable selection priors for ultra-dimensional Bayesian models, Bayesian Grouped Lasso models with overlapping group variables, Bayesian structured variable selection with Graph Laplacian priors, and Bayesian Lassos for high dimensional stationary vector autoregressions .
Model Based Data Integration from Multiple and Diverse Data Platforms: The rough idea which I am still fine tuning is developing a model based unified approach when we have data on same individual collected based on different data generating platforms. For example, when we have medical data sets collected on individual which are pathology data, image data, molecular data based on different genetic platforms and epigenetics etc.
Scalable Machine Learning for Healthcare Big Data: In response to the growing volume of healthcare data—such as Medicaid claims exceeding 100 million records—we are developing a Bayesian Support Vector Machine (SVM) framework powered by Quasi-Monte Carlo Markov Chain (QMCMC) approximations. This scalable model overcomes the limitations of traditional SVMs and is designed to efficiently handle high-dimensional and massive datasets. Beyond Medicaid, this methodology has broad applicability across healthcare domains, where computational tractability and accurate prediction are both critical.
Bayesian Kernel-Based Spatio-Temporal Models for Environmental Health: Understanding the health impact of complex environmental exposures remains a priority in epidemiology and public health. We are developing Bayesian kernel machine models to jointly model multiple health outcomes as functions of exposure mixtures (e.g., air pollutants, toxicants). Our approach flexibly captures non-linear interactions and aims to identify key drivers of adverse health outcomes. Applications will include both environmental epidemiology and toxicology, with plans to submit an NIH R01 proposal based on this work.
Modeling Animal Disease Spread Using Censored Spatio-Temporal Count Data: In collaboration with the Taylor Geospatial Institute, we are modeling the spread of diseases among wildlife—such as chronic wasting disease in deer and Lyme-related tick-borne illnesses—using Bayesian spatio-temporal models for censored count data. This project supports wildlife disease monitoring and management strategies for conservation agencies across multiple states.
Smart Agroforestry and Digital Agriculture through Machine Learning: As Co-Principal Investigator on a NASA-funded project, I am applying machine learning techniques to predict and assess the impact of extreme climate events—including wildfires, deforestation, and flooding—on forest ecosystems and social disparities. This initiative contributes to the growing field of climate-resilient agriculture and digital forestry, leveraging Earth observation and socio-environmental data.
Bayesian Hierarchical Modeling for Geographic Health Disparities: In an ongoing project on structural inequality and firearm violence, we are examining how historical redlining and modern gentrification trends relate to community-level firearm injuries across more than 200 U.S. cities. Using a Bayesian hierarchical model with gradient boosting extensions, we aim to disentangle spatial and temporal effects, informing targeted interventions and health equity policies.