We advance cutting-edge data analytics by developing and applying novel methods for analyzing complex datasets. Our research enables more robust and rigorous empirical research, helping researchers and practitioners solve real-world problems, and generating insights that drive impactful decision-making.

2025: Ongoing research

BTW: A Non-Parametric Variance Stabilization Framework for Multimodal Model Integration

We introduced Beyond Two-Modality Weighting (BTW), a parameter-free framework that adaptively weights modalities in multimodal Mixture-of-Experts models. BTW uses instance-level KL divergence and modality-level mutual information to dynamically adjust each modality’s contribution, scales to any number of modalities, and improves both regression and classification performance without adding complexity.

Multimodal models often suffer when additional modalities add noise rather than signal, and existing methods cannot scale or adapt at the instance level. BTW solves these issues by providing a scalable, statistically grounded way to down-weight unreliable modalities and emphasize informative ones in real time, overcoming core limitations of current multimodal weighting approaches.

BTW strengthens prediction and decision-making in domains relying on heterogeneous data—such as consumer analytics, credit risk, forecasting, and health economics—by making multimodal models more accurate, robust, and scalable. Its ability to manage noisy inputs and integrate many data sources enhances reliability and supports practical, large-scale economic and business applications.

The Promise of Time-Series Foundation Models for Agricultural Forecasting: Evidence from Marketing Year Average Prices

We conducted the first comprehensive evaluation of Time-Series Foundation Models for forecasting USDA Marketing Year Average (MYA) prices, comparing 16 models across traditional time series, machine learning, deep learning, and foundation-model approaches using data on major crops from 1997–2025.

Forecasting agricultural prices is difficult because datasets are small and market dynamics shift frequently, making many modern models unreliable. With the rapid rise of foundation models, it was unclear whether they could meaningfully improve forecasting in data-scarce environments. By providing the first systematic test of these models for MYA forecasting, we aimed to determine whether they outperform classical tools, generalize better with limited data, and represent a genuine paradigm shift in how agricultural economists approach forecasting.

Our results show that time-series foundation models—especially Time-MoE—offer a step-change in forecasting performance, consistently outperforming established methods and reducing errors used in farm planning, hedging, risk management, and policy design. The finding that foundation models excel even without domain-specific retraining highlights a novel and scalable forecasting paradigm, lowering barriers to adoption and signaling broader potential for foundation-model-driven analytics across economics and agribusiness.

A L-infinity Norm Counterfactual and Synthetic Control Approach

We introduce an L∞-regularized Synthetic control (SC) method that blends the dense weighting philosophy of Difference-in-Differences with the flexibility of traditional SC. Our approach produces more stable, evenly distributed weights, supported by an interior-point algorithm and asymptotic theory. Simulations and two empirical applications—California’s tobacco policy and a volatile U.S. stock-market regulation—show strong performance and clear advantages over sparse and Lasso-based SC estimators.

Sparse SC weights are simple but often fragile, especially in environments where relying on a few control units amplifies bias. Existing work treats sparsity as inherently desirable, but we argue it is a modeling choice, not a virtue. We designed the L∞-SC method to offer a denser and more robust alternative, reframing the SC–DID debate as a choice between sparse and dense weighting philosophies. This represents a paradigm shift in how counterfactuals are constructed in comparative case studies..

Denser L∞-based weights produce more reliable treatment effect estimates in volatile markets and policy environments, avoiding the underestimation observed with sparse and Lasso-based SC. In stable settings, performance matches existing methods, but in more turbulent contexts—like stock-market regulation—L∞-SC provides notably more accurate counterfactuals. This delivers a more robust framework for business analytics and policy evaluation where causal counterfactuals guide real decisions.

2024: Ongoing research

Beyond common support: an iterative approach to nonseparable models with endogeneity

We are developing a novel econometric approach to address endogeneity in nonseparable models by introducing a new concept called "partially connected support." This more general support condition allows us to overcome limitations of the traditional method. Our approach is tested through simulations, demonstrating superior performance in scenarios where the common support assumption is violated but partially connected support holds.

Endogeneity in nonseparable models presents significant challenges for understanding the true effects in data analytics and econometrics, and the existing method of using control variables is often constrained by the common support assumption, which can be difficult to satisfy in practice.

This approach broadens the applicability of nonseparable models, allowing for more reliable analysis and decision-making in a wide range of contexts such as education and agricultural production.

Rehabilitating the Once-Abandoned Endogenous IV

We are developing a novel econometric approach to address one of the key challenges in the instrumental variable (IV) method: the difficulty in identifying valid instruments. Our approach provides researchers with a tool to assess the validity of IVs,  and to identify causal effects even when IVs are imperfect. This approach is especially effective in large datasets where traditional IV methods may fail.

Causal inference is challenging, particularly in fields like health, education, and agriculture where complex relationships and large datasets are common. The IV approach is one of the most popular approaches to causal inference, but is often not reliable due to the difficulty to find a valid IV in practice.

This approach has broad implications for empirical research and policy-making. By offering a more robust framework for causal inference, it allows researchers to draw more reliable and robust conclusions in complex real-world scenarios. The potential applications are vast, ranging from assessing the effects of policies in health and education to evaluating agricultural interventions or policies.