Research

Research Vision

I study learning dynamics in multimodal biomedical AI systems, with a focus on survival prediction and risk modeling under heterogeneous and missing-modality settings. Using data such as histopathology images, structured clinical variables, and molecular signals, I investigate how architectural design and optimization strategies transform partially observed, noisy inputs into stable and learnable decision representations.

My research spans both molecular and imaging scales. At the molecular level, I fine-tune protein foundation models using more than two million sequences and inject post-translational modification (PTM) information to enhance structural and functional representation learning. At the imaging level, I analyze gradient interactions across multi-task pathology heads, characterizing structural subspace conflicts, stochastic batch-induced noise conflicts, and dominance imbalance among tasks. By studying how different gradient strategies reshape shared backbones, I seek optimization mechanisms that preserve task-relevant geometry while mitigating destructive interference.

Rather than viewing multimodal learning as simple fusion, I see machine learning more broadly as a mathematical process that organizes complex, noisy real-world information into structured representations capable of supporting reliable decisions. My goal is to understand how model structure and optimization dynamics shape this transformation—how they filter noise, integrate heterogeneous signals, and amplify task-relevant information within high-dimensional spaces.

Biomedical systems, with their multi-scale interactions, partial observability, and inherent uncertainty, make these structural challenges explicit. By studying how useful signals emerge and stabilize under such conditions, I aim to design AI systems that more deliberately and effectively extract what truly matters from data.

Current Research Themes

Multimodal Survival & Missing Modality Learning

Real-world biomedical data often has naturally missing modalities (e.g., one patient has imaging but no genomics). Standard fusion assumes all modalities are present and breaks under missingness. We need alignment and robustness that work when modalities are incomplete.

We build heterogeneous aligned fusion frameworks so that representations stay comparable across modalities and missing patterns. Our MIDL work focuses on survival prediction: we align modality-specific encoders and fuse with mechanisms that remain valid under missing modalities, improving both robustness and cross-modality consistency.

Heterogeneous Aligned Fusion for Survival Prediction (MIDL 2026)

Optimization Dynamics in Multi-Task Learning

Multi-task and multi-objective training often suffers from gradient conflict: objectives pull parameters in different directions. How shared vs. task-specific subspaces evolve, and how that affects representation stability, is still poorly understood.

We study optimization dynamics in multi-objective settings—how gradient conflict manifests, how parameter subspaces (shared vs. private) evolve during training, and how this affects representation quality in biomedical models. The goal is to design training and architecture choices that improve alignment and stability.

Information Injection in Foundation Models

Foundation models (e.g., protein models, LLMs) encode huge amounts of structure. Injecting new information or aligning latent spaces in a principled way, especially without full supervision, remains an open challenge.

We work on how to inject and align information in foundation-scale systems: probing and influencing protein model representations, and understanding LLM latent space structure for unsupervised or light-touch alignment. The aim is more principled methods for information injection and representation learning at scale.