Yilin Bao profile
PhD Applicant, Fall 2026

University of California, San Diego

Machine Learning and Data Science (ECE)

San Diego, CA

Yilin Bao

Reinforcement Learning for Reasoning · LLMs · Structured AI Systems

I am a researcher focusing on reinforcement learning, reasoning systems, and structured representations for large language models. I completed my M.S. in Machine Learning and Data Science at UC San Diego, where I worked with Prof. Zhiting Hu on offline RL for multi-step reasoning and credit assignment.

My work examines how intermediate reasoning states can be modeled, evaluated, and optimized. I have developed methods integrating soft Bellman consistency, MCTS-guided trajectory sampling, and operator-level value modeling to improve stability and generalization in long-horizon reasoning. I also study verifiable reasoning mechanisms for aligning intermediate semantic states through coherence, consistency, completeness, and grounding criteria.

My research interests center on building reliable and interpretable reasoning systems through reinforcement learning, structured representations, and controlled architectural evolution. I am applying to PhD programs for Fall 2026 to further pursue these directions.

Research Interests

Reinforcement LearningLLM ReasoningOffline RLOperator-level Credit AssignmentStructured Intermediate StatesVerifiable ReasoningGraph-based RepresentationsAgentic LLMsSelf-Evolving AI Systems

Biography

I am a machine learning researcher interested in reinforcement learning, multi-step reasoning, and structured representations for large language models. I received my M.S. in Machine Learning and Data Science from UC San Diego, where I worked in Prof. Zhiting Hu’s MixLab on offline RL, soft Bellman consistency, and MCTS-guided reasoning policy learning.

My recent work focuses on modeling and evaluating intermediate reasoning states, including coherence, consistency, completeness, and grounding. I also design system-level methods for reliable LLM deployment, including retrieval-integrated workflows, value modeling for operator sequences, and scalable serving pipelines. In parallel, I collaborate with ShelteredAI to build applied LLM systems for real-world social-service environments.

My broader research goal is to develop structured, interpretable, and self-improving reasoning frameworks that unify reinforcement learning, verifiable intermediate states, and controlled structural evolution in agentic LLMs. I am currently applying to PhD programs for Fall 2026 to further pursue these directions.

News

  • Dec 2025Submitted OutlineForge (structured RL for scientific writing) to ARR.
  • Nov 2025OREO project presented at the ICLR 2025 Workshop on LLM Reasoning.
  • Apr 2024Joined Prof. Zhiting Hu’s lab (MixLab) to work on RL for multi-step reasoning.
  • Jan 2024Resumed work with ShelteredAI on LLM systems for social-service delivery.
  • Jul 2022Completed research at NEAT Labs on multimodal cognitive-neural data analysis.

Research Experience

Research Assistant

2024 — 2025

MixLab (Machine Learning & Reasoning Group), University of California, San Diego

Advisor: Prof. Zhiting Hu

Studied reinforcement learning for multi-step reasoning in large language models. Developed offline RL methods with soft Bellman consistency, MCTS-guided trajectory sampling, and operator-level value modeling. Achieved substantial improvements on MATH, GSM8K, and ALFWorld generalization benchmarks.

LLM ReasoningReinforcement LearningOffline RLMCTSOpenRLHF

Machine Learning Engineer

2023 — Present

ShelteredAI (Non-profit), Charlotte & New York

Advisor: Lead ML Team

Built AI systems for social-service delivery, focusing on domain adaptation, few-shot learning, and structured reasoning alignment. Designed evaluation metrics for intermediate reasoning states including coherence, consistency, completeness, and grounding. Developed Dockerized LLM microservices with vLLM, FAISS retrieval, and Twilio-based phone workflows.

NLPLLM AlignmentReasoning EvaluationFew-shot LearningSocial Impact

Research Assistant

2022

NEAT Labs (Neural Engineering & Translation Labs), University of California, San Diego

Advisor: Prof. Mariam S. Thomas (Lab PI)

Conducted multimodal neural data analysis involving EEG, ERP, and behavioral signals. Implemented statistical evaluation pipelines to study cognitive and neural modulation, and automated psychometric scoring workflows with REDCap integration.

NeuroscienceMultimodal DataStatistical Analysis

Undergraduate Researcher

2020 — 2021

Oceanic AI & Remote Sensing Group, Nanjing University of Information Science and Technology

Advisor: Prof. Wenjin Sun

Investigated significant wave height forecasting using LSTM and classical numerical modeling. Processed long-term buoy datasets, handled large missing-value patterns, and evaluated climate-dependent model behavior across Caribbean and Atlantic regions.

Time-SeriesLSTMRemote SensingGeophysical AI

Research Intern

2019

AI Oceanography Collaboration, NUIST & UCLA

Advisor: Prof. Wenjin Sun and UCLA collaborators

Implemented PSPNet for multi-layer satellite feature extraction. Benchmarked AI-based oceanic eddy detection against geometric baseline methods and analyzed regional morphology differences.

Computer VisionPSPNetRemote SensingSegmentation

Publications

* denotes equal contribution

2025

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu

ACL(Presented)

2025

OutlineForge: Hierarchical Reinforcement Learning with Explicit States for Scientific Writing

Yilin Bao, Ziyao He, Zayden Yang, Haohan Wang* (Corresponding)

ArXiv Preprint · Submitted to ARR(Under Review)

2022

Assessing Long Short-Term Memory Network Significant Wave Height Forecast Efficacy in the Caribbean Sea and Atlantic Ocean

Brandon J. Bethel, Changming Dong, Shuyi Zhou, Wenjin Sun, Yilin Bao

SSRN Working Paper(Published)

Contact

I am currently seeking PhD opportunities for Fall 2026 in Computer Science and Electrical & Computer Engineering departments. If you are interested in my work or would like to discuss potential research collaborations, feel free to get in touch.

yibao@ucsd.edu
University of California, San Diego · Electrical & Computer Engineering · La Jolla, CA