Youngsoo Jang

Research Scientist at LG AI Research, Advanced ML Lab

Research Interest: Reinforcement Learning (RL), Large Language Models (LLMs), Reinforcement Learning from Human Feedback (RLHF)

Contact: jys5609 at gmail.com

Education

Work Experience

Publications

Prospector: Improving LLM Agents with Self-Asking and Trajectory Ranking

Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments

Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration

Show, Think, and Tell: Thought-Augmented Fine-Tuning of Large Language Models for Video Captioning

SafeDICE: Offline Safe Imitation Learning with Non-Preferred Demonstrations

Information-Theoretic State Space Model for Multi-View Reinforcement Learning

LobsDICE: Offline Imitation Learning from Observation via Stationary Distribution Correction Estimation

GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems

Monte-Carlo Planning and Learning with Language Action Value Estimates

Variational Inference for Sequential Data with Future Likelihood Estimates

End-to-End Neural Pipeline for Goal-Oriented Dialogue System using GPT-2

Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented Dialogues

Trust Region Sequential Variational Inference

PyOpenDial: A Python-based Domain-Independent Toolkit for Developing Spoken Systems with Probabilistic Rules

Cross-language Neural Dialog State Tracker for Large Ontologies using Hierarchical Attention

Constrained Bayesian Reinforcement Learning via Approximate Linear Programming

Neural Dialog State Tracker for Large Ontologies by Attention Mechanism

Awards and Honors

Academic Talks

Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration

SafeDICE: Offline Safe Imitation Learning with Non-Preferred Demonstrations

GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems

Monte-Carlo Planning and Learning with Language Action Value Estimates

Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented Dialogues

Teaching Experiences

Academic Services