AAII Research Seminar Series | Seminar 11
WHEN
10 June 2025
Tuesday
12.00pm - 1.00pm Australia/Sydney
WHERE
City campus
Meeting Room CB02.12.225
(Level 12, UTS Building 02)
COST
Free admission
CONTACT
Associate Professor Yi Zhang
Reinforcement Learning for the Development of Large Vision Language Models
Speaker: Associate Professor Thanh Thi Nguyen
Abstract
Large Vision-Language Models (LVLMs) represent a significant advancement in AI, enabling systems to understand and generate content across both visual and textual modalities. While large-scale pretraining has driven substantial progress, fine-tuning these models for aligning with human values or engaging in specific tasks remains a critical challenge. Reinforcement Learning (RL) methods offer promising frameworks for this fine-tuning process. RL enables models to optimise actions based on
reward signals instead of relying solely on supervised preference data.
This talk presents an overview of paradigms for fine-tuning LVLMs, highlighting how RL techniques can be used to align models with human values, improve task performance, and enable adaptive multimodal interaction. I categorise key approaches, examine sources of preference data, reward signals, and discuss open challenges.
The goal is to provide a clear understanding of how RL contributes to the evolution of fine-tuned, robust, and human-aligned LVLMs.
About A/Prof Thanh Thi Nguyen
Associate Professor Thanh Thi Nguyen has been ranked among the world’s top 2 per cent of AI scientists by Elsevier and Stanford University. He was a visiting scholar with the Computer Science Department at Stanford University in 2015 and the John A. Paulson School of Engineering and Applied Sciences at Harvard University in 2019. Dr Nguyen received a European-Pacific Partnership for ICT Expert Exchange Program Award from the European Commission in 2018 and an Australia–India Strategic Research Fund Early- and Mid-Career Fellowship from the Australian Academy of Science in 2020. He is currently an associate professor (research) at the Faculty of Information Technology, Monash University, Melbourne, Australia.