publications

Below is a list of my journal and conference publications and preprints in reverse chronological order. You can also check out my Google Scholar profile.

2025

arXiv’25

VINCIE: Unlocking In-context Image Editing from Video

Leigang Qu , Feng Cheng , Ziyan Yang , Qi Zhao , Shanchuan Lin , Yichun Shi , Yicong Li , Wenjie Wang , Tat-Seng Chua , and Lu Jiang

In arXiv preprint , 2025

PDF Code Website
CVPR’25

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu , Haochuan Li , Wenjie Wang , Xiang Liu , Juncheng Li , Liqiang Nie , and Tat-Seng Chua

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2025

PDF Code Website
ICLR’25

TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models

Leigang Qu , Haochuan Li , Tan Wang , Wenjie Wang , Yongqi Li , Liqiang Nie , and Tat-Seng Chua

In The Thirteenth International Conference on Learning Representations , 2025

PDF Code Website

2024

CVPR’24

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu , Wenjie Wang , Yongqi Li , Hanwang Zhang , Liqiang Nie , and Tat-Seng Chua

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2024

PDF Code Website
ICML’24

NExT-GPT: Any-to-Any Multimodal LLM

Shengqiong Wu , Hao Fei , Leigang Qu , Wei Ji , and Tat-Seng Chua

In The International Conference on Machine Learning , 2024

(Oral)

PDF Code Website
ICLR’24

Composed image retrieval with text feedback via multi-grained uncertainty regularization

Yiyang Chen , Zhedong Zheng , Wei Ji , Leigang Qu , and Tat-Seng Chua

In The International Conference on Learning Representations , 2024

PDF Code
ACL’24

Generative cross-modal retrieval: Memorizing images in multimodal language models for retrieval and beyond

Yongqi Li , Wenjie Wang , Leigang Qu , Liqiang Nie , Wenjie Li , and Tat-Seng Chua

In Proceedings of the 62st Annual Meeting of the Association for Computational Linguistics , 2024
ACL (findings)’24

Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

Thong Nguyen , Yi Bin , Junbin Xiao , Leigang Qu , Yicong Li , Jay Zhangjie Wu , Cong-Duy Nguyen , See-Kiong Ng , and Luu Anh Tuan

In Findings of the Association for Computational Linguistics: ACL 2024 , 2024

2023

ACM MM’23

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Leigang Qu* , Shengqiong Wu* , Hao Fei , Liqiang Nie , and Tat-Seng Chua

In Proceedings of the 31st ACM International Conference on Multimedia , 2023

(Oral)

PDF Code Website
SIGIR’23

Learnable Pillar-based Re-ranking for Image-Text Retrieval

Leigang Qu , Meng Liu , Wenjie Wang , Zhedong Zheng , Liqiang Nie , and Tat-Seng Chua

In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2023

PDF Code
CIKM’23

Popularity-aware Distributionally Robust Optimization for Recommendation System

Jujia Zhao , Wenjie Wang , Xinyu Lin , Leigang Qu , Jizhi Zhang , and Tat-Seng Chua

In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , 2023

2022

ACM MM’22

Search-oriented Micro-video Captioning

Liqiang Nie , Leigang Qu , Dai Meng , Min Zhang , Qi Tian , and Alberto Del Bimbo

In Proceedings of the 30th ACM international conference on multimedia , 2022

(Best Paper Award)

PDF Code
TMM’22

Self-Supervised Correlation Learning for Cross-modal Retrieval

Yaxin Liu , Jianlong Wu , Leigang Qu , Tian Gan , Jianhua Yin , and Liqiang Nie

IEEE Transactions on Multimedia, 2022

2021

SIGIR’21

Dynamic Modality Interaction Modeling for Image-Text Retrieval

Leigang Qu , Meng Liu , Jianlong Wu , Zan Gao , and Liqiang Nie

In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2021

(Best Student Paper Award)

PDF Code

2020

ACM MM’20

Context-Aware Multi-View Summarization Network for Image-Text Matching

Leigang Qu , Meng Liu , Da Cao , Liqiang Nie , and Qi Tian

In Proceedings of the 28th ACM International Conference on Multimedia , 2020

(Oral)

PDF Code
TIP’20

Iterative Local-Global Collaboration Learning Towards One-Shot Video Person Re-Identification

Meng Liu , Leigang Qu , Liqiang Nie , Maofu Liu , Lingyu Duan , and Baoquan Chen

IEEE Transactions on Image Processing, 2020

PDF Code