synthetic data generation

5 papers across 3 sessions

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Automatic Synthetic Data and Fine-grained Adaptive Feature Alignment for Composed Person Retrieval

#4810 · Delong Liu, Haiwen Li, Zhaohui Hou, Zhicheng Zhao, Fei Su, Yuan Dong

We introduce a large-scale synthetic dataset and a fine-grained alignment framework for composed person retrieval, and provide a manually annotated benchmark test set for objective evaluation.

PANGEA: Projection-Based Augmentation with Non-Relevant General Data for Enhanced Domain Adaptation in LLMs

#2307 · Seungyoo Lee, Giung Nam, Moonseok Choi, Hyungi Lee, Juho Lee

This paper introduces PANGEA, a method that leverages general-purpose data to generate diverse and high-quality synthetic data, improving LLM performance on domain-specific tasks.

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs

#206 · Yifan Wei, Xiaoyan Yu, Tengfei Pan, Angsheng Li, Li Du

A novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs.

Poster Session 4

2 papers

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Forging Time Series with Language: A Large Language Model Approach to Synthetic Data Generation

#2413 · Cécile Rousseau, Tobia Boschi, Giandomenico Cornacchia, Dhaval Salwala, Alessandra Pascale, Juan Moreno

Time Series Generation using Large Language Models and compact embeddings

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

#109 · Akshara Prabhakar, Zuxin Liu, Ming Zhu, Jianguo Zhang, Tulika Manoj Awalgaonkar, Shiyu Wang, Zhiwei Liu, Haolin Chen, Thai Hoang, Juan Carlos Niebles, Shelby Heinecke, Weiran Yao, Huan Wang, Silvio Savarese, Caiming Xiong

An agentic pipeline for multi-turn synthetic data generation that produces high-quality training data for AI agents.