Image Generation Diversity Issues and How to Tame Them

Dombrowski MN, Zhang W, Cechnicka S, Reynaud H, Kainz B (2025)

Publication Language: English

Publication Status: Submitted

Publication Type: Conference contribution, Original article

Future Publication Type: Journal article

Publication year: 2025

Publisher: The Computer Vision Foundation

Pages Range: 3029-3039

Conference Proceedings Title: Proceedings of the Computer Vision and Pattern Recognition Conference 2025

Event location: Nashville, TN, USA

URI: https://openaccess.thecvf.com/content/CVPR2025/html/Dombrowski_Image_Generation_Diversity_Issues_and_How_to_Tame_Them_CVPR_2025_paper.html

DOI: 10.48550/arXiv.2411.16171

Open Access Link: https://openaccess.thecvf.com/content/CVPR2025/html/Dombrowski_Image_Generation_Diversity_Issues_and_How_to_Tame_Them_CVPR_2025_paper.html

Abstract

Generative methods now produce outputs nearly indistinguishable from real data but often fail to fully capture the data distribution. Unlike quality issues, diversity limitations in generative models are hard to detect visually, requiring specific metrics for assessment. In this paper, we draw attention to the current lack of diversity in generative models and the inability of common metrics to measure this. We achieve this by framing diversity as an image retrieval problem, where we measure how many real images can be retrieved using synthetic data as queries. This yields the Image Retrieval Score (IRS), an interpretable, hyperparameter-free metric that quantifies the diversity of a generative model's output. IRS requires only a subset of synthetic samples and provides a statistical measure of confidence. Our experiments indicate that current feature extractors commonly used in generative model assessment are inadequate for evaluating diversity effectively. Consequently, we perform an extensive search for the best feature extractors to assess diversity. Evaluation reveals that current diffusion models converge to limited subsets of the real distribution, with no current state-of-the-art models superpassing 77% of the diversity of the training data. To address this limitation, we introduce Diversity-Aware Diffusion Models (DiADM), a novel approach that improves diversity of unconditional diffusion models without loss of image quality. We do this by disentangling diversity from image quality by using a diversity aware module that uses pseudo-unconditional features as input. We provide a Python package offering unified feature extraction and metric computation to further facilitate the evaluation of generative models this https URL.

Authors with CRIS profile

Mischa Neil Dombrowski Professur für Image Data Exploration and Analysis Hadrien Jean-Francois Reynaud
Bernhard Kainz Professur für Image Data Exploration and Analysis

Related research project(s)

Medical Image Analysis with Normative Machine Learning (ERC-CoG MIA-NORMAL) Sept. 1, 2023 - Sept. 30, 2028 DC-AIDE - Dedizierte klinische Ausrüstung für den Einsatz künstlicher Intelligenz (DFG 512819079) Jan. 1, 2024 - Jan. 31, 2024

Involved external institutions

Imperial College London / The Imperial College of Science, Technology and Medicine

United Kingdom (GB)

How to cite

APA:

Dombrowski, M.N., Zhang, W., Cechnicka, S., Reynaud, H., & Kainz, B. (2025). Image Generation Diversity Issues and How to Tame Them. In Proceedings of the Computer Vision and Pattern Recognition Conference 2025 (pp. 3029-3039). Nashville, TN, USA: The Computer Vision Foundation.

MLA:

Dombrowski, Mischa Neil, et al. "Image Generation Diversity Issues and How to Tame Them." Proceedings of the Computer Vision and Pattern Recognition Conference 2025, Nashville, TN, USA The Computer Vision Foundation, 2025. 3029-3039.

BibTeX: Download