Voigt L, Freiling F, Hargreaves CJ (2025)
Publication Type: Journal article
Publication year: 2025
Book Volume: 52
Pages Range: 301874
Article Number: 301874
DOI: 10.1016/j.fsidi.2025.301874
Open Access Link: https://www.sciencedirect.com/science/article/pii/S2666281725000137
There is currently no systematic method for evaluating digital forensic datasets. This makes it difficult to judge their suitability for specific use cases in digital forensic education and training. Additionally, there is limited comparability in the quality of synthetic datasets or the strengths and weaknesses of different data synthesis approaches. In this paper, we propose the concept of a quantitative, metrics-based assessment of forensic datasets as a first step toward a systematic evaluation approach. As a concrete implementation of this approach, we introduce Mass Disk Processor, a tool that automates the collection of metrics from large sets of disk images. It enables a privacy-preserving retrieval of high-level disk image characteristics, facilitating the assessment of not only synthetic but also real-world disk images. We demonstrate two applications of our tool. First, we create a comprehensive datasheet for publicly available, scenario-based synthetic disk images. Second, we propose a formal definition of synthetic data realism that compares properties of synthetic data to properties of real-world data and present results from an examination of the realism of current scenario-based disk images.
APA:
Voigt, L., Freiling, F., & Hargreaves, C.J. (2025). A metrics-based look at disk images: Insights and applications. Forensic Science International: Digital Investigation, 52, 301874. https://doi.org/10.1016/j.fsidi.2025.301874
MLA:
Voigt, Lena, Felix Freiling, and Christopher J. Hargreaves. "A metrics-based look at disk images: Insights and applications." Forensic Science International: Digital Investigation 52 (2025): 301874.
BibTeX: Download