Synthetic Data

Innovation, Limitations and Practical Applications


Talha Iqbal

Insight SFI Research Center for Data Analytics, University of Galway 

Muhammad Ali Farooq

School of Electrical Engineering, University of Galway

Peter Corcoran

School of Electrical Engineering, University of Galway 

Ihsan Ullah

Insight SFI Research Center for Data Analytics, University of Galway 


2 hours


In the recent years, the rise of data-driven technologies has shown how important it is to have high-quality data for machine learning and artificial intelligence (AI) applications. However, acquiring sufficient real-world data for the training and testing of these models often presents significant challenges, ranging from data scarcity and data privacy concerns to issues of bias and representativeness. In response to these challenges, the use/generation of synthetic data has emerged as a promising solution, offering artificially generated data using algorithms and statistical models that replicate the patterns, characteristics, and relationships found in real-world data. Synthetic data can serve as a valuable supplement/replacement to the real data, especially in situations where obtaining/gathering sufficient real data is challenging or where privacy concerns limit data access. Additionally, the synthetic data generation techniques can be tuned to address the bias present in a dataset by generating data that is more representative and balanced. While synthetic data offers many advantages, including the ability to augment the available data, enhance the model performance, and mitigate privacy concerns, it cannot fully replace real-world data in all domains. Real data often contains multi-layered details and complexities that are difficult to replicate accurately. Thus, it is crucial to consider the limitations and biases inherent in synthetic data and interpret results accordingly.

The proposed workshop aims to provide a platform for researchers, practitioners, and industry experts to exchange their insights, share best practices, and explore cutting-edge developments in synthetic data generation. By fostering collaboration and interdisciplinary interaction, this workshop aims to advance our understanding of synthetic data’s capabilities, limitations, and ethical implications (fairness) in the context of machine learning and AI. The proposed workshop will be 2 hours long and will cover a wide range of topics, including but not limited to:

Submission Instructions

Submission to the workshop follows the same procedures as for main conference papers. This workshop supports both ordinary and short paper submissions.  please select the specific Special Session name you are interested in the "additional questions" part of the submission. Please note that submission dates are the same as per the main conference schedule.