There has been some confusion in recent years in which circumstances datasets generated using the synthetic data approach should be considered fully synthetic and which estimator to use for obtaining valid variance estimates based on the synthetic data. This paper aims at providing some guidance to overcome this confusion. It offers a review of the different approaches for generating synthetic datasets and discusses their similarities and differences. It also presents the different variance estimators that have been proposed for analyzing the synthetic data. Based on two simulation studies the advantages and limitations of the different estimators are discussed. The paper concludes with some general recommendations how to judge which synthesis strategy and which variance estimator is most suitable in which situation.
CITATION STYLE
Drechsler, J. (2018). Some clarifications regarding fully synthetic data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11126 LNCS, pp. 109–121). Springer Verlag. https://doi.org/10.1007/978-3-319-99771-1_8
Mendeley helps you to discover research relevant for your work.