Synthetic Datasets for Statistical Disclosure Control

Synthetic Datasets for Statistical Disclosure Control
Author :
Publisher : Springer Science & Business Media
Total Pages : 148
Release :
ISBN-10 : 9781461403265
ISBN-13 : 146140326X
Rating : 4/5 (26X Downloads)

Book Synopsis Synthetic Datasets for Statistical Disclosure Control by : Jörg Drechsler

Download or read book Synthetic Datasets for Statistical Disclosure Control written by Jörg Drechsler and published by Springer Science & Business Media. This book was released on 2011-06-24 with total page 148 pages. Available in PDF, EPUB and Kindle. Book excerpt: The aim of this book is to give the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. It describes all approaches that have been developed so far, provides a brief history of synthetic datasets, and gives useful hints on how to deal with real data problems like nonresponse, skip patterns, or logical constraints. Each chapter is dedicated to one approach, first describing the general concept followed by a detailed application to a real dataset providing useful guidelines on how to implement the theory in practice. The discussed multiple imputation approaches include imputation for nonresponse, generating fully synthetic datasets, generating partially synthetic datasets, generating synthetic datasets when the original data is subject to nonresponse, and a two-stage imputation approach that helps to better address the omnipresent trade-off between analytical validity and the risk of disclosure. The book concludes with a glimpse into the future of synthetic datasets, discussing the potential benefits and possible obstacles of the approach and ways to address the concerns of data users and their understandable discomfort with using data that doesn’t consist only of the originally collected values. The book is intended for researchers and practitioners alike. It helps the researcher to find the state of the art in synthetic data summarized in one book with full reference to all relevant papers on the topic. But it is also useful for the practitioner at the statistical agency who is considering the synthetic data approach for data dissemination in the future and wants to get familiar with the topic.


Synthetic Datasets for Statistical Disclosure Control Related Books

Synthetic Datasets for Statistical Disclosure Control
Language: en
Pages: 148
Authors: Jörg Drechsler
Categories: Social Science
Type: BOOK - Published: 2011-06-24 - Publisher: Springer Science & Business Media

DOWNLOAD EBOOK

The aim of this book is to give the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. It describes a
Synthetic Data for Confidentiality
Language: en
Pages: 19
Authors: Harold Mantel
Categories: Confidential communications
Type: BOOK - Published: 2009 - Publisher:

DOWNLOAD EBOOK

This paper reviews methodology for creating and analyzing synthetic data files, as implemented for various US Census Bureau survey programs -- particularly a SI
Handbook of Sharing Confidential Data
Language: en
Pages: 338
Authors: Jörg Drechsler
Categories: Business & Economics
Type: BOOK - Published: 2024-10-09 - Publisher: CRC Press

DOWNLOAD EBOOK

Statistical agencies, research organizations, companies, and other data stewards that seek to share data with the public face a challenging dilemma. They need t
Privacy in Statistical Databases
Language: en
Pages: 376
Authors: Josep Domingo-Ferrer
Categories: Computers
Type: BOOK - Published: 2004-06-30 - Publisher: Springer

DOWNLOAD EBOOK

Privacy in statistical databases is about ?nding tradeo?s to the tension between the increasing societal and economical demand for accurate information and the
Practical Synthetic Data Generation
Language: en
Pages: 166
Authors: Khaled El Emam
Categories: Computers
Type: BOOK - Published: 2020-05-19 - Publisher: "O'Reilly Media, Inc."

DOWNLOAD EBOOK

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issu