Automated Algorithm for Multi-variate Data Synthesis with Cholesky Decomposition

Angel Marchev, Jr; Vasil Marchev

doi:10.1145/3631908.3631909

What is it about?

This article explores methods for generating synthetic data, algorithmically created for testing and training machine learning models. Probabilistic approaches, such as Monte Carlo simulation and Generative Adversarial Networks (GANs), rely on random sampling. Non-probabilistic methods, like Inverse Copula Sampling and Cholesky Decomposition, preserve dependencies and covariance structures. The proposed algorithm focuses on generating synthetic data that preserves both marginal distributions and correlations. The generated data is validated using the Kolmogorov-Smirnov (K-S) test. An empirical example demonstrates the effectiveness of the methodology, showcasing the generation of synthetic data and validation against original distributions. This research provides valuable insights into synthetic data generation, aiding researchers and practitioners in data analysis and model development.

Why is it important?

Synthetic data is a type of data that is algorithmically generated rather than obtained by direct measurement or collection from the real world. It is a means of simulating real-world data, often used for testing, validation, and training of machine learning models or other computational systems.

This page is a summary of: Automated Algorithm for Multi-variate Data Synthesis with Cholesky Decomposition, October 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3631908.3631909.
You can read the full text:

Read

Contributors

The following have contributed to this page

PhD Vasil Marchev
University of National and world Economy

Automated Algorithm for Multi-variate Data Synthesis with Cholesky Decomposition

What is it about?

Why is it important?

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Automated Algorithm for Multi-variate Data Synthesis with Cholesky Decomposition

What is it about?

Featured Image

Why is it important?

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management