Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
-
Updated
Nov 16, 2024 - Go
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Synthetic data generation for tabular data
Conditional GAN for generating synthetic tabular data.
A library to model multivariate data using copulas.
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
Unity's privacy-preserving human-centric synthetic data generator
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
Synthetic Data Generation for mixed-type, multivariate time series.
(SIGCOMM '22) Practical GAN-based Synthetic IP Header Trace Generation using NetShare
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
[TMLR] "GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?"
A toolset to test data classification engines that generates mock data in various file formats, sizes and data profiles.
[ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models".
Unity's Privacy-Preserving Novel Human Body Model Trained Solely on Synthetic Data and Corresponding Dense Anthropometric Measurements
Codebase for "Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN)"
[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling
Scripts for data generation using Blender and 3D datasets like Matterport3D.
This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper or enhanced and custom datasets
TensorFlow 2 implementation of Wasserstein Conditional GAN with Gradient Penalty (WCGAN-GP) for synthetic data generation
A testbed for agents and environments that can automatically improve models through data generation.
Add a description, image, and links to the synthetic-data-generation topic page so that developers can more easily learn about it.
To associate your repository with the synthetic-data-generation topic, visit your repo's landing page and select "manage topics."