DataMimic.io - Realistic Synthetic Data Generation & No-Code EDA Platform
DataMimic.io is a web-based platform empowering data scientists, developers, and QA engineers to generate realistic synthetic datasets and perform no-code Exploratory Data Analysis (EDA). This project addresses critical challenges in data privacy and accessibility by providing a powerful, intuitive interface to create, analyze, and clean tabular data on demand.
Live Demo : Here
- Pre-defined Schemas: Generate data for common domains like Medical, Finance, Retail, Education, and Automotive.
- Locality-Based Data: Create realistic data for different regions (US, UK, India, Canada, Australia).
- Data Quality Controls: Fine-tune the dataset with adjustable missing value ratios and data variance.
- AI-Powered Custom Columns: A standout feature that leverages the Google Gemini API to generate entire columns of data based on natural language prompts.
- Flexible Export: Download generated data in CSV, JSON, or Excel formats.


- Easy Data Upload: Upload your CSV or XLSX files and get an instant, comprehensive data overview.
- Detailed Summary: View total rows/columns, file size, missing value percentages, and detailed column-wise statistics (mean, median, std dev, etc.).
- Powerful Pre-processing Suite: Clean and transform your data with a few clicks:
- Missing Value Handling: Remove rows/columns or impute with mean, median, or mode.
- Duplicate Removal: Eliminate duplicate rows.
- Column Management: Remove specific columns or change data types.
- Data Scaling: Apply Min-Max Scaling or Standardization (Z-score).
- Text Cleaning: Standardize text with uppercase, lowercase, or title case.
- Download Processed Data: Export your cleaned dataset, ready for analysis or model training.

- Backend:
- Framework: Flask
- Data Manipulation: Pandas, NumPy
- Data Preprocessing: Scikit-learn
- Synthetic Data: Faker
- AI Integration: Google Gemini API (via
requests
) - Email: Flask-Mail
- Frontend:
- HTML5, CSS3, JavaScript (Vanilla JS)
- Jinja2 Templating
- Deployment:
- WSGI Server: Gunicorn
- Hosting: Render.com (Web Service)
To run DataMimic.io on your local machine, follow these steps:
- Python 3.9 or higher
pip
andvenv
git clone https://github.com/harsh-kakadiya1/datamimic.io.git
cd datamimic.io
# For Windows
python -m venv venv
venv\Scripts\activate
# For macOS/Linux
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Create a file named .env in the root of the project directory. This file stores your secret keys and credentials.
SECRET_KEY='a_very_strong_and_random_secret_key'
EMAIL_USER='[email protected]'
EMAIL_PASS='your_gmail_app_password'
GEMINI_API_KEY='your_google_gemini_api_key'
- SECRET_KEY: A long, random string for Flask session security.
- EMAIL_USER / EMAIL_PASS: Your Gmail credentials for the contact form. Use a Google App Password if you have 2-Factor Authentication enabled.
- GEMINI_API_KEY: Your API key from Google AI Studio.
flask run
The application will be available at http://127.0.0.1:5000
.
Harsh Kakadiya - GitHub | LinkedIn
Krish Kunjadiya - GitHub | LinkedIn
This project is licensed under the MIT License - see the LICENSE file for details.