โ˜• Grab a coffee while we automate your EDA


EDA AUTOMATED
Privacy First ๐Ÿ”’
1. Your file is processed entirely in memory and never stored.
2. All data is cleared automatically after analysis.
3. We use dynamic rendering โ€” nothing is saved to disk or database.

๐Ÿ“ Upload Your Dataset

Supported formats: .csv, .xlsx, .xls, .json, .feather

Max file size: 500.0MB

๐Ÿง  How It Works

This app runs a full data profiling and quality audit using statistical methods โ€” all locally in your browser session. No code, no setup, just results.

  • ๐Ÿ“‹ Comprehensive Column Classification
    Columns are automatically categorised into numeric, boolean, categorical, datetime, or timeseries types. Timeseries checks include monotonicity and time span detection.
  • ๐Ÿ“Š Descriptive Statistics
    For every numeric column, we compute mean, standard deviation, min, max, variance, skewness, and kurtosis. Outliers are detected using Z-scores with thresholds at 3ฯƒ, 4ฯƒ, and 5ฯƒ.
  • ๐Ÿ“ˆ Distribution Diagnostics
    Skewed columns are flagged and classified (moderate or severe). You'll get transformation suggestions like log, Box-Cox, and Yeo-Johnson โ€” with histograms for visual inspection.
  • ๐Ÿšจ Outlier Detection & Visualisation
    Outliers are split by severity level and shown with plots. This helps identify influential points or data errors before they affect your models.
  • ๐Ÿงช Data Quality Audit
    The app automatically checks for:
    • โœ”๏ธ Missing data (0โ€“29% or 30%+ severity)
    • โœ”๏ธ Fully null columns
    • โœ”๏ธ Duplicate rows
    • โœ”๏ธ Constant and low-variance columns (including categorical and boolean)
    • โœ”๏ธ High and medium cardinality features
    • โœ”๏ธ Imbalanced boolean features (over 70% one class)
  • ๐Ÿ”— Correlation Analysis
    Computes Pearson correlation for numeric features and Cramรฉrโ€™s V for categoricals. Visualises numeric correlation heatmaps and flags highly collinear pairs.
  • ๐Ÿงฎ Multicollinearity Detection (VIF)
    Applies preprocessing (null filtering, constant drop, imputation), then calculates Variance Inflation Factors. Warns about numeric features with VIF > 5 or 10.
  • ๐Ÿ’ก Actionable Insights
    Recommendations are shown for each issue โ€” complete with example code, severity badges, and justifications so you can clean data efficiently.

๐Ÿ” All data stays in memory โ€” nothing is stored or shared. This is a fully stateless, secure analysis workflow.

Loading...

Processing your dataset... โณ