Skip to content

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

License

Notifications You must be signed in to change notification settings

surajwate/DataLab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

suraj_datalab

PyPI version License: MIT Build Status

suraj_datalab is a Python package designed to streamline the process of analyzing and visualizing both categorical and numerical data. It also includes utilities for data cleaning and preparing datasets for machine learning models, like creating K-Folds for cross-validation.

Table of Contents

Features

  • Categorical Analysis: Effortlessly analyze and visualize categorical data in relation to target variables.
  • Numerical Analysis: Detect, analyze, and visualize outliers in numerical data.
  • Data Cleaning: Automatically handle rare categories in your datasets.
  • Cross-Validation Preparation: Create K-Folds for both classification and regression tasks, including stratified K-Folds.
  • Visualization: Built-in support for generating insightful plots with minimal code.
  • Extensible: Designed with flexibility in mind, allowing easy extension and integration with other data processing workflows.

Installation

Requirements

Install via pip

pip install suraj_datalab

Quickstart

Here’s how you can quickly get started with suraj_datalab:

import pandas as pd
from suraj_datalab.analysis import analyze_categorical_feature, analyze_numerical_feature

# Sample DataFrame
data = {'Feature': ['A', 'B', 'A', 'B'], 'Transported': [True, False, True, False]}
df = pd.DataFrame(data)

# Analyze categorical feature
result = analyze_categorical_feature(df, 'Feature', 'Transported')
print(result)

Usage

For detailed usage instructions, please refer to the Usage Guide.

Examples

Check out the Examples section for practical examples of how to use the functions and classes provided by suraj_datalab.

API Reference

For a detailed reference of all available functions and classes, see the API Reference.

Contributing

Contributions are welcome! Please read the Contributing Guidelines for more details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Thanks to all contributors who have helped with this project.

Contact

For any questions or suggestions, please contact Suraj Wate.

About

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages