Skip to content

CalebJKim/DataPilot

Repository files navigation

🚀 Data Engineer Agent for Observability Tasks

Overview

Welcome to the Data Engineer Agent Project! This project aims to automate data ingestion pipelines and streamline Business Intelligence (BI) reporting using StateFlow and Large Language Models (LLMs). We're also exploring automation for DevOps teams by generating real-time incident response playbooks based on observability alerts and codebase knowledge. All powered by advanced AI, this agent will reduce manual intervention and boost system efficiency.

Key Features

  • SQL Query Generation: Translate business-level requests into SQL queries using state-of-the-art LLMs.
  • Business Intelligence Insights: Effortlessly generate BI reports using real-time data.
  • Incident Response Automation: DevOps teams can easily request data through natural language querying without worrying about structure.
  • Enterprise-Grade Robustness: Capacity to scale for multi-database environments while ensuring trustworthiness and accuracy in complex data ecosystems.

Why This Matters

Managing data pipelines and handling BI requests can be time-consuming. With our Data Engineer Agent, we aim to achieve:

  • 📊 Instant Data Reports: LLMs do the heavy lifting to provide quick, actionable insights.
  • 🤖 Automated Playbooks: Real-time insights and recommendation playbooks for Data Analysis and DevOps, generated on-the-fly.
  • 🔍 Less Manual Work: Automated workflows mean you focus on strategy, not maintenance.

Research Focus

  1. Automating Data Pipelines: Ensuring seamless data ingestion and management.
  2. Streamlining BI Reporting: Transforming natural language requests into SQL queries.
  3. Incident Response Automation: Generating playbooks from real-time observability data.
  4. Tackling LLM Limitations: Mitigating hallucinations and ensuring data trustworthiness.
  5. Enterprise Integration: Scaling across multi-cloud and complex data environments.

🧠 Built On Top of Prior Work

We leverage advanced data query languages and visualization frameworks to boost natural language-to-SQL translations, improving upon their limitations to manage complex queries and dynamic schemas.

Run it yourself

  1. In the frontend directory, run npm start to launch the electron app interface.
  2. In the main directory, run python app.py to launch the flask server.
  3. Query the database through natural language in the frontend interface.
  4. Swap out dataset for desired use case and fit database schema context accordingly.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published