Document Management and Query Application

Overview

This project is a full-stack, microservices-based application designed to securely handle document uploads, parsing, indexing, and querying. It enables users to interact with various types of documents (PDF, PPT, CSV, etc.) through advanced natural language processing (NLP), leveraging RAG (Retrieve and Generate) agents for context-aware query responses. The application ensures high scalability, efficient processing, and secure access, making it suitable for enterprise-level use.

Demo Video

https://www.loom.com/share/352ed166336042c0a1dd752654fb17ad

Features

User Authentication: Secure login and registration using JWT tokens, ensuring user data protection.
Document Upload and Management: Supports multiple file types with storage in AWS S3, allowing metadata tracking and categorization.
Advanced Document Parsing: Utilizes unstructured.io to parse documents and extract meaningful metadata, making content easily retrievable.
NLP Querying with RAG Agents: Implements RAG agents to generate accurate, context-sensitive answers to user queries based on document content.
Search and Indexing: Uses Elasticsearch for indexing parsed content, enabling fast, efficient search capabilities.
Caching and Status Management: Redis is used for caching document statuses and managing service health.
Logging and Monitoring: Integrates with the ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging and supports Prometheus and Grafana for real-time monitoring.

Technology Stack

Frontend: Next.js
Backend Services:
- NestJS for Login and Document Management (DMS) services
- Flask for Indexing and Query Answering (QA) services
Storage: AWS S3 for file storage
Databases:
- PostgreSQL for metadata storage
- Redis for caching
Document Parsing: unstructured.io for advanced parsing
NLP Processing: LangChain/LlamaIndex and RAG agents
Search Engine: Elasticsearch
Containerization and Orchestration: Docker and Kubernetes
Logging and Monitoring: ELK Stack for logging, with optional Prometheus and Grafana for monitoring

Architecture

The application follows a microservices architecture, with each service handling a distinct function, ensuring modularity, scalability, and fault tolerance. Services communicate over gRPC, REST APIs, and WebSockets as required. Each component is containerized using Docker and orchestrated with Kubernetes for deployment.

Key Components

Frontend: Built with Next.js, providing a user-friendly interface for login, file upload, and query submission.
Login Service: Manages user authentication and authorization using NestJS and JWT.
DMS (Document Management) Service: Handles file uploads to AWS S3 and manages file metadata in PostgreSQL.
Indexing Service: Retrieves files from S3, parses content using unstructured.io, and indexes it in Elasticsearch.
QA (Query Answering) Service: Uses RAG agents to process user queries, retrieving and generating responses from indexed content.
Caching and Status Management: Redis is used to cache document processing status and manage inter-service communication.
Logging and Monitoring: ELK Stack is used for logging, with optional Prometheus and Grafana for application performance monitoring.

Deployment

All services are containerized with Docker and managed with Kubernetes. The deployment supports scaling, load balancing, and high availability. Kubernetes manifests or Helm charts are provided to ease the deployment process on any Kubernetes platform (e.g., Minikube for local, AWS EKS for cloud).

Documentation

https://island-wool-188.notion.site/AI-Planet-138e6e41cfdf803b98c8d10b6592a83d

Key Deployment Features

Docker: Each service has its own Dockerfile for containerization.
Kubernetes: Manages the deployment, scaling, and orchestration of services.
Logging Sidecar: A sidecar logging service is deployed with each container to aggregate logs centrally in the ELK Stack.
Optional Monitoring: Prometheus and Grafana are configured to collect and visualize metrics, helping to monitor application health and performance.

Getting Started

Prerequisites

To set up and run this application, ensure you have the following installed:

Docker
Kubernetes (Minikube or cloud provider like AWS EKS, GKE)
Redis
PostgreSQL
AWS S3 or an equivalent file storage solution
Elasticsearch
Node.js (for the frontend)

Installation and Setup

Clone the repository: Download or clone this project from GitHub.
Configure Environment Variables: Update environment variables for AWS, PostgreSQL, Redis, and Elasticsearch.
Build Docker Images: Use the provided Dockerfiles to build images for each service.
Deploy with Kubernetes: Use Kubernetes manifests or Helm charts to deploy the application in your Kubernetes cluster.

Usage

Authentication: Use the login service to create an account and sign in.
Upload Documents: Upload files in the supported formats (PDF, PPT, CSV, etc.) via the frontend.
Query Documents: Enter queries in natural language, and the application will fetch context-aware answers based on the document content.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements, bug fixes, or feature suggestions.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Login-ms		Login-ms
ai-planet-frontend		ai-planet-frontend
dms-microservice		dms-microservice
indexing-microservice		indexing-microservice
qa-microservice		qa-microservice
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Management and Query Application

Overview

Demo Video

Features

Technology Stack

Architecture

Key Components

Deployment

Documentation

Key Deployment Features

Getting Started

Prerequisites

Installation and Setup

Usage

Contributing

License

About

Releases

Packages

Languages

Lakshya0257/RAGAgent-DocumentAI

Folders and files

Latest commit

History

Repository files navigation

Document Management and Query Application

Overview

Demo Video

Features

Technology Stack

Architecture

Key Components

Deployment

Documentation

Key Deployment Features

Getting Started

Prerequisites

Installation and Setup

Usage

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages