AI Documentation/Analystic BY LLM Model (Multimodal)

Introdctions

Business/ Use case

We often obtain or download a lot of Document from website internet.After that, we need spent a lot of the time to read multiple documents for understanding the context of these documents. It is taking a lot of time.

This project intended to be used LLM Model for the purpose of assisting the user to analyze and understand the context of these documents and faster access the context of the documents.

This project can divide into 3 parts

Document Data Extraction Algorithms
Data Analysis Algorithms for Documentation
Data Retrieval Algorithms for Documentation (RAG+ LLM Model)

Technology use in this project

Document Data Extraction (Unstructure Document Preprocssing)

Because the input document complexity, include table, image (chart), I will use several AI model like OCR , commputer vision model, Vision transformer , layout transformer, Embedding model to extract and analysis the document content from bank statement.
Complex layout/Context format Analysis by ML model
use advance rule base model or Machine learning model :
- group and reorganize the data into a user-friendly format. (no experience to build rule to graoup data)
- Identify common denominators and create headers for each group. (no experence)
- Display only the differences between similar items (e.g., window sizes, owners) as line items below each header.
- Automate the process using AI, enabling the system to self-learn and understand the data structure.
- Extract relevant data from PDFs with different layouts and formats.

Document Analysis

Classification the document type
Documentation content summary
Intelligence Extract Data and output structure data

Vector DataBase use Vector DataBase to store the converted Document context into embedding vector use Vector Database can find document similarity
Retrieval augmented generation (RAG) with Multimodal use for query the local Documentation use LangChain to Question and Answer from local Documentation
The first version will be used Google Gemini API for LLM Model , later versions will be try different open LLM models (e.g. LLama3, mi)
Support Document first version only pdf files format later versions will be words, excel, may be also support image base documents
FrontEnd UI first version will be used Streamlit for Frontend UI later versions will be Full stack with Backend Restful API

Installation and Setup

use requirements.txt for installation package dependencies
you can setup virual environment by venv
add your google api key to .env file for enviroment variables

Run Application

run development version : go to "dev" folder
run demo version: go to "demo" folder

Type command as below for running application

streamlit run apps.py

Upload your document for query
Type chat prompt message to query the multiple documents

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Documentation/Analystic BY LLM Model (Multimodal)

Introdctions

Business/ Use case

Technology use in this project

Installation and Setup

Run Application

About

Releases

Packages

Languages

johnsonhk88/AI-Document-Analysic-By-LLM

Folders and files

Latest commit

History

Repository files navigation

AI Documentation/Analystic BY LLM Model (Multimodal)

Introdctions

Business/ Use case

Technology use in this project

Installation and Setup

Run Application

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages