#

document-processing

Here are 37 public repositories matching this topic...

awslabs / project-lakechain

⚡ Cloud-native, AI-powered, document processing pipelines on AWS.

aws machine-learning natural-language-processing computer-vision serverless hacktoberfest document-processing aws-cdk generative-ai retrieval-augmented-generation

Updated Jun 12, 2024
TypeScript

formkiq-core

formkiq / formkiq-core

A full-featured Document Layer for your application, providing the functionality of a flexible document management system, including storage, discovery, processing, and retrieval. Deploys directly into your Amazon Web Services Cloud. 🌟 Star to support our work!

aws ocr serverless headless cloud-storage document-database amazon-web-services dms document-management optical-character-recognition document-processing document-management-system document-api document-apis intelligent-document-processing document-layer

Updated Jun 14, 2024
Java

ArtemZarubin / XmlDocumentProcessor

XmlDocumentProcessor: A .NET component for XML document processing. It analyzes XML content, performs keyword-based queries, and transforms data into HTML. Emphasizes design patterns like Strategy pattern, with a focus on class diagramming. Implements penalty for non-compliance.

c-sharp dotnet xml document-processing xml-processing

Updated Jun 10, 2024
C#

rhubarb

awslabs / rhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

multi-modal document-processing generative-ai intelligent-document-processing amazon-bedrock

Updated Jun 6, 2024
Python

aws-solutions / enhanced-document-understanding-on-aws

Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates search indexes from the analyzed data.

document-analysis document-processing

Updated May 30, 2024
JavaScript

parsee-ai / parsee-core

Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.

structured-data document-processing multimodal llm

Updated May 21, 2024
Python

dayang4321 / MSc-Team-Project-CMPU9010-2023-24-Group-3

TU Dublin Computer Science MSc. Final Project Group 3 - Accessibilator

accessibility social-good document-processing

Updated May 6, 2024
Jupyter Notebook

rina-reimer / uwb-hacks-ai-local

AI-powered chatbot designed to simplify the job search process

resume ai job-search document-processing

Updated Apr 28, 2024
TypeScript

jmanhype / DSPy-Multi-Document-Agents

An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, and semantic understanding.

nlp distributed-systems ai query-optimization knowledge-management document-processing vector-search

Updated Apr 23, 2024
Python

johnsirmon / clearcouncil

ClearCouncil: Automated tools for collecting, organizing, and embedding publicly available local state county council documents (minutes, agendas) into LLMs. Python, JS, and wget scripts included for easy data retrieval and integration.

local-government wget open-data openai civic-tech gpt data-retrieval document-processing transparency-enhancing-technologies langchain langchain-python retrieval-augmented-generation

Updated Apr 16, 2024
Python

CentralFloridaAttorney / zmongo_retriever

Use data from MongoDB in LangChain, Llama and OpenAI

python mongo machine-learning database mongodb openai data-retrieval document-processing langchain llamacpp data-chunking

Updated Mar 31, 2024
Python

abdur75648 / urdu-text-detection

Text line detection for Urdu OCR (UTRNet)

ocr text-detection document-processing urdu-text-detection urdu-ocr utrnet contournet

Updated Jan 31, 2024
Python

eklem / stopword-trainer

A module for creating stopword lists for any language, based on a set of documents.

nlp information-retrieval stopwords document-processing stopwords-removal

Updated Nov 13, 2023
JavaScript

m4nd0mb3 / document-templater

Document Templater is a powerful tool for automated document generation. Streamline the process of creating standard documents, such as contracts, reports, and forms, using predefined templates. This repository contains the source code for Document Templater, allowing you to easily integrate this functionality into your projects and automate docs.

api automation integration backend forms templates swagger expressjs reports contracts pdf-generation word-documents swagger-api expressjs-api document-generation expressjs-server document-processing swagger3

Updated Sep 19, 2023
JavaScript

cemonal / Pdf2xNet

Pdf2xNet is a .NET library for seamless integration with Xpdf tools, enabling easy conversion of PDF documents to text, images, and HTML formats within your .NET applications.

html pdf library png text images conversion pdf-converter document-processing xpdf conversion-tool xpdf-utils

Updated Sep 4, 2023
C#

Oneirocom / generative-intent-detection

Generative intent detection with Magick

machine-learning document-processing investment-analysis

Updated Aug 14, 2023
TypeScript

fonckchain / pdf-text-converter

Python tool for converting PDF files to text. Simplify your document processing tasks.

python automation pdf-converter text-extraction document-processing

Updated Jul 14, 2023
Python

x1ao4 / doc-merger

通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script

merge data-analysis documents filtering document-analysis document-processing document-comparison filtering-data data-merging merge-documents

Updated Jul 11, 2023
Python

jackvaughan09 / phil

Minimize the time requirement of audit report analysis with a containerized file conversion and scraping system

docker automation data-engineering document-processing

Updated Apr 3, 2023
Jupyter Notebook

cburschka / lyx

Unofficial mirror of git://git.lyx.org/lyx.git (updates daily. not affiliated with lyx.org.)

latex mirror lyx document-processing

Updated Mar 21, 2023
C++

Improve this page

Add a description, image, and links to the document-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-processing topic, visit your repo's landing page and select "manage topics."