Skip to content

An awesome & curated list of best LLMOps tools for developers

License

Notifications You must be signed in to change notification settings

BerriAI/Awesome-LLMOps

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome LLMOps

discord invitation link

An awesome & curated list of the best LLMOps tools for developers.

Contribute

Contributions are most welcome, please adhere to the contribution guidelines.

Table of Contents

Model

Large Language Model

  • Alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data.
  • BELLE - A 7B Large Language Model fine-tune by 34B Chinese Character Corpus, based on LLaMA and Alpaca.
  • Bloom - BigScience Large Open-science Open-access Multilingual Language Model
  • dolly - Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
  • Falcon 40B - Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.
  • FastChat (Vicuna) - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
  • GLM-6B (ChatGLM) - An Open Bilingual Pre-Trained Model, quantization of ChatGLM-130B, can run on consumer-level GPUs.
  • GLM-130B (ChatGLM) - An Open Bilingual Pre-Trained Model (ICLR 2023)
  • GPT-NeoX - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
  • Luotuo - A Chinese LLM, Based on LLaMA and fine tune by Stanford Alpaca, Alpaca LoRA, Japanese-Alpaca-LoRA.
  • StableLM - StableLM: Stability AI Language Models

⬆ back to ToC

CV Foundation Model

  • disco-diffusion - A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
  • midjourney - Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.
  • segment-anything (SAM) - produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
  • stable-diffusion - A latent text-to-image diffusion model
  • stable-diffusion v2 - High-Resolution Image Synthesis with Latent Diffusion Models

⬆ back to ToC

Audio Foundation Model

  • bark - Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.
  • whisper - Robust Speech Recognition via Large-Scale Weak Supervision

Serving

Large Model Serving

  • Alpaca-LoRA-Serve - Alpaca-LoRA as Chatbot service
  • DeepSpeed-MII - MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
  • FlexGen - Running large language models on a single GPU for throughput-oriented scenarios.
  • Flowise - Drag & drop UI to build your customized LLM flow using LangchainJS.
  • llama.cpp - Port of Facebook's LLaMA model in C/C++
  • Modelz-LLM - OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)
  • whisper.cpp - Port of OpenAI's Whisper model in C/C++
  • x-stable-diffusion - Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention.

⬆ back to ToC

Frameworks/Servers for Serving

  • BentoML - The Unified Model Serving Framework
  • Mosec - A machine learning model serving framework with dynamic batching and pipelined stages, provides an easy-to-use Python interface.
  • TFServing - A flexible, high-performance serving system for machine learning models.
  • Torchserve - Serve, optimize and scale PyTorch models in production
  • Triton Server (TRTIS) - The Triton Inference Server provides an optimized cloud and edge inferencing solution.
  • langchain-serve - Serverless LLM apps on Production with Jina AI Cloud

⬆ back to ToC

Observability

  • Deepchecks - Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
  • Evidently - Evaluate and monitor ML models from validation to production.
  • Great Expectations - Always know what to expect from your data.
  • whylogs - The open standard for data logging

⬆ back to ToC

LLMOps

  • Arize-Phoenix - ML observability for LLMs, vision, language, and tabular models.
  • deeplake - Stream large multimodal datasets to achieve near 100% GPU utilization. Query, visualize, & version control data. Access data w/o the need to recompute the embeddings for the model finetuning.
  • GPTCache - Creating semantic cache to store responses from LLM queries.
  • Haystack - Quickly compose applications with LLM Agents, semantic search, question-answering and more.
  • langchain - Building applications with LLMs through composability
  • LangFlow - An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
  • LlamaIndex - Provides a central interface to connect your LLMs with external data.
  • promptfoo - Open-source tool for testing & evaluating prompt quality. Create test cases, automatically check output quality and catch regressions, and reduce evaluation cost.
  • Weights & Biases (Prompts)- A suite of LLMOps tools within the developer-first W&B MLOps platform. Utilize W&B Prompts for visualizing and inspecting LLM execution flow, tracking inputs and outputs, viewing intermediate results, securely managing prompts and LLM chain configurations.
  • xTuring - Build and control your personal LLMs with fast and efficient fine-tuning.
  • ZenML - Open-source framework for orchestrating, experimenting and deploying production-grade ML solutions, with built-in langchain & llama_index integrations.
  • Dify - Open-source framework aims to enable developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.

⬆ back to ToC

Search

Vector search

  • AquilaDB - An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
  • Chroma - the open source embedding database
  • Jina - Build multimodal AI services via cloud native technologies · Neural Search · Generative AI · Cloud Native
  • Marqo - Tensor search for humans.
  • Milvus - Vector database for scalable similarity search and AI applications.
  • Pinecone - The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.
  • pgvector - Open-source vector similarity search for Postgres.
  • pgvecto.rs - Vector database plugin for Postgres, written in Rust, specifically designed for LLM.
  • Qdrant - Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud
  • txtai - Build AI-powered semantic search applications
  • Vald - A Highly Scalable Distributed Vector Search Engine
  • Vearch - A distributed system for embedding-based vector retrieval
  • Weaviate - Weaviate is an open source vector search engine that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.

⬆ back to ToC

Code AI

  • CodeGen - CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
  • CodeT5 - Open Code LLMs for Code Understanding and Generation.
  • fauxpilot - An open-source alternative to GitHub Copilot server
  • tabby - Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.

Training

IDEs and Workspaces

  • code server - Run VS Code on any machine anywhere and access it in the browser.
  • conda - OS-agnostic, system-level binary package manager and ecosystem.
  • Docker - Moby is an open-source project created by Docker to enable and accelerate software containerization.
  • envd - 🏕️ Reproducible development environment for AI/ML.
  • Jupyter Notebooks - The Jupyter notebook is a web-based notebook environment for interactive computing.
  • Kurtosis - A build, packaging, and run system for ephemeral multi-container environments.

⬆ back to ToC

Foundation Model Fine Tuning

  • alpaca-lora - Instruct-tune LLaMA on consumer hardware
  • LMFlow - An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
  • Lora - Using Low-rank adaptation to quickly fine-tune diffusion models.
  • peft - State-of-the-art Parameter-Efficient Fine-Tuning.
  • p-tuning-v2 - An optimized prompt tuning strategy achieving comparable performance to fine-tuning on small/medium-sized models and sequence tagging challenges. (ACL 2022)
  • QLoRA - Efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance.

⬆ back to ToC

Frameworks for Training

  • Accelerate - 🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
  • Apache MXNet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler.
  • Caffe - A fast open framework for deep learning.
  • ColossalAI - An integrated large-scale model training system with efficient parallelization techniques.
  • DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
  • Horovod - Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
  • Jax - Autograd and XLA for high-performance machine learning research.
  • Kedro - Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.
  • Keras - Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow.
  • LightGBM - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
  • MegEngine - MegEngine is a fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
  • metric-learn - Metric Learning Algorithms in Python.
  • MindSpore - MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
  • Oneflow - OneFlow is a performance-centered and open-source deep learning framework.
  • PaddlePaddle - Machine Learning Framework from Industrial Practice.
  • PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration.
  • PyTorchLightning - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
  • XGBoost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library.
  • scikit-learn - Machine Learning in Python.
  • TensorFlow - An Open Source Machine Learning Framework for Everyone.
  • VectorFlow - A minimalist neural network library optimized for sparse data and single machine environments.

⬆ back to ToC

Experiment Tracking

  • Aim - an easy-to-use and performant open-source experiment tracker.
  • ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
  • Guild AI - Experiment tracking, ML developer tools.
  • MLRun - Machine Learning automation and tracking.
  • Kedro-Viz - Kedro-Viz is an interactive development tool for building data science pipelines with Kedro. Kedro-Viz also allows users to view and compare different runs in the Kedro project.
  • LabNotebook - LabNotebook is a tool that allows you to flexibly monitor, record, save, and query all your machine learning experiments.
  • Sacred - Sacred is a tool to help you configure, organize, log and reproduce experiments.
  • Weights & Biases - A developer first, lightweight, user-friendly experiment tracking and visualization tool for machine learning projects, streamlining collaboration and simplifying MLOps. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.

⬆ back to ToC

Visualization

  • Maniford - A model-agnostic visual debugging tool for machine learning.
  • netron - Visualizer for neural network, deep learning, and machine learning models.
  • OpenOps - Bring multiple data streams into one dashboard.
  • TensorBoard - TensorFlow's Visualization Toolkit.
  • TensorSpace - Neural network 3D visualization framework, build interactive and intuitive model in browsers, support pre-trained deep learning models from TensorFlow, Keras, TensorFlow.js.
  • dtreeviz - A python library for decision tree visualization and model interpretation.
  • Zetane Viewer - ML models and internal tensors 3D visualizer.
  • Zeno - AI evaluation platform for interactively exploring data and model outputs.

⬆ back to ToC

Data

Data Management

  • ArtiVC - A version control system to manage large files. Lake is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.
  • Dolt - Git for Data.
  • DVC - Data Version Control | Git for Data & Models | ML Experiments Management.
  • Delta-Lake - Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
  • Pachyderm - Pachyderm is a version control system for data.
  • Quilt - A self-organizing data hub for S3.

⬆ back to ToC

Data Storage

  • JuiceFS - A distributed POSIX file system built on top of Redis and S3.
  • LakeFS - Git-like capabilities for your object storage.
  • Lance - Modern columnar data format for ML implemented in Rust.

⬆ back to ToC

Data Tracking

  • Piperider - A CLI tool that allows you to build data profiles and write assertion tests for easily evaluating and tracking your data's reliability over time.
  • LUX - A Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process.

⬆ back to ToC

Feature Engineering

  • Featureform - The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
  • FeatureTools - An open source python framework for automated feature engineering

⬆ back to ToC

Data/Feature enrichment

  • Upgini - Free automated data & feature enrichment library for machine learning: automatically searches through thousands of ready-to-use features from public and community shared data sources and enriches your training dataset with only the accuracy improving features
  • Feast - An open source feature store for machine learning.

⬆ back to ToC

Large Scale Deployment

ML Platforms

  • ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management.
  • MLflow - Open source platform for the machine learning lifecycle.
  • MLRun - An open MLOps platform for quickly building and managing continuous ML applications across their lifecycle.
  • ModelFox - ModelFox is a platform for managing and deploying machine learning models.
  • Kserve - Standardized Serverless ML Inference Platform on Kubernetes
  • Kubeflow - Machine Learning Toolkit for Kubernetes.
  • PAI - Resource scheduling and cluster management for AI.
  • Polyaxon - Machine Learning Management & Orchestration Platform.
  • Primehub - An effortless infrastructure for machine learning built on the top of Kubernetes.
  • Seldon-core - An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
  • Weights & Biases - A lightweight and flexible platform for machine learning experiment tracking, dataset versioning, and model management, enhancing collaboration and streamlining MLOps workflows. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.

⬆ back to ToC

Workflow

  • Airflow - A platform to programmatically author, schedule and monitor workflows.
  • aqueduct - An Open-Source Platform for Production Data Science
  • Argo Workflows - Workflow engine for Kubernetes.
  • Flyte - Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale.
  • Kubeflow Pipelines - Machine Learning Pipelines for Kubeflow.
  • LangFlow - An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
  • Metaflow - Build and manage real-life data science projects with ease!
  • Ploomber - The fastest way to build data pipelines. Develop iteratively, deploy anywhere.
  • Prefect - The easiest way to automate your data.
  • VDP - An open-source unstructured data ETL tool to streamline the end-to-end unstructured data processing pipeline.
  • ZenML - MLOps framework to create reproducible pipelines.

⬆ back to ToC

Scheduling

  • Kueue - Kubernetes-native Job Queueing.
  • PAI - Resource scheduling and cluster management for AI (Open-sourced by Microsoft).
  • Slurm - A Highly Scalable Workload Manager.
  • Volcano - A Cloud Native Batch System (Project under CNCF).
  • Yunikorn - Light-weight, universal resource scheduler for container orchestrator systems.

⬆ back to ToC

Model Management

  • dvc - Data Version Control | Git for Data & Models | ML Experiments Management
  • ModelDB - Open Source ML Model Versioning, Metadata, and Experiment Management
  • MLEM - A tool to package, serve, and deploy any ML model on any platform.
  • ormb - Docker for Your ML/DL Models Based on OCI Artifacts

⬆ back to ToC

Performance

ML Compiler

  • ONNX-MLIR - Compiler technology to transform a valid Open Neural Network Exchange (ONNX) graph into code that implements the graph with minimum runtime support.
  • TVM - Open deep learning compiler stack for cpu, gpu and specialized accelerators

⬆ back to ToC

Profiling

  • octoml-profile - octoml-profile is a python library and cloud service designed to provide the simplest experience for assessing and optimizing the performance of PyTorch models on cloud hardware with state-of-the-art ML acceleration technology.
  • scalene - a high-performance, high-precision CPU, GPU, and memory profiler for Python

⬆ back to ToC

AutoML

  • Archai - a platform for Neural Network Search (NAS) that allows you to generate efficient deep networks for your applications.
  • autoai - A framework to find the best performing AI/ML model for any AI problem.
  • AutoGL - An autoML framework & toolkit for machine learning on graphs
  • AutoGluon - AutoML for Image, Text, and Tabular Data.
  • automl-gs - Provide an input CSV and a target field to predict, generate a model + code to run it.
  • autokeras - AutoML library for deep learning.
  • Auto-PyTorch - Automatic architecture search and hyperparameter optimization for PyTorch.
  • auto-sklearn - an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
  • Dragonfly - An open source python library for scalable Bayesian optimisation.
  • Determined - scalable deep learning training platform with integrated hyperparameter tuning support; includes Hyperband, PBT, and other search methods.
  • DEvol (DeepEvolution) - a basic proof of concept for genetic architecture search in Keras.
  • EvalML - An open source python library for AutoML.
  • FEDOT - AutoML framework for the design of composite pipelines.
  • FLAML - Fast and lightweight AutoML (paper).
  • Goptuna - A hyperparameter optimization framework, inspired by Optuna.
  • HpBandSter - a framework for distributed hyperparameter optimization.
  • HPOlib2 - a library for hyperparameter optimization and black box optimization benchmarks.
  • Hyperband - open source code for tuning hyperparams with Hyperband.
  • Hypernets - A General Automated Machine Learning Framework.
  • Hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python.
  • hyperunity - A toolset for black-box hyperparameter optimisation.
  • Katib - Katib is a Kubernetes-native project for automated machine learning (AutoML).
  • Keras Tuner - Hyperparameter tuning for humans.
  • learn2learn - PyTorch Meta-learning Framework for Researchers.
  • Ludwig - a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code.
  • MOE - a global, black box optimization engine for real world metric optimization by Yelp.
  • Model Search - a framework that implements AutoML algorithms for model architecture search at scale.
  • NASGym - a proof-of-concept OpenAI Gym environment for Neural Architecture Search (NAS).
  • NNI - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
  • Optuna - A hyperparameter optimization framework.
  • Pycaret - An open-source, low-code machine learning library in Python that automates machine learning workflows.
  • Ray Tune - Scalable Hyperparameter Tuning.
  • REMBO - Bayesian optimization in high-dimensions via random embedding.
  • RoBO - a Robust Bayesian Optimization framework.
  • scikit-optimize(skopt) - Sequential model-based optimization with a scipy.optimize interface.
  • Spearmint - a software package to perform Bayesian optimization.
  • TPOT - one of the very first AutoML methods and open-source software packages.
  • Torchmeta - A Meta-Learning library for PyTorch.
  • Vegas - an AutoML algorithm tool chain by Huawei Noah's Arb Lab.

⬆ back to ToC

Optimizations

  • FeatherCNN - FeatherCNN is a high performance inference engine for convolutional neural networks.
  • Forward - A library for high performance deep learning inference on NVIDIA GPUs.
  • NCNN - ncnn is a high-performance neural network inference framework optimized for the mobile platform.
  • PocketFlow - use AutoML to do model compression.
  • TensorFlow Model Optimization - A suite of tools that users, both novice and advanced, can use to optimize machine learning models for deployment and execution.
  • TNN - A uniform deep learning inference framework for mobile, desktop and server.

⬆ back to ToC

Federated ML

  • EasyFL - An Easy-to-use Federated Learning Platform
  • FATE - An Industrial Grade Federated Learning Framework
  • FedML - The federated learning and analytics library enabling secure and collaborative machine learning on decentralized data anywhere at any scale. Supporting large-scale cross-silo federated learning, cross-device federated learning on smartphones/IoTs, and research simulation.
  • Flower - A Friendly Federated Learning Framework
  • Harmonia - Harmonia is an open-source project aiming at developing systems/infrastructures and libraries to ease the adoption of federated learning (abbreviated to FL) for researches and production usage.
  • TensorFlow Federated - A framework for implementing federated learning

⬆ back to ToC

Awesome Lists

  • Awesome Argo - A curated list of awesome projects and resources related to Argo
  • Awesome AutoDL - Automated Deep Learning: Neural Architecture Search Is Not the End (a curated list of AutoDL resources and an in-depth analysis)
  • Awesome AutoML - Curating a list of AutoML-related research, tools, projects and other resources
  • Awesome AutoML Papers - A curated list of automated machine learning papers, articles, tutorials, slides and projects
  • Awesome Federated Learning Systems - A curated list of Federated Learning Systems related academic papers, articles, tutorials, slides and projects.
  • Awesome Federated Learning - A curated list of federated learning publications, re-organized from Arxiv (mostly)
  • awesome-federated-learningacc - All materials you need for Federated Learning: blogs, videos, papers, and softwares, etc.
  • Awesome Open MLOps - This is the Fuzzy Labs guide to the universe of free and open source MLOps tools.
  • Awesome Production Machine Learning - A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
  • Awesome Tensor Compilers - A list of awesome compiler projects and papers for tensor computation and deep learning.
  • kelvins/awesome-mlops - A curated list of awesome MLOps tools.
  • visenger/awesome-mlops - An awesome list of references for MLOps - Machine Learning Operations
  • currentslab/awesome-vector-search - A curated list of awesome vector search framework/engine, library, cloud service and research papers to vector similarity search.

⬆ back to ToC

About

An awesome & curated list of best LLMOps tools for developers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 86.2%
  • Python 13.8%