All notable changes to this project will be documented in this file. The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- improved fastAPI server
- support libra confidence router
- improved the hidden states generation method
- project structure refactoring
- langchain integration
- local_rag and graph_rag example
- generate method to support hidden states output
- model management, FastAPI-server
- unit test
- synchronized with the mlx-lm
- simplified README
- updated mlx_fastchat_worker for supporting mlx >= 0.14.
- updated conda config.
- Lora support for GBA low-bit models.
- support for Phi-3
- Conversion: Utilize gba2mlx.py to convert models from GBA format to a format compatible with the MLX framework, ensuring smooth integration and optimal performance.
- Generation: Includes scripts for generating content using GBA quantized models within the MLX environment, empowering users to leverage the advanced capabilities of GBA models for natural language content creation.
- Fully support GreenBitAI's MLX Model Collection
- Initial commit