From f4e6eac3b698fa27b048c407b18c0dfe81306eba Mon Sep 17 00:00:00 2001 From: Leonid Ganeline Date: Wed, 13 Sep 2023 14:43:04 -0700 Subject: [PATCH] docs: `self-query` consistency (#10502) The `self-que[ring` navbar](https://python.langchain.com/docs/modules/data_connection/retrievers/self_query/) has repeated `self-quering` repeated in each menu item. I've simplified it to be more readable - removed `self-quering` from a title of each page; - added description to the vector stores - added description and link to the Integration Card (`integrations/providers`) of the vector stores when they are missed. --- docs/extras/integrations/providers/milvus.mdx | 19 +- .../integrations/providers/pinecone.mdx | 8 +- docs/extras/integrations/providers/qdrant.mdx | 19 +- docs/extras/integrations/providers/redis.mdx | 16 +- .../integrations/providers/vectara/index.mdx | 12 +- .../integrations/providers/weaviate.mdx | 21 +- .../activeloop_deeplake_self_query.ipynb | 9 +- .../self_query/chroma_self_query.ipynb | 6 +- .../retrievers/self_query/dashvector.ipynb | 330 +++++++++++------- .../self_query/elasticsearch_self_query.ipynb | 15 +- .../self_query/milvus_self_query.ipynb | 15 +- .../self_query/myscale_self_query.ipynb | 17 +- .../retrievers/self_query/pinecone.ipynb | 8 +- .../self_query/qdrant_self_query.ipynb | 6 +- .../self_query/redis_self_query.ipynb | 10 +- .../self_query/supabase_self_query.ipynb | 13 +- .../self_query/vectara_self_query.ipynb | 9 +- .../self_query/weaviate_self_query.ipynb | 9 +- 18 files changed, 332 insertions(+), 210 deletions(-) diff --git a/docs/extras/integrations/providers/milvus.mdx b/docs/extras/integrations/providers/milvus.mdx index d1e7229f47429..509cd5294baeb 100644 --- a/docs/extras/integrations/providers/milvus.mdx +++ b/docs/extras/integrations/providers/milvus.mdx @@ -1,15 +1,20 @@ # Milvus -This page covers how to use the Milvus ecosystem within LangChain. -It is broken into two parts: installation and setup, and then references to specific Milvus wrappers. +>[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages +> massive embedding vectors generated by deep neural networks and other machine learning (ML) models. + ## Installation and Setup -- Install the Python SDK with `pip install pymilvus` -## Wrappers -### VectorStore +Install the Python SDK: + +```bash +pip install pymilvus +``` + +## Vector Store -There exists a wrapper around Milvus indexes, allowing you to use it as a vectorstore, +There exists a wrapper around `Milvus` indexes, allowing you to use it as a vectorstore, whether for semantic search or example selection. To import this vectorstore: @@ -17,4 +22,4 @@ To import this vectorstore: from langchain.vectorstores import Milvus ``` -For a more detailed walkthrough of the Miluvs wrapper, see [this notebook](/docs/integrations/vectorstores/milvus.html) +For a more detailed walkthrough of the `Miluvs` wrapper, see [this notebook](/docs/integrations/vectorstores/milvus.html) diff --git a/docs/extras/integrations/providers/pinecone.mdx b/docs/extras/integrations/providers/pinecone.mdx index c0248b8f75935..3dd1e55e69d02 100644 --- a/docs/extras/integrations/providers/pinecone.mdx +++ b/docs/extras/integrations/providers/pinecone.mdx @@ -1,16 +1,18 @@ # Pinecone -This page covers how to use the Pinecone ecosystem within LangChain. -It is broken into two parts: installation and setup, and then references to specific Pinecone wrappers. +>[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality. + ## Installation and Setup + Install the Python SDK: + ```bash pip install pinecone-client ``` -## Vectorstore +## Vector store There exists a wrapper around Pinecone indexes, allowing you to use it as a vectorstore, whether for semantic search or example selection. diff --git a/docs/extras/integrations/providers/qdrant.mdx b/docs/extras/integrations/providers/qdrant.mdx index 048c2fe19828c..33dfcb266cb67 100644 --- a/docs/extras/integrations/providers/qdrant.mdx +++ b/docs/extras/integrations/providers/qdrant.mdx @@ -1,15 +1,22 @@ # Qdrant -This page covers how to use the Qdrant ecosystem within LangChain. -It is broken into two parts: installation and setup, and then references to specific Qdrant wrappers. +>[Qdrant](https://qdrant.tech/documentation/) (read: quadrant) is a vector similarity search engine. +> It provides a production-ready service with a convenient API to store, search, and manage +> points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support. + ## Installation and Setup -- Install the Python SDK with `pip install qdrant-client` -## Wrappers -### VectorStore +Install the Python SDK: + +```bash +pip install qdrant-client +``` + + +## Vector Store -There exists a wrapper around Qdrant indexes, allowing you to use it as a vectorstore, +There exists a wrapper around `Qdrant` indexes, allowing you to use it as a vectorstore, whether for semantic search or example selection. To import this vectorstore: diff --git a/docs/extras/integrations/providers/redis.mdx b/docs/extras/integrations/providers/redis.mdx index d1316e4d5bd93..e5fcc239587f0 100644 --- a/docs/extras/integrations/providers/redis.mdx +++ b/docs/extras/integrations/providers/redis.mdx @@ -1,18 +1,26 @@ # Redis +>[Redis](https://redis.com) is an open-source key-value store that can be used as a cache, +> message broker, database, vector database and more. + This page covers how to use the [Redis](https://redis.com) ecosystem within LangChain. It is broken into two parts: installation and setup, and then references to specific Redis wrappers. ## Installation and Setup -- Install the Redis Python SDK with `pip install redis` + +Install the Python SDK: + +```bash +pip install redis +``` ## Wrappers -All wrappers needing a redis url connection string to connect to the database support either a stand alone Redis server +All wrappers need a redis url connection string to connect to the database support either a stand alone Redis server or a High-Availability setup with Replication and Redis Sentinels. ### Redis Standalone connection url -For standalone Redis server the official redis connection url formats can be used as describe in the python redis modules +For standalone `Redis` server, the official redis connection url formats can be used as describe in the python redis modules "from_url()" method [Redis.from_url](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url) Example: `redis_url = "redis://:secret-pass@localhost:6379/0"` @@ -20,7 +28,7 @@ Example: `redis_url = "redis://:secret-pass@localhost:6379/0"` ### Redis Sentinel connection url For [Redis sentinel setups](https://redis.io/docs/management/sentinel/) the connection scheme is "redis+sentinel". -This is an un-offical extensions to the official IANA registered protocol schemes as long as there is no connection url +This is an unofficial extensions to the official IANA registered protocol schemes as long as there is no connection url for Sentinels available. Example: `redis_url = "redis+sentinel://:secret-pass@sentinel-host:26379/mymaster/0"` diff --git a/docs/extras/integrations/providers/vectara/index.mdx b/docs/extras/integrations/providers/vectara/index.mdx index abd82837359a3..ebda156cd1731 100644 --- a/docs/extras/integrations/providers/vectara/index.mdx +++ b/docs/extras/integrations/providers/vectara/index.mdx @@ -1,17 +1,18 @@ # Vectara - -What is Vectara? +>[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation +>(aka Retrieval-augmented-generation or RAG) applications. **Vectara Overview:** -- Vectara is developer-first API platform for building GenAI applications +- `Vectara` is developer-first API platform for building GenAI applications - To use Vectara - first [sign up](https://console.vectara.com/signup) and create an account. Then create a corpus and an API key for indexing and searching. - You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index - You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly). - You can use Vectara's integration with LangChain as a Vector store or using the Retriever abstraction. ## Installation and Setup -To use Vectara with LangChain no special installation steps are required. + +To use `Vectara` with LangChain no special installation steps are required. To get started, follow our [quickstart](https://docs.vectara.com/docs/quickstart) guide to create an account, a corpus and an API key. Once you have these, you can provide them as arguments to the Vectara vectorstore, or you can set them as environment variables. @@ -19,9 +20,8 @@ Once you have these, you can provide them as arguments to the Vectara vectorstor - export `VECTARA_CORPUS_ID`="your_corpus_id" - export `VECTARA_API_KEY`="your-vectara-api-key" -## Usage -### VectorStore +## Vector Store There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection. diff --git a/docs/extras/integrations/providers/weaviate.mdx b/docs/extras/integrations/providers/weaviate.mdx index 1c570948ab535..e68105bf6f0b5 100644 --- a/docs/extras/integrations/providers/weaviate.mdx +++ b/docs/extras/integrations/providers/weaviate.mdx @@ -1,10 +1,10 @@ # Weaviate -This page covers how to use the Weaviate ecosystem within LangChain. +>[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from +>your favorite ML models, and scale seamlessly into billions of data objects. -What is Weaviate? -**Weaviate in a nutshell:** +What is `Weaviate`? - Weaviate is an open-source ​database of the type ​vector search engine. - Weaviate allows you to store JSON documents in a class property-like fashion while attaching machine learning vectors to these documents to represent them in vector space. - Weaviate can be used stand-alone (aka bring your vectors) or with a variety of modules that can do the vectorization for you and extend the core capabilities. @@ -14,15 +14,20 @@ What is Weaviate? **Weaviate in detail:** -Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), etc. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering and the fault tolerance of a cloud-native database. It is all accessible through GraphQL, REST, and various client-side programming languages. +`Weaviate` is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), etc. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering and the fault tolerance of a cloud-native database. It is all accessible through GraphQL, REST, and various client-side programming languages. ## Installation and Setup -- Install the Python SDK with `pip install weaviate-client` -## Wrappers -### VectorStore +Install the Python SDK: -There exists a wrapper around Weaviate indexes, allowing you to use it as a vectorstore, +```bash +pip install weaviate-client +``` + + +## Vector Store + +There exists a wrapper around `Weaviate` indexes, allowing you to use it as a vectorstore, whether for semantic search or example selection. To import this vectorstore: diff --git a/docs/extras/modules/data_connection/retrievers/self_query/activeloop_deeplake_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/activeloop_deeplake_self_query.ipynb index 4f821019c446e..6ec8e29dcf030 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/activeloop_deeplake_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/activeloop_deeplake_self_query.ipynb @@ -6,11 +6,14 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Deep Lake self-querying \n", + "# Deep Lake\n", "\n", - ">[Deep Lake](https://www.activeloop.ai) is a multimodal database for building AI applications.\n", + ">[Deep Lake](https://www.activeloop.ai) is a multimodal database for building AI applications\n", + ">[Deep Lake](https://github.com/activeloopai/deeplake) is a database for AI.\n", + ">Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version,\n", + "> & visualize any AI data. Stream data in real time to PyTorch/TensorFlow.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Deep Lake vector store. " + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Deep Lake` vector store. " ] }, { diff --git a/docs/extras/modules/data_connection/retrievers/self_query/chroma_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/chroma_self_query.ipynb index ac6954e82db36..a1eeddd16d8ee 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/chroma_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/chroma_self_query.ipynb @@ -5,11 +5,11 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Chroma self-querying \n", + "# Chroma\n", "\n", ">[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Chroma vector store. " + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Chroma` vector store. " ] }, { @@ -447,7 +447,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.6" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/dashvector.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/dashvector.ipynb index d1048ee5fa76a..16884df33d19c 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/dashvector.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/dashvector.ipynb @@ -2,20 +2,36 @@ "cells": [ { "cell_type": "markdown", + "id": "59895c73d1a0f3ca", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "source": [ - "# DashVector self-querying\n", + "# DashVector\n", "\n", - "> [DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully-managed vectorDB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements.\n", + "> [DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully managed vector DB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements.\n", + "> The vector retrieval service `DashVector` is based on the `Proxima` core of the efficient vector engine independently developed by `DAMO Academy`,\n", + "> and provides a cloud-native, fully managed vector retrieval service with horizontal expansion capabilities.\n", + "> `DashVector` exposes its powerful vector management, vector query and other diversified capabilities through a simple and\n", + "> easy-to-use SDK/API interface, which can be quickly integrated by upper-layer AI applications, thereby providing services\n", + "> including large model ecology, multi-modal AI search, molecular structure A variety of application scenarios, including analysis,\n", + "> provide the required efficient vector retrieval capabilities.\n", "\n", - "In this notebook we'll demo the `SelfQueryRetriever` with a `DashVector` vector store." - ], - "metadata": { - "collapsed": false - }, - "id": "59895c73d1a0f3ca" + "In this notebook, we'll demo the `SelfQueryRetriever` with a `DashVector` vector store." + ] }, { "cell_type": "markdown", + "id": "539ae9367e45a178", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "source": [ "## Create DashVector vectorstore\n", "\n", @@ -24,46 +40,55 @@ "To use DashVector, you have to have `dashvector` package installed, and you must have an API key and an Environment. Here are the [installation instructions](https://help.aliyun.com/document_detail/2510223.html).\n", "\n", "NOTE: The self-query retriever requires you to have `lark` package installed." - ], - "metadata": { - "collapsed": false - }, - "id": "539ae9367e45a178" + ] }, { "cell_type": "code", "execution_count": 1, + "id": "67df7e1f8dc8cdd0", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [], "source": [ "# !pip install lark dashvector" - ], - "metadata": { - "collapsed": false - }, - "id": "67df7e1f8dc8cdd0" + ] }, { "cell_type": "code", "execution_count": 1, - "outputs": [], - "source": [ - "import os\n", - "import dashvector\n", - "\n", - "client = dashvector.Client(api_key=os.environ[\"DASHVECTOR_API_KEY\"])" - ], + "id": "ff61eaf13973b5fe", "metadata": { - "collapsed": false, "ExecuteTime": { "end_time": "2023-08-24T02:58:46.905337Z", "start_time": "2023-08-24T02:58:46.252566Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false } }, - "id": "ff61eaf13973b5fe" + "outputs": [], + "source": [ + "import os\n", + "import dashvector\n", + "\n", + "client = dashvector.Client(api_key=os.environ[\"DASHVECTOR_API_KEY\"])" + ] }, { "cell_type": "code", "execution_count": null, + "id": "de5c77957ee42d14", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [], "source": [ "from langchain.schema import Document\n", @@ -74,15 +99,22 @@ "\n", "# create DashVector collection\n", "client.create(\"langchain-self-retriever-demo\", dimension=1536)" - ], - "metadata": { - "collapsed": false - }, - "id": "de5c77957ee42d14" + ] }, { "cell_type": "code", "execution_count": 3, + "id": "8f40605548a4550", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:08.090031Z", + "start_time": "2023-08-24T02:59:05.660295Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [], "source": [ "docs = [\n", @@ -119,31 +151,37 @@ "vectorstore = DashVector.from_documents(\n", " docs, embeddings, collection_name=\"langchain-self-retriever-demo\"\n", ")" - ], + ] + }, + { + "cell_type": "markdown", + "id": "eb1340adafac8993", "metadata": { "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:08.090031Z", - "start_time": "2023-08-24T02:59:05.660295Z" + "jupyter": { + "outputs_hidden": false } }, - "id": "8f40605548a4550" - }, - { - "cell_type": "markdown", "source": [ "## Create your self-querying retriever\n", "\n", "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents." - ], - "metadata": { - "collapsed": false - }, - "id": "eb1340adafac8993" + ] }, { "cell_type": "code", "execution_count": 4, + "id": "d65233dc044f95a7", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:11.003940Z", + "start_time": "2023-08-24T02:59:10.476722Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [], "source": [ "from langchain.llms import Tongyi\n", @@ -175,31 +213,37 @@ "retriever = SelfQueryRetriever.from_llm(\n", " llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n", ")" - ], + ] + }, + { + "cell_type": "markdown", + "id": "a54af0d67b473db6", "metadata": { "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:11.003940Z", - "start_time": "2023-08-24T02:59:10.476722Z" + "jupyter": { + "outputs_hidden": false } }, - "id": "d65233dc044f95a7" - }, - { - "cell_type": "markdown", "source": [ "## Testing it out\n", "\n", "And now we can try actually using our retriever!" - ], - "metadata": { - "collapsed": false - }, - "id": "a54af0d67b473db6" + ] }, { "cell_type": "code", "execution_count": 6, + "id": "dad9da670a267fe7", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:28.577901Z", + "start_time": "2023-08-24T02:59:26.780184Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [ { "name": "stdout", @@ -210,7 +254,12 @@ }, { "data": { - "text/plain": "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.199999809265137}),\n Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" + "text/plain": [ + "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n", + " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n", + " Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.199999809265137}),\n", + " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" + ] }, "execution_count": 6, "metadata": {}, @@ -220,19 +269,22 @@ "source": [ "# This example only specifies a relevant query\n", "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")" - ], - "metadata": { - "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:28.577901Z", - "start_time": "2023-08-24T02:59:26.780184Z" - } - }, - "id": "dad9da670a267fe7" + ] }, { "cell_type": "code", "execution_count": 7, + "id": "d486a64316153d52", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:32.370774Z", + "start_time": "2023-08-24T02:59:30.614252Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [ { "name": "stdout", @@ -243,7 +295,10 @@ }, { "data": { - "text/plain": "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'}),\n Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" + "text/plain": [ + "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'}),\n", + " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" + ] }, "execution_count": 7, "metadata": {}, @@ -253,19 +308,22 @@ "source": [ "# This example only specifies a filter\n", "retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")" - ], - "metadata": { - "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:32.370774Z", - "start_time": "2023-08-24T02:59:30.614252Z" - } - }, - "id": "d486a64316153d52" + ] }, { "cell_type": "code", "execution_count": 8, + "id": "e05919cdead7bd4a", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:35.353439Z", + "start_time": "2023-08-24T02:59:33.278255Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [ { "name": "stdout", @@ -276,7 +334,9 @@ }, { "data": { - "text/plain": "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.300000190734863})]" + "text/plain": [ + "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.300000190734863})]" + ] }, "execution_count": 8, "metadata": {}, @@ -286,19 +346,22 @@ "source": [ "# This example specifies a query and a filter\n", "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")" - ], - "metadata": { - "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:35.353439Z", - "start_time": "2023-08-24T02:59:33.278255Z" - } - }, - "id": "e05919cdead7bd4a" + ] }, { "cell_type": "code", "execution_count": 9, + "id": "ac2c7012379e918e", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:38.913707Z", + "start_time": "2023-08-24T02:59:36.659271Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [ { "name": "stdout", @@ -309,7 +372,9 @@ }, { "data": { - "text/plain": "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'})]" + "text/plain": [ + "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'})]" + ] }, "execution_count": 9, "metadata": {}, @@ -319,33 +384,39 @@ "source": [ "# This example specifies a composite filter\n", "retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")" - ], + ] + }, + { + "cell_type": "markdown", + "id": "af6aa93ae44af414", "metadata": { "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:38.913707Z", - "start_time": "2023-08-24T02:59:36.659271Z" + "jupyter": { + "outputs_hidden": false } }, - "id": "ac2c7012379e918e" - }, - { - "cell_type": "markdown", "source": [ "## Filter k\n", "\n", "We can also use the self query retriever to specify `k`: the number of documents to fetch.\n", "\n", "We can do this by passing `enable_limit=True` to the constructor." - ], - "metadata": { - "collapsed": false - }, - "id": "af6aa93ae44af414" + ] }, { "cell_type": "code", "execution_count": 10, + "id": "a8c8f09bf5702767", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:41.594073Z", + "start_time": "2023-08-24T02:59:41.563323Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [], "source": [ "retriever = SelfQueryRetriever.from_llm(\n", @@ -356,19 +427,22 @@ " enable_limit=True,\n", " verbose=True,\n", ")" - ], - "metadata": { - "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:41.594073Z", - "start_time": "2023-08-24T02:59:41.563323Z" - } - }, - "id": "a8c8f09bf5702767" + ] }, { "cell_type": "code", "execution_count": 11, + "id": "b1089a6043980b84", + "metadata": { + "ExecuteTime": { + "end_time": "2023-08-24T02:59:48.450506Z", + "start_time": "2023-08-24T02:59:46.252944Z" + }, + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, "outputs": [ { "name": "stdout", @@ -379,7 +453,10 @@ }, { "data": { - "text/plain": "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]" + "text/plain": [ + "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n", + " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]" + ] }, "execution_count": 11, "metadata": {}, @@ -389,44 +466,39 @@ "source": [ "# This example only specifies a relevant query\n", "retriever.get_relevant_documents(\"what are two movies about dinosaurs\")" - ], - "metadata": { - "collapsed": false, - "ExecuteTime": { - "end_time": "2023-08-24T02:59:48.450506Z", - "start_time": "2023-08-24T02:59:46.252944Z" - } - }, - "id": "b1089a6043980b84" + ] }, { "cell_type": "code", "execution_count": null, - "outputs": [], - "source": [], + "id": "6d2d64e2ebb17d30", "metadata": { - "collapsed": false + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } }, - "id": "6d2d64e2ebb17d30" + "outputs": [], + "source": [] } ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", - "version": 2 + "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", - "pygments_lexer": "ipython2", - "version": "2.7.6" + "pygments_lexer": "ipython3", + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/elasticsearch_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/elasticsearch_self_query.ipynb index dbfc6986678a4..ebe7fe34709ee 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/elasticsearch_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/elasticsearch_self_query.ipynb @@ -5,7 +5,13 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Elasticsearch self-querying " + "# Elasticsearch\n", + "\n", + "> [Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine.\n", + "> It provides a distributed, multi-tenant-capable full-text search engine with an HTTP web interface and schema-free\n", + "> JSON documents.\n", + "\n", + "In this notebook, we'll demo the `SelfQueryRetriever` with an `Elasticsearch` vector store." ] }, { @@ -13,8 +19,9 @@ "id": "68e75fb9", "metadata": {}, "source": [ - "## Creating a Elasticsearch vector store\n", - "First we'll want to create a Elasticsearch vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n", + "## Creating an Elasticsearch vector store\n", + "\n", + "First, we'll want to create an `Elasticsearch` vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n", "\n", "**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `elasticsearch` package." ] @@ -354,7 +361,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.3" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/milvus_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/milvus_self_query.ipynb index 068495eefaae3..eb7cc2e9d30e0 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/milvus_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/milvus_self_query.ipynb @@ -4,9 +4,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Self-querying with Milvus\n", + "# Milvus\n", "\n", - "In the walkthrough we'll demo the `SelfQueryRetriever` with a `Milvus` vector store." + ">[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.\n", + "\n", + "In the walkthrough, we'll demo the `SelfQueryRetriever` with a `Milvus` vector store." ] }, { @@ -352,7 +354,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -366,10 +368,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.4" - }, - "orig_nbformat": 4 + "version": "3.10.12" + } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 } diff --git a/docs/extras/modules/data_connection/retrievers/self_query/myscale_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/myscale_self_query.ipynb index 5288a7dd62c49..d437d95f53d6d 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/myscale_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/myscale_self_query.ipynb @@ -5,12 +5,15 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Self-querying with MyScale\n", + "# MyScale\n", "\n", - ">[MyScale](https://docs.myscale.com/en/) is an integrated vector database. You can access your database in SQL and also from here, LangChain. MyScale can make a use of [various data types and functions for filters](https://blog.myscale.com/2023/06/06/why-integrated-database-solution-can-boost-your-llm-apps/#filter-on-anything-without-constraints). It will boost up your LLM app no matter if you are scaling up your data or expand your system to broader application.\n", + ">[MyScale](https://docs.myscale.com/en/) is an integrated vector database. You can access your database in SQL and also from here, LangChain.\n", + ">`MyScale` can make use of [various data types and functions for filters](https://blog.myscale.com/2023/06/06/why-integrated-database-solution-can-boost-your-llm-apps/#filter-on-anything-without-constraints). It will boost up your LLM app no matter if you are scaling up your data or expand your system to broader application.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a MyScale vector store with some extra pieces we contributed to LangChain. In short, it can be condensed into 4 points:\n", - "1. Add `contain` comparator to match list of any if there is more than one element matched\n", + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `MyScale` vector store with some extra pieces we contributed to LangChain. \n", + "\n", + "In short, it can be condensed into 4 points:\n", + "1. Add `contain` comparator to match the list of any if there is more than one element matched\n", "2. Add `timestamp` data type for datetime match (ISO-format, or YYYY-MM-DD)\n", "3. Add `like` comparator for string pattern search\n", "4. Add arbitrary function capability" @@ -221,9 +224,7 @@ "cell_type": "code", "execution_count": null, "id": "fc3f1e6e", - "metadata": { - "scrolled": false - }, + "metadata": {}, "outputs": [], "source": [ "# This example only specifies a filter\n", @@ -384,7 +385,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.3" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/pinecone.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/pinecone.ipynb index 78c29641ccd20..e52085e42e3c8 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/pinecone.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/pinecone.ipynb @@ -5,9 +5,11 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Self-querying with Pinecone\n", + "# Pinecone\n", "\n", - "In the walkthrough we'll demo the `SelfQueryRetriever` with a `Pinecone` vector store." + ">[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality.\n", + "\n", + "In the walkthrough, we'll demo the `SelfQueryRetriever` with a `Pinecone` vector store." ] }, { @@ -395,7 +397,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.3" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/qdrant_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/qdrant_self_query.ipynb index a8769e443b051..8a91504cedeee 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/qdrant_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/qdrant_self_query.ipynb @@ -6,11 +6,11 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Qdrant self-querying \n", + "# Qdrant\n", "\n", ">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Qdrant vector store. " + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Qdrant` vector store. " ] }, { @@ -419,7 +419,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.6" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/redis_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/redis_self_query.ipynb index d74ea2dd6839b..95d9d39a6e367 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/redis_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/redis_self_query.ipynb @@ -5,11 +5,11 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Redis self-querying \n", + "# Redis\n", "\n", ">[Redis](https://redis.com) is an open-source key-value store that can be used as a cache, message broker, database, vector database and more.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Redis vector store. " + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Redis` vector store. " ] }, { @@ -450,9 +450,9 @@ ], "metadata": { "kernelspec": { - "display_name": "poetry-venv", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "poetry-venv" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -464,7 +464,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/supabase_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/supabase_self_query.ipynb index 564a3a21d9ed2..165e1a3dc1219 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/supabase_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/supabase_self_query.ipynb @@ -5,19 +5,22 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Supabase Vector self-querying \n", + "# Supabase\n", "\n", - ">[Supabase](https://supabase.com/docs) is an open source `Firebase` alternative. \n", + ">[Supabase](https://supabase.com/docs) is an open-source `Firebase` alternative. \n", "> `Supabase` is built on top of `PostgreSQL`, which offers strong `SQL` \n", "> querying capabilities and enables a simple interface with already-existing tools and frameworks.\n", "\n", ">[PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) also known as `Postgres`,\n", "> is a free and open-source relational database management system (RDBMS) \n", "> emphasizing extensibility and `SQL` compliance.\n", + ">\n", + ">[Supabase](https://supabase.com/docs/guides/ai) provides an open-source toolkit for developing AI applications\n", + ">using Postgres and pgvector. Use the Supabase client libraries to store, index, and query your vector embeddings at scale.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Supabase vector store.\n", + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Supabase` vector store.\n", "\n", - "Specifically we will:\n", + "Specifically, we will:\n", "1. Create a Supabase database\n", "2. Enable the `pgvector` extension\n", "3. Create a `documents` table and `match_documents` function that will be used by `SupabaseVectorStore`\n", @@ -569,7 +572,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.1" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/vectara_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/vectara_self_query.ipynb index 1e9128dc6fb7e..72eb71478f370 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/vectara_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/vectara_self_query.ipynb @@ -5,11 +5,12 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Vectara self-querying \n", + "# Vectara\n", "\n", - ">[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation (aka Retrieval-augmented-generation) applications.\n", + ">[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation\n", + ">(aka Retrieval-augmented-generation or RAG) applications.\n", "\n", - "In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Vectara vector store. " + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a Vectara vector store. " ] }, { @@ -432,7 +433,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.9" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/extras/modules/data_connection/retrievers/self_query/weaviate_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/weaviate_self_query.ipynb index 382b5225d1f3c..df11279c404d6 100644 --- a/docs/extras/modules/data_connection/retrievers/self_query/weaviate_self_query.ipynb +++ b/docs/extras/modules/data_connection/retrievers/self_query/weaviate_self_query.ipynb @@ -5,7 +5,12 @@ "id": "13afcae7", "metadata": {}, "source": [ - "# Weaviate self-querying " + "# Weaviate\n", + "\n", + ">[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from\n", + ">your favorite ML models, and scale seamlessly into billions of data objects.\n", + "\n", + "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Weaviate` vector store. " ] }, { @@ -293,7 +298,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.6" + "version": "3.10.12" } }, "nbformat": 4,