Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Add vectorless search to record collection interface. #9278

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

westey-m
Copy link
Contributor

Motivation and Context

Many of the databases that implement the vector store abstractions support search without vectors.
It is possible to search the database with a zero value vector anyway, but doing vector comparisons
without needing to is inefficient.

Description

  • Adding a vectorless search method to the collection interface with implementations.

Contribution Checklist

@westey-m westey-m requested a review from a team as a code owner October 15, 2024 11:00
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory labels Oct 15, 2024
@github-actions github-actions bot changed the title Add vectorless search to record collection interface. .Net: Add vectorless search to record collection interface. Oct 15, 2024
/// <param name="options">The options that control the behavior of the search.</param>
/// <param name="cancellationToken">The <see cref="CancellationToken"/> to monitor for cancellation requests. The default is <see cref="CancellationToken.None"/>.</param>
/// <returns>The records found by the vector search, including their result scores.</returns>
Task<VectorlessSearchResults<TRecord>> VectorlessSearchAsync(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this method should be a part of IVectorStoreRecordCollection. For me, it looks like vectorless search capability should be a part of either:

  1. Separate interface, for those vector databases which can support that. Based on the XML doc since not all vector stores support searching without a vector, we may end up with some workaround implementations or throwing NotSupported/NotImplemented exception, which will be a bad pattern. So, I would avoid forcing developers to implement it.
  2. This interface, but inside GetAsync method, which we can re-think. Currently, it accepts a key parameter, which is required to get a record, but we can re-design it to accept a filter instead of key parameter, and filter in turn can have a key or any other property in order to get a record. Then each connector will process the filter only in the way it natively supports (e.g. If getting a record by other property than key is not supported - throw NotSupported exception). Then, in order to simplify things for getting records by key, we can create an extension method which will accept a key parameter, create a filter by key and use GetAsync interface method. This will also allow to avoid breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Issues or pull requests impacting the core kernel memory .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants