Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add support for initializing vecs client with custom schema #63

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

majamil16
Copy link

What kind of change does this PR introduce?

Feature: add support for connecting to a user-specified database schema (instead of hard-coding vecs).

This was accomplished by adding the schema argument to the Client class and replacing all hard-coded references to schema vecs.

Also added schema parameter to create_client method. For backwards compatibility, it still defaults to vecs.

What is the current behavior?

Currently, the client defaults to schema vecs. Users may want to place vecs collections under a schema according to their own naming conventions.

Addresses #62

Please link any relevant issues here.

What is the new behavior?

Users can now instantiate a Client instance with a custom schema. For example:

DB_CONNECTION = "postgresql://<user>:<password>@<host>:<port>/<db_name>"

# create vector store client
vx = vecs.create_client(DB_CONNECTION, schema="my_schema")

Additional context

Add any other context or screenshots.

Test coverage attached
Screen Shot 2023-12-06 at 9 32 26 PM

@majamil16 majamil16 changed the title add support for initializing vecs client with custom schema Feat: add support for initializing vecs client with custom schema Dec 7, 2023
@olirice olirice self-requested a review December 11, 2023 15:46
Copy link
Collaborator

@olirice olirice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice job on the PR

did you see this comment here:
#62 (comment)

if we add a top level parameter for schema in the Client it'll be much more difficult to support defining a schema on the Collection when that inevitably gets requested.

Would you be up for refactoring to make this a per Collection keyword only argument?

Note that it'd be a bit more involved since the Metadata instance can't be bound to a single schema in that case

@majamil16
Copy link
Author

@olirice thanks for the feedback! sure, I'm definitely up to implement that change.

Maja Milinkovic and others added 2 commits January 15, 2024 11:28
update - move support for schema onto collection instead of client
@majamil16
Copy link
Author

hey @olirice, made some updates. let me know your thoughts, thanks!

@olirice olirice self-requested a review January 16, 2024 20:35
src/vecs/client.py Outdated Show resolved Hide resolved
src/vecs/collection.py Show resolved Hide resolved
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@olirice
Copy link
Collaborator

olirice commented Jan 16, 2024

looks great. if we can address those last few concerns this will merge

@yusofy
Copy link

yusofy commented Feb 4, 2024

I'm looking forward to this PR being merged!

@majamil16
Copy link
Author

hey, thanks @olirice for the review! will finish this up this week

@majamil16 majamil16 requested a review from olirice February 13, 2024 00:37
Comment on lines +28 to +29
Creates a client from a Postgres connection string and optional schema.
Defaults to `vecs` schema.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 and optional schema. Defaults to `vecs` schema.

can be removed

Comment on lines 28 to +29
- id: black
language_version: python3.9
language_version: python3.8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was the rationale for this change?

@@ -13,6 +13,7 @@
import vecs

PYTEST_DB = "postgresql://postgres:password@localhost:5611/vecs_db"
PYTEST_SCHEMA = "test_schema"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the best way to test that escaping is done correctly in all places is to use a crazy schema name.

I tested basic operations with the schema name "esCape Me!" and the test suite fails.

To reproduce that, try:

foo: Collection = client.get_or_create_collection(name="foo", schema="esCape Me!", dimension=5)

and you'll get

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.InvalidSchemaName) schema ""esCape Me!"" does not exist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants