Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why ivf index so big? #601

Open
huaimin016 opened this issue Sep 30, 2024 · 0 comments
Open

why ivf index so big? #601

huaimin016 opened this issue Sep 30, 2024 · 0 comments

Comments

@huaimin016
Copy link

huaimin016 commented Sep 30, 2024

I use select * from pg_vector_index_stat; to checkout index size, here is the result:

19176892	19177456	pgvecto_rs_90_ivf	pgvecto_rs_90_ivf_vector_idx	NORMAL	false	2781540	{2781540}	{}	0	1580896555	{"vector":{"dimensions":512,"vector":"Veci8","distance":"Cos"},"segment":{"max_growing_segment_size":20000,"max_sealed_segment_size":30000000},"indexing":{"ivf":{"least_iterations":16,"iterations":500,"nlist":1000,"nsample":65536,"quantization":{"trivial":{}}}}}

In disk:

postgres@dddddd:~/16/main/pg_vectors/indexes/0000000000000000000000000000000066d137b44b7ae9220000000501249ff0$ du . -h
21M	./segments/51f8e269-845f-4154-8558-629302f1fa7f/indexing/quantization
1.5G	./segments/51f8e269-845f-4154-8558-629302f1fa7f/indexing/raw
1.5G	./segments/51f8e269-845f-4154-8558-629302f1fa7f/indexing
1.5G	./segments/51f8e269-845f-4154-8558-629302f1fa7f
1.5G	./segments
8.0K	./startup
1.5G	.

why need raw index data? This increases the memory usage a lot. Is it necessary to save raw index data? Will this data be loaded into memory?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant