Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large embeddings fail with Add WebLoader / Lance / LMDB - Dimension Sizing Problem #41

Closed
converseKarl opened this issue Apr 28, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@converseKarl
Copy link

converseKarl commented Apr 28, 2024

The large embedded model constructor seems to be been fixed and this is now included in my project for build 0.71

However clearing vector and lmdb caches (deleting) and restart server, switching to the large declaration for gpt embedded model resulted in problems. A number of issues is happening. The code below works with small text embedded gpt model but switch this to large and add a single URL or multiple and it fails with sizing dimension issues immediately

Code
ragApplication = await ragApplicationBuilder
.setTemperature(0.2)
.setEmbeddingModel(new OpenAi3LargeEmbeddings())
.setVectorDb(new LanceDb({ path: './db' }))
.setCache(new LmdbCache({ path: './llmcache'}))
.build();

1|platform | Error adding URL: Error: Invalid argument error: Values length 13824 is less than the length (3072) multiplied by the value size (3072) for FixedSizeList(Field { name: "item", data_type: Float32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, 3072)
1|platform | at LocalTable.add (/opt/bitnami/projects/platform/node_modules/vectordb/dist/index.js:209:14)
1|platform | at async LanceDb.insertChunks (file:///opt/bitnami/projects/platform/htdocs/node_modules/@llm-tools/embedjs/dist/vectorDb/lance-db.js:61:9)
1|platform | at async RAGApplication.batchLoadChunks (file:///opt/bitnami/projects/platform/htdocs/node_modules/@llm-tools/embedjs/dist/core/rag-application.js:130:23)

@converseKarl converseKarl changed the title Large embeddings fail with Lance / LMDB - Large embeddings fail with Add WebLoader / Lance / LMDB - Dimension Sizing Problem Apr 28, 2024
@adhityan adhityan added the bug Something isn't working label May 2, 2024
@adhityan
Copy link
Collaborator

adhityan commented May 2, 2024

Sorry I am on vacation which affects how much time I have to look into all the issues. Back next week and will take a look at several of the open issues.

@adhityan
Copy link
Collaborator

adhityan commented May 9, 2024

So I tested the new OpenAi3LargeEmbeddings with LanceDb and could not reproduce the error. Did you delete the whole database folder at resolved path ./db and then restart when you changed the model?

@converseKarl
Copy link
Author

i can confirm its working, you're right, clearing out the indexes, and removing the db lance vector folder and rebuild resolved the issue. So you can mark this issue as closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants