Embedding Operations Very Slow for 2MB CSV File #54

dkindlund · 2024-07-03T02:21:19Z

Hi @danny-avila , after configuring rag_api in a docker container, it seems as though when every file is submitted from LibreChat, rag_api processes the file into chunks and sequentially handles each chunk through the embedding API one-at-a-time. For a 2MB CSV file, that process is very, very slow.

I'm wondering if you've considered processing each chunk in batches, instead? Like, could we specify a rate limit of something like: process 10 chunks at a time?

Let me know your thoughts here.

As it stands, only small files can be handled by rag_api before the LibreChat file upload times out, because of these delays.

The text was updated successfully, but these errors were encountered:

dkindlund · 2024-07-03T02:46:00Z

Oh, as a temp measure, @danny-avila -- maybe introduce a max file size limit field. That way, the rag_api can proactively reject files larger than X amount in size via LibreChat? (This is a precaution while larger file sizes are eventually supported.)

nbhadauria · 2024-07-31T10:15:32Z

I am also facing the same issue with excel file

danny-avila · 2024-07-31T18:44:55Z

Thanks after doing some tests, it's nothing with the actual performance of the server/parsing process (though obviously will be slower with more chunks to process), but the real bottleneck is generating the embeddings, at least I see this on my end. This is an API call for each chunk, and this scales with "quantity" of text. Perhaps I can set a limit for max chunk size?

One approach is to make embeddings more asynchronous, though it may be tricky navigating rate limits by making them all async but it's a better appraoch. At least with OpenAI, this can be somewhat circumvented with Batch API but not all providers have this feature. Some additional settings for max async chunks would be better, in general, the async approach would definitely be faster than current, thanks for the feedback.

Usually this general RAG method doesn't serve structured data that well, either, which is why CSV to SQL is a popular strategy

ggomp2885 · 2024-09-11T23:34:30Z

You can navigate the rate limits by putting a small time.sleep(.05) between each asynchronous API call,
It will likely be different for each api provider, so you can create an environment variable which holds a default (.2) seconds which should be more than enough, and if the user wants to speed this up they can easily reduce this value in the .env file,

I have seen this turn 1000 api calls to an LLM sequentially take 8.5hours, and when the calls are made async, it will complete all 1000 api calls in 10 minutes or less

ggomp2885 · 2024-09-21T00:04:21Z

Solved - danny-avila/LibreChat#4081

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding Operations Very Slow for 2MB CSV File #54

Embedding Operations Very Slow for 2MB CSV File #54

dkindlund commented Jul 3, 2024

dkindlund commented Jul 3, 2024

nbhadauria commented Jul 31, 2024

danny-avila commented Jul 31, 2024 •

edited

Loading

ggomp2885 commented Sep 11, 2024

ggomp2885 commented Sep 21, 2024

Embedding Operations Very Slow for 2MB CSV File #54

Embedding Operations Very Slow for 2MB CSV File #54

Comments

dkindlund commented Jul 3, 2024

dkindlund commented Jul 3, 2024

nbhadauria commented Jul 31, 2024

danny-avila commented Jul 31, 2024 • edited Loading

ggomp2885 commented Sep 11, 2024

ggomp2885 commented Sep 21, 2024

danny-avila commented Jul 31, 2024 •

edited

Loading