Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Excel (maybe aother attachments types) creates duplicate Loader ID's #63

Closed
converseKarl opened this issue May 22, 2024 · 5 comments
Assignees

Comments

@converseKarl
Copy link

converseKarl commented May 22, 2024

  1. Add Excel file to loader
  2. Query loaders
  3. Add same Excel file to loader (repeat process)
  4. Query loaders

In v0.77, two entires Duplicate ID's

I would expect it would at the very least to create a new different Loader ID or if its able to determine the spreadsheet is same as before (filename matched) it will not add it and return a "could not add, as named resource already exists"

@adhityan
Copy link
Collaborator

When you say query loader, what do you mean?

@converseKarl
Copy link
Author

when you use query the loaders this way, ragApplication.loaders you get a json list of the types loaded, i use this to built my front end list, which is when i noticed uploading the same excel twice created two entires when querying the ragApplication object and using loaders method to get the latest list. Hope that helps

@adhityan
Copy link
Collaborator

There was never an intent (originally) to maintain a single instance of a unique loader as the unique identifier was only used to delete the vector values from the database if the loader was previously seen.

I reflected on this issue and decided to not maintain duplicates (by removing older versions of loaded loaders or newer versions of loaders being loaded in parallel). Have published a new version 0.0.79 with these changes.

@adhityan
Copy link
Collaborator

Now ragApplication.loaders should always have one instance of a loader of the same data.

@converseKarl
Copy link
Author

converseKarl commented May 24, 2024

I can confirm on adding excel named file, it indexes,and lists from the get loaders method. Adding it again, only one entry appears in the list from loaders

I also observed this with the web loaders too so appears to be fixed

Job well done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants