You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My concern about large scale data imports has always been that we be careful not to make our data quality issues worse at the expense playing the "numbers game" to bulk up.
Perhaps I just got unlucky, but the very first wishlist bot addition that I looked at (linked from the OpenLibrary blog post) had three duplicated works and two duplicated authors, both with badly formatted names.
Although adding "1000 books" sounds like a relatively small sample, if this single book is representative, we now have many thousands of records to clean up.
The text was updated successfully, but these errors were encountered:
My concern about large scale data imports has always been that we be careful not to make our data quality issues worse at the expense playing the "numbers game" to bulk up.
Perhaps I just got unlucky, but the very first wishlist bot addition that I looked at (linked from the OpenLibrary blog post) had three duplicated works and two duplicated authors, both with badly formatted names.
https://openlibrary.org/works/OL17890901W/Eagle's_Trees_and_shrubs_of_New_Zealand.
https://openlibrary.org/works/OL17900501W/Eagle's_Trees_and_shrubs_of_New_Zealand.
https://openlibrary.org/works/OL17900497W/Eagle's_Trees_and_shrubs_of_New_Zealand.
Eagle, Audrey Lily - https://openlibrary.org/authors/OL7416671A/Eagle_Audrey_Lily
Audrey LilyEagle - https://openlibrary.org/authors/OL7417982A/Audrey_LilyEagle
Although adding "1000 books" sounds like a relatively small sample, if this single book is representative, we now have many thousands of records to clean up.
The text was updated successfully, but these errors were encountered: