-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow librarians to import MARC data from other libraries #8360
Comments
Really, this should have been addressed long ago. Once a unique external ID such as ISBN or OCLCn has been furnished, the ImportBot ought not settle for just one repository’s record, but either select the most complete one available from a reliable library, or even better, fuse them together to fill in any blank fields. Certainly not a good plan to be stuck indefinitely with whatever little bit AMZ or BWB furnished. |
Hi @hornc |
It seems like the ask is: We have a pipeline for importing MARCs to Open Library, backed by Archive.org items which is described here: Also, there is a MARC option in the openlibrary.org/api/import path... This doesn't seems like a fantastic match for a first project by a community contributor. If someone did want to work through this, the solution would likely be... To create a librarian-only UI where a contributor with openlibrary/openlibrary/plugins/importapi/code.py Lines 117 to 133 in c792a2f
|
I agree with @mekarpeles that this is probably a bit tricky for a first time contributor. I had been meaning to respond with a summary of the two options mentioned above where we do have MARC imports already. The bulk import process could be used to import a single record, but that's a bit fiddly and involves creating a new archive.org item. Depending on the source though, if MARC records are available publicly, there might be a way to import an entire collection rather than a few books one by one. Is that a possibility here? The API should work to import a single record in one go, but I have not looked at this in a while. I don't think the single import API will store the MARC record anywhere, which is less useful than it could be. Open Library does not store MARC records, they are all on archive.org as single records stored on a scanned item, or part of a larger bulk-data MARC collection. Single MARC records without corresponding scans is not handled well / at all (if I remember correctly). The work around has been to only import bulk collections, which gives many new books, and records the source. Three options:
|
The free MARC records I found were all limited to a single edition of a single work. With the tools and knowledge I have, I can only download and process one edition at a time. If it is possible to import the whole catalogue at once, that would definitely be better. At least the Deutsche Nationalbibliothek has an Bezugswege und Exportformate entry on their homepage, and they seem to offer their whole catalogue in several different files formats:
They also offer a long list of formats and APIs, but I lack the technical expertise to comment on them. |
@onnotasler There's an issue for DNB data here: internetarchive/openlibrary-bots#29 I have prepared the data and made a start on importing. I stopped because of the various discussion about import data quality, and have not yet resumed importing. This is something I can turn back on again if there is demand. |
I do not insist on a MARC importer if I can instead get the books imported in bulk, but in that case we should implement a way to suggest sources for bulk data instead. |
When entering new books or editing existing books, I often have to manually copy from libraries that offer a MARC record for download. It would be great if I could directly import this data instead of having to typing it.
As an example, take Das Postwesen im Postamtbezirk Buxtehude.
This book exists as a really low quality import on Open Library at OL26425107W
The Deutsche Nationalbibliothek offers most of the lacking information on their website. It offers downloads as MARC21-XML and RDF (Turtle).
The DNB is not the only national libraries offering this, even though the formats differ between libraries. The Bibliothèque nationale de France offers Intermarc and Unimarc instead, for instance. LIBRIS (National Library of Sweden) offers MARC21.
It would save me time and prevent spelling errors if I could import those datasets.
Describe the problem that you'd like solved
A way to import MARC records from National Libraries, to at least improve existing records, but ideally also to create new books.
Proposal & Constraints
As far as I understood, Open Library already imports MARC records from some libraries. At least I often read "imported by MARC record from library of ..." at the bottom of editions.
The import should not be more annoying than typing the stuff in manually. Also, there seems to be a lot of technical differences between different MARC versions - I probably won't be able to get up to speed in all of them, this would have to be handled automatically.
Additional context
Stakeholders
@hornc
The text was updated successfully, but these errors were encountered: