Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data from Deutsche Nationalbibliothek #29

Open
GLBW opened this issue Feb 27, 2019 · 4 comments
Open

Data from Deutsche Nationalbibliothek #29

GLBW opened this issue Feb 27, 2019 · 4 comments
Assignees

Comments

@GLBW
Copy link

GLBW commented Feb 27, 2019

The Deutsche Nationalbibliothek (DNB) offers its catalogue data under CC0. See: Datendienst "Bibliografische Dienstleistungen" (in german)

  1. The missing books could be imported into OL.
  2. The books in OL that also are in DNB could get a DNB identifier if that is missing in OL. Also DDC.
  3. Missing covers could also be imported. In most cases they have a better quality than the covers from Amazon. (see: issue 28)
@hornc
Copy link
Collaborator

hornc commented Jan 13, 2020

@hornc
Copy link
Collaborator

hornc commented Sep 21, 2020

I have created an archive.org item to hold the MARC21 records for import by ia-bulkmarc-bot.
A test import is here: https://openlibrary.org/books/OL30393926M/%C3%84thanol_und_h%C3%B6here_Alkohole_im_Serum_von_Diabetikern

It would be better to extract the DNB id and include it in identifiers, and possibly leverage any DNB author authority control ids, if we can extract them.

The MARC record for the above example is: https://openlibrary.org/show-records/marc_dnb_202006/dnb_all_dnbmarc_20200615-2.mrc:0:953

@hornc
Copy link
Collaborator

hornc commented Oct 12, 2020

The first non-serial (i.e. 'book' / monograph item) in the first DNB MARC file is:
https://openlibrary.org/show-records/marc_dnb_202006/dnb_all_dnbmarc_20200615-1.mrc:512190605:1084

Test imported as https://openlibrary.org/books/OL30608448M
Currently the DNB id number is not imported / added automatically. In the MARC records from this source it is located in fields 001, and 035 (with a (DE-599)DNB prefix)

@onnotasler
Copy link

@onnotasler There's an issue for DNB data here: internetarchive/openlibrary-bots#29 I have prepared the data and made a start on importing. I stopped because of the various discussion about import data quality, and have not yet resumed importing. This is something I can turn back on again if there is demand.

What problems were there with data quality? @GLBW and I can probably help to find heuristics to exclude unwanted material that is part of the DNBs collection, but not part of OpenLibrary's scope, if that is necessary.

We definitely should import data from the DNB, as they usually offer rather high quality data about books.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants