Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get study article for this week #27

Closed
GeiserX opened this issue Dec 26, 2024 · 4 comments
Closed

Get study article for this week #27

GeiserX opened this issue Dec 26, 2024 · 4 comments

Comments

@GeiserX
Copy link

GeiserX commented Dec 26, 2024

Hey. I'm requesting a little bit help from you as you seem to be the most knowledgeable related to the JW API.
I'm trying to get this weeks' Watchtower, but it seems to be mission impossible (regarding to catalog.db). I can retrieve the Publication, but the mission is to get the correct DocumentID. There is a table there PublicationDocument but there is no relationship to the week. A given publication has many documentId. I can retrieve the index of the Watchtower for the month, but that't it.
I've checked and this could be feasible with docClass (As I'm seeing on the web) but it's nowhere to be seen in any db.
What's your approach for this? Highly appreciate your help.

@orangethewell
Copy link

There is DatedDocument table for every dated study. This a sample for the exact day (or week) of the entire publication document. I don't remember if DatedDocument from watchtower articles contains the study or a page with a jwpub:// link to study... but at all, all studies with dated content are set in this table. Ex: Daily text for the exact day is set there, while Document table holds data for the entire month.

The date string is formatted with "yearMonthDay" if i'm not mistaken, and for checking if is in the current week, just check if it is between startDate and endDate, that for study articles covers one week instead of one day.

@GeiserX
Copy link
Author

GeiserX commented Dec 26, 2024

Thanks @orangethewell for your time.

There is DatedText table which in fact I was trying to use, but it just contains the PublicationIds for the entire month, but I can't find DatedDocument or Document tables you mentioned. Perhaps I need other file I didn't fetch?... There are few "weekly" rows in DatedText but only for a few rare languages, and I prefer not to use it.

Here using this function is how I currently extract it, as I found it impossible to do it via only the sqlite db. I extract the index and visit the URL, then I extract the "nth" link which belongs to a given class. A bit hacky, but haven't found any workaround yet.

@orangethewell
Copy link

Sorry, misspelled the name, it is DatedText, you just need to decrypt content using the same method for decrypt content column from Document. At least, it doesn't exist on catalog.db, which seems to be the db you looking for, there's only the publicationId for you looking out on publications and get it. The DatedText content is in the own publication db file (.jwpub/content/.db)

@GeiserX
Copy link
Author

GeiserX commented Dec 26, 2024

Okay, thanks a lot for your help. Then I will try to do it differently I believe, as I was interfacing directly with the web through catalog.db. Instead, I will just download the .jwpub file and work with the contents, that's a better approach. I had previously gained experience from getting data from .jwpubs so this will get handy. Hopefully I can get the data about the week.
Thank you!

@GeiserX GeiserX closed this as completed Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants