Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help with understanding JWPUB format #1

Open
MrCyjaneK opened this issue Apr 24, 2021 · 128 comments
Open

Need help with understanding JWPUB format #1

MrCyjaneK opened this issue Apr 24, 2021 · 128 comments

Comments

@MrCyjaneK
Copy link
Owner

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

@orangethewell
Copy link

Hi! I had this idea (scrapping jwpub files) somedays ago and was searching for anything about these JW Library files. Appearly, these files have some linking directly with jw.org, but even then, I don't got anything about how this linking works. By the way, I was thinking that bytecode should be a id from words table too, but I also don't think this is a directly id, maybe have some instructions that JW made for it.

You're a Jehovah Witness?

@MrCyjaneK
Copy link
Owner Author

I even sent a couple of emails with a request for documentation, but got a response that said that they are unable to answer my question from this email address. So my idea was to call the number from https://www.jw.org/en/jehovahs-witnesses/contact/united-states/, but recently I didn't had much time, so I didn't do that.

And yes, I am

@MrCyjaneK
Copy link
Owner Author

I've put a lot of time into understanding this format, but still no results worth showing.

It's sad that most of the new publications are PDF/JWPUB only, PDF just doesn't scale well, and JWPUB is ugh.,

I still have an idea - scraping wol.jw.org but I'm against sending hundreds/thousands of request (every image, article, quote, source) to get one publication.

@orangethewell
Copy link

orangethewell commented Jun 15, 2021

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

@orangethewell
Copy link

orangethewell commented Jun 16, 2021

Hello! So, I made some experiencies with the JWPUB file to know how it works and I think I got some hot things working! First of all, content is directly related with the page and don't accept something new in (maybe because content have a fixed size bytes and I inserted more than that? I don't know). Furthermore, the Words table don't work the way we thought, I changed a word in this table and all I got is the way I find it on the book, now I need to search by "subjecters" instead "subject", and after all, the word in the documents keep the same.

So, after all, I got a "How to Remain in God's Love" Book with the subjects section with title "Edited Subjects" and a blank "Letter from the Governing Body".

EDIT: I read the documentation that you gave, maybe the begin and end can be the initial byte and the final byte to be converted, but there's the question: Converted in what if it's not an index from words table?

@MrCyjaneK
Copy link
Owner Author

MrCyjaneK commented Jun 16, 2021

Wow! That's great! I lost so much time with the Words table.. So you are saying that Contents is directly related to the content? Not just reference the Words?

Have you seen things below the sentence Huh It's quite short. in the docs? https://raw.githubusercontent.com/MrCyjaneK/jwapi/master/docs/jwpub/index.md

You also need to have there:

  • Some heading/subheadings/fonts etc...
  • Images (probably by ID)
  • Links to other publications
  • And the content itself

What is the news from God? translates to:

Decimal 1246 616 1131 758 474 499
Hex 4de 268 46b 2f6 1da 1f3

Which is quite short, so my guess was that it use Words table. Maybe it store rendered publications somewhere in cache, that's why changing the table didn't change the content?

Or another scenario the Contents is compressed in some way..

@orangethewell
Copy link

Okayyy I think I got a problem with the customized JWPUB and I don't know what exactly was charging it.

I saw what was in the jwpub converting doc before and yeah, it could be it but... There's something strange with it and I don't know what exactly happened.

I changed a lot of things in the original db because I thought I was compacting it with a new jwpub file with my code but no, and when I fixed that, I had changed a lot in the DB and I think I got a corrupted publication (Or modified so long that it's don't load anything). Remembering that I changed just one content column. But this is really strange, I didn't saw that yesterday but even then it's strange how it's going.

After all, there's a lot of things working behind the jwpub specifications, there's even a schema specification for publication view and, with words table, there's some strange tables that's is like a pre compiled search. I'm really thinking about what some of a reading program forum responded to a request to create a support for the JW files, they said these files have requests for the JW API. I don't trust in everything, but this really was stuck in my mind, but even then doesn't make any sense, why a 100mb or + will need from JW? And if it's, how the pioneers book are distributed?

@MrCyjaneK
Copy link
Owner Author

I'll check the network thing tonight.. I'll download a publication and just watch for the traffic in burp suite, that should clarify if the requests are sent there or not.

@MrCyjaneK
Copy link
Owner Author

So first of all, I had some problems with android studio, then it was just late and I forgot to reply.
After downloading publications there were no requests (execept for few images, that were unrelated to the publication)

@MrCyjaneK
Copy link
Owner Author

Haha, I don't think they would simply give us their code, sadly. Anyways, scrapping wol.jw.org would actually works, but it's the worst idea considering that we should add a lot more code and change almost everything. (Considering too this will make the project a lot more heavy for low-end systems)

Yea.. but if we fail that's the only option.

After all, all we can do is trial and error. We have at least a hint, that files works like a Epub, with XML files inside it, the difference it's hard modified and for some reason there's binary code that isn't a match with a list of words.

Not really - it can be converted on the go, and then just kept in some html format.

I will try doing something with my knowledge with Python, I don't know that I will help in something, but at least I will try for fun. I really like the fact to use JW Library in PC, and it's sad that Watchtower don't have ported it to some Linux distro. I don't think it will go for a long time, maybe some day they release a version for a famous Linux distro.

That's sad :( I wish that there would be a decent watchtower library app made with gtk ;p

A final question... I saw with your project that you use the app API from JW but, this is allowed? Isn't a violation from some of the App's terms of use?

Since I don't reupload the content, it is legal, but I'm not a lawyer

https://www.jw.org/en/terms-of-use/

and even if it's against the terms.. sigh. I'm not switching back to android, so I'll continue to develop this app.

(sorry for late reply.. I missed this comment)

@orangethewell
Copy link

No no! It's okay, brother! I too don't have so much time for searching more these days, after all, I'm still have 15 years old and have some homework to do here for school. ^^

But I will still following the project flow, if I can get something new here, I make a new response on this issue.

And if you can't got any new thing from the JWPUB convertion, you still have a more easy task to do, like the video player :) (I really like the way the PC JW Library app can be easily "hacked" to have a new video on, lol)

@MrCyjaneK MrCyjaneK mentioned this issue Jun 24, 2021
8 tasks
@MrCyjaneK
Copy link
Owner Author

After spending hours on this thing, I'll not continue to reverse engineer the JWPUB format, until somebody do that.. for me.

For now I'll try to move to flooding wol.jw.org apis and getting the publications page by page (thanks for abandoning epub btw).

image

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK MrCyjaneK linked a pull request Aug 13, 2021 that will close this issue
@mjacobus
Copy link

mjacobus commented Jan 3, 2023

@MrCyjaneK did you figure out how to read Document.Content?

@MrCyjaneK
Copy link
Owner Author

@MrCyjaneK
Copy link
Owner Author

MrCyjaneK commented Jan 3, 2023

I'm not working on this app anymore, spending time on open source alternative to something that is clearly using DRM when it shouldn't (can somebody give me one single reason for which it is worth to encrypt such content when it is freely available?)
Also I don't feel like playing some sort of cat and mouse thing when somebody can just change the way api sends publications and cut support for earlier versions.

And the elephant in the room. WHY isn't the app open source in the first place?

Until somebody gives me answers to that questions I'm not going to work on this project. wol.jw.org is enough for me.

</project>

@darioragusa
Copy link

Security? If anyone could get a publication and easily edit it the risk of spreading misleading information would be very high.

@MrCyjaneK
Copy link
Owner Author

@darioragusa As they can do with .epub, .mobi, and .pdf.

Also there is a tool for that used widely in the internet, you can sign things with PGP that would allow 3rd party apps to be developed and would cause less risk (currently we can edit the publications - drm is defectivebydesign.org).

@darioragusa
Copy link

@MrCyjaneK I know you can edit the other formats without problems but the most of us use the JW Library app. I download a jwpub knowing that it comes from jw.org or the app and I trust the content. It's not a random txt file sent by a random guy opened with Word or Adobe Reader which may or may not contain the correct informations. An example: if I send to my grandma an EPUB she my be not able to open it but, if I send a jwpub she taps the file, a trusted app she always use show up and for her it's all ok: a normal article with the reliable content that is supposed to be there. A jwpub can still be edited but it's not a thing that anyone with basic knowledge of Word can do: less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

@MrCyjaneK
Copy link
Owner Author

less editors -> less edited files. Perhaps I'm totally wrong but those are my two cents.

The thing is current method allows editing, and signing would make it impossible while allowing moders like us to easily read the content

@darioragusa
Copy link

I don't know much about signing files, but I guess that the app should have a key and using this key with (something, idk) they should get a value. It's like checking the hash? If a bit changes the value is different?

@MrCyjaneK
Copy link
Owner Author

It's like checking if the content was modified, the content can be signed to verify that it was created by somebody and after modding it the signature will not match. It's like encrypting but you can see the content and can't modify it.

@darioragusa
Copy link

Ok, but this way they shouldn't save the signatures for every version of every article in every publication in every language?

@MrCyjaneK
Copy link
Owner Author

pgp signatures do not add a lot of extra size to publication so I don't consider this a problem. (hence you could sign a sha512sum of publication and get similar result) + you can sign them as they are served to download.

@darioragusa
Copy link

If the signature is stored with the publication what stops me to change it?

@MrCyjaneK
Copy link
Owner Author

You can change it - you can even sign it with your key but it will be invalid

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

This is a message I have decided to sign, try to mess with it and it will no longer be signed.
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCAAdFiEE0gTaRRUXZfyrr8PQPD6SA9PleeEFAmO36L8ACgkQPD6SA9Pl
eeERiAv+MXm2VjIZMvOgXwKT5bDmwMpfK8liOdT/IhoFvNsTwMiWUQRHzp12OJtz
U+V26gq6lmBJsKsyij6AAvefy048mAzGnAMRR5c9uqkYs2R66jqUIRNERCE2XKdu
uiJAhmpMqNughA0/h19/As1xCrepZpo+W1SEE8yEPZp13eZ0gylmS0pBqXR5QcHB
JNAIMV84xOAntQNe2dzs6lBhhWdF3EvE5L50so2EiXGulr5mIdPwIkaCUSQIZYRd
2aWLwcA4j8ZN/UfY6YbCyhSyH5Fm4WXZ17tsPSuOqBE7QhW100gPiQjPDGc5ZUwN
SjvRIxrvCZ9rPg/PQnAOIgxALilBW3y6Jaq73XTBFaOArkOxmWh8rFhL7OkMdyW7
ewpAVjU90ChYEJ17BZpM+cSYIYcRwsYdtNQcQVl1fViBFlBFY1PEm4mvbbHK4GLQ
aRBnSsTbabNQQLij3hk/Wc9RLEe49pk/tmeDlqrtF5ELbFWRBtM0R63H3qfXkEL7
vFqGsJB4
=gBZ4
-----END PGP SIGNATURE-----

@geimist
Copy link

geimist commented Apr 1, 2024

Hello @MrCyjaneK,
sorry for my question, which is a bit off topic here, but because you wrote that:

5 days of interrupted work and I've imported most of JWPUBs that were easy to obtain (curl + jq in a for loop). I have managed to downlaod 414.89 GiB of them, and discovered few interesting things on my way

I want to keep a local JWPUB archive. Do you have any information on how you were able to load the entire catalog (I only want it in one language – that's only ~4 GB).

I am using the API url https://b.jw-cdn.org/apis/pub-media/GETPUBMEDIALINKS?issue=... so far. However, this must be structured differently for different publications (e.g. for periodicals), which I currently influence manually. In the future, however, I would also like to handle updates and new publications automatically.

How have you implemented this?
(I only do this in the shell)

Thanks 🤗

@MrCyjaneK
Copy link
Owner Author

@geimist you need catalog.db

  1. https://app.jw-cdn.org/catalogs/publications/v4/manifest.json
  2. https://app.jw-cdn.org/catalogs/publications/v4/{current from manifest.json above}/catalog.db.gz

gzip -d it and enjoy sqlite catalog :)

@geimist
Copy link

geimist commented Apr 1, 2024

Yes, I have that.
But how do you find/create the download links?
Or is there perhaps a way to initialize the download directly via the value Signature (ID of the publication) of the PublicationAsset table?

@GuyMicciche
Copy link

GuyMicciche commented Apr 2, 2024

I have a working python version.

@MrCyjaneK
Copy link
Owner Author

@geimist - provide all parameters that you find to the url, if something is undefined provide an empty string.

@MaxBas70
Copy link

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

I would like to know if there is a way to convert .doc (win word) or pdf files into .jwpub format, or if there isn't a system already done online, an idea on how to do it

1 similar comment
@MaxBas70
Copy link

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

I would like to know if there is a way to convert .doc (win word) or pdf files into .jwpub format, or if there isn't a system already done online, an idea on how to do it

@arthurwweber
Copy link

I have no idea how to get words out of Content in .db file located in jwpub archive. what I know. So any help is needed.

I would like to know if there is a way to convert .doc (win word) or pdf files into .jwpub format, or if there isn't a system already done online, an idea on how to do it

Yes, MaxBas70. There is a way. You have to create the SQLite database that defines the publication (with all its content, metadata, words index etc.), add the artwork as separate files, pack everything together, attach some more required metadata, pack it once more and there you have the JWPUB file. Since the JWPUB file is not a publicly documented format (most of us here have looked into JWPUB files to see how they are structured without any support from brothers serving in MEPS Programming), there isn't any publicly available piece of software that takes a random Word document and converts it into JWPUB.

@MaxBas70
Copy link

Thanks Arthur, at this point I ask you if it was possible to ### create an online converter that launches a procedure for the conversion. I know many who would be interested and would be willing to pay a monthly or processing fee to have, for example, speech outlines directly on JWL. I can provide the technological structure to do it and the marketing to sell it. If you give me a hand, if it's possible.

@arthurwweber
Copy link

Aren't some of the public talk outlines already available in JWPUB on JW Hub?

@MrCyjaneK
Copy link
Owner Author

it is already rather in depth explained in this very issue, I suggest reading all of the comments in here.

@livrasand
Copy link

I'm not sure if creating a "JWPUB online converter software" is legal to charge money. You won't find a direct WORD or PDF to JWPUB converter. This specific conversion can only be done through MEPS. MEPS provides an interface similar to that of WORD and allows the creation of digital publications. After you finish publishing and sending it to print, you can export it in formats such as PDF, WPUB, EPUB, and JWPUB. Have you already tried Reviw and its wiki? It can help you create a JWPUB easily and for free.

https://github.com/livrasand/Reviw

@MaxBas70
Copy link

Alcuni schemi dei discorsi pubblici non sono già disponibili in JWPUB su JW Hub?

Not all, some are in pdf or word. The problem is that if I download a pattern, I am forced to use it pure and make the notes alongside. Instead it would be more useful to upload my word or pdf outline, which I customized and have it in the main window. Or it would be useful to have a system to position the free text box as you like as in the W study. Can anyone explain to me how to use Rewind in practice?

@Fuseteam
Copy link

hey guys, have you seen this sections of jw.org/en/terms-use?
image
"the Application" referring to JW Library

Repository owner deleted a comment from yuniermv Aug 26, 2024
@gokusander
Copy link

gokusander commented Sep 9, 2024

Good evening, I'm creating a custom .jwpub, but I'd like some help with the .db tables. There are some that I don't know what to put: sqlite_stat1, word, searchindex, DocumentParagraph, and many others. I already have the html, manifest and blob, but it doesn't open. I believe there's something missing. I was told it would be the paragraph, but I don't know how to fill it in. If you can help, please. Thank you.

I can share the .zip for help

(I use google translator)😂

@goodniceweb
Copy link

Hello there.

I'm working on a service that helps with a personal study and I also need help / guidance around how to make my own JWPub file.

Idea of my service is the following: save your writings while listening on a congregation / assembly or while listening a video in the service. It splits your writings with AI into pieces so you can have your own index: when you read The Bible and tap on the gem icon, you can see part of your writing, tap on it and see the whole writing - of a speech or a video.

My progress so far:

  • Encoding / decoding of a binary format for the Content column according to this guide - Done ✅
  • Filling Publication table with metadata of my custom JWPub file - Done ✅
  • Filling PublicationYear table with current year - Done ✅
  • Filling PublicationCategory table with index - Done ✅
  • Filling Document table with user's writings - Done ✅
  • Filling Extract table with pieces of user writing split by AI - Done ✅
  • Creating a zip file of an SQLite db file - Done ✅
  • Creating a manifest file of JWPub file - Done ✅
  • Creating a zip file which is actually JWPub file - Done ✅
  • Fill RefDocument table - In Progress 🟡
  • Fill VerseCommentary table - In Progress 🟡
  • Fill VerseCommentaryMap table - In Progress 🟡
  • Being able to install this file in JWLibrary - Not done 🙁 And I'm not sure what else I can do about it. Below you can find more details about what was done.

Details

  • got SQLite file of the latest publication index from our official website. Unzipped it and removed all data from all tables, left just empty tables. Run vacuum on it so it release space on my SSD. So I think the structure of db should look good.

  • Fill Publication table with the only record about my own index.

    See code and comments
    publication_data = {
      PublicationId: 1,
      VersionNumber: 1,
      Type: 9, # Just didn't change what it was
      Title: name, # Name here means what user assign it to on the service
      TitleRich: nil,
      RootSymbol: SYMBOL,
      RootYear: Date.today.year,
      RootMepsLanguageIndex: 1,
      ShortTitle: name,
      ShortTitleRich: nil,
      DisplayTitle: name,
      DisplayTitleRich: nil,
      ReferenceTitle: name,
      ReferenceTitleRich: nil,
      UndatedReferenceTitle: name,
      UndatedReferenceTitleRich: name,
      Symbol: SYMBOL, # this is my own made-up symbol: mrsrch (stands for my-research)
      UndatedSymbol: SYMBOL,
      UniqueSymbol: SYMBOL,
      EnglishSymbol: SYMBOL,
      UniqueEnglishSymbol: SYMBOL,
      IssueTagNumber: 0,
      IssueNumber: 0,
      Variation: "",
      Year: Date.today.year,
      VolumeNumber: 0,
      MepsLanguageIndex: MEPS_LANGUAGE_INDEX, # I'm native Russian/Ukraine person so this is 207 for me
      PublicationType: PUBLICATION_TYPE, # "Index" string here
      PublicationCategorySymbol: CATEGORY_SYMBOL, # "dx"
      BibleVersionForCitations: "NWTR",
      HasPublicationChapterNumbers: 0,
      HasPublicationSectionNumbers: 0,
      FirstDatedTextDateOffset: 0,
      LastDatedTextDateOffset: 0,
      MepsBuildNumber: MEPS_BUILD_NUMBER # 13_073
    }
  • PublicationCategory and PublicationYear tables are self-explanatory but still gonna list them below

    See code
    publication_category_data = {
      PublicationCategoryId: 1,
      PublicationId: 1,
      Category: "dx"
    }
    publication_year_data = {
      PublicationYearId: 1,
      PublicationId: 1,
      Year: Date.today.year
    }
  • Then I loop through writings user created in my system and create Document records in the database.

    See code and comments
      document_data = {
        DocumentId: document_array_index,
        PublicationId: publication_data[:PublicationId],
        MepsDocumentId: document_array_index,
        MepsLanguageIndex: MEPS_LANGUAGE_INDEX,
        Class: 3, # means regular I guess
        Type: 1, # playing guessing game
        SectionNumber: nil, # do not use sections for now
        ChapterNumber: nil,
        Title: writing.title,
        TitleRich: nil,
        TocTitle: research.title,
        TocTitleRich: nil,
        ContextTitle: nil,
        ContextTitleRich: nil,
        FeatureTitle: nil,
        FeatureTitleRich: nil,
        Subtitle: nil,
        SubtitleRich: nil,
        FeatureSubtitle: nil,
        FeatureSubtitleRich: nil,
        # Using this algo https://github.com/MrCyjaneK/jwapi/issues/1#issuecomment-1714309559
        Content: encryptor.encrypt(writing.content.body.to_html),
        FirstFootnoteId: nil,
        LastFootnoteId: nil,
        FirstBibleCitationId: nil,
        LastBibleCitationId: nil,
        ParagraphCount: writing.paragraph_count,
        HasMediaLinks: 0,
        HasLinks: 0,
        FirstPageNumber: nil,
        LastPageNumber: nil,
        ContentLength: writing.content.body.to_html.size,
        PreferredPresentation: nil,
        ContentReworkedDate: nil,
        HasPronunciationGuide: 0
      }
  • Then I create records in the Extract table so users can see previews of their writings

    See code and comments
       extract_data = {
          ExtractId: extract_index,
          Link: nil, # ToDo: find a way to set link. I think it's related with content of VerseCommentary->Content
          Caption: research.title,
          CaptionRich: nil,
          Content: encryptor.encrypt(extract.content),
          RefPublicationId: publication_data[:PublicationId],
          RefMepsDocumentId: document_data[:DocumentId],
          RefMepsDocumentClass: document_data[:Class],
          RefBeginParagraphOrdinal: nil,
          RefEndParagraphOrdinal: nil
        }
  • Then I create zip out of db file. In the zip file, I put file inside of directory with the same name as the file, except the "*.db" extension. As I don't have images for this, I don't add them.

  • I create manifest file.

    See code and comments
    manifest_data = {
      name: "mrsrch.jwpub",
      hash: sha256of_zipped_db_file,
      timestamp: Time.now.utc.iso8601,
      version: 1,
      expandedSize: File.size(path_to_unzipped_db_file),
      contentFormat: "z-a",
      htmlValidated: true,
      mepsPlatformVersion: 2.100000,
      mepsBuildNumber: MEPS_BUILD_NUMBER,
      publication: {
        fileName: "mrsrch.db",
        type: 9,
        title: name,
        shortTitle: name,
        displayTitle: name,
        referenceTitle: name,
        undatedReferenceTitle: name,
        titleRich: name,
        displayTitleRich: name,
        referenceTitleRich: name,
        undatedReferenceTitleRich: name,
        symbol: SYMBOL,
        uniqueEnglishSymbol: SYMBOL,
        uniqueSymbol: SYMBOL,
        undatedSymbol: SYMBOL,
        englishSymbol: SYMBOL,
        language: MEPS_LANGUAGE_INDEX,
        hash: sha1_of_unzipped_db_file,
        timestamp: Time.now.utc.iso8601,
        minPlatformVersion: 1,
        schemaVersion: 8,
        year: Date.today.year,
        issueId: 0,
        issueNumber: 0,
        variation: "",
        publicationType: PUBLICATION_TYPE,
        rootSymbol: SYMBOL,
        rootYear: Date.today.year,
        rootLanguage: 0,
        images: [],
        categories: [
            CATEGORY_SYMBOL
        ],
        attributes: [],
        issueAttributes: [],
        issueProperties: {
          title: "",
          undatedTitle: "",
          coverTitle: "",
          titleRich: "",
          undatedTitleRich: "",
          coverTitleRich: "",
          symbol: "",
          undatedSymbol: ""
        }
      }
    }

Then I zip zipped db + the manifest file into one "mrsrch.jwpub" file. And it can't be installed on JW Library at this point.

I feel a bit lost on what I can do next to make it work. Do you have any ideas? I'd appreciate any help 🙏

@livrasand
Copy link

Te recomiendo la wiki de Reviw: https://www.github.com/livrasand/Reviw/wiki its free

@goodniceweb
Copy link

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.

Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

@gokusander
Copy link

gokusander commented Sep 21, 2024

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.

Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

Is something like that, but Reviw i use to create a .db file, after that i create my own .jwpub (CyberChef i use for create my blobs). He helping with somethings i dont know, i believe he help you with your file and changing ideas.

That is my .jwpub, i'm working does 2 weeks and work well. He help me with something. So i believe is a good thing use there for asnwer your questions.

RefDocument table i create one, is complicated but cool.

I will add other talks, for now I have only made 1
Rename the file to .jwpub

am_T (1.0).zip

@livrasand
Copy link

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.

Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

If you need more specific help, I can help you. Me and the Reviw community have created all these JWPUBs: livrasand.github.io

Send me an email and we can talk about how to help you

@in-Load
Copy link

in-Load commented Oct 6, 2024

Thanks I'll give it a try. Though, my Spanish to English translator sometimes gives me something crazy.
Maybe I'm missing something, but at least from what I understood, Reviw wants me to send them files so they create JWPub file for me, via GH issues. That's not how I want to approach the issue. But maybe I didn't understand their wiki.

Is something like that, but Reviw i use to create a .db file, after that i create my own .jwpub (CyberChef i use for create my blobs). He helping with somethings i dont know, i believe he help you with your file and changing ideas.

That is my .jwpub, i'm working does 2 weeks and work well. He help me with something. So i believe is a good thing use there for asnwer your questions.

RefDocument table i create one, is complicated but cool.

I will add other talks, for now I have only made 1 Rename the file to .jwpub

am_T (1.0).zip

hi everyone what's up ?
Apologies, I'm French, I don't speak English and the translator is not so good to understand or learn, you know.
Thank you so much to all of you for your contributions and repository, and thanks @gokusander for your testing file, but how you created the design interface (Ui) of your file please ?

@gokusander
Copy link

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

@orangethewell
Copy link

design interface (Ui)

Sorry for late, I was in vacation. What is "design interface (Ui)"? I'm not dev, just a normal guy

It's User Interface, probably they were referring to Graphical User Interface, the window that the application show the text and images

@orangethewell
Copy link

orangethewell commented Nov 12, 2024

Good news, guys! I discovered how to extract all the styles from publications. I kinda wasted some time trying to discover it in past, but now I got how they setup the styles.

Basically, I did this:

  1. Create the root element for publication chapter content with those classes: "jwac docClass-13 docId-1102023301 ms-ROMAN ml-T dir-ltr pub-lmd layout-reading layout-sidebar". NOTE: The most important class is jwac, which represents a jw-article, and every other class check if they are inside that class
  2. Copy the colector.css from JW website, which contains every class for (I think) every publication, even printable publication data from website.

image

Another thing I discovered, (Someone already know this, I think) is that backup files hold some metadata for publication markups, like color index, paragraph index, Token index start and index end. There are 3 tables if I'm not mistaken that holds markup data, one with markup start end and paragraph, one with location/publication and another with colorIndex.

image

I still have to mess up with path matches, since I'm back to Windows and compiling the code to Linux would not work at all. But I will discover it out how to make it more legible. For now, I just bloated the code from Rust to JavaScript with React, so it still look a mess. When done, I will push a commit to my repo.

@orangethewell
Copy link

orangethewell commented Nov 21, 2024

There is a Languages table in mepsunit.db, that's where you find the respective language mnemonic for each MepsLanguageId.

I couldn't find it, I found it once, but didn't at the second time, neither Windows or Android, just on .apk data, but as .jwdat

EDIT: Nah, nevermind, just found it on msibundle from windows store version

@GeiserX
Copy link

GeiserX commented Nov 25, 2024

Hey @orangethewell
Would you please share that mepsunit.db somewhere, perhaps over a GitHub repository? I'd need the relationship between langcode and MepsLanguageId in my python scripts. I'm no desktop developer so I'd highly appreciate it, as it would take long for me to learn how to unbundle that from the Windows app. Thank you!

@orangethewell
Copy link

Hey @orangethewell
Would you please share that mepsunit.db somewhere, perhaps over a GitHub repository? I'd need the relationship between langcode and MepsLanguageId in my python scripts. I'm no desktop developer so I'd highly appreciate it, as it would take long for me to learn how to unbundle that from the Windows app. Thank you!

Sadly I can't, since it can fall into a copyright content infringement, but it isn't that hard to get it, just download the windows edition, unzip the file, rename the msixbundle to a zip extension, unzip it, same step on any of the versions inside msixbundle, preferably the suffixed with x64, unzip it then the MEPS unit is available in Data folder

@geimist
Copy link

geimist commented Nov 25, 2024

Perfect - thank you very much. I had searched in vain for a long time for the relationship between LanguageID and symbol in the installation directory (library) on the Mac and couldn't find anything. Now I see that I should have looked directly in the installation package.
On the Mac you can find the DB here: /Applications/JW Library.app/Wrapper/JWLibrary.app/JWLResources/mepsunit.db. The values are in the table language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

17 participants