Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presence of HTML tags inside extracted annotations (Zotero 6) #185

Open
Klemet opened this issue May 10, 2022 · 12 comments · May be fixed by #212
Open

Presence of HTML tags inside extracted annotations (Zotero 6) #185

Klemet opened this issue May 10, 2022 · 12 comments · May be fixed by #212
Labels
enhancement New feature or request formatting zotero 6 FR or bugs for the upcoming Zotero beta

Comments

@Klemet
Copy link

Klemet commented May 10, 2022

Describe the bug

When extracting annotations from a PDF with Zotero 6's Add note from annotations feature, and exporting the resulting note with the Export to Markdown function of MDNotes, the resulting markdown file contains HTML code, making it difficult to read and edit sometimes.

To Reproduce
Steps to reproduce the behavior:

  1. Make annotations in PDF file in Zotero
  2. Export those annotations to a note with Add note from annotations in Zotero
  3. Use Export to Markdown option from MDNotes
  4. Open the resulting file

Expected behavior
The resulting markdown file should not contain HTML tags, and should remain in markdown format.

Screenshots
image

Desktop (please complete the following information):

  • OS: Windows 10
  • Zotero version: v6.0.7
  • Mdnotes version: v0.2.3
  • Zotfile version: v5.1.1
  • BetterBibtex version: v.6.7.1
@Klemet Klemet added the bug Something isn't working label May 10, 2022
@argenos
Copy link
Owner

argenos commented May 10, 2022

That is the way Zotero 6 exports the annotations from their DB. The exports from mdnotes explicitly allow for spans, since they can be used to format exported annotations if you use an external PDF viewer and the Zotfile workflow.

Could you try to modify your note template in Zotero? I'm not sure this will remove the span, but it's worth a try.

@argenos argenos added enhancement New feature or request formatting and removed bug Something isn't working labels May 10, 2022
@Klemet
Copy link
Author

Klemet commented May 10, 2022

Hello, @argenos !

Thanks a lot for your answer, and for the wonder that is MDnotes in general 😄 !

The reason makes sense, and tweaking the note templates of Zotero looks like a great idea. However, I'm not able to find anywhere where it would be possible to affect the presence of these <span> tags. The only values that can be edited don't mention them:

image

If somebody finds a way to change that, I'll be all in !

@kcudding
Copy link

kcudding commented May 30, 2022

So, trying to understand exactly what happened, and haven't used mdnotes for a while. After Zotero update to 6.0.8 could not figure out how to replicate my previous workflow which produced .md notes like attached photo.

I've updated Better bibtex, zotfile and mdnotes. But, the behaviour is exactly the same as before the update. I can only extract annotations if I use the Zotero pdf viewer. Those annotations produce the html mess the same as the commentor above

So... does that mean the note extraction functionality is entirely gone now, unless someone finds a way to modify the templates in a useful way??
Screen Shot 2022-05-30 at 5 47 07 PM

Mac OS
Zotero 6.0.8
Mdnote 0.2.3
Zotfile 5.1.1
Better Bitex 6.7.1

baroneUnmetNeedsAnalyzing2017 - Extracted Annotations (2021-05-02, 92555 a.m.)The biologists in this study see training as the most important factor .md
?

@kcudding
Copy link

Okay, after hard reboot, I CAN use an external pdf view to produce an .md of annotations from mdnotes. However the file does not contain zotero links (and annoyingly has two Annotations headers). Generated by using mdnotes on the annotations folder. This file is correctly named.
Screen Shot 2022-05-30 at 6 08 47 PM

Or I can use the native Zotero export to .md to produce a note file from the annotations folder that contains links but which is incorrectly named.

Is this still behaviour as expected, with possible resolution related to a template modification ?

Screen Shot 2022-05-30 at 6 12 24 PM

so I guess this is why people are talking about the citations add-on? but can it be resolved just be using mdnotes

@argenos
Copy link
Owner

argenos commented May 31, 2022

@kcudding I recommend switching to Zotero Integrator if this is an issue. As I mentioned above, it's unlikely I will address the issue any time soon, since it comes from the way Zotero exports their internal annotations to HTML. If you wish, you can play around with the templates used by Zotero 6 to do that export, or open a PR to fix this.

@argenos argenos added the zotero 6 FR or bugs for the upcoming Zotero beta label May 31, 2022
@cjpoor
Copy link

cjpoor commented Jun 9, 2022

It is possible to strip the html tags from text. This will remove the link to the page of the pdf in Zotero. A link to the pdf can be added back in using Zutilo "copy select item links."

However it is quicker to export the note using both Zotero export and mdnotes, then copy the tags, related, and anything else you want from the mdnotes exported file into the Zotero exported file.
You can also copy tags to the clipboard using Zutilo.

If you use the unique ID method to make links between notes e.g:
Note ID: 20220601115332
Related: [[20220601112158]]
then the link in square brackets is recognised by Zettelkasten apps like Zettlr.

If someone with coding knowledge can think of a way of automating any or all of this I will buy them a 🍺

@argenos argenos pinned this issue Jul 28, 2022
@huyz
Copy link

huyz commented Aug 4, 2022

I'm a first-time user to Zotero and mdnotes, trying to come up with a workflow so that my annotations import well into Obsidian. I'm naturally using Zotero 6 since I'm new.

Right now, it looks like using Zotero 6's Add Note from Annotations, then selecting the note, and doing Export Note with Include Zotero Links end up with a pretty good result, except you lose all the colors.

So what is OP trying to do that built-in Zotero 6 functionality doesn't give you? What's missing?

@Klemet
Copy link
Author

Klemet commented Aug 4, 2022

I'm a first-time user to Zotero and mdnotes, trying to come up with a workflow so that my annotations import well into Obsidian. I'm naturally using Zotero 6 since I'm new.

Right now, it looks like using Zotero 6's Add Note from Annotations, then selecting the note, and doing Export Note with Include Zotero Links end up with a pretty good result, except you lose all the colors.

So what is OP trying to do that built-in Zotero 6 functionality doesn't give you? What's missing?

Nice catch, @huyz ! Indeed, it seems that the problem is not present when using the built-in Export Note function of Zotero.

To answer your question, it's just a question of practicality; Export Note of Zotero functions well, but mdnotes has a lot of customization functions that makes everything quicker (like properly naming the file you're exporting, giving you the right folder to export by default, etc.). I agree that it's just a matter of convenience, and not a necessity.

@kcudding
Copy link

kcudding commented Aug 4, 2022

The deal killer for me is that while I can use the native Zotero export to .md to produce a note file from the annotations folder that contains links but which is incorrectly named." You end up with a file labelled Annotations, which overwrites the previous file unless you have renamed it, and which is not identifiable. That's not just a convenience issue unless you only use the feature occasionally.

I and others have raised the file name issue with Zotero, but no motion yet.

@huyz
Copy link

huyz commented Aug 4, 2022

You end up with a file labelled Annotations, which overwrites the previous file unless you have renamed it

One workaround is to change the Annotations text at the top of the note and paste in a more descriptive name (e.g., copied from the item title). Only then, if you do an Export note... then you get a unique, and more descriptive name.

@kcudding
Copy link

kcudding commented Aug 4, 2022

mdnotes renames to the better bibtex reference key automatically, which I found invaluable for organization. Doing it manually you have to get the key (you can set up for command c - shift to do this) and then paste in. So instead of processing annotations to a .md with one click as before, its now 4 actions to get the same job done.

Unless someone has new ideas about how to automate? As far as I know there is still no option to automatically set the title field in the Zotero generated annotation note.

@gjimenezUCM
Copy link

@kcudding I recommend switching to Zotero Integrator if this is an issue. As I mentioned above, it's unlikely I will address the issue any time soon, since it comes from the way Zotero exports their internal annotations to HTML. If you wish, you can play around with the templates used by Zotero 6 to do that export, or open a PR to fix this.

I think I found a solution for this issue: span.highlight element (span element whose class is highlight) represents an annotation in Zotero and the attribute data-annotation has information about the attachment, the annotation and its position in the attachment. As an example, if we decode the component contained in data-annotation, we can create the URI transforming:

{
    "attachmentURI":"http://zotero.org/users/8528213/items/L3YWES9Q",
    "annotationKey":"AT3Q9HXT",
    "color":"#ffff00",
    "pageLabel":"3",
    "position":{
        "pageIndex":21,
        "rects":[[51.744,124.472,391.283,136.526],[51.744,111.522,391.283,123.576],[51.744,98.572,391.283,110.626],[51.744,85.622,115.667,97.676]]},
        "citationItem":{
            "uris":["http://zotero.org/users/8528213/items/5P58CWR2"],
            "locator":"3"
        }
    }
}

into

zotero://open-pdf/library/items/L3YWES9Q?page=22&annotation=AT3Q9HXT

This url can be used to create a link using the content in the span.citation element. To do that, you can add a new rule in the getConverter function in markdown-utils.js. Something like this:

converter.addRule('annotation-link', {
    filter: function (node, options) {
        // Only works with span.citation elements
        return (
          node.nodeName === 'SPAN' &&
          node.getAttribute('class') === 'citation'
        );
    },
    replacement: function (content, node) {
        // Access to the span.highlight element (.citation sibling)
        let sibling = node.previousElementSibling;
        let newContent = content;      // By default
        // Sanity check
        if (sibling && sibling.getAttribute('class') === 'highlight') {
            // data-annotation to object
            let data = JSON.parse(decodeURIComponent(sibling.getAttribute('data-annotation'))); 

            // Extract the attachment (item) key (the last element in the URL)
            let itemKey =  data.attachmentURI.split("/").at(-1);
            let page = data.position.pageIndex+1;        // Is it necessary?
            let url =  `zotero://open-pdf/library/items/${itemKey}?page=${page}&annotation=${data.annotationKey}`;
            newContent =  `[${content}](${url})`;  
        }
        return newContent;        
    }
  });

I have not tried it in zotero-mdnotes yet but I have tested it.

@adzcai adzcai linked a pull request Mar 13, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request formatting zotero 6 FR or bugs for the upcoming Zotero beta
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants