dotnet target files sometimes have the wrong part of the string linked #1062

xt0rted · 2020-10-06T07:33:59Z

Describe the bug

If there are multiple instances of the text to be linked, and they're the same case on the same line, the wrong one can be linked. For the links that are in the wrong spot they're also including the character before & after the matched text.

The plugin that's running on this file is plugin-dotnet.

To Reproduce

Go to https://gist.github.com/xt0rted/f6bf5f0d9c4bbb42f829cb8f0163957c#file-directory-build-targets
Add octolinker-debug to the body
View lines 4, 5, 16, and 17

Expected behavior

For Include="NUnit" to be linked, not Version="$(NUnitVersion)".

Additional context

The text was updated successfully, but these errors were encountered:

edavidaja · 2020-10-12T21:59:53Z

Variation on the theme: if an R function is called with an identically named namespace operator, the function call and some following punctuation are linked rather than the package name, which precedes the two colons.

fregante · 2021-03-23T17:41:55Z

Variations of this issue have been reported before: #618

It seems to me that the matching happens twice in 2 separate ways: once to detect the dependencies and once to linkify the content via helper-insert-link.

If the first matching happens correctly, why not just reuse the same matched DOM element instead of using findAndReplaceDOMText?

stefanbuck · 2021-03-24T22:57:46Z

Yes, matching happens twice as you describe. The first match operates on string representation of the code. This match returns an object like this

{
  "endPos": 5,
  "endPosInBlob": 11,
  "lineNumber": 2,
  "startPos": 0,
  "startPosInBlob": 6,
  "values": [
    "hello-world",
  ],
},

Then findAndReplaceDOMText is used to wrap the match with an anchor tag. This part seems to be buggy and I started looking into this last weekend, but wasn't able to finish it.

We use this findAndReplaceDOMText because a match may span across multiple child nodes like in this example.

<td id="LC19" class="blob-code blob-code-inner js-file-line">
    <span class="pl-k">use</span> <span class="pl-v">Illuminate</span>\<span class="pl-v">Contracts</span>\<span class="pl-v">Container</span>\<span class="pl-v">BindingResolutionException</span>;
</td>

Also in a diff view, one or more additional 's maybe surrounding the match

(Random diff, not OctoLinker related)

However, as mentioned before, I started looking into this but I need more time to investigate this further. This part is the most complicated bit in OctoLinker, but it is necessary to decouple the code from the underlaying DOM as much as possible. In the early days OctoLinker broke a few times because GitHub updates classnames I used to find and wrap matches.

@fregante I'm curious to know to how refined-github is dealing with this problem.

fregante · 2021-03-25T00:16:56Z

I think I suggested it before: https://github.com/fregante/zip-text-nodes + a custom script to linkify text, for example https://github.com/sindresorhus/linkify-issues

The latter receives plain text and returns a DocumentFragment with the same text, but one or more Anchor elements mixed in. This would be your code detecting and linkifying the correct text from a simple string.

Then this is passed to zip-text-nodes so that this new Anchor element is merged into the original DOM, by basically diffing it.

This works well on titles or single lines but I don’t know how performant it is on whole files. Probably you could linkify the whole file but then only pass the single line to be zipped together.

This works across diffs and all kinds of DOM elements, because it simply ignores them and reads the textContent of each.

xt0rted added the bug label Oct 6, 2020

stefanbuck self-assigned this Mar 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dotnet target files sometimes have the wrong part of the string linked #1062

dotnet target files sometimes have the wrong part of the string linked #1062

xt0rted commented Oct 6, 2020

edavidaja commented Oct 12, 2020

fregante commented Mar 23, 2021

stefanbuck commented Mar 24, 2021 •

edited

fregante commented Mar 25, 2021 •

edited

dotnet target files sometimes have the wrong part of the string linked #1062

dotnet target files sometimes have the wrong part of the string linked #1062

Comments

xt0rted commented Oct 6, 2020

edavidaja commented Oct 12, 2020

fregante commented Mar 23, 2021

stefanbuck commented Mar 24, 2021 • edited

fregante commented Mar 25, 2021 • edited

stefanbuck commented Mar 24, 2021 •

edited

fregante commented Mar 25, 2021 •

edited