Automatic hyperlinking treats punctuation differently to webui reference

redjard@lemmy.dbzer0.com · 3 days ago

Automatic hyperlinking treats punctuation differently to webui reference

redjard@lemmy.dbzer0.com · 3 days ago

Lemmyui seems to be using markdown-it, so the spec might be this re.tpl_link_fuzzy: https://github.com/markdown-it/linkify-it/blob/master/lib/re.mjs

That looks like it includes the bracket logic.

idunnololz@lemmy.world · edit-2 3 days ago

Urg. I hate dealing with hyperlinks. The reason why the ) isn’t included is because I didn’t want to match the paren in case the url is enclosed in paren. Eg. (https://google.com/). But clearly there are edge cases where it needs to be included.

redjard@lemmy.dbzer0.com · 3 days ago

I got linkify-it to run with nodejs with some minor modifications and this is the output of console.log(re.tpl_link_fuzzy);: https://files.catbox.moe/8y1bfx.regex (tpl_link_fuzzy.regex, 18.47kiB)

Just paste 19kB of raw regex into your code, noone has ever regretted pasting 19kB of regex into their code.

idunnololz@lemmy.world · 3 days ago

This doesn’t convert cleanly to java/kotlin. At least one of the groups is messed up and I am not going to go through 19,000 characters to find each one. I found a library that looks promising and I’ll try that instead.

redjard@lemmy.dbzer0.com · 3 days ago

Understandable. I have an even worse idea then: https://files.catbox.moe/71dzf7.base64 (tpl_link_fuzzy.regex.base64, 24.63kiB)

Take this base64, and decode it in kotlin into a string variable. And then maybe make kotlin give it to you in a form you can paste back into the code idk

idunnololz@lemmy.world · 2 days ago

I will consider this is all else fails but I also don’t have high hopes this regex would even work.

redjard@lemmy.dbzer0.com · 3 days ago

What markdown-it does is match parentheses across the path only. It makes sense to parse urls component by component, for example protocol and domain can’t contain those characters anyway.

idunnololz@lemmy.world · 3 days ago

Interesting, there is some backend logic that prevents links within paren from ending in an alpha numeric character. For instance if I send the comment (https://google.com/), Lemmy auto changes it to (https://google.com/). I wonder if this is done to make it easier to parse links.

test: (https://google.com/)