Fixed link building by patrick-sil · Pull Request #133 · useblocks/sphinx-simplepdf

patrick-sil · 2026-02-17T11:18:12Z

This PR is supposed to resolve #123

However, let me add to the description of the issue above that it also applies to internal references inside the PDF, hence, currently :ref: or :numref: are not working for sub-headlines in any case.

The suggested changes here, should fix the issue for both the HTML and the PDF.

What does my solution do:

Introduces a new function which
Does check on whether the link actually contains a #
If so, it does only take the part of the link of the right-hand side of the last, i.e., right most #
Return the now fixed link or the original link (also works for cases where only one # is present in the link, as it basically doen't modify the link in such cases)

If anyone has a better solution, which I assume are possible, as I'm not all that familiar with this project, please suggest them.

It would be nice, if this could be resolved soon, thank you :)

patdhlk

@patrick-sil I left a few comments in the files.

Concept is right, needs another pass. Thanks for the contribution.

And one more thing that can be done in a follow up: Taking the last # fragment is a blunt instrument. The bug is that Sphinx singlehtml emits #document-path#real-anchor. You know the shape of the bad input. It'd be more precise to strip a #document-* prefix rather than blindly taking the rightmost #. Right now if someone has a legitimate # in a fragment identifier (unlikely but not impossible), this silently eats it.

patdhlk · 2026-04-01T16:15:53Z

+    """
+    Fixes a link that refers to a sub-headline.
+
+    For instance the broken link:
+        #headline#sub-headline#sub-sub-headline
+    Will be fixed to:
+        #sub-sub-headline
+
+    If the link is not a sub-headline, the link isn't modified
+
+    Args:
+        link_to_fix: The link to fix
+
+    Returns: If no modification is needed, link_to_fix otherwise the fixed link
+    """


The docstring is floating above the function — it's a bare string expression, not a docstring. Move it inside. But honestly, the function is three lines; the 14-line comment block is doing more harm than good. If you need a comment, one line is fine.

patdhlk · 2026-04-01T16:16:28Z

+        pos = link_to_fix.rfind("#")
+        if pos != -1:
+            # In case we only find one #, this will still work as it won't affect the link
+            link_to_fix = link_to_fix [pos::]


Style nit: link_to_fix [pos::] — space before bracket, double colon. Should be link_to_fix[pos:]

But more importantly: this is doing too much work to express a simple idea. The whole function is:

def _fix_href(self, href): """Strip everything before the last '#'.""" i = href.rfind("#") return href[i:] if i != -1 else href

Or even just inline it — href.rsplit("#", 1)[-1] prefixed with # gets you there in one expression.

The real problem though: you only call this on toc sidebar links. The issue report says body cross-references are also broken (:py:obj:, :ref:). Those aren't in .sphinxsidebarwrapper. So this fixes the table of contents but leaves the actual document links broken if I understand it correctly.

If you're going to fix this, fix it everywhere. Find all <a class="reference internal"> in the whole soup, not just the sidebar.

Also — the name fix_link tells me nothing. Every function in this class "fixes" something. What does this one actually do? _strip_document_prefix or _normalize_fragment would tell the next reader what's happening without reading the body.

And it should have a leading underscore. It's internal.

patrick-sil · 2026-04-02T14:24:45Z

Concept is right, needs another pass. Thanks for the contribution.

Happy to help :)

And one more thing that can be done in a follow up: Taking the last # fragment is a blunt instrument. The bug is that Sphinx singlehtml emits #document-path#real-anchor. You know the shape of the bad input. It'd be more precise to strip a #document-* prefix rather than blindly taking the rightmost #. Right now if someone has a legitimate # in a fragment identifier (unlikely but not impossible), this silently eats it.

Hmm yes, correct. However, how would one even safely figure out where to split the string?

Refer to the example here based on what the document we are building:
#document-Requirements/Software/Req_15#Req_15
But now assume someone uses # in paths/file-names:
#document-Requirements/Software#1/Req_#15#Req_#15
A simple solution for this problem would be something like

new_link = link.rsplit("/", 1)[-1]
new_link = new_link[int(len(new_link)/2):]

Which only works, if the anchor is the same as the document name.

With e.g.
#document-Chapters/Installation_and_Configuration/Installation_Guide#workstation-clients
That wouldn't work and assuming the worst case e.g.
#document-Chapters/Installation_and_Configuration/Installation_Guide#1#workstation-clients#2
Now I don't even see any trivial solution at all, as we neither can assume the last # is the one we need or that after the last / that there are no other #.

Fixed link building

50a3649

patrick-sil mentioned this pull request Feb 17, 2026

Link fixer fails to process cross-references in document body, causing malformed double-hash hrefs #123

Open

patdhlk self-requested a review April 1, 2026 16:11

patdhlk requested changes Apr 1, 2026

View reviewed changes

patrick-sil and others added 2 commits April 2, 2026 13:27

Merge branch 'useblocks:main' into main

3801410

Fixes after review

08019e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed link building#133

Fixed link building#133
patrick-sil wants to merge 3 commits into
useblocks:mainfrom
patrick-sil:main

patrick-sil commented Feb 17, 2026

Uh oh!

patdhlk left a comment

Uh oh!

patdhlk Apr 1, 2026

Uh oh!

patdhlk Apr 1, 2026

Uh oh!

patrick-sil commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

patrick-sil commented Feb 17, 2026

Uh oh!

patdhlk left a comment

Choose a reason for hiding this comment

Uh oh!

patdhlk Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

patdhlk Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

patrick-sil commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants