Hello, guys!

I’m in process of moving my notes from Joplin, which is also a great tool, to Emacs 30.1. I use denote for managing notes.

I found a strange behavior when using org-publish: almost every note I created and exported using org-publish can’t be read by webserver. It happens when file name consists cyrillic letters. I’ve tried nginx, apache, python http.server, web-static-server. When I run a server and try to open html file in latin - it’s OK, but when there some cyrillic letters in file name - web serser tells me it can’t find file with this name like “%u…”. However when I open html files locally with Firefox everything works just fine.

So after a couple of days of reasearch I found that one reason for such behavior could be the wrong file name encoding. As far as I’m not an expert may be somebody can explain how to make emacs convert with org-publish notes in encoding that is readable for any web server?

My emacs config consists:

org-publish-project-alist '(
                            (
                             "notes"
                             :base-directory "~/org/denotes/"
                             :recursive nil
                             :publishing-directory "~/public_notes"
                             :section-numbers nil
                             :with-toc nil
                             :with-author nil
                             :with-creator nil
                             :with-date nil
                             :html-preamble "<nav><a href='index.html'>Notes</a></nav>"
                             :html-postamble nil
                             :auto-sitemap t
                             :sitemap-filename "index.org"
                             :sitemap-title "Notes"
                             :sitemap-sort-files anti-chronologically
                             )

Host is Debian 13. UTF-8 is the only encoding enabled in locales. Servers I’ve tried so far also run on Debian 13 with UTF-8.

  • midribbon_action@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    1
    ·
    16 days ago

    One more note on this is that while some searching did lead to webservers that can decode uris into utf before handling them, I believe this is very unsafe for a public server, and, in the worst case, could allow public access to your entire drive. There are vulnerabilities because different systems, and even different services on a single system, can treat specific unicode characters differently. My advice above to url-encode the filenames before serving or while building them would avoid the need for any decoding of requests as they come in.