How I blog using Emacs and Org-mode

2021-08-29

Table of Contents

This article describes / documents my settings for this website.

Why

There exist many different frameworks for blogging and/or hosting websites. Why make the website from Org-mode alone?

This is mainly because of my preferences. I like to work with Emacs only, instead of relying on an external software to produce the website. So this is not for everyone to follow.

Requirements

To produce a website from Emacs and Org-mode alone, be sure to install GNU Emacs. There are no other dependencies, as we will build the tools we need along the way.

Results

The things we generate for this website are listed as follows.

  • One HTML file for each article
  • One CSS file for the entire website
  • One sitemap file for each category of articles
  • One index file as the entry point of the website
  • One Atom feed for each category of articles

Architecture

The folder structure of my blog folder is as follows.

  • folder blog
    • folder math
      • file math-sitemap.org
      • articles
    • folder code
      • file code-sitemap.org
      • articles
    • folder life
      • file life-sitemap.org
      • articles
    • file index.org
    • file how-i-make-website.org

Disclaimer: The following is heavily inspired by the website of Protesilaos. Specifically, the sidebar and the "latest updates" section are directly inspired by the website.

org-publish

Basically I use the built-in feature of Org-mode, called org-publish to generate HTML files. And I then call a custom Emacs Lisp function to post-process the generated HTML files to add additional features.

Exclude Javascript snippet

(setq org-html-head-include-scripts nil)

By default org-publish will include a little javascript snippet that is not used anywhere. But the comment of that snippet contains a magnet URL, which will be recognized as invalid HTML, if I check if the page is valid HTML.

Basic project settings

The project settings for org-publish are stored in the variable org-publish-project-alist. The basic settings in my setup are reproduced below.

(setq org-publish-project-alist
      (list
       (list
        "website"
        :components (list "math" "code" "life" "main"))
       (list
        "code"
        :base-directory (directory-file-name
                         (expand-file-name
                          "code" (expand-file-name
                                  "blog" org-directory)))
        :base-extension "org"
        :publishing-directory (directory-file-name
                               (expand-file-name
                                "public" org-directory))
        :publishing-function #'org-html-publish-to-html
        :section-numbers nil
        :with-toc nil
        :with-email t
        :with-creator t
        :auto-sitemap t
        :recursive t)
       (list
        "main"
        :base-directory (directory-file-name
                         (expand-file-name
                          "blog" org-directory))
        :base-extension "org"
        :publishing-directory (directory-file-name
                               (expand-file-name
                                "public" org-directory))
        :publishing-function #'org-html-publish-to-html
        :section-numbers nil
        :with-toc nil
        :with-email t
        :with-creator t
        :auto-sitemap nil
        :recursive nil)))

The settings for the components math and life are not shown, as they are similar to the code component.

Some notes about the setting:

  • Find every file in the directory ORG-DIRECTORY/blog/code, with extension "org", call the function org-html-publish-to-html to convert the file into an HTML file, and put in the directory ORG-DIRECTORY/public.
  • Don't use section numbers and table-of-contents in the resulting files.
  • Show the Email of the author.
  • Show that the files are created by Emacs and Org-mode. (See the bottom of each page.)
  • Create sitemap files automatically.
  • Do this recursively for the sub-directories found.
  • For files in the directory ORG-DIRECTORY/blog with extension "org", do the same as the above, but don't create sitemap files.

Custom CSS and favicon

The plain website generated by org-publish is too plain, and lacks some functionalities to make it more readable. So I use a custom CSS file to style the website. The settings are as follows.

(defvar durand-org-publish-css-file nil
  "A custom css-file for publishing.")

(setq durand-org-publish-css-file
      "<link rel=\"stylesheet\" type=\"text/css\" \
href=\"website.css\"/>")

(defvar durand-org-publish-favicon nil
  "A custom favicon for publishing.")

(setq durand-org-publish-favicon
      "<link rel='shortcut icon' \
href='https://jsdurand.xyz/favicon.ico'/>")

The custom CSS is in the file website.css. And the custom icon in the file favicon.ico. Then I add the following property-value pair to the project settings:

:html-head (concat durand-org-publish-css-file
                   "\n"
                   durand-org-publish-favicon)

As to the (ugly) icon file, I first write a little C program to generate the picture in the raw pbm format. Then I convert it to the ico file by Imagemagick (the convert program, to be precise).

Sidebar

The website contains a "sidebar", that is usually at the left of the page. It contains navigation links to different sitemap pages (which will be covered in the next sub-section).

When the width of the page is below 1200px wide, the sidebar will instead be displayed at the top of the page, so that it will not block the contents of the page.

This is created as follows.

(defvar durand-org-publish-sidebar nil
  "The sidebar that provides the navigation of the website.")

(setq durand-org-publish-sidebar
      "<div class=\"sidebar\">\n\
<a href=\"math-sitemap.html\"> Math </a>\n\
<a href=\"code-sitemap.html\"> Code </a>\n\
<a  href=\"life-sitemap.html\"> Life </a>\n\
<a href=\"index.html\"> Home </a></div>")

;; REVIEW: This might be unnecessary.
(defun durand-org-publish-insert-sidebar (_arg)
  "Return the sidebar."
  durand-org-publish-sidebar)

Then add the following to the project settings:

:html-link-home ""
:html-link-up ""
:html-home/up-format ""
:html-preamble #'durand-org-publish-insert-sidebar

Note that I use a function to return the string so that when I change the contents of the string, I don't have to add that string to the project settings again. But maybe there are better ways to achieve the same effect?

Sitemap files

The format of the sitemap files is customized as follows.

Add the following to the project settings:

:sitemap-function #'durand-org-publish-sitemap
:sitemap-format-entry #'durand-org-publish-sitemap-format
:sitemap-date-format "Published: %F %a %R"
:sitemap-filename "code-sitemap.org"
:sitemap-title "About coding"
:sitemap-sort-files 'anti-chronologically

One note:

  • The option :sitemap-date-format has no effect currently.

The functions are reproduced below.

;;; A dirty hack to insert some custom strings

;; The settings about Atom feeds are covered in the next section.
(defvar durand-sitemap-custom-string-alist nil
  "An association list that relates the title of the sitemap and the string to insert.")

(setq durand-sitemap-custom-string-alist
      (list
       (list "About coding"
             "This is my coding blog.  It contains my coding experiments, or \
one might think of them as development diaries."
             "code-atom.xml")
       (list "My life"
             "This is my casual blog.  It contains articles about my plain \
life."
             "life-atom.xml")
       (list "Mathematics"
             "My Mathematics-related articles are put here."
             "math-atom.xml")))

;;; Custom sitemap function

(defun durand-org-publish-sitemap (title rep)
  "Return the sitemap as a string.
TITLE is the title of the sitemap.

REP is a representation of the files and directories in the
project.  Use such functions as `org-list-to-org' or
`org-list-to-subtree' to transform it."
  (format "#+TITLE: %s\n#+AUTHOR: JSDurand\n%s#+DATE: <%s>\n\n%s\n\n\
#+ATTR_HTML: :border nil :rules nil :frame nil\n\
%s\n\n\
[[https://jsdurand.xyz/%s][Web feed]]"
          title
          "#+HTML_LINK_UP: index.html"
          (format-time-string "%F %a %R")
          (cadr (assoc title durand-sitemap-custom-string-alist #'string=))
          ;; generate a table
          (org-list-to-generic
           rep
           '(:ustart "|---|"
             :uend "|---|"
             :isep "|---|"))
          (caddr (assoc title durand-sitemap-custom-string-alist #'string=))))

;;; Custom sitemap format

;; NOTE: I use a table to style the entries.

;; NOTE: This stores the date in an ugly long format.  But worry not:
;; it will be replaced by a clean form in the post-processing phase.
;; The long form is inserted here so that we can sort the entries
;; precisely.

(defun durand-org-publish-sitemap-format (entry _style project)
  "Format the entry for the sitemap as a table with a date.
ENTRY is the entry file name to format.

STYLE is either 'list or 'tree, which is ignored by us.

PROJECT is the current project."
  (format "| %s | [[file:%s][%s]] |"
          (format-time-string
           "%FT%T%z"
           (org-publish-find-date entry project)
           (current-time-zone))
          entry
          (org-publish-find-title entry project)))

Post-processing

The above can be considered to be built-in features of org-publish, with some simple customizations. But there are some additional features that are not easy to customize directly:

  • Latest published articles
  • Atom feeds

Latest published articles

When entering the website, I would like to provide the reader with a list of the latest published articles, so that the reader can quickly see whether there are some updates of the website, across the website.

Before I start, I want to mention that since I am using Emacs 28, which is an unstable version, sometimes something will not work as expected. For example, the function plist-get does not work for me. So I use my own function for that:

(defun durand-org-publish-plist-get (prop plist)
  (let (res)
    (while (consp plist)
      (cond
       ((eq (car plist) prop)
        (setq res (cadr plist))
        (setq plist nil))
       ((setq plist (cddr plist)))))
    res))

This is implemented as follows.

(defun durand-org-publish-convert-time (spec)
  "Convert SPEC to a valid time value.
SPEC should be the result of `parse-time-string'.

It is assumed that the year, the month, and the day components
are present."
  (let ((sec (car spec))
        (minute (cadr spec))
        (hour (caddr spec)))
    (encode-time
     (append
      (list
       (or sec 0)
       (or minute 0)
       (or hour 0))
      (cdddr spec)))))

(defun durand-take (n ls)
  "Return the first N items of LS.
If the length of LS is less than N, then return the whole LS."
  (cond
   ((< (length ls) n) ls)
   ((let ((i 0) result)
      (while (< i n)
        (cond
         (ls
          (setq result (cons (car ls) result))
          (setq ls (cdr ls))
          (setq i (1+ i)))
         ((setq i n))))
      (reverse result)))))))

(defvar durand-org-index-entries-max-num 10
  "The maximal number of entries to show on the index page.")

(defun durand-org-post-process (project)
  "Generate a proper index page and Atom feeds.
Also shorten the date strings in the sitemap files, and store the
completion information in an attribute.

The feeds are generated by the function `durand-org-generate-atom-feed'."
  (let* ((project-plist (cdr (assoc project org-publish-project-alist #'string=)))
         (components (durand-org-publish-plist-get
                      :components project-plist))
         (sitemap-file (durand-org-publish-plist-get
                        :sitemap-filename project-plist))
         ;; I cheat here
         (publishing-dir (expand-file-name "~/org/public/"))
         (publish-sitemap-file (cond
                                (sitemap-file
                                 (replace-regexp-in-string
                                  "org$" "html"
                                  (expand-file-name sitemap-file publishing-dir)))))
         (index-file (expand-file-name "index.html" publishing-dir))
         contents)
    (cond
     ;; depth-first recursion
     (components
      (setq contents (apply #'append
                            (delete nil (mapcar #'durand-org-post-process components))))
      ;; We only want some items
      (setq contents
            (durand-take
             durand-org-index-entries-max-num
             (sort contents
                   (lambda (x y)
                     (time-less-p
                      (car y) (car x))))))
      ;; If contents is non-nil, we need to process the info in the
      ;; index page.
      (cond
       (contents
        (with-temp-buffer
          (insert-file-contents index-file)
          (goto-char (point-min))
          (search-forward "Latest updates")
          (goto-char (line-end-position))
          (while (search-forward "<table" nil t)
            ;; delete existing tables
            (delete-region
             (match-beginning 0)
             (progn (search-forward "</table>")
                    (match-end 0))))
          (insert "\n")
          (insert "<table cellspacing=\"0\" cellpadding=\"6\">


<colgroup>
<col  class=\"org-right\" />

<col  class=\"org-left\" />
</colgroup>\n\n<tbody>\n")
          (mapc
           (lambda (entry)
             (insert "<tr>")
             (insert "<td class=\"org-right\">")
             (insert (format-time-string "%F" (car entry)))
             (insert "</td>\n")
             (insert "<td class=\"org-left\">")
             (insert (cdr entry))
             (insert "</td>\n</tr>\n"))
           contents)
          (insert "</tbody>\n</table>")
          (write-region nil nil index-file)))))
     ((and publish-sitemap-file
           (stringp publish-sitemap-file)
           (file-exists-p publish-sitemap-file))
      (with-temp-buffer
        (insert-file-contents publish-sitemap-file)
        (goto-char (point-min))
        (search-forward "</colgroup>" nil t)
        (let ((regexp (rx-to-string
                       '(seq
                         "<t" (any "hd")
                         " "
                         (one-or-more (not (or ">")))
                         ">")
                       t))
              pos pos-2 temp temp-time)
          (while (re-search-forward regexp nil t)
            (setq pos (point))
            ;; replace the first org-left by org-right
            (save-excursion
              (goto-char (match-beginning 0))
              (setq pos-2 (match-end 0))
              (save-match-data
                (cond
                 ((re-search-forward "left" pos-2 t)
                  (replace-match "right")
                  ;; fix pos
                  (setq pos (1+ pos))))))
            (search-forward "</t")
            (setq pos-2 (match-beginning 0))
            (forward-line 1)
            (cond
             ((re-search-forward 
               (rx-to-string
                '(seq
                  (= 4 (any digit)) "-"
                  (= 2 (any digit)) "-"
                  (= 2 (any digit)) "T")
                t)
               (line-end-position) t)
              (setq temp-time
                    (durand-org-publish-convert-time
                     (parse-time-string
                      (buffer-substring-no-properties
                       (match-beginning 0)
                       (line-end-position))))))
             (t
              (setq
               temp-time
               (durand-org-publish-convert-time
                (parse-time-string
                 (buffer-substring-no-properties
                  pos pos-2))))
              (save-excursion
                (goto-char pos)
                (delete-region pos pos-2)
                (insert (format-time-string "%F" temp-time)))
              (insert
               "<!--"
               (format-time-string
                "%FT%T%z\n" temp-time (current-time-zone))
               "-->\n")
              (write-region nil nil publish-sitemap-file)))
            (setq
             temp
             (cons
              (cons
               temp-time
               (progn
                 (re-search-forward regexp)
                 (setq pos (point))
                 (search-forward "</t")
                 (buffer-substring-no-properties
                  pos (match-beginning 0))))
              temp)))
          (durand-org-generate-atom-feed
           project
           (expand-file-name (concat project "-atom.xml")
                             publishing-dir)
           (mapcar
            (lambda (cell)
              (let* ((orig-string (cdr cell))
                     (temp 0)
                     (title
                      (progn
                        (string-match ">" orig-string)
                        (setq temp (match-end 0))
                        (substring
                         orig-string
                         temp
                         (progn
                           (string-match "</a>" orig-string temp)
                           (match-beginning 0)))))
                     (file-name
                      (progn
                        (string-match "href=\"" orig-string)
                        (setq temp (match-end 0))
                        (substring
                         orig-string
                         temp
                         (progn
                           (string-match "\">" orig-string temp)
                           (match-beginning 0)))))
                     (content
                      (with-temp-buffer
                        (insert-file-contents
                         (expand-file-name file-name publishing-dir))
                        (goto-char (point-min))
                        (search-forward "<div id=\"content\">" nil t)
                        (search-forward "<p>" nil t)
                        (buffer-substring-no-properties
                         (1+ (point))
                         (progn
                           (search-forward "</p>" nil t)
                           (match-beginning 0)))))
                     ;; escape html
                     (content
                      (replace-regexp-in-string
                       ">" "&gt;"
                       (replace-regexp-in-string
                        "<" "&lt;"
                        (replace-regexp-in-string
                         "&" "&amp;" content)))))
                (list
                 title
                 (durand-org-atom-format-time
                  (file-attribute-modification-time
                   (file-attributes
                    (expand-file-name file-name publishing-dir))))
                 (concat "https://jsdurand.xyz/" file-name)
                 (concat "https://jsdurand.xyz/" file-name)
                 (durand-org-atom-format-time (car cell))
                 content)))
            (reverse temp)))
          (reverse temp)))))

It is quite a long function. Basically, it loops through the components of the project, and for each component, finds the corresponding sitemap file, and collects the entries from the sitemap, and then sort them according to their date, and puts top 10 of them in the index file.

In addition, it generates the Atom feeds for each component as well, which is implemented in the next section.

Note that this does not parse the HTML files properly, so it only works with my overall settings, and might fail for arbitrary HTML files.

I have thought about parsing the HTML files through the built-in function libxml-parse-html-region. But I think that is unnecessary generality, so I chose the simpler and dirtier way above.

Atom feeds

Generating feeds

For the website to be considered "fully functioning" (by me), I require it to provide convenient web feeds, so that readers can track the publications without consulting the website every 5 minutes like crazy.

So I download the RFC-4287, the specifications for the Atom feeds, and wrote some functions to generate feeds for my website.

(defvar durand-org-atom-titles-alist nil
  "An assocuation list of Atom feed titles with the project name.")

(setq durand-org-atom-titles-alist
      (list
       (list "code" "JSDurand's codes" "https://jsdurand.xyz/code-atom.xml")
       (list "math" "Math articles of JSDurand"
             "https://jsdurand.xyz/math-atom.xml")
       (list "life" "JSDurand's life"
             "https://jsdurand.xyz/life-atom.xml")))

(defvar durand-org-atom-preamble nil
  "The preamble of an Atom feed.")

(setq durand-org-atom-preamble
      "<?xml version=\"1.0\" encoding=\"utf-8\" ?>
<feed xmlns=\"http://www.w3.org/2005/Atom\">
<id>https://jsdurand.xyz/atom.xml</id>
<title>%s</title>
<author>
<name>JSDurand</name>
<uri> https://jsdurand.xyz </uri>
<email>durand@jsdurand.xyz</email>
</author>
<link rel=\"self\" type=\"applicatoin/atom+xml\" href=\"%s\"/>
<rights>Copyright (c) 2021, JSDurand</rights>
<updated>%s</updated>
<generator uri=\"https://www.gnu.org/software/emacs/\" version=\"28.0.50\">\
Emacs</generator>\n")

(defvar durand-org-atom-entry-template nil
  "The template for an Atom entry.
It HAS to be formatted with 6 arguments in the following order:

TITLE: the title.  Note this does not have a subtitle.

UPDATED-TIME: the newest updated time.

ID: I think I will use the URL as the ID directly.

LINK: Link to this entry.

PUBLISHED-TIME: the time this is published.

CONTENT: a short content.")

(setq durand-org-atom-entry-template
      "<entry>
<title type=\"text\">%s</title>
<updated>%s</updated>
<id>%s</id>
<link rel=\"self\" type=\"text/html\" href=\"%s\"/>
<published>%s</published>
<content type=\"html\">
%s
</content>
</entry>")

(defvar durand-org-atom-postamble nil
  "The post-amble of an Atom feed.")

(setq durand-org-atom-postamble
      "</feed>")

(defun durand-org-atom-format-time (time)
  "Format TIME in an acceptable way."
  (concat
   (format-time-string
    "%FT%T" time)
   (format "%s%02d:%02d"
           (cond ((> (car (current-time-zone)) 0) "+")
                 ("-"))
           (/ (car (current-time-zone)) 3600)
           (% (car (current-time-zone)) 3600))))

(defun durand-org-generate-atom-feed (project file-name entries)
  "Generate an Atom feed for ENTRIES and save in FILE-NAME.
PROJECT is the name of the subproject.

ENTRIES is a list of entries.

An entry is a list of the form (TITLE UPTIME ID LINK PUBTIME CONTENT).
See the documentation string for `durand-org-atom-entry-template' for more."
  (with-temp-buffer
    (insert (apply #'format
                   durand-org-atom-preamble
                   (append
                    (cdr
                     (assoc project durand-org-atom-titles-alist
                            #'string=))
                    (list (durand-org-atom-format-time nil))))
            "\n")
    (mapc
     (lambda (entry)
       (insert (apply #'format durand-org-atom-entry-template
                      entry)
               "\n"))
     entries)
    (insert durand-org-atom-postamble)
    (write-region nil nil file-name)))

Basically, this just uses a template and fills the template with suitable data, collected from the sitemap files in the post-processing stage.

Validating feeds

Since the above-mentionned RFC-4287 provides a machine-readable specification of the formal grammar of a valid Atom feed, we can use that to verify our feeds by progams. See this website for details. But basically we convert the appendix B of RFC-4287 from the RNC format to RNG format, and then use xmllint to validate the feeds.

Adding date/time info to each page

By default, org-html-publish-to-html puts a little block of date/time information to the bottom of each page, including some meta-information. But it is not too convenient to require the reader to scroll to the end of the page to know the published date of the article, so I added this information directly below the title.

At first I tried to alter the internal mechanism of org-html-publish-to-html to add this information. But it turned out this mechanism is too entangled and is not easily changed. So I adviced the function to achieve this effect.

(advice-add #'org-html-publish-to-html :filter-return #'durand-org-publish-html-advice)

This :filter-return advice will have the input from the result of the advided function, which is the published HTML file name for the function org-html-publish-to-html.

(defun durand-org-publish-html-advice (html-file-name)
  "Advice `org-html-publish-html' to add a date/time info below the \
title."
  (with-temp-buffer
    (insert-file-contents html-file-name)
    (goto-char (point-min))
    (cond
     ((search-forward "h1 class=\"title\"" nil t)
      ;; if it does not have a title, we do not add a time info.
      (search-forward "</h1>" nil)
      ;; if it already has a time info, don't generate again
      (cond ((save-excursion
               (forward-char 1)
               (looking-at-p "<p class=\"subtitle\"")))
            (t
             (let* ((date (let (temp)
                            (save-excursion
                              (goto-char (point-max))
                              (cond
                               ((search-backward "<p class=\"date\">Date: " nil t)
                                (setq temp (match-end 0))
                                (buffer-substring-no-properties
                                 temp (progn
                                        (search-forward "</p>")
                                        (match-beginning 0))))))))
                    (date-time (and date
                                    (encode-time
                                     (append
                                      (durand-take 8 (parse-time-string date))
                                      (list 28800))))))
               (cond
                (date
                 (insert
                  (format
                   "\n<p class=\"subtitle\">%s</p>\n"
                   (format-time-string "%F" date-time)))
                 (write-region nil nil html-file-name))))))))))

All original content is licensed under the free copyleft license CC BY-SA .

Author: JSDurand

Email: durand@jsdurand.xyz

Date: 2021-08-29 Dim 10:32:00 CST

GNU Emacs 28.2.50 of 2022-12-05 (Org mode 9.5.5)

Validate