How I blog using Emacs and Org-mode
2021-08-29
Table of Contents
This article describes / documents my settings for this website.
Why
There exist many different frameworks for blogging and/or hosting websites. Why make the website from Org-mode alone?
This is mainly because of my preferences. I like to work with Emacs only, instead of relying on an external software to produce the website. So this is not for everyone to follow.
Requirements
To produce a website from Emacs and Org-mode alone, be sure to install GNU Emacs. There are no other dependencies, as we will build the tools we need along the way.
Results
The things we generate for this website are listed as follows.
- One HTML file for each article
- One CSS file for the entire website
- One sitemap file for each category of articles
- One index file as the entry point of the website
- One Atom feed for each category of articles
Architecture
The folder structure of my blog folder is as follows.
- folder blog
- folder math
- file math-sitemap.org
- articles
- folder code
- file code-sitemap.org
- articles
- folder life
- file life-sitemap.org
- articles
- file index.org
- file how-i-make-website.org
- folder math
Disclaimer: The following is heavily inspired by the website of Protesilaos. Specifically, the sidebar and the "latest updates" section are directly inspired by the website.
org-publish
Basically I use the built-in feature of Org-mode, called org-publish
to generate HTML files. And I then call a custom Emacs Lisp function
to post-process the generated HTML files to add additional features.
Exclude Javascript snippet
(setq org-html-head-include-scripts nil)
By default org-publish will include a little javascript snippet that is not used anywhere. But the comment of that snippet contains a magnet URL, which will be recognized as invalid HTML, if I check if the page is valid HTML.
Basic project settings
The project settings for org-publish
are stored in the variable
org-publish-project-alist
. The basic settings in my setup are
reproduced below.
(setq org-publish-project-alist (list (list "website" :components (list "math" "code" "life" "main")) (list "code" :base-directory (directory-file-name (expand-file-name "code" (expand-file-name "blog" org-directory))) :base-extension "org" :publishing-directory (directory-file-name (expand-file-name "public" org-directory)) :publishing-function #'org-html-publish-to-html :section-numbers nil :with-toc nil :with-email t :with-creator t :auto-sitemap t :recursive t) (list "main" :base-directory (directory-file-name (expand-file-name "blog" org-directory)) :base-extension "org" :publishing-directory (directory-file-name (expand-file-name "public" org-directory)) :publishing-function #'org-html-publish-to-html :section-numbers nil :with-toc nil :with-email t :with-creator t :auto-sitemap nil :recursive nil)))
The settings for the components math and life are not shown, as they are similar to the code component.
Some notes about the setting:
- Find every file in the directory
ORG-DIRECTORY/blog/code
, with extension "org", call the functionorg-html-publish-to-html
to convert the file into an HTML file, and put in the directoryORG-DIRECTORY/public
. - Don't use section numbers and table-of-contents in the resulting files.
- Show the Email of the author.
- Show that the files are created by Emacs and Org-mode. (See the bottom of each page.)
- Create sitemap files automatically.
- Do this recursively for the sub-directories found.
- For files in the directory
ORG-DIRECTORY/blog
with extension "org", do the same as the above, but don't create sitemap files.
Custom CSS and favicon
The plain website generated by org-publish
is too plain, and lacks
some functionalities to make it more readable. So I use a custom CSS
file to style the website. The settings are as follows.
(defvar durand-org-publish-css-file nil "A custom css-file for publishing.") (setq durand-org-publish-css-file "<link rel=\"stylesheet\" type=\"text/css\" \ href=\"website.css\"/>") (defvar durand-org-publish-favicon nil "A custom favicon for publishing.") (setq durand-org-publish-favicon "<link rel='shortcut icon' \ href='https://jsdurand.xyz/favicon2.ico'/>")
The custom CSS is in the file website.css. And the custom icon in the file favicon.ico. Then I add the following property-value pair to the project settings:
:html-head (concat durand-org-publish-css-file "\n" durand-org-publish-favicon)
As to the (ugly) icon file, I first write a little C program to
generate the picture in the raw pbm format. Then I convert it to the
ico file by Imagemagick (the convert
program, to be precise).
Sidebar
The website contains a "sidebar", that is usually at the left of the page. It contains navigation links to different sitemap pages (which will be covered in the next sub-section).
When the width of the page is below 1200px wide, the sidebar will instead be displayed at the top of the page, so that it will not block the contents of the page.
This is created as follows.
(defvar durand-org-publish-sidebar nil "The sidebar that provides the navigation of the website.") (setq durand-org-publish-sidebar "<div class=\"sidebar\">\n\ <a href=\"math-sitemap.html\"> Math </a>\n\ <a href=\"code-sitemap.html\"> Code </a>\n\ <a href=\"life-sitemap.html\"> Life </a>\n\ <a href=\"index.html\"> Home </a></div>") ;; REVIEW: This might be unnecessary. (defun durand-org-publish-insert-sidebar (_arg) "Return the sidebar." durand-org-publish-sidebar)
Then add the following to the project settings:
:html-link-home "" :html-link-up "" :html-home/up-format "" :html-preamble #'durand-org-publish-insert-sidebar
Note that I use a function to return the string so that when I change the contents of the string, I don't have to add that string to the project settings again. But maybe there are better ways to achieve the same effect?
Sitemap files
The format of the sitemap files is customized as follows.
Add the following to the project settings:
:sitemap-function #'durand-org-publish-sitemap :sitemap-format-entry #'durand-org-publish-sitemap-format :sitemap-date-format "Published: %F %a %R" :sitemap-filename "code-sitemap.org" :sitemap-title "About coding" :sitemap-sort-files 'anti-chronologically
One note:
- The option
:sitemap-date-format
has no effect currently.
The functions are reproduced below.
;;; A dirty hack to insert some custom strings ;; The settings about Atom feeds are covered in the next section. (defvar durand-sitemap-custom-string-alist nil "An association list that relates the title of the sitemap and the string to insert.") (setq durand-sitemap-custom-string-alist (list (list "About coding" "This is my coding blog. It contains my coding experiments, or \ one might think of them as development diaries." "code-atom.xml") (list "My life" "This is my casual blog. It contains articles about my plain \ life." "life-atom.xml") (list "Mathematics" "My Mathematics-related articles are put here." "math-atom.xml"))) ;;; Custom sitemap function (defun durand-org-publish-sitemap (title rep) "Return the sitemap as a string. TITLE is the title of the sitemap. REP is a representation of the files and directories in the project. Use such functions as `org-list-to-org' or `org-list-to-subtree' to transform it." (format "#+TITLE: %s\n#+AUTHOR: JSDurand\n%s#+DATE: <%s>\n\n%s\n\n\ #+ATTR_HTML: :border nil :rules nil :frame nil\n\ %s\n\n\ [[https://jsdurand.xyz/%s][Web feed]]" title "#+HTML_LINK_UP: index.html" (format-time-string "%F %a %R") (cadr (assoc title durand-sitemap-custom-string-alist #'string=)) ;; generate a table (org-list-to-generic rep '(:ustart "|---|" :uend "|---|" :isep "|---|")) (caddr (assoc title durand-sitemap-custom-string-alist #'string=)))) ;;; Custom sitemap format ;; NOTE: I use a table to style the entries. ;; NOTE: This stores the date in an ugly long format. But worry not: ;; it will be replaced by a clean form in the post-processing phase. ;; The long form is inserted here so that we can sort the entries ;; precisely. (defun durand-org-publish-sitemap-format (entry _style project) "Format the entry for the sitemap as a table with a date. ENTRY is the entry file name to format. STYLE is either 'list or 'tree, which is ignored by us. PROJECT is the current project." (format "| %s | [[file:%s][%s]] |" (format-time-string "%FT%T%z" (org-publish-find-date entry project) (current-time-zone)) entry (org-publish-find-title entry project)))
Post-processing
The above can be considered to be built-in features of org-publish
,
with some simple customizations. But there are some additional
features that are not easy to customize directly:
- Latest published articles
- Atom feeds
Latest published articles
When entering the website, I would like to provide the reader with a list of the latest published articles, so that the reader can quickly see whether there are some updates of the website, across the website.
Before I start, I want to mention that since I am using Emacs 28,
which is an unstable version, sometimes something will not work as
expected. For example, the function plist-get
does not work for
me. So I use my own function for that:
(defun durand-org-publish-plist-get (prop plist) (let (res) (while (consp plist) (cond ((eq (car plist) prop) (setq res (cadr plist)) (setq plist nil)) ((setq plist (cddr plist))))) res))
This is implemented as follows.
(defun durand-org-publish-convert-time (spec) "Convert SPEC to a valid time value. SPEC should be the result of `parse-time-string'. It is assumed that the year, the month, and the day components are present." (let ((sec (car spec)) (minute (cadr spec)) (hour (caddr spec))) (encode-time (append (list (or sec 0) (or minute 0) (or hour 0)) (cdddr spec))))) (defun durand-take (n ls) "Return the first N items of LS. If the length of LS is less than N, then return the whole LS." (cond ((< (length ls) n) ls) ((let ((i 0) result) (while (< i n) (cond (ls (setq result (cons (car ls) result)) (setq ls (cdr ls)) (setq i (1+ i))) ((setq i n)))) (reverse result))))))) (defvar durand-org-index-entries-max-num 10 "The maximal number of entries to show on the index page.") (defun durand-org-post-process (project) "Generate a proper index page and Atom feeds. Also shorten the date strings in the sitemap files, and store the completion information in an attribute. The feeds are generated by the function `durand-org-generate-atom-feed'." (let* ((project-plist (cdr (assoc project org-publish-project-alist #'string=))) (components (durand-org-publish-plist-get :components project-plist)) (sitemap-file (durand-org-publish-plist-get :sitemap-filename project-plist)) ;; I cheat here (publishing-dir (expand-file-name "~/org/public/")) (publish-sitemap-file (cond (sitemap-file (replace-regexp-in-string "org$" "html" (expand-file-name sitemap-file publishing-dir))))) (index-file (expand-file-name "index.html" publishing-dir)) contents) (cond ;; depth-first recursion (components (setq contents (apply #'append (delete nil (mapcar #'durand-org-post-process components)))) ;; We only want some items (setq contents (durand-take durand-org-index-entries-max-num (sort contents (lambda (x y) (time-less-p (car y) (car x)))))) ;; If contents is non-nil, we need to process the info in the ;; index page. (cond (contents (with-temp-buffer (insert-file-contents index-file) (goto-char (point-min)) (search-forward "Latest updates") (goto-char (line-end-position)) (while (search-forward "<table" nil t) ;; delete existing tables (delete-region (match-beginning 0) (progn (search-forward "</table>") (match-end 0)))) (insert "\n") (insert "<table cellspacing=\"0\" cellpadding=\"6\"> <colgroup> <col class=\"org-right\" /> <col class=\"org-left\" /> </colgroup>\n\n<tbody>\n") (mapc (lambda (entry) (insert "<tr>") (insert "<td class=\"org-right\">") (insert (format-time-string "%F" (car entry))) (insert "</td>\n") (insert "<td class=\"org-left\">") (insert (cdr entry)) (insert "</td>\n</tr>\n")) contents) (insert "</tbody>\n</table>") (write-region nil nil index-file))))) ((and publish-sitemap-file (stringp publish-sitemap-file) (file-exists-p publish-sitemap-file)) (with-temp-buffer (insert-file-contents publish-sitemap-file) (goto-char (point-min)) (search-forward "</colgroup>" nil t) (let ((regexp (rx-to-string '(seq "<t" (any "hd") " " (one-or-more (not (or ">"))) ">") t)) pos pos-2 temp temp-time) (while (re-search-forward regexp nil t) (setq pos (point)) ;; replace the first org-left by org-right (save-excursion (goto-char (match-beginning 0)) (setq pos-2 (match-end 0)) (save-match-data (cond ((re-search-forward "left" pos-2 t) (replace-match "right") ;; fix pos (setq pos (1+ pos)))))) (search-forward "</t") (setq pos-2 (match-beginning 0)) (forward-line 1) (cond ((re-search-forward (rx-to-string '(seq (= 4 (any digit)) "-" (= 2 (any digit)) "-" (= 2 (any digit)) "T") t) (line-end-position) t) (setq temp-time (durand-org-publish-convert-time (parse-time-string (buffer-substring-no-properties (match-beginning 0) (line-end-position)))))) (t (setq temp-time (durand-org-publish-convert-time (parse-time-string (buffer-substring-no-properties pos pos-2)))) (save-excursion (goto-char pos) (delete-region pos pos-2) (insert (format-time-string "%F" temp-time))) (insert "<!--" (format-time-string "%FT%T%z\n" temp-time (current-time-zone)) "-->\n") (write-region nil nil publish-sitemap-file))) (setq temp (cons (cons temp-time (progn (re-search-forward regexp) (setq pos (point)) (search-forward "</t") (buffer-substring-no-properties pos (match-beginning 0)))) temp))) (durand-org-generate-atom-feed project (expand-file-name (concat project "-atom.xml") publishing-dir) (mapcar (lambda (cell) (let* ((orig-string (cdr cell)) (temp 0) (title (progn (string-match ">" orig-string) (setq temp (match-end 0)) (substring orig-string temp (progn (string-match "</a>" orig-string temp) (match-beginning 0))))) (file-name (progn (string-match "href=\"" orig-string) (setq temp (match-end 0)) (substring orig-string temp (progn (string-match "\">" orig-string temp) (match-beginning 0))))) (content (with-temp-buffer (insert-file-contents (expand-file-name file-name publishing-dir)) (goto-char (point-min)) (search-forward "<div id=\"content\">" nil t) (search-forward "<p>" nil t) (buffer-substring-no-properties (1+ (point)) (progn (search-forward "</p>" nil t) (match-beginning 0))))) ;; escape html (content (replace-regexp-in-string ">" ">" (replace-regexp-in-string "<" "<" (replace-regexp-in-string "&" "&" content))))) (list title (durand-org-atom-format-time (file-attribute-modification-time (file-attributes (expand-file-name file-name publishing-dir)))) (concat "https://jsdurand.xyz/" file-name) (concat "https://jsdurand.xyz/" file-name) (durand-org-atom-format-time (car cell)) content))) (reverse temp))) (reverse temp)))))
It is quite a long function. Basically, it loops through the components of the project, and for each component, finds the corresponding sitemap file, and collects the entries from the sitemap, and then sort them according to their date, and puts top 10 of them in the index file.
In addition, it generates the Atom feeds for each component as well, which is implemented in the next section.
Note that this does not parse the HTML files properly, so it only works with my overall settings, and might fail for arbitrary HTML files.
I have thought about parsing the HTML files through the built-in
function libxml-parse-html-region
. But I think that is unnecessary
generality, so I chose the simpler and dirtier way above.
Atom feeds
Generating feeds
For the website to be considered "fully functioning" (by me), I require it to provide convenient web feeds, so that readers can track the publications without consulting the website every 5 minutes like crazy.
So I download the RFC-4287, the specifications for the Atom feeds, and wrote some functions to generate feeds for my website.
(defvar durand-org-atom-titles-alist nil "An assocuation list of Atom feed titles with the project name.") (setq durand-org-atom-titles-alist (list (list "code" "JSDurand's codes" "https://jsdurand.xyz/code-atom.xml") (list "math" "Math articles of JSDurand" "https://jsdurand.xyz/math-atom.xml") (list "life" "JSDurand's life" "https://jsdurand.xyz/life-atom.xml"))) (defvar durand-org-atom-preamble nil "The preamble of an Atom feed.") (setq durand-org-atom-preamble "<?xml version=\"1.0\" encoding=\"utf-8\" ?> <feed xmlns=\"http://www.w3.org/2005/Atom\"> <id>https://jsdurand.xyz/atom.xml</id> <title>%s</title> <author> <name>JSDurand</name> <uri> https://jsdurand.xyz </uri> <email>durand@jsdurand.xyz</email> </author> <link rel=\"self\" type=\"applicatoin/atom+xml\" href=\"%s\"/> <rights>Copyright (c) 2021, JSDurand</rights> <updated>%s</updated> <generator uri=\"https://www.gnu.org/software/emacs/\" version=\"28.0.50\">\ Emacs</generator>\n") (defvar durand-org-atom-entry-template nil "The template for an Atom entry. It HAS to be formatted with 6 arguments in the following order: TITLE: the title. Note this does not have a subtitle. UPDATED-TIME: the newest updated time. ID: I think I will use the URL as the ID directly. LINK: Link to this entry. PUBLISHED-TIME: the time this is published. CONTENT: a short content.") (setq durand-org-atom-entry-template "<entry> <title type=\"text\">%s</title> <updated>%s</updated> <id>%s</id> <link rel=\"self\" type=\"text/html\" href=\"%s\"/> <published>%s</published> <content type=\"html\"> %s </content> </entry>") (defvar durand-org-atom-postamble nil "The post-amble of an Atom feed.") (setq durand-org-atom-postamble "</feed>") (defun durand-org-atom-format-time (time) "Format TIME in an acceptable way." (concat (format-time-string "%FT%T" time) (format "%s%02d:%02d" (cond ((> (car (current-time-zone)) 0) "+") ("-")) (/ (car (current-time-zone)) 3600) (% (car (current-time-zone)) 3600)))) (defun durand-org-generate-atom-feed (project file-name entries) "Generate an Atom feed for ENTRIES and save in FILE-NAME. PROJECT is the name of the subproject. ENTRIES is a list of entries. An entry is a list of the form (TITLE UPTIME ID LINK PUBTIME CONTENT). See the documentation string for `durand-org-atom-entry-template' for more." (with-temp-buffer (insert (apply #'format durand-org-atom-preamble (append (cdr (assoc project durand-org-atom-titles-alist #'string=)) (list (durand-org-atom-format-time nil)))) "\n") (mapc (lambda (entry) (insert (apply #'format durand-org-atom-entry-template entry) "\n")) entries) (insert durand-org-atom-postamble) (write-region nil nil file-name)))
Basically, this just uses a template and fills the template with suitable data, collected from the sitemap files in the post-processing stage.
Validating feeds
Since the above-mentionned RFC-4287 provides a machine-readable
specification of the formal grammar of a valid Atom feed, we can use
that to verify our feeds by progams. See this website for details.
But basically we convert the appendix B of RFC-4287 from the RNC
format to RNG format, and then use xmllint
to validate the feeds.
Adding date/time info to each page
By default, org-html-publish-to-html
puts a little block of
date/time information to the bottom of each page, including some
meta-information. But it is not too convenient to require the reader
to scroll to the end of the page to know the published date of the
article, so I added this information directly below the title.
At first I tried to alter the internal mechanism of
org-html-publish-to-html
to add this information. But it turned out
this mechanism is too entangled and is not easily changed. So I
adviced the function to achieve this effect.
(advice-add #'org-html-publish-to-html :filter-return #'durand-org-publish-html-advice)
This :filter-return advice will have the input from the result of the
advided function, which is the published HTML file name for the
function org-html-publish-to-html
.
(defun durand-org-publish-html-advice (html-file-name) "Advice `org-html-publish-html' to add a date/time info below the \ title." (with-temp-buffer (insert-file-contents html-file-name) (goto-char (point-min)) (cond ((search-forward "h1 class=\"title\"" nil t) ;; if it does not have a title, we do not add a time info. (search-forward "</h1>" nil) ;; if it already has a time info, don't generate again (cond ((save-excursion (forward-char 1) (looking-at-p "<p class=\"subtitle\""))) (t (let* ((date (let (temp) (save-excursion (goto-char (point-max)) (cond ((search-backward "<p class=\"date\">Date: " nil t) (setq temp (match-end 0)) (buffer-substring-no-properties temp (progn (search-forward "</p>") (match-beginning 0)))))))) (date-time (and date (encode-time (append (durand-take 8 (parse-time-string date)) (list 28800)))))) (cond (date (insert (format "\n<p class=\"subtitle\">%s</p>\n" (format-time-string "%F" date-time))) (write-region nil nil html-file-name))))))))))