Switching to nikola for blogging

Why?

The key to me for blogging is to keep it simple. This at first meant static sites that rendered nicely, and this was a function fulfilled by Wintersmith.

The Problems

The problem with Wintersmith was that it was a pure markdown solution. At the end of the day, it turned out that most of what I wanted to talk about probably had some type of interactive or graphical component to it, or I just wanted to be able to add photos and images easily.

None of this is easy with pure markdown.

The Solution

The new solution for me here is Nikola which seems to hit a new sweetspot of codeability versus low friction. Specifcially it's support for posts being rendered out of jupter-lab notebooks, something that I've had just grow in usefulness.

Perceived Benefits

Getting the transition from "jupyter notebook" to "blog" post down to as frictionless as possible feels important. When you want to share neat stuff about code, it helps to do it right next to the neat code you just wrote.

When you want to share data analysis - well that's a jupyter specialty. And when you want to show off something you're doing in your workshop, then at the very least pulling in some images with jupyter is a bit more practical and probably a lot more likely to be low enough effort that I'll do it - this last part is the bit that's definitely up in the air here.

Workarounds

Keeping media and posts together

The first problem I ran into with Nikola is that it and I disagree on how to structure posts.

Namely: out of the box Nikola has the notion of separate /images /files and /posts (or /pages) directories for determining content.

I don't like this, and it has a practical downside: when working on a Jupyter notebook for a post, suppose it requires data or I'm using some tool which will do drag and drop images for me? What I'd like is to open that notebook, and just that notebook in Jupyter or VSCode and work on just that file - including how I reference data.

Although this issue suggests a workaround, it has a of drawbacks. But more importantly - one of the reasons we use Nikola is it's written in Python, and better - it's main configuration file conf.py is itself, runnable Python.

This gives us a much better solution: we can just generate our post finding logic at config time.

To do this we need to find this stanza:

POSTS = (
    ("posts/*.ipynb", "posts", "post.tmpl"),
    ("posts/*.rst", "posts", "post.tmpl"),
    ("posts/*.md", "posts", "post.tmpl"),
    ("posts/*.txt", "posts", "post.tmpl"),
    ("posts/*.html", "posts", "post.tmpl"),
)

This is a pretty standard stanza which configures how posts are located. Importantly: the wildcard here isn't a regular glob, and all these paths act recursive searches, with the directory names winding up in our paths (i.e. posts/some path with spaces/mypost.ipynb winds up as https://yoursite/posts/some path with spaces/mypost-name-from-metadata)

So what do we want to have happen?

Ideally we want something like this to work:

|-my-post/- my-post.ipynb
         |- images/some-image.jpg
         |- files/data_for_the_folder.tsv

and then on the output it should end up in a sensible location.

We can do this by calculating all these paths at compile time in the config file, to workaround the default behavior.

So for our POSTS element we use this dynamic bit of Python code:

# Calculate POSTS so they must follow the convention <post-name>/<post-name>.<supported extension>
_post_exts = (
    "ipynb",
    "rst",
    "md",
    "txt",
    "html",
)
_posts = []
_root_dir = os.path.join(os.path.dirname(__file__),"posts")
for e in os.listdir(_root_dir):
    fpath = os.path.join(_root_dir,e)
    if not os.path.isdir(fpath):
        continue
    _postmatchers = [ ".".join((os.path.join("posts",e,e),ext)) for ext in _post_exts ]
    _posts.extend([ (p, "posts", "post.tmpl") for p in _postmatchers ])

POSTS = tuple(_posts)

PAGES = (
    ("pages/*.ipynb", "posts", "page.tmpl"),
    ("pages/*.rst", "pages", "page.tmpl"),
    ("pages/*.md", "pages", "page.tmpl"),
    ("pages/*.txt", "pages", "page.tmpl"),
    ("pages/*.html", "pages", "page.tmpl"),
)

Testing this - it works. It means we can keep our actual post bodies, and any supporting files, nicely organized.

Now there's two additional problems: images and files. We'd like to handle images specially because Nikola will do automatic thumbnailing and resizing for us in our posts. They're handled lossily. Whereas files are not touched at all in the final output.

The solution I settled on is just to move these paths to under images and files adjacent to the posts respectfully. This means is the Jupyter notebook I'm using is referencing data, it's reasonably well behaved.

For files we use this config stanza:

FILES_FOLDERS = {'files': 'files'}

for e in os.listdir(_root_dir):
    fpath = os.path.join(_root_dir,e)
    if not os.path.isdir(fpath):
        continue
    FILES_FOLDERS[os.path.join(fpath,"files")] = os.path.join("posts",e,"files")

and for images we use this:

IMAGE_FOLDERS = {
    "images": "images",
}

for e in os.listdir(_root_dir):
    fpath = os.path.join(_root_dir,e)
    if not os.path.isdir(fpath):
        continue
    IMAGE_FOLDERS[os.path.join(fpath,"images")] = os.path.join("posts",e,"images")

Setting up the publish workflow

Nikola comes out of the box with a publishing workflow for Github pages, which is where I host this blog.

Since I've switched over to running decentralized with my git repos stored in syncthing, I wanted to ensure I only pushed the content of this blog and kept the regular repo on my local systems since it leads to an easier drafting experience.

I configure the github publish workflow like so in conf.py:

GITHUB_SOURCE_BRANCH = "src"
GITHUB_DEPLOY_BRANCH = "master"

# The name of the remote where you wish to push to, using github_deploy.
GITHUB_REMOTE_NAME = "publish"

# Whether or not github_deploy should commit to the source branch automatically
# before deploying.
GITHUB_COMMIT_SOURCE = False

and then add my Github repo as the remote named publish as

git remote add publish https://github.com/wrouesnel/wrouesnel.github.io.git

and then synchronize my old blog so nikola can take it over:

git fetch publish
git checkout master
find -depth -delete
git commit -m "Transition to Nikola"
git checkout main

and then finally just do the deploy:

nikola github_deploy

Next Steps

This isn't perfect, but it's a static site and it looks okay and that's good enough.

I've got a few things I want to fix:

  • presentation of jupyter notebooks - I'd like it to look seamless to "writing things in Markdown"
  • a tag line under the blog title - the old site had it, the new one should have it.
  • using nikola new_post with this system probably doesn't put the new file anywhere sensible - it would be nice if it did
  • figure out how I want to use galleries