The How And Why Of Gemini

If you read my re-introduction post (or looked at the main page), then you’ll have noticed that I mentioned making this site available via Gemini. This post is to explain why I did this, and also how I made it work with Hugo.

Table of Contents

What Is Gemini

First, I guess I should describe what Gemini is. (Of course, if you’re viewing this via Gemini, you already know this, so feel free to skip ahead.)

Gemini is a new internet protocol/ecosystem that aims to occupy the space between Gopher and the Web.

Internet Protocols

Before I get to answering this question, let me briefly touch on the concept of protocols on the Internet.

Computers use a variety of methods to talk to each other on the Internet. They are (largely) designed for different purposes, and have their own strengths and weaknesses. Almost nothing happens without many protocols being involved. Ultimately, though, all of these protocols are ways to move bits of information across various distances, like digital postal services.

You could easily be forgiven for believing that “Internet” is synonymous with “The Web”, which is only a subset of the Internet as a whole, powered by the Hypertext Transfer Protocol . HTTP is how web browsers like Chrome, Firefox, and Safari retrieve web pages from other computers around the world. (Once upon a time, it was even called the World Wide Web, hence “www” in some addresses.)

Even functions that were traditionally accomplished without it, like e-mail and forums, have become subsumed by the One True Application Protocol. However, even today, HTTP is hardly the only one in the game.

Sending email is still largely handled by the Simple Mail Transfer Protocol, even if many people now only do so via GMail’s web interface. Post Office Protocol and Internet Message Access Protocol are still in relatively wide use as a means of getting email that have nothing at all to do with the Web. And these are just a few of the application protocols now in use.

There are and have been a great many protocols for moving information from one computer to another. HTTP is certainly the biggest now, but it wasn’t always. Which brings me to…

Gopher

Gopher was an internet protocol that arose in the early 90s, around the same time as HTTP.

The link above goes into the history of Gopher and why it ultimately lost to the Web, but I suspect the biggest reason was that the Web had significantly greater capabilities, even in its relatively early stages. (Although I’m sure the royalty-free nature of the Web definitely helped its adoption.)

Gopher was designed for a text-based world, but as we got further into the 90s more people had access to computers with graphical interfaces. Another big difference is that HTML allowed for freeform links from any page to any other page (thus “the Web”), where Gopher could only link via dedicated menu pages.

I personally only barely remember Gopher. I was only 7 or 8 when it came out, so I likely encountered it as it was on the downswing a few years later. I remember the search engines Archie and Veronica, which had dedicated buttons on whatever internet service we had at the time. However, I don’t recall finding much of interest there. Also, internet access was about $3/hour, so you didn’t really use it unless you really had reason to. (The only unmetered internet access I had was email/usenet via a local BBS.)

There are certainly many people to this day who continue to use and enjoy Gopher, and it holds a significant place in the “small internet”, a sort of movement that yearns for a simpler time when the internet felt much more personal.

Gemini

Gemini (and its accompanying file format, Gemtext) aims to be more capable than Gopher, but purposefully less capable than the Web with HTML.

Gemtext as a format is sort of an even more limited Markdown with Gopher-ish sensibilities, where the beginning of a line indicates its purpose, and links end up similar to links in gophermaps.

Compared to HTML it is startlingly simple. In comparison to Gopher, it’s like a combination of gophermap and full content. Notably, it does mean that inline links don’t exist. All links must be on their own line.

Similarly, comparing the Gemini protocol itself to HTTP shows remarkable simplification. There are far fewer possible status codes (2 digits instead of 3), and the closest thing to a header in either direction is that successful responses can indicate the MIME type of the content.

Notably, there’s nothing like a cookie, and very little that a server could use to track a user, beyond their IP and what they requested. A server can request that a client identify itself via a TLS certificate which can sort of be used as an identity, which I suppose is about as close as you can get to real trackability.

Why Use Gemini

Often, once someone understands what Gemini is, their next question is “… why?”

As I mentioned above, Gemini is part of the small internet movement. While there are plenty of other avenues to explore, I feel that this one tries to strike a balance between the esoteric and the usable.

Gopher still exists and is used today. I personally find that it’s just a little bit too deep in the past, if only because the protocol has to be somewhat misused to fit modern data. On the other hand, it’s certainly possible to create websites that attempt to rein in some of what could be described as modern excess. Indeed, if you’re reading this on the web, you’re reading it on a site where I’ve attempted to do just that.

What’s Wrong With The Web?

I’ll preface this by saying that obviously there does exist a great deal of utility in the modern internet. It’s become deeply intertwined with every aspect of our lives. Honestly, most of the fundamentals are decent enough. It’s amazing that the whole thing functions as well as it does on a technical level.

Most of what I think is wrong tends to be at the interface of society and the internet. Not the technology per se but rather the ways in which our society has developed alongside it, in a direction that de-centers the user in favor of whatever will make the most Return On Investment in the short term.

The modern internet is extremely commercialized and has become significantly less personal-feeling. Back in the late 90s I had my own website, with several pages on different interests of mine. Because it was mine, I could organize and style it in any fashion I wanted. Others did the same, and communities would form around common interests.

There is a part of me that misses the smaller, slower online world as it was. Engaging in discussion with small communities, without the Engagement Engine driving everything to its most simplistic, attention-grabbing form.

Perhaps I’ve become the old man yelling at the cloud. Do I think there’s any real chance of getting everyone to hop into Geminispace and go back to the Internet of Yore? No. But that’s fine; it doesn’t need everyone on board for me to care. It’s enough that it exists and that some people use it.

How This Site Works With Gemini

To tell the truth, I probably wouldn’t have bothered doing this at all if it weren’t for the fact that my hosting provider, Sourcehut Pages, provides an easy way to host Gemini Capsules.

On the other hand, my static site generator, Hugo, definitely does not make it easy. I did briefly consider switching to another such generator but didn’t really see any better options.

I did do some searching and found a few other posts on getting Hugo to output Gemtext, the two most useful of which were Gemini and Hugo (Sylvain Durand) and Using Hugo To Launch A Gemini Capsule. Sadly, neither of these approaches worked quite as I wanted them to.

Configuring Hugo

As with Sylvain’s post, I had to tell Hugo about Gemini/gemtext as an output format in config.toml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
[mediaTypes.'text/gemini']
suffixes = ['gmi']

[outputFormats.GEMINI]
name = "gemini"
isPlainText = true
isHTML = false
mediaType = "text/gemini"
protocol = "gemini://"
permalinkable = true

[outputs]
home = ["HTML", "RSS", "GEMINI"]
taxonomy = ["HTML", "RSS", "GEMINI"]
term = ["HTML", "RSS", "GEMINI"]
page = ["HTML", "GEMINI"]
section = ["HTML", "GEMINI"]

Then I had to make copies of most of my HTML templates, but for Gemtext. Here’s an example (_default/single.gmi):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# {{ .Site.Title }}: {{ .Title }}

Published on {{ .Lastmod.Format (.Site.Params.dateFormat | default "2 January 2006") }}
{{- if .Params.tags }}

Tagged:
{{ range .Params.tags }}
=> {{ print (relURL "tags/") (urlize .) }} {{ . }}
{{- end -}}
{{- end }}

---

REPLACE {{ .File.Path }} {{ if .Params.md2gemini_args }}{{ .Params.md2gemini_args }}{{ end }}

{{- $related := .Site.RegularPages.Related . | first 5 -}}
{{- if $related }}

---

## Related articles
{{ range $related }}
=> {{ .RelPermalink }} {{ .Title }}{{ if .Params.Subtitle }}: {{ .Params.Subtitle }}{{ end }}
{{- end -}}
{{ end }}

---

=> / Back to the Index
=> {{ (.OutputFormats.Get "html").Permalink }} View this article on the web

The underlying Hugo-templating stuff is very similar, although you do need to know how to get links to other formats if you want to link back to the web. The other important item to note is that where web browsers will largely ignore extra whitespace in HTML, Gemini browsers and Gemtext do not. Care must be taken to make use of dashes if you want to avoid having too many blank lines appearing in your output.

The most important difference is, of course, that the usual variables for inserting the actual content (.Content or its analogs) are never used. This is because Hugo doesn’t know how generate Gemtext from any of its supported source languages.

Generating Gemtext

Instead of directly including content in my Gemtext templates, I have a line starting with REPLACE, followed by the path of the source file, and optionally arguments to pass to md2gemini.

As part of my build pipeline, after running Hugo, I call a script on every .gmi file in the built site.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/usr/bin/env bash
set -euo pipefail

gem_file=$1

if test -z "$gem_file"; then
    echo "Filename of gem file required"
    exit 1
fi

# Extract the name of the source file, if it exists; otherwise exit
src_file="$(sed -E '/REPLACE/!d;s/REPLACE (\S+).*/\1/' "$gem_file")"

if test -z "$src_file"; then
    echo "No REPLACE found in $gem_file"
    exit 0
fi

# Extract the optional argument list after the source file
IFS=$' ' read -r -a md2gemini_args < <(sed -E '/REPLACE/!d;s/REPLACE \S+ (.*)$/\1/' "$gem_file")

# Use default arguments if none were given in the file
if (( ${#md2gemini_args[@]} == 0 )); then
    md2gemini_args=("-sf" "--links=paragraph")
fi

# Concatenate together:
# - The lines before `REPLACE`
# - The result of running md2gemini on the source file
# - The lines following `REPLACE`
tmp_file=$(mktemp)
cat <(sed '/REPLACE/,$d' < "$gem_file") <(md2gemini "${md2gemini_args[@]}" "content/$src_file" | tr -d '\r') <(sed '1,/REPLACE/d' < "$gem_file") > "$tmp_file"
mv "$tmp_file" "$gem_file"

echo "Replaced in $gem_file with $src_file and args" "${md2gemini_args[@]}"`

This script looks for that REPLACE line, and then uses it to insert the result of running md2gemini on the original source document, with default arguments that can be overridden by setting them after the source file path.

Fixing URLs

One last little adjustment I make is to change URLs to use gemini versions where possible. After all, what’s the point of a whole new protocol if all the links just go back to the old one?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/env python3
import sys

if len(sys.argv) < 2:
    print("Filename of gem file required")
    sys.exit(1)

subs = {
    "gemini.circumlunar.space": "gemini.circumlunar.space",
    "srht.site": "srht.site",
    "en.wikipedia.org/wiki": "vault.transjovian.org/full/en",
}

with open(sys.argv[1], "r+") as gem_file:
    content = gem_file.read()

    for (http, gemini) in subs.items():
        content = content.replace(f"https://{http}", f"gemini://{gemini}")

    gem_file.seek(0)
    gem_file.write(content)

This just runs a quick search and replace from a dict where the keys are the HTTP form and the values are the Gemini form. As time goes on, I can add more entries to the dict. I might also need to change things to handle more complicated cases, but for now this should work fine.

Uploading

Once the conversion is complete, results are tarred up into web and gemini variants:

1
2
fd --base-directory public -tf -E '*.gmi' | tar -C public -czvf web.tar.gz --files-from=/dev/stdin
fd --base-directory public -tf -e gmi | tar -C public -czvf gmi.tar.gz --files-from=/dev/stdin

I can then use the hut utility to upload these to Sourcehut Pages, supplying the appropriate --protocol option for each, and thus the site is deployed for both Web and Gemini!

Summary

In the end, it wasn’t very hard to make this work. It just took an hour or two of tinkering.

It did help that a few other folks have documented their attempts, and I hope that this post will also be useful to anyone who goes looking for instructions in the future.

If you’d like to check it out for yourself, there are a number of Gemini clients available. If you’d rather just take a quick look, you can also just view this post via a Gemini proxy.