Introduction
When I set out to build this site, I had a pretty simple goal: write some posts, host them cheaply, and not spend the rest of my life fighting with a CMS. I looked at the usual suspects — WordPress, Ghost, Hugo, Jekyll — and they all felt like more than I needed. I wanted something I could actually understand end-to-end, so I did what any reasonable software engineer does and just built it myself.
The result is a lightweight Python static site generator. It reads HTML files with YAML front matter, runs them
through Jinja2 templates, and spits out a complete static site in a dist/ folder. From there,
publish.py syncs everything to S3 and optionally invalidates a CloudFront distribution.
The whole thing is about 450 lines of Python across two files.
I recently decided to open source the generator itself, which meant figuring out how to share the framework without sharing all of my posts and images. This post covers both how the generator works and how I structured the repos to keep the two things separate.
How the Generator Works
Post Format
Each post is a .html file in a posts/ directory. The file starts with a YAML front
matter block between --- delimiters, followed by the post body as plain HTML:
---
title: "My Post Title"
date: 2026-01-15
categories: [Python, How To]
type: Tutorial
description: A short summary shown in post listings.
author: Ryan Dockstader
draft: false
---
<h2>Introduction</h2>
<p>Post content goes here.</p>
The draft: true flag skips the post at build time, which is handy for work-in-progress pieces.
The slug defaults to the filename but can be overridden with a slug field.
The Build Script
build.py does the heavy lifting. It scans posts/, parses the front matter with
PyYAML, and builds a few data structures: a flat list of posts sorted newest-first, a dict of posts grouped
by category, and a dict grouped by type. Those get passed to Jinja2 templates to render the index page,
individual post pages, category index pages, and type index pages.
Images live in images/post-slug/ and get copied straight to dist/, so you reference
them with absolute paths like /images/my-post/screenshot.png. Static assets (CSS, JS) work the
same way. The whole build wipes dist/ and regenerates from scratch every time, which keeps things
simple — no incremental build state to reason about.
Publishing
publish.py syncs dist/ to an S3 bucket. It compares the MD5 hash of each local file
against the S3 ETag so unchanged files don't get re-uploaded. It also makes sure that any files that were removed
from the dist folder also get Deleted from the bucket. If
you set CLOUDFRONT_DISTRIBUTION_ID in your .env, it triggers a /* cache
invalidation automatically after any changes. A typical publish run for this site takes about 5 seconds.
Link Shortener
There's a small link shortener baked in. You define named links in data/links.json:
[
{
"name": "MyBook",
"type": "product",
"url": "https://example.com/really-long-affiliate-url",
"description": "A book I liked."
}
]
The build generates a /links/go/ page that reads a ?link=MyBook query param and
redirects. It's nothing fancy, but it keeps affiliate links out of post content and makes them easy to update
in one place.
Keeping the Framework Public and Content Private
Once I decided to open source the generator, I had a straightforward problem: the same repo contains both the framework I want to share and the posts and images I'd rather not hand out. The solution I landed on was two separate repositories.
Two Repos
The public repo contains only the framework — build.py, publish.py,
serve.sh, the templates/ directory, static/, requirements.txt,
and a sample post that demonstrates the format. The site name is set to a generic placeholder so it's immediately
forkable. The .gitignore in that repo excludes posts/, images/, and
data/ by default, with a comment explaining how to opt in if you want to track your own content there.
The private repo is this one — it has all the actual posts, images, and site-specific config. Nothing changes about how I work day-to-day.
Syncing Framework Changes
The main downside of two repos is that framework improvements have to be manually pushed to the public repo. In practice this is fine since the framework changes infrequently. When it does, I just copy the relevant files over and push:
cp build.py ../blog-generator-public/
cd ../blog-generator-public
git add build.py
git commit -m "update build script"
git push
I considered using a git submodule or subtree to automate this, but the added complexity wasn't worth it for something I touch a few times a year.
Try It Yourself
If you want to use it as a starting point for your own blog, the public repo is on GitHub at
rdockstader/python-blog-generator.
Clone it, drop some HTML posts in posts/, run python3 build.py, and you have a site.
The README walks through the full setup including the S3 publishing step.
The whole stack is about as minimal as it gets: Python, Jinja2, PyYAML, boto3, and an S3 bucket. No Node.js,
no build tools, no config files beyond a single .env. If you're comfortable with Python and HTML,
there's nothing here you can't understand in an afternoon.