Old Blog, New Blog

2022-10-10

In a triumph of hope over experience, I’ve decided to try blogging again, this time connecting my blog directly with my public identity here on this domain. It was well beyond time to refresh its contents with something, anything other than work projects from the first years of my career. I’ve resurrected a few posts from my old blog that aren’t too embarrassing.

Instead of using one of the many fine blog systems that exist, I wanted to build my own, in Python. Well, almost. The blank page or the empty project can be a terrifying thing, and I wanted a boost to get going. I settled on makesite, because in addition to having the bare minimum of features, it had all the regexes I knew would be required and was even well-tested. I immediately discarded the project’s noble goals of requiring only make, POSIX compliance, and compatibility with both Python 2 and 3. For someone who likes modern Python features, those are challenging boundaries to work within.

Here’s my code. The readme summarizes my changes, but I’ve written up the current details below. (As typical for a software project, this site will never be “done”, and I will twiddle with it endlessly.)

In modifying and thus breaking the code, I found the tests had a tendency to delete parts of the generated site, since they used helpers that would move the “production” generated site to a temp path and then move it back. Also, they all used the builtin Python unittest module, which helps avoid dependencies outside Python itself but is verbose and limited compared to pytest. I solved this by using pytest’s tmp_path fixture and optionally passing a working path to the main function. This meant introducing pathlib, which I find more intuitive.

Up to this point, I’d been managing dependencies using plain old pip and requirements.txt because I didn’t want to drag in something heavy like poetry or pipenv. Sure enough, despite having few dependencies, a transitive dependency update mysteriously broke a unit test. After trial and error, I think pip-tools makes a good minimum viable Python dependency manager. In the process, I ran into a hiccup when using pyproject.toml with setuptools.

I once knew CSS inside and out, back when CSS 2.1 was the latest version, but it’s been years since I did anything significant with it. CSS has changed for the better, but that meant my positioning knowledge is obsolete, float hacks replaced by Grid and Flexbox. In my last site, I did everything by hand, but this time I started with what makesite provided plus Simple.css. This was enough to get going. I spent too much time trying to get CSS Grid to make the header span 100% of the screen, yet align the content with the second grid column. There isn’t a way to do it with a single grid. CSS sub-grids would solve it. Firefox has supported subgrid since late 2019, and Safari 16 now does, but not Chrome. A team from Microsoft is implementing it in Chromium: feature tracker. It took me way too long to realize I could use the old margin: 0 auto trick on all the elements containing content to achieve this, which is what the makesite CSS already does.

I don’t even want to go into detail about how much time I spent tweaking images and favicons. I used Pixelmator Pro, but I probably would’ve been more productive activating my Photoshop CC license and using that. In one of my first jobs, I spent many hours resizing images for the web in Photoshop, eventually building macros to do it. I’ve retained the muscle memory for Photoshop keyboard shortcuts, not Pixelmator’s.

For post summaries on the post index page, makesite takes the first 50 words of plaintext from a post, with no formatting or markup. I wanted to get the content of the first <p> element from the post content and display it as a summary on the post index page. There’s html.parser built into Python, but it doesn’t do much by itself. I settled on lxml, but it may not have been the best choice. Despite the tutorial and umpteen how-to type examples, the documentation was written from the point of view of the project developer, not a user. The lxml API reference is a great example of that. I added type hints with types-lxml but even that is of limited help, because lxml is written in Cython (lmxl’s developer is a core Cython developer) which has its own notion of C-based types. I finally got it working using an XPath query, but making something useful from HTMLParser might’ve taken the same or less amount of time.

Working on this blog has reminded me of an important distinction. Even a “simple” project that has specific requirements soon generates a stream of issues like the above, a crucial difference between work that has to interface with reality, and exams, exercises, or tutorials. These little problems are not difficult to solve by themselves, but together add up to plenty of work. A focused course of study is best for gaining deeper knowledge of a particular domain, but the day-to-day experience of working in that domain requires pulling in bits of code, information, and knowledge from many areas where your expertise is more limited.