Migrating my blog to Astro - RSS/Atom syndication
Astro has an RSS package but my blog was previously using the Atom syndication format. I wanted to keep using Atom so I need to adapt the RSS package to generate Atom format.
This is part of a series of posts on how I migrated my blog to Astro
One common sign that someone has changed or rebuilt their blog engine is that I get a flurry of notifications in Feedly (my feed reader of choice) that there’s a whole bunch of “new” posts from them. But when I look closer I realised they’re old posts that I remember reading before, but Feedly thinks they’re new because the underlying RSS feed has changed.
Astro has an RSS package that you can choose to add to your site. But I realised the old Jekyll site (and I think Blogger before it) was actually generating an Atom format feed.
Atom and RSS are two XML different syndication standards. Both are still widely used, and I figured if I wanted to keep my feed stable, I’d need to stick with Atom if possible.
A quick search didn’t find any other implementation of Atom for Astro, so I figured I’d see if I could translate the @astrojs/rss package into something that could generate Atom.
I forked the @astrojs/rss source and created https://github.com/flcdrg/astrojs-atom. Well actually it turns out the rss package source lives at https://github.com/withastro/astro/tree/main/packages/astro-rss so I selectively extracted just the history of that folder from that repo.
I then had a bit of fun kicking the tyres of GitHub Copilot Chat’s agent mode. It seemed a good subject to try it out on, given I could provide an example of what the generated output should look like, and Copilot tends to be better at working with JavaScript projects.
Within a day I had a working Atom package that integrated nicely with Astro. I decided it was worth publishing it to npmjs.org, and as such it’s my first package that I’ve published there. I should let the Astro folks know about it in case they’re interested or there are others who want to use it.
You can find the package at https://www.npmjs.com/package/astrojs-atom
Integrating with Astro
We use the package in much the same way as the @astrojs/rss one. My intention was to make interchangeable at the package level (though obviously there’s some different requirements for the data that you pass in).
The feed is implemented by an Astro Static File Endpoint at /src/pages/feed.xml.ts
.
import atom from "astrojs-atom";
import type { AtomEntry } from "astrojs-atom";
import { getCollection } from "astro:content";
import sanitizeHtml from "sanitize-html";
import MarkdownIt from "markdown-it";
import type { APIContext } from "astro";
import getExcerpt from "../scripts/getExcerpt";
import onlyCurrent from "../scripts/filters";
import { parse as htmlParser } from "node-html-parser";
import { getImage } from "astro:assets";
// From https://billyle.dev/posts/adding-rss-feed-content-and-fixing-markdown-image-paths-in-astro
// get dynamic import of images as a map collection
const imagesGlob = import.meta.glob<{ default: ImageMetadata }>(
"/src/assets/**/*.{jpeg,jpg,png,gif}" // add more image formats if needed
);
const parser = new MarkdownIt();
export async function GET(context: APIContext) {
if (!context.site) {
throw Error("site not set");
}
const posts = (await getCollection("blog")).filter(onlyCurrent);
const sortedPosts = posts.sort(
(a, b) => new Date(b.data.date).getTime() - new Date(a.data.date).getTime()
);
const postsToInclude = sortedPosts.filter((post) => post.body).slice(0, 10); // Get the latest 10 posts
const siteUrl = context.site?.toString();
if (!siteUrl) {
throw new Error("Site URL is not defined");
}
const feed: AtomEntry[] = [];
for (const post of postsToInclude) {
// convert markdown to html string
const body = parser.render(post.body!);
// convert html string to DOM-like structure
const html = htmlParser.parse(body);
// hold all img tags in variable images
const images = html.querySelectorAll("img");
for (const img of images) {
const src = img.getAttribute("src")!;
// Relative paths that are optimized by Astro build
if (src.startsWith("../../assets/")) {
// remove prefix of `./`
const prefixRemoved = src.replace("../../assets/", "");
// create prefix absolute path from root dir
const imagePathPrefix = `/src/assets/${prefixRemoved}`;
// call the dynamic import and return the module
const imagePath = await imagesGlob[imagePathPrefix]?.()?.then(
(res) => res.default
);
if (imagePath) {
const optimizedImg = await getImage({ src: imagePath });
const newSrc = context.site + optimizedImg.src.replace("/", "");
// set the correct path to the optimized image
img.setAttribute("src", newSrc);
}
} else if (src.startsWith("/images")) {
// images starting with `/images/` is the public dir
img.setAttribute("src", context.site + src.replace("/", ""));
} else {
throw Error(`src unknown: ${src}`);
}
}
const htmlContent = sanitizeHtml(html.toString(), {
allowedTags: sanitizeHtml.defaults.allowedTags.concat(["img"]),
});
feed.push({
id: `${new URL(post.id, context.site).toString()}`,
updated: post.data.date,
published: post.data.date,
title: post.data.title,
content: {
type: "html",
value: htmlContent,
},
summary: {
type: "html",
value: post.data.description || getExcerpt(htmlContent, 500),
},
category: post.data.tags.map((tag) => ({
term: tag,
})),
link: [
{
rel: "alternate",
href: new URL(post.id, context.site).toString(),
type: "text/html",
title: post.data.title,
},
],
thumbnail: post.data.image
? {
url: `${new URL(post.data.image.src, context.site).toString()}`,
}
: undefined,
});
}
const atomFeedUrl = `${siteUrl}feed.xml`;
return atom({
id: atomFeedUrl,
title: {
value: "David Gardiner",
type: "html",
},
author: [
{
name: "David Gardiner",
},
],
updated: new Date().toISOString(),
subtitle:
"A blog of software development, .NET and other interesting things",
link: [
{
rel: "self",
href: atomFeedUrl,
type: "application/atom+xml",
},
{
rel: "alternate",
href: siteUrl,
type: "text/html",
hreflang: "en-AU",
},
],
lang: "en-AU",
entry: feed,
});
}
Complexities
There’s one thing that makes it slightly more complex than I’d like - including full blog post content in the feed
Unfortunately Astro doesn’t provide you access to the rendered content for each page when you’re looking at the page data from a call to getCollection
in a static file endpoint. Because of this we have to rerender each page ourselves.
This also means we need to deal with fixing links to images, and there’s always the risk that what we end up rendering isn’t quite the same as how the actual web page is rendered.
It’s not ideal and there an existing discussion tracking possible solutions. Ironically there used to be a way in earlier versions of Astro but that approach is no longer possible.