Migrating from Medium to Gatsby

March 17, 2019 · 9 mins read
Engineering
Marvin Meyer
Photo by Marvin Meyer on Unsplash

I recently moved my blog from Medium to a self-managed blog built with Gatsby in the open, then deployed on Netlify. After a few weeks of fiddling around, I feel like I’ve landed on something I’m mostly happy with.

Despite my many new & old frustrations with Medium, I really enjoyed the writing experience. Whenever I’ve built my own blogs in the past, inevitably, I spent more time designing and building the damn thing instead of writing content. On Medium, I could just write.

In this post I’ll describe how I did the transition, and what tools I found along the way that have made the experience of managing my own blog with Gatsby a somewhat painless one. Let’s get started!

Exporting your content off of Medium

The first step is to download all your information from Medium. It’ll be sent to you as .zip file containing all your data, including published & draft posts you’ve written. Once you’ve uncompressed your files, you’ll see a bunch of folders:

Medium posts are exported as html files
Medium posts are exported as html files

Medium posts are exported as html files in the posts folder, so we’ll want something to convert them to markdown. Alternatively, you could use something like gatsby-source-medium or gatsby-source-rss to add this data to Gatsby’s GraphQL endpoint. I preferred having actual files in my repo so I could make edits as necessary.

After some googling, I found medium-2-md by Gautam Dhameja which seemed promising. You point this script at a local folder containing your Medium posts, and it generates a folder full of markdown files together with basic frontmatter. Having frontmatter included is useful, because it allows us to add interesting metadata to our posts that we can then use when building our blog in Gatsby. Frontmatter looks like this (everything between the ---):

---
title: "My blog post"
description: >-
  This is a description of my awesome blog post
date: "2019-03-17T00:52:08.562Z"
categories:
  - stuff n things
keywords:
  - my blog post
this:
  - is an array
  - of strings
arbitraryData: is fine to add
---

My content that isn't frontmatter

To begin, run the following script, pointing it to the folder which houses your Medium html posts:

npx medium-2-md convertLocal path/to/medium-export/posts -f

If you want to convert your drafts as well, add the -d flag to the command. When the script completes, it’ll place a folder within posts called md_<series of numbers>, which contains all your published posts in markdown format.

Converted posts in markdown format

Getting started with Gatsby

Now that we have our posts, we can get our blog going! I started by throwing something together in Sketch, as I found it helpful for me to think about different ideas I wanted to try in the design of my blog. Here’s what I originally came up with (quite different from what I actually built!):

My original design for my blog
My original design for my blog

Setting up

Next, let’s get our repo created using the gatsby-starter-blog starter. There are lots of different starters when you’re creating something with Gatsby, but I found this one good enough to be a base. Follow the instructions in the gatsby-starter-blog repo after you’ve also installed gatsby-cli. If you did it right, you should have a new folder with the following (or similar) structure:

What your folder structure should look like
What your folder structure should look like

At this point, I moved my markdown posts into the content/blog folder, and had to do a bunch of cleanup to prettify the markdown. One thing that made the conversion process difficult was that in my Medium posts, I added code snippets by embedding GitHub gists. This meant that the code snippets in my posts didn’t get converted, so I had to do some manual work here with copypasta.

How does Gatsby even?

Once you’ve cleaned up your posts, you should have a bare bones, basic blog built with Gatsby. I’m not going to comprehensively cover what Gatsby does or how it works in the post, but the docs and tutorial are excellent ways to get started. What you should know about Gatsby is that it’s really a static progressive web app (PWA) generator. It’s like Jekyll and create-react-app combined. In their own words:

Gatsby.js is a static PWA (Progressive Web App) generator. You get code and data splitting out-of-the-box. Gatsby loads only the critical HTML, CSS, data, and JavaScript so your site loads as fast as possible. Once loaded, Gatsby prefetches resources for other pages so clicking around the site feels incredibly fast.

Which parts of my Gatsby app are static? It wasn’t initially obvious to me. It eventually dawned on me that Gatsby tries to do as many things at build time in order to generate static assets. For example, if you can have Gatsby query your data at build time (such as from your filesystem, an external API, a database, etc), then static files can be generated from that data. The resulting artifact is then completely static.

During run time, your Gatsby app can respond immediately with static assets while hydrating itself with additional data / interactivity as needed after initial render. It’s not just a static site generator! It also uses GraphQL in a fairly novel way as a layer for your React client to work with. For example, the markdown blog posts you’ve added, the metadata for your site, the image assets and more are all exposed to your React client via GraphQL.

The cool thing about Gatsby’s architecture, is that in addition to the runtime ecosystem for application concerns via React components, it also opens up the possibility for reusable plugins that work on your build itself. For example, if you wanted to create an RSS feed for your blog, you could quickly download a plugin (gatsby-plugin-feed) that can do that for you at build time. That’s amazing!

Automate all the things

Throughout the process of creating my blog, I found the following tools that made my life easier.

Deploying your blog with Netlify

Getting your blog deployed with Netlify is a breeze. Once you’ve created a new site in Netlify, you can quickly turn on automatic deployments as well as deploy previews. Deploy previews are built for every PR, and lets you quickly take a look at your changes before merging into master. You’ll also want to use Netlify’s DNS if you can, because that will allow them to provision a wildcard SSL certificate for you, meaning both your “naked” (meaning https://no.lol) domain and “www” (https://www.no.lol) domain will both have SSL.

Add the lighthousebot for continuous performance testing

After some headscratching, I finally figured out how to get the lighthousebot to automatically run in CI for every pull request. If you’re not familiar with Lighthouse, it’s a developer tool released by the Chrome team that helps you audit your site for performance, accessibility, progressive web apps, and more. You can run a lighthouse audit in your Chrome DevTools without installing anything:

Run a lighthouse audit on your site in the Chrome DevTools
Run a lighthouse audit on your site in the Chrome DevTools

To get started with lighthousebot, follow the instructions in the lighthousebot repo. You’ll need to:

  1. Add @lighthousebot as a collaborator to your repo
  2. Request a lighthousebot API key
  3. Add the API key as an environment variable to TravisCI
  4. Then run it against your Netlify deploy preview so you can look at score changes before merging your PR

lighthousebot will leave a comment in your PR
lighthousebot will leave a comment in your PR

Because you need to wait for the Netlify deploy preview to finish before you can run lighthousebot, you’ll need a little npm package called wait-for-netlify-preview by Alexander Lichter to let TravisCI wait before running the lighthouse audit. To get this to work, install wait-for-netlify-preview as a dev dependency, add an access token with the repo permission, and add that as a environment variable in TravisCI: GITHUB_API_TOKEN = <your access token>

If you did it right, you should have two environment variables set in TravisCI:

What your TravisCI environment variables should look like
What your TravisCI environment variables should look like

Here’s what I added to my TravisCI config and package.json:

.travis.yml
jobs:
  include:
    - stage: Test
      install: yarn install --frozen-lockfile
      script: yarn test
    - stage: Lighthouse
      if: type = pull_request
      install: yarn install --frozen-lockfile
      script: yarn run lh --perf 90 --pwa 90 --a11y 90 --bp 90 --seo 90 "$(wait-for-netlify-preview)"
package.json
{
  "scripts": {
    "lh": "lighthousebot"
  }
}

Now, when you open a PR, you’ll see the following stages in TravisCI:

Your TravisCI stages should look like this
Your TravisCI stages should look like this

This is a pretty nice setup! Gatsby gives you incredible performance out of the box, so this addition to your CI process ensures that the code you add won’t slow your site down too much ;)

Bring on the bots

Some other bots I also found useful:

  • delete-merged-branch: automatically deletes merged branches
  • renovate: keeps your dependencies up to date, similar to greenkeeper. I like that you can specify a schedule so the PRs don’t get too noisy
  • bors: if you have multiple people contributing to your blog, bors is incredibly helpful! I use this in a bunch of my open source libraries. It helps you prevent “semantic conflicts” as you merge multiple PRs. It’s also pretty cool being able to merge PRs simply by leaving a comment (“bors r+“)

If you have any questions or want more details, tweet at me. I hope you found this post useful, happy blogging!

Discuss on Twitter · Edit this post on GitHub
Lauren Tan

Written by Lauren Tan who lives and works in the Bay Area building useful things. You should follow her on Twitter