Migrating content from Ghost to Jekyll

The first step to migrate my blog from Ghost to Jekyll was to convert the Ghost content to Jekyll. Here is how I did it and the challenges I had along the way.

Articles in this series:

  1. Migrated from Ghost to Jekyll
  2. Migrating content from Ghost to Jekyll (this article)
  3. How I set up this blog on Jekyll
  4. How I improved my Jekyll setup
  5. How I improved my Jekyll SEO

Getting content out of Ghost

As I previously mentioned, I recently migrated my Ghost blog to Jekyll. While the process wasn't overly complicated, there were a few things I had to adjust along the way to get a proper result. I started with exporting all of my content from Ghost. Ghost offers a convenient option for that, which gives you all content in a JSON format that you can process further.

Exporting content will only give you the content but not the assets. Luckily, the awesome team at Ghost provided me with a backup of all my assets.

With the content and images in place, I was ready to take the next step.

Convert the content to Jekyll

Jekyll recommends two ways of migrating Ghost blogs: using a database or a backup. If you're migrating from a hosted blog like I did, you will have access to a backup file, which you can process using the Jekyll ghost importer Ruby gem. Using the gem is pretty straight-forward and comes down to copying the Ghost backup file to where your Jekyll site is and running the gem which will then create a Markdown file for each post. But there are a few things that I wanted to adjust in the process.

Change permalink to slug

By default, when converting your pages, the Jekyll ghost importer gem sets the URL of each post using the permalink property. This basically hardcodes the URL of each post ignoring the site configuration. If you wanted to change the format in the future, you would need to adjust all links of all posts separately.

Following the URL of my Ghost blog and to avoid SEO issues and broken links, I wanted to keep my URLs the same, meaning slug (title without noise words that I control) + trailing slash, eg. /my-post/. The result of the conversion was the existing slug without the trailing slash. While I could've used Netlify's redirect, I decided to fix it properly while I was at it.

After converting my content, I decided to change the permalink property to slug. With this approach, I could centrally configure my URLs in _config.yml to be formatted as /:slug/ - with a trailing slash and without the date. Jekyll automatically generates slug for you, but if you want to, you can specify your own in the post's front matter, which is what I did. I already had slugs

To have the script output slug instead of permalink, I adjusted its front_matter property from:

def front_matter
  front_matter_hash = {
    'layout' => "post",
    'title'  => title
  }
  # ...
end

to:

def front_matter
  front_matter_hash = {
    'layout' => "post",
    'title'  => title,
    'permalink' => slug
  }
  # ...
end

Update image links

Ghost and Jekyll use different paths for images. So for my images to show up after the migration, I needed to change their URLs. Images are referenced in two places: the front matter, which points to the post image, and the body which contains all images used in the content.

Once again I adjusted the importer gem extending the front_matter property changing:

front_matter_hash['image'] = image if image != nil

to:

front_matter_hash['image'] = feature_image.sub('/content/', '/assets/') if feature_image != nil

Run the script locally

Having adjusted the importer gem, I needed to run it. I had zero experience working with Ruby, which turned such a simple task into a puzzle. But looking closer at the code, I noticed that it had a shebang on top (#!/usr/bin/env ruby) which meant it could be executed as a script in the command line:

$ /Users/me/jekyll_ghost_importer/bin/jekyll_ghost_importer ./ghost.json
...

Fixing code snippets

After converting the content I tried to build my site only to get an error. Apparently some of my snippets collided with the Liquid templating engine in Jekyll. This was particularly the case with Angular code snippets that use double curly braces which are also used by Liquid:

{{::vm.title}}<br />
<br />
<a href="#/admin" ng-show="vm.isAdmin">admin</a>

To prevent the code snippets from breaking the build, I had to wrap them in the {% raw %}...{% endraw %} tags:

{% raw %}
```html
{{::vm.title}}<br />
<br />
<a href="#/admin" ng-show="vm.isAdmin">admin</a>
```
{% endraw %}

Change paging URLs

On my old Ghost blog, I had two places with a paged listing of articles: archive and tags.

Paging the archive

On my old Ghost blog, the home page was a paged archive of all my posts. The URL of the pages was /page/n/. On my new blog, I'd also have a similar archive, but by default, Jekyll uses page:num, which would give URLs, like /page2/. To avoid broken URLs, I've adjusted the paging URL in the _config.yml:

paginate_path: /page/:num/

This gave me the exact URLs for the paged archive that I had in Ghost.

Paging the tags

Using the jekyll-tagsgenerator plugin, Jekyll can build a page for each tag with the specified number of articles for each page. Unfortunately, the plugin doesn't follow the paginate path defined in the config file and uses a fixed URL scheme instead. Not much that you can do about it other than building your own version of the plugin and changing the URL pattern. Since tag pages are not that heavily used on my blog, I decided to take my losses and leave it as-is.

Redirect RSS

My Ghost blog would publish an RSS feed of my 10 last blog posts at /rss. Jekyll, on the other hand, publishes RSS feed at /feed.xml. To avoid breaking links I decided to use Netlify's ability to define redirects. To the root of my project, I added a file named netlify.toml and in it, I defined the redirect:

[[redirects]]
  from = "/rss"
  to = "/feed.xml"
  status = 301

This sets up a 301 permanently moved redirect from the /rss URL I had in Ghost to /feed.xml that I have in Jekyll.

Reading through Jekyll's docs later, I found out that you can change the default path where the RSS feed is generated. Since my redirect was already working, I decided to leave it be. But in case you're hosting your static site on a server that doesn't support redirects, it's good to remember to avoid breaking URLs.

With these steps done, I had the content of my blog migrated, ready for further optimizations.

What's next

In the following articles, I will tell you more about how I configured Jekyll to improve SEO and decreased my site's build time. Stay tuned!