Movable Type to WordPress without losing your permalinks

So you’d like to upgrade to WordPress from Movable Type? It’s not very hard… if you’ve always had the same permalink structure. But if you’ve gone through the MovableType versions from 1.5 to 2.x to 3.x and finally to 4.1 (like me) then you’ve probably been through a couple of URL scheme revisions.

In my case, I started out with Movable Type’s default (and, at the time, only) permalink structure… post ID (a six digit number) followed by .html in the archives directory to avoid cluttering up the top one. It looked like this: “http://tbotcotw.com/archives/000334.html.” Of course, from there I switched to .shtml, then quickly to .php, and when MT finally supported it, permalinks based on the post name (all the words with underscores replacing the whitespace). At around the same time I read Mark Pilgrim’s excellent post on cruft-free and future-proof URLs in Movable Type. I’d finally found a URL structure that (I thought, it turns out that the underscore was a bad idea) would stand the test of time: post name with underscores for whitespace and no extension.

This was exasperating, though, since switching to a new URL scheme in midstream meant that my permalinks were now not very perma. I was getting lots of 404s because people had linked to a URL that didn’t exist anymore. And that was hurting my Google page rank. I considered fixing the problem by publishing a page with a php location header that would redirect from the old URL to the new seamlessly (using a 301 error code). But actually building those pages statically slowed down the already miserably slow interface to MT (my server is a piece of crap)… comments would take over a minute to be published when two or three individual entries and two category archives had to be built. So I had to wait until MT caught up and allowed dynamic publication, then I created a new archive mapping for the old URL structure and MT would automatically serve a page with the location header whenever a browser requested an old page. Pretty hacky, but it worked very well.

Fast forward to today and I still get hits on old posts from links that someone wrote in 2002. I wanted to duplicate my redirect scheme after I switched blogging systems but neither the MT export nor the WP import pays any attention to post ID, so I hacked both of them as I mentioned here. I had to use an old (2.1?) version of the import script in WordPress, but it worked just fine. Then I built a series of mod_rewrite rules to redirect from every one of the many URL structures I’ve ever used:

# Convert old archives
RewriteRule ^/archives/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.html$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.shtml$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/old/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.php$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.html$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.shtml$ /category/$1/ [R=301,L]
RewriteRule ^/archives/?$ /archive/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9]/[0-9][0-9]/?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9][0-9][0-9]\.?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/tag/.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/author/.*$ [NC]
RewriteRule ^/archives/(.*)$ /category/$1 [R=301,L]
RewriteRule ^/archives/(.*)$ /$1 [R=301,L]

This converts the old “/archives/000334.html” format to “/000334/,” changes “/archives/cat_blog_business.html” (another legacy MT convention) to “/category/blog_business/,” and removes the “/archives/” from every other URL. At this point the numeric requests are handled by a plugin called Permalink Redirect. Just put “%post_id%/” in the old permalink structure field and blamo… no more 404s, everybody who clicks an ancient link is sent to the correct post.

The next problem was the MT underscores to WP hyphens issue. It’s actually a good idea to use hyphens… Google doesn’t recognize underscores as whitespace (yet), so you’re destroying a source of relevant keywords if your URL is a big_string_like_this. On the other hand, big-string-like-this then gets your page in the index with those four keywords, raising your page rank. I quickly found that, in most circumstances, WordPress automatically converts underscores to hyphens. Only when the URL has something after the title, like in “/2008/04/big_string_like_this/feed/,” or if a category is involved, does WordPress fail to convert. Perhaps I should take the time to fix that in the code… until then, I’ll use these mod_rewrites:

# Change underscores hyphens
RewriteRule ^/category/([^_]*)_([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3-$4/ [R=301,L]
RewriteRule ^/category/([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3/ [R=301,L]
RewriteRule ^/category/([^_]*)_(.*)$ /category/$1-$2/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4-$5/$6 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4/$5 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_(.*)/(.+)$ /$1/$2-$3/$4 [R=301,L]

This will ignore the underscores (and let Wordpress handle them) unless it’s a category or there are characters after that last slash. Finally, I had to move all my syndication feeds to the new structure:

# Change to wordpress feeds
RewriteRule ^/atom$ /wp-atom.php [R=301,L]
RewriteRule ^/atom\.xml$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.atom$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.rdf$ /wp-rdf.php [R=301,L]
RewriteRule ^/index\.xml$ /wp-rss2.php [R=301,L]
RewriteRule ^/rss\.xml$ /wp-rss.php [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom/$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml/$ /$1/feed/ [R=301,L]

Nothing complicated there.

Then I found one more problem. Movable Type makes sure that permalink slugs are globally unique, and posts with the same titles get tagged with “-1,” “-2,” etc. WordPress doesn’t care if they’re globally unique, as long as they’re not in the same month. So I had to watch my logs for 404s, then go back and rename the posts. At this point the Redirection plugin was invaluable. It automatically builds a 301 redirect from old name to new when you retitle a post… so I could change all those post names back to the Movable Type default without worrying about blocking new links that pointed to the WordPress format.

I was going to go into a discussion of the relative benefits of dynamic vs. static publishing, and what a wonderful plugin WP Super Cache is, but this post has run on way too long, so that will have to wait for another day. Until then, here’s an interesting discussion on the subject.

Popularity: 56% [?]

Ch-Ch-Ch-Changes!

Obviously something has happened to TBOTCOTW. I’ve finally switched from MovableType to WordPress. I had a lot of time invested in MovableType, but an upgrade to MT4.1 made most of that moot… many of my hacks didn’t work anymore because none of the more arcane plugins (simple comments, regex, getXML, etc.) were ever updated to work with MT4.

So I realized that the golden days for MovableType were over. There were just too many internal changes from version 2.5 to 3.0, and then again from 3.4 to 4.0, for plugin developers to keep up. Even some of the most dedicated, like Arvind Satyanarayan, had given up on immensely popular plugins like MT BlogRoll. In the comments he got request after request for a new version, and he promised it in a week or two a couple of times, but it’s never happened.

This isn’t Arvind’s fault, of course. He’s a busy young man who just started college. This also isn’t really the fault of MovableType’s developers. The upgrades they did that made plugins break were probably necessary, and I think their vision was (unavoidably) a bit blinkered because they’re developers, not plugin writers (much less plugin users). So they have different priorities than me (and other end-users like me who like to make everything work just so).

WordPress, on the other hand, is purely open source, and always has been. So there seems to be a lot more of the community’s requirements taken into account in new versions. Plus the plugin community feels like MovableType’s did three or four years ago… it’s vibrant and alive, with new plugins released daily and old plugins updated frequently.

The migration was not especially difficult, but I did have to follow these directions to make MT print the EntryIDs in the export file, and to make WP read and use them when importing. This was only because I still get hits with the old entryid format of /archives/001178.html and I have a redirection scheme that sends those hits to the correct entry (basically an extra archive template that’s just a redirect header and a series of archive mappings to each old format). That redirection scheme was very difficult to come up with in MT, but there were a couple of plugins (Advanced Permalinks and Redirection) that made it very, very easy in WP. WP also, out of the box, automatically converts from the underscore format of MT to dashes, which is nice, since I just learned that dashes are preferred by search engines.

Since the upgrade I’ve noticed several things are better. WordPress is noticeably faster than MovableType since it does everything dynamically rather than publishing static html files (it’s even faster than an MT installation with every template set to dynamic publishing). And, like I said before, the plugin selection is pretty incredible. There are several plugins that just do stats reports on visitors, pageviews, and the like. Plus the akismet plugin (which did have a MT version) is native to WP, and it works very well at stopping comment spam. And WP has several features built-in that need plugins or new templates in MT, like a comments feed per post and a blogroll management system.

All in all, I’m extremely happy with WordPress, and not just because I like having new toys to play with.

Popularity: 18% [?]