Movable Type to WordPress without losing your permalinks
So you’d like to upgrade to WordPress from Movable Type? It’s not very hard… if you’ve always had the same permalink structure. But if you’ve gone through the MovableType versions from 1.5 to 2.x to 3.x and finally to 4.1 (like me) then you’ve probably been through a couple of URL scheme revisions.
In my case, I started out with Movable Type’s default (and, at the time, only) permalink structure… post ID (a six digit number) followed by .html in the archives directory to avoid cluttering up the top one. It looked like this: “http://tbotcotw.com/archives/000334.html.” Of course, from there I switched to .shtml, then quickly to .php, and when MT finally supported it, permalinks based on the post name (all the words with underscores replacing the whitespace). At around the same time I read Mark Pilgrim’s excellent post on cruft-free and future-proof URLs in Movable Type. I’d finally found a URL structure that (I thought, it turns out that the underscore was a bad idea) would stand the test of time: post name with underscores for whitespace and no extension.
This was exasperating, though, since switching to a new URL scheme in midstream meant that my permalinks were now not very perma. I was getting lots of 404s because people had linked to a URL that didn’t exist anymore. And that was hurting my Google page rank. I considered fixing the problem by publishing a page with a php location header that would redirect from the old URL to the new seamlessly (using a 301 error code). But actually building those pages statically slowed down the already miserably slow interface to MT (my server is a piece of crap)… comments would take over a minute to be published when two or three individual entries and two category archives had to be built. So I had to wait until MT caught up and allowed dynamic publication, then I created a new archive mapping for the old URL structure and MT would automatically serve a page with the location header whenever a browser requested an old page. Pretty hacky, but it worked very well.
Fast forward to today and I still get hits on old posts from links that someone wrote in 2002. I wanted to duplicate my redirect scheme after I switched blogging systems but neither the MT export nor the WP import pays any attention to post ID, so I hacked both of them as I mentioned here. I had to use an old (2.1?) version of the import script in WordPress, but it worked just fine. Then I built a series of mod_rewrite rules to redirect from every one of the many URL structures I’ve ever used:
# Convert old archives
RewriteRule ^/archives/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.html$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.shtml$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/old/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.php$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.html$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.shtml$ /category/$1/ [R=301,L]
RewriteRule ^/archives/?$ /archive/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9]/[0-9][0-9]/?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9][0-9][0-9]\.?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/tag/.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/author/.*$ [NC]
RewriteRule ^/archives/(.*)$ /category/$1 [R=301,L]
RewriteRule ^/archives/(.*)$ /$1 [R=301,L]
This converts the old “/archives/000334.html” format to “/000334/,” changes “/archives/cat_blog_business.html” (another legacy MT convention) to “/category/blog_business/,” and removes the “/archives/” from every other URL. At this point the numeric requests are handled by a plugin called Permalink Redirect. Just put “%post_id%/” in the old permalink structure field and blamo… no more 404s, everybody who clicks an ancient link is sent to the correct post.
The next problem was the MT underscores to WP hyphens issue. It’s actually a good idea to use hyphens… Google doesn’t recognize underscores as whitespace (yet), so you’re destroying a source of relevant keywords if your URL is a big_string_like_this. On the other hand, big-string-like-this then gets your page in the index with those four keywords, raising your page rank. I quickly found that, in most circumstances, WordPress automatically converts underscores to hyphens. Only when the URL has something after the title, like in “/2008/04/big_string_like_this/feed/,” or if a category is involved, does WordPress fail to convert. Perhaps I should take the time to fix that in the code… until then, I’ll use these mod_rewrites:
# Change underscores hyphens
RewriteRule ^/category/([^_]*)_([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3-$4/ [R=301,L]
RewriteRule ^/category/([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3/ [R=301,L]
RewriteRule ^/category/([^_]*)_(.*)$ /category/$1-$2/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4-$5/$6 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4/$5 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_(.*)/(.+)$ /$1/$2-$3/$4 [R=301,L]
This will ignore the underscores (and let Wordpress handle them) unless it’s a category or there are characters after that last slash. Finally, I had to move all my syndication feeds to the new structure:
# Change to wordpress feeds
RewriteRule ^/atom$ /wp-atom.php [R=301,L]
RewriteRule ^/atom\.xml$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.atom$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.rdf$ /wp-rdf.php [R=301,L]
RewriteRule ^/index\.xml$ /wp-rss2.php [R=301,L]
RewriteRule ^/rss\.xml$ /wp-rss.php [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom/$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml/$ /$1/feed/ [R=301,L]
Nothing complicated there.
Then I found one more problem. Movable Type makes sure that permalink slugs are globally unique, and posts with the same titles get tagged with “-1,” “-2,” etc. WordPress doesn’t care if they’re globally unique, as long as they’re not in the same month. So I had to watch my logs for 404s, then go back and rename the posts. At this point the Redirection plugin was invaluable. It automatically builds a 301 redirect from old name to new when you retitle a post… so I could change all those post names back to the Movable Type default without worrying about blocking new links that pointed to the WordPress format.
I was going to go into a discussion of the relative benefits of dynamic vs. static publishing, and what a wonderful plugin WP Super Cache is, but this post has run on way too long, so that will have to wait for another day. Until then, here’s an interesting discussion on the subject.
Popularity: 58% [?]
Author: Matt Moore | | Comments Feed
[...] other accomplishment is forwarding all my old posts. After reading this post I found from another valuable blog, I was determined to somehow forward all my old, outdated posts [...]
Excellent post. I’m sure this will save me some time in the future.
Come back soon, I’m going to update it with some things I learned helping asuh.com implement this.
Hello, I’ve been trying to migrate an old MT blog with numerical permalinks to Wordpress, and am aware about using the old version of the import script. My problem is that this old version won’t import MT tags or keywords.
Did you have this problem, and if so, how did you get round it? I’m a bit at my wits’ end - the old version of the import script lets you keep your MT postIDs, the new version (WP 2.5) converts your MT keywords to tags, but I don’t know of any version that lets you do both!
That sucks. Sorry, I didn’t use keywords or tags in MT (just categories). Someone is going to have to figure out a way to make the newer MT import script respect post IDs, and that someone will not be me. I tried and the code is just too convoluted.
Thanks anyway, I just had to ask.
Moving from Movable Type to Wordpress and redesigning there has been way more difficult than setting up my Movable Type blog ever was! (I decided to try WP after MT 4.1 screwed up some stuff, but so far I’m not understanding the WP hype, sadly.)
I’ve totally bought into the hype (as you can tell). But I have a high tolerance for difficulty in setup… figuring these things out is actually one of the things I really enjoy.
Wordpress, to me, seems like a great product for the extremes. You can use it out of the box and it’ll work great (if you’re not picky), and it’s also very configurable if you like to make things perfect. But if you’re in the middle, and want things just so, but also don’t want to hunt down a bunch of hacks and plugins, it could be a nightmare.
Yup, I’m sort of that person in the middle - I’m able to follow detailed instructions and fairly willing to hunt stuff down, and until recently this was enough for me to do what I wanted with Movable Type. Somehow I’m finding Wordpress much harder and more time-consuming to figure out.
Anyway, I look forward to your forthcoming update of this post with things you learnt on preserving permalinks - with 8 years’ worth of blog posts, getting this right is a huge priority for me!