So you’d like to upgrade to WordPress from Movable Type? It’s not very hard… if you’ve always had the same permalink structure. But if you’ve gone through the MovableType versions from 1.5 to 2.x to 3.x and finally to 4.1 (like me) then you’ve probably been through a couple of URL scheme revisions.
In my case, I started out with Movable Type’s default (and, at the time, only) permalink structure… post ID (a six digit number) followed by .html in the archives directory to avoid cluttering up the top one. It looked like this: “http://tbotcotw.com/archives/000334.html.” Of course, from there I switched to .shtml, then quickly to .php, and when MT finally supported it, permalinks based on the post name (all the words with underscores replacing the whitespace). At around the same time I read Mark Pilgrim’s excellent post on cruft-free and future-proof URLs in Movable Type. I’d finally found a URL structure that (I thought, it turns out that the underscore was a bad idea) would stand the test of time: post name with underscores for whitespace and no extension.
This was exasperating, though, since switching to a new URL scheme in midstream meant that my permalinks were now not very perma. I was getting lots of 404s because people had linked to a URL that didn’t exist anymore. And that was hurting my Google page rank. I considered fixing the problem by publishing a page with a php location header that would redirect from the old URL to the new seamlessly (using a 301 error code). But actually building those pages statically slowed down the already miserably slow interface to MT (my server is a piece of crap)… comments would take over a minute to be published when two or three individual entries and two category archives had to be built. So I had to wait until MT caught up and allowed dynamic publication, then I created a new archive mapping for the old URL structure and MT would automatically serve a page with the location header whenever a browser requested an old page. Pretty hacky, but it worked very well.
Fast forward to today and I still get hits on old posts from links that someone wrote in 2002. I wanted to duplicate my redirect scheme after I switched blogging systems but neither the MT export nor the WP import pays any attention to post ID, so I hacked both of them as I mentioned here. I had to use an old (2.1?) version of the import script in WordPress, but it worked just fine. Then I built a series of mod_rewrite rules to redirect from every one of the many URL structures I’ve ever used:
# Convert old archives
RewriteRule ^/archives/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.html$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9]*)\.shtml$ /$1/ [R=301,L]
RewriteRule ^/archives/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/old/([0-9]*)\.php$ /$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.php$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.html$ /category/$1/ [R=301,L]
RewriteRule ^/archives/cat_(.+)\.shtml$ /category/$1/ [R=301,L]
RewriteRule ^/archives/?$ /archive/ [R=301,L]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9]/[0-9][0-9]/?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/[0-9][0-9][0-9][0-9][0-9][0-9]\.?.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/tag/.*$ [NC]
RewriteCond %{REQUEST_URI} !^/archives/author/.*$ [NC]
RewriteRule ^/archives/(.*)$ /category/$1 [R=301,L]
RewriteRule ^/archives/(.*)$ /$1 [R=301,L]
This converts the old “/archives/000334.html” format to “/000334/,” changes “/archives/cat_blog_business.html” (another legacy MT convention) to “/category/blog_business/,” and removes the “/archives/” from every other URL. At this point the numeric requests are handled by a plugin called Permalink Redirect. Just put “%post_id%/” in the old permalink structure field and blamo… no more 404s, everybody who clicks an ancient link is sent to the correct post.
The next problem was the MT underscores to WP hyphens issue. It’s actually a good idea to use hyphens… Google doesn’t recognize underscores as whitespace (yet), so you’re destroying a source of relevant keywords if your URL is a big_string_like_this. On the other hand, big-string-like-this then gets your page in the index with those four keywords, raising your page rank. I quickly found that, in most circumstances, WordPress automatically converts underscores to hyphens. Only when the URL has something after the title, like in “/2008/04/big_string_like_this/feed/,” or if a category is involved, does WordPress fail to convert. Perhaps I should take the time to fix that in the code… until then, I’ll use these mod_rewrites:
# Change underscores hyphens
RewriteRule ^/category/([^_]*)_([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3-$4/ [R=301,L]
RewriteRule ^/category/([^_]*)_([^_]*)_(.*)$ /category/$1-$2-$3/ [R=301,L]
RewriteRule ^/category/([^_]*)_(.*)$ /category/$1-$2/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4-$5/$6 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_([^_/]*)_(.*)/(.+)$ /$1/$2-$3-$4/$5 [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9])/([^_/]*)_(.*)/(.+)$ /$1/$2-$3/$4 [R=301,L]
This will ignore the underscores (and let Wordpress handle them) unless it’s a category or there are characters after that last slash. Finally, I had to move all my syndication feeds to the new structure:
# Change to wordpress feeds
RewriteRule ^/atom$ /wp-atom.php [R=301,L]
RewriteRule ^/atom\.xml$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.atom$ /wp-atom.php [R=301,L]
RewriteRule ^/index\.rdf$ /wp-rdf.php [R=301,L]
RewriteRule ^/index\.xml$ /wp-rss2.php [R=301,L]
RewriteRule ^/rss\.xml$ /wp-rss.php [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.atom/$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml$ /$1/feed/ [R=301,L]
RewriteRule ^/([0-9][0-9][0-9][0-9]/[0-9][0-9]/.*)\.xml/$ /$1/feed/ [R=301,L]
Nothing complicated there.
Then I found one more problem. Movable Type makes sure that permalink slugs are globally unique, and posts with the same titles get tagged with “-1,” “-2,” etc. WordPress doesn’t care if they’re globally unique, as long as they’re not in the same month. So I had to watch my logs for 404s, then go back and rename the posts. At this point the Redirection plugin was invaluable. It automatically builds a 301 redirect from old name to new when you retitle a post… so I could change all those post names back to the Movable Type default without worrying about blocking new links that pointed to the WordPress format.
I was going to go into a discussion of the relative benefits of dynamic vs. static publishing, and what a wonderful plugin WP Super Cache is, but this post has run on way too long, so that will have to wait for another day. Until then, here’s an interesting discussion on the subject.
Popularity: 77% [?]