Converting Your WordPress Blog to an eBook (Part 2)

Now that you have all the tools you need we can start building your blog into an eBook. Today we’ll cover exporting your WordPress posts, and converting them to an HTML webpage. On Thursday we’ll add a table of contents to that edit page, and convert it into our final format.

Exporting Your WordPress Blog:

WordPress offers both a free and paid export service for transferring your WordPress blog. For our purposes, the XML file created by the free service is all we’ll need.

An XML or eXtensible Markup Language file, is the backbone data file of much of the web. Simply put, XML allows for data to be arranged in a flexible and organized way. XML is also designed to be used with XSL stylesheets to convert that raw data into something we can use, such as a spreadsheet, PDF file, or webpage. An XSL Stylesheet is a special kind of XML file which provides instructions for organizing and rewriting the XML data into a new format. For the purposes of today’s post, I have written a stylesheet specifically for the WordPress XML export so you won’t have to do any of the coding heavy lifting.

Step One: From the WordPress Dashboard, select “Tools–>Export“.

Step Two: You’ll be presented with a screen that looks like the one below. Click on the “Export” option (the free one).

shot-20130105T130048

Step Three: From here you can choose to export your blog’s full content (pages, comments, posts, etc.) or just posts within a certain range. The XSL stylesheet I wrote for this project will write out posts only, though you can modify it to include all content if you wish. For this example I’m exporting posts from January 2012 – December 2012. When you’ve selected the range and content you want, click “Download Export File“. It’ll have a name something like [name of blog].wordpress.[date of export].xml

shot-20130105T130140

Converting your Blog to HTML:

#include <std-disclaimer.h>

This is where you’ll use the XSL Tools you installed in Notepad++.

Step One: Open your [export].xml file in Notepad++.

Step Two: Download the XSL stylesheet from here (Due to file type restrictions on WordPress I had to change the extension on this file from (XSL) to (XLS). The XSL Transformation tool will recognize the file with the incorrect extension, though you may just want to change it back to the correct one (XSL). (Final file name should be wordpress_xml_parse.xsl).

If you open this file in Notepad++ you’ll see a section toward the top that looks like this:

<head>
<title>[BTW] : 2012</title>
</head>

This is the webpage title (which will become your eBook title). You can change this to whatever you like, and then save the stylesheet as an XSL file.

Note: This XSL stylesheet borrows code for neat paragraphs from http://www.gslsrc.net/l003_html_paras_with_xslt.html. For the techies out there I might post more of a detailed explanation of how the XSL file works at a later date. If interested comment below.

Step Three: Select “Plugins–>XML Tools–>XSL Transformation“.

Step Four: You’ll see a dialog box called “XSL Transformation Settings“. Click the “” button to browse to the stylesheet you just downloaded. Once selected click “Transform“. The transformation may take several minutes to work.

shot-20130105T130344

Step Five: Your finished file will be created in a new tab in Notepad++ (plain old notepad has no tabs 😦 ). Click “File–>Save As” to save this output. Choose a file name like [Your_Blog_Name.html]. It’s important to use the (.html) or (.htm) extension on this file.

shot-20130105T130419

Saving your Blog Pictures Locally:

The HTML file you just created will have the title, date published, and full text of each post in your exported file. It will also contain links to any images you uploaded to that post. In order to create an eBook, you’ll need a local copy of those images, rather than the linked one currently in the file. Fortunately, doing this is pretty easy.

Step One: Open up the HTML file you created in your favorite browser (I’m using Google Chrome).

Step Two: Once the webpage is open, select the “Save Page As” option (in Chrome you can get to this menu by clicking the three lines on the top-left).

shot-20130105T130625

Step Three: You can save the webpage just like any other file. Select “Web Page, Complete” (or its equivalent in other browsers). What you should see is your HTML file, plus a folder with a name like [name of html file]_files. This folder will have all your image files. This will take a while to save if you’ve uploaded a lot of pictures, so make sure it’s done before closing the browser.

shot-20130105T132207

———————————

That’s it for today. Next post I’ll show you how to add a table of contents, convert your images to eBook size, and construct your eBook. We covered a lot today so any questions, comments?

Advertisements

17 Comments

Filed under Books + Publishing, Internal Debate 42, Trube On Tech, Writing

17 responses to “Converting Your WordPress Blog to an eBook (Part 2)

  1. Pingback: Converting Your WordPress Blog to an eBook (Part 3) | [BTW] : Ben Trube, Writer

  2. Did anyone else have a problem with invalid code? I added
    xmlns:atom=”http://www.w3.org/1999/xhtml” after the other namespace declarations,
    for both my exported blog and the stylesheet. I ran it through the validator. It worked after that. This is a know WordPress problem. Thanks for the info! I am trying to get video too! any Ideas? It is a standard youtube embed that works well in WordPress:

    Thanks for the great work
    Tom

  3. I don’t know if your comments allow code, but here is an attempt to show the raw html with spaces added: “”

  4. code with no greater than signs: iframe width=”560″ height=”315″ src = “http:// http://www.youtube.com / embed / tJbT1VkXo5k” frameborder=”0″ allowfullscreen /iframe

    • I’ll have to try the YouTube code. Most of my posts don’t have video content (and since its an eBook most eReaders don’t really handle it though the Fire and iPad might). I didn’t have a problem with invalid code, though admittedly my standard is does the page render and does the eBook render (and work properly on an eReader)? I know HTML can often still work with “errors”, one of the frustrating\interesting features of that language. I’m probably going to add your namespace line to the uploaded code sometime later today (cause after all it doesn’t really hurt anything). What are you using for validation checking? Thanks for actually trying the procedure. It’s helpful to have others try it, so I can refine it for any errors like this one.

  5. I’m getting the following error after running the XSL Transformation plugin:

    —————————
    XML Tools plugin
    —————————
    Unable to apply transformation on current source.
    Error occured in source parsing.
    —————————
    OK
    —————————

    The WordPress blog is self-hosted, is up-to-date as far as the Wordpres software version, and is set up as a multi-site blog network. I’m merely exporting all of single blog’s “Published” posts.

    Any idea what might be wrong?

    • I wrote the xsl about a year ago, so it’s possible WordPress’ export format has changed in that time (or you are using an older version for your self-hosted version). The easiest way for me to debug would be to take a look at your xml export file and see if there are any red flags. If you don’t want to send the whole file, you can maybe export a month and we can go from there. I’ll test this at some point with my blog and see if the code still works or needs updating. I’m using the free site so that may also be part of the difference too. You can send any files to bentrubewriter@gmail.com. Thanks!

  6. You just saved me about a month of editing work! \o/ Thank you!

  7. This worked like a CHARM. Thank you so much for the instructions!

  8. I’ve been looking for an easy way to export my blog to ebook format… thanks for these instructions, Ben!

  9. Hi Ben, your step-by-step tutorial works great. Thanks indeed. In my Blog export I would like to grab the comments also. Do you think there is a chance to extend the XSL transformation to evaluate the comments? Kind regards, Chris

  10. Pingback: Trube on Tech: WordPress eBooks and Star Trek Games | [BTW] : Ben Trube, Writer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s