Now that you have all the tools you need we can start building your blog into an eBook. Today we’ll cover exporting your WordPress posts, and converting them to an HTML webpage. On Thursday we’ll add a table of contents to that edit page, and convert it into our final format.
Exporting Your WordPress Blog:
WordPress offers both a free and paid export service for transferring your WordPress blog. For our purposes, the XML file created by the free service is all we’ll need.
An XML or eXtensible Markup Language file, is the backbone data file of much of the web. Simply put, XML allows for data to be arranged in a flexible and organized way. XML is also designed to be used with XSL stylesheets to convert that raw data into something we can use, such as a spreadsheet, PDF file, or webpage. An XSL Stylesheet is a special kind of XML file which provides instructions for organizing and rewriting the XML data into a new format. For the purposes of today’s post, I have written a stylesheet specifically for the WordPress XML export so you won’t have to do any of the coding heavy lifting.
Step One: From the WordPress Dashboard, select “Tools–>Export“.
Step Two: You’ll be presented with a screen that looks like the one below. Click on the “Export” option (the free one).
Step Three: From here you can choose to export your blog’s full content (pages, comments, posts, etc.) or just posts within a certain range. The XSL stylesheet I wrote for this project will write out posts only, though you can modify it to include all content if you wish. For this example I’m exporting posts from January 2012 – December 2012. When you’ve selected the range and content you want, click “Download Export File“. It’ll have a name something like [name of blog].wordpress.[date of export].xml
Converting your Blog to HTML:
This is where you’ll use the XSL Tools you installed in Notepad++.
Step One: Open your [export].xml file in Notepad++.
Step Two: Download the XSL stylesheet from here (Due to file type restrictions on WordPress I had to change the extension on this file from (XSL) to (XLS). The XSL Transformation tool will recognize the file with the incorrect extension, though you may just want to change it back to the correct one (XSL). (Final file name should be wordpress_xml_parse.xsl).
If you open this file in Notepad++ you’ll see a section toward the top that looks like this:
<title>[BTW] : 2012</title>
This is the webpage title (which will become your eBook title). You can change this to whatever you like, and then save the stylesheet as an XSL file.
Note: This XSL stylesheet borrows code for neat paragraphs from http://www.gslsrc.net/l003_html_paras_with_xslt.html. For the techies out there I might post more of a detailed explanation of how the XSL file works at a later date. If interested comment below.
Step Three: Select “Plugins–>XML Tools–>XSL Transformation“.
Step Four: You’ll see a dialog box called “XSL Transformation Settings“. Click the “…” button to browse to the stylesheet you just downloaded. Once selected click “Transform“. The transformation may take several minutes to work.
Step Five: Your finished file will be created in a new tab in Notepad++ (plain old notepad has no tabs 😦 ). Click “File–>Save As” to save this output. Choose a file name like [Your_Blog_Name.html]. It’s important to use the (.html) or (.htm) extension on this file.
Saving your Blog Pictures Locally:
The HTML file you just created will have the title, date published, and full text of each post in your exported file. It will also contain links to any images you uploaded to that post. In order to create an eBook, you’ll need a local copy of those images, rather than the linked one currently in the file. Fortunately, doing this is pretty easy.
Step One: Open up the HTML file you created in your favorite browser (I’m using Google Chrome).
Step Two: Once the webpage is open, select the “Save Page As” option (in Chrome you can get to this menu by clicking the three lines on the top-left).
Step Three: You can save the webpage just like any other file. Select “Web Page, Complete” (or its equivalent in other browsers). What you should see is your HTML file, plus a folder with a name like [name of html file]_files. This folder will have all your image files. This will take a while to save if you’ve uploaded a lot of pictures, so make sure it’s done before closing the browser.
That’s it for today. Next post I’ll show you how to add a table of contents, convert your images to eBook size, and construct your eBook. We covered a lot today so any questions, comments?