How to backup your wordpress.com images and attachments

This is about catching all the files stuff that is not contained with the export provided by wordpress for basic wordpress.com accounts (see https://en.support.wordpress.com/com-vs-org for the differences between the com and org plans of wordpress and in particular https://en.support.wordpress.com/export for the export options at wordpress.com).

Essentially, hitting export from wp admin -> tools -> export creates an xml file in a so called “WordPress eXtended RSS or WXR” format that may and shall contain any of your content and will in turn also comprise links to any of your files. Since (well formed) xml is truly machine readable, we may therefore extract those (https) links for backing up all the files.

There are a couple of options how to execute the link extraction and file grabbing. Me, I just use linux shell utilities for the ease of use in a small and simple call. However, while file grabbing is of course a dedicated wget job, link extraction can be done with grep or xmllint, whatever you prefer in terms of availability and effort. The difference is, basically, that grep will only succeed as long as the links, including the tags, do not span more than one line, because grep is line oriented, like so:


xmllint, on the other hand, will always catch the text node successfully, no matter how many newlines surround the tags. Anyway, since this is not an issue currently, grep ma ybe safely used and will be much faster for large image and attachment collections.

Having the wrx file at hand, we may proceed using a script as follows:

  • grep style file grabbing
    BKP_DIR=/home/.../Wordpress/`date +"%Y-%m-%d"`
    if [ -f $BKP_FILE ]
        mkdir -p $BKP_DIR
        cp $BKP_FILE $BKP_DIR
        cd $BKP_DIR
        cat $BKP_FILE | grep -oP '(?<=wp:attachment_url>)[^<]+' | wget -xi -
        echo "File not found"
  • xmllint style file grabbing
    # just exchange the grep line like so, a one-liner, may be wrapped here
    # takes one hack to read the namespaced tag and another to have lined output
    xmllint --xpath "//*[local-name()='attachment_url']/text()" <(sed 's/<\/wp:attachment_url>/\n<\/wp:attachment_url>/g' $BKP_FILE | wget -xi -

… did not know that I already run so many files over at wordpress.com.

Enjoy, Peter

Repositioning the wordpress start and posts pages

I was recently asked to provide a general, description-oriented start page on another wordpress blog. An initial direction was given by Gary Barrett with this article on the wordpress support portal. The trick is to set up another page on the Pages view and configure it to have the lowest order index. Afterwards you proceed to the Settings / Reading view and change the selection below Front page displays / Front page to the just created item. That’s great so far, on reloading the blog you’ll now find your new start page on intial display.

However, changing the settings with Settings / Reading / Front page displays I already wondered what to enter with the Front page displays / Posts page select list. I actually expected to find an entry such as posts or default-posts or whatever but to no avail. In fact, leaving out this selection removes the the posts page from the blog completely. Stuck! Just driven by some enlightment of intuition (ha ha) I tried the following: I created another (empty) page on the Pages view, not forgetting to set the order index accordingly. Thereafter, i proceeded to the Settings / Reading view again and configured Front page displays / Posts page to this new (empty) item. And, it worked! That is, wordpress obviously only needs another (empty) container page to put the posts page into.

Another functionality i employed with the new start page was page templates that you’ll find along with the order index on the Pages view. It removes all the sidebar widget stuff from the start page which i find is much more comfortable. Please note however, that the availability and the layout of page templates depend on the theme of the blog.

have fun!