Preserving a WordPress site using the WP REST API – the WordPress database and parsing taxonomies

To make the most out of my notes, you should have your own sites (source and destination) set up for testing and analyzing the WordPress database content.

If you’re intend to try out something from my notes, you should have your own site to try things out against, or at least have access to a site you can dig into without any legal issues. You will also need to set up a destination site somewhere, and I recommend that you do it on a virtual machine with shell access somewhere (Oracle Cloud Free Tier is a forever-free alternative – and no, I’m not getting paid for recommending them, I just use and like their services – for stability AVOID their old AMD machines and only create machines on the Ampere A1 platform).
For just trying out some GET requests, I have set up lab.webit.nu and populated it with random content using wp-cli. This site might break at any time, so do not rely on it being available.

Some commands I give as examples has to be run on a Linux/Unix machine. This might also be possible in windows using microsoft ubuntu.

The WordPress database tables

It’s now a good time to examine the WordPress database tables used to store the terms. The descriptions below are my own findings in my own words. I later found a resource on the WP Staging plugin home page:
https://wp-staging.com/docs/the-wordpress-database-structure/

wp_terms
Terms such as categories, tags and menu names

+------------+-----------------+------+-----+---------+----------------+
| Field      | Type            | Null | Key | Default | Extra          |
+------------+-----------------+------+-----+---------+----------------+
| term_id    | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| name       | varchar(200)    | NO   | MUL |         |                |
| slug       | varchar(200)    | NO   | MUL |         |                |
| term_group | bigint          | NO   |     | 0       |                |
+------------+-----------------+------+-----+---------+----------------+

Note: in the database from a larger site with about 3500 terms I manage, I have not seen any other value than 0 (zero) in the ‘term_group’ field.

wp_term_taxonomy
Connects the terms (tags, categories with their names) to the taxonomy they belong to. This table also holds the ‘description’ and ‘parent’ fields for the term.
“category” and “post_tag” are the most used ones.
Every term should have the corresponding entry in this table.

+------------------+-----------------+------+-----+---------+----------------+
| Field            | Type            | Null | Key | Default | Extra          |
+------------------+-----------------+------+-----+---------+----------------+
| term_taxonomy_id | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| term_id          | bigint unsigned | NO   | MUL | 0       |                |
| taxonomy         | varchar(32)     | NO   | MUL |         |                |
| description      | longtext        | NO   |     | NULL    |                |
| parent           | bigint unsigned | NO   |     | 0       |                |
| count            | bigint          | NO   |     | 0       |                |
+------------------+-----------------+------+-----+---------+----------------+

wp_term_relationships
Connect page or post with term
object_id: post, page etc id (any object that supports tags or categories)
term_taxonomy_id: id in wp_term_taxonomy

+------------------+-----------------+------+-----+---------+-------+
| Field            | Type            | Null | Key | Default | Extra |
+------------------+-----------------+------+-----+---------+-------+
| object_id        | bigint unsigned | NO   | PRI | 0       |       |
| term_taxonomy_id | bigint unsigned | NO   | PRI | 0       |       |
| term_order       | int             | NO   |     | 0       |       |
+------------------+-----------------+------+-----+---------+-------+

Note: in the database from a larger site with about 3500 terms I manage, I have not seen any other value than 0 (zero) in the ‘term_order’ field.

wp_termmeta
Additional data for term items. This table is used by plugins.

+------------+-----------------+------+-----+---------+----------------+
| Field      | Type            | Null | Key | Default | Extra          |
+------------+-----------------+------+-----+---------+----------------+
| meta_id    | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| term_id    | bigint unsigned | NO   | MUL | 0       |                |
| meta_key   | varchar(255)    | YES  | MUL | NULL    |                |
| meta_value | longtext        | YES  |     | NULL    |                |
+------------+-----------------+------+-----+---------+----------------+

Adding the terms to the WordPress database

At this time, only the wp_terms and wp_term_taxonomy tables are to be populated. As I later will parse objects from the media library, I will convert the json response to an associative array for easier manipulation of the meta value for images (more on that later).
PHP has the function json_decode() that has the option to return an array instead of an object.
Below is incomplete code of my fully working code, as you read this, I assume you are able to put things together from my hints.

$db = mysqli_connect(your db connection details here);
$file = "your-file-with-10-categories-or-tags.json";
$jsondata = json_decode(file_get_contents($file),true);
print count($jsondata) . " items\n";
foreach($jsondata as $post)
{
  if (!empty($post['taxonomy']))
  {
    $parent = !empty($post['parent']) ? $post['parent'] : 0;
    $name = mysqli_real_escape_string($db,$post['name']);
    $desc = mysqli_real_escape_string($db,$post['description']);
    $sql1 = <<<EOM
INSERT IGNORE INTO wp_terms(term_id,name,slug)
 VALUES({$post['id']},"{$name}","{$post['slug']}");
EOM;
  print "$sql1\n";

    $sql2 = <<<EOM
INSERT IGNORE INTO wp_term_taxonomy(term_taxonomy_id,term_id,taxonomy,description,parent)
 VALUES ({$post['id']},{$post['id']},"{$post['taxonomy']}","{$desc}",{$parent});
EOM;
    print "$sql2\n";
  }
}

After this step, you will be able to see the categories in the wp-admin backend of the destination site.

Leave a Reply

Your email address will not be published. Required fields are marked *