Creating usable sample data for WordPress

This is a side-step to my series of notes on How to preserve a WordPress site using the WP REST API

To have actual (but faked) content I can share in examples, and also allow anyone to run the API queries against for testing and examining the result, I set up a sample site at lab.webit.nu.

What I wasn’t prepared for was that it’s hard to find useful test content to fill the site with (the official “Theme Unit Test” was more or less of no use for my purpose). I finally found two candidates and decided on the second one which I describe below.

wp-cli-fixtures seems really flexible, but refused to run because of conflicts in the code. I managed to “fix” these conflicts, but I still couldn’t have it connect images to the posts. I also tested Faker which ‘wp-cli-fixtures’ is based on, but it hasn’t been updated for many years, and failed because the use of the flickr API has changed.

test-content-generator acts both as a plugin and as an extension to wp-cli. It has options to generate taxonomies (categories and tags, but not to specify which one separately), users (specify user role or randomize), add images (from picsum.photos) in a specified size, create posts and comments to them, and (from the WP backend) create pages.

Creating fake content for the test site

As mentioned (and true for all three alternatives for creating fake content) wp-cli is required since the data-creators are extensions to it.
Simple enough to install according to the instructions on the wp-cli homepage:

## download and test
curl -O https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar
php wp-cli.phar --info
## then if it works
chmod +x wp-cli.phar
sudo mv wp-cli.phar /usr/local/bin/wp

Install the ‘test-content-generator’ plugin as described:

wp plugin install test-content-generator --activate

To have the data itself created in the right order (and to not have to type it over again if I wipe the database), I created a script to do it:

#!/bin/sh
wp test users --amount=5 --role_keys=editor,author
wp test users --amount=15 --role_keys=subscriber
wp test users --amount=3 --role_keys=contributor
wp test users --amount=10
wp test taxonomies --amount=50
wp test images --amount=20 --image_width=1500 --image_height=410
wp test images --amount=20 --image_width=1500 --image_height=410
wp test images --amount=20 --image_width=1500 --image_height=410
wp test posts --amount=60
wp test comments --amount=100
wp test comments --amount=33

I wanted more comments than the maximum of 100 for each command run, so this is the reason for two of this command. The limit for images is 20, so I ran this three times to create enough different images for each post. The four ‘wp test users’ commands in the beginning is to create a set amount of one kind of users, then add 10 more users with randomized roles.

I also uploaded 10 images from stocksnap.io the normal way through wp-admin to have more images to examine the database content for. Five of these will be attached to each of the test pages I create using ‘test-content-generator’ from within wp-admin.

My next post will be the continuation of the series on how to preserve that WordPress site…

Preserving a WordPress site using the WP REST API

This is not a tutorial for someone who likes to copy/paste stuff. It’s my notes on how to recreate a WordPress site where the admin login has been lost and the site is still running

Do not use the methods described here on any site you do not own or have permission to dig through. Doing intense wp-json queries might have you banned or give the site problems (bandwidth or technical).

I have blocked wp-json access to this site (tech.webit.nu) because of my posts about how to collect content. I have however set up another site, lab.webit.nu, which you are allowed to try out some fetching commands on.

The only requirement is that (at least part of) the WP REST API (wp-json) is available on the site. This will let you access most of the content visible to those who visit the site using a web browser.

I came across a site that needs to be recovered/preserved which had all its users deleted (probably including all admins as well), and access to the post comments were not possible using the API. This will later be parsed out from the saved rendered posts of the site.

The focus is on preserving, not cloning. There are plugins available for cloning sites to a new location or domain, but those requires admin access on both locations.

The WordPress REST API

Read someone else’s tutorial on this, there are a couple out there. I will only go into details on what parts of the json output belongs to what table in the WordPress database and how to get the content back where it belongs.
A few pages I stumbled on doing my research for this post:
This is a very short introduction to the API:
https://jalalnasser.com/wordpress-rest-api-endpoints/

Also, I found an article about the WordPress API on SitePoint:
https://www.sitepoint.com/wordpress-json-rest-api/

Another cloning/backup plugin (WP Migrate) claims to have the Ultimate Developer’s Guide to the WordPress Database

The WordPress REST API LinkedIn Course was probably the best resource I found to get started:
https://www.linkedin.com/learning/wordpress-rest-api-2
What I found confusing is how Morten used the term “Endpoint” for the METHOD and “Route” (which is correct) for the URL-part following “wp-json” on the URL. According to me with limited knowledge about this, the GET/POST/DELETE is what I will call “method” and I will only use “GET”. I will use the term “Endpoint” or “Route” for the part of the URL after “wp-json”.

Begin digging

The most useful endpoints are, besides “posts” and “media”, “taxonomies” and “types” which will give you all the taxonomies and post types to retrieve and parse for the parts that will be put back into a new database.
For a WordPress site without any custom post types or taxonomies, “taxonomies” will only be “categories” and “tags”, and “types” of interest will be “pages”, “posts” and “media” (“attachment”). If the site has a WooCommerce shop there are specific endpoints for product categories and tags.

Step 1: Post index

Luckily enough the site I was going to preserve had a (more or less) complete index of the public posts (probably auto-generated by the theme template), so I was able to download the rendered HTML of each post as well as the json for each of them. I didn’t really need to save json for each post, but the code I used for parsing the HTML pages will be used later when I go on recreating the comments.
At this point I had html and json for each post (but no related files or content to them)

Step 2: Get taxonomies (terms)

Taxonomies are as I described earlier the tags and categories. These can be fetched all at once and saved down to one file per type.
These can be easily inserted into the WordPress database.
There are two tables of interest in this step:
‘wp_terms’ (the words) and ‘wp_term_taxonomy’ (connecting each term to a taxonomy, and contains the description and setting for ‘parent’ for categories). A third table connecting the terms with the posts (‘wp_term_relationships’) will come in use when the posts are imported. Lastly, the table ‘wp_termmeta’ optionally contains more information for the terms (meta_key and meta_value added by plugins)

Step 3: Get json for the posts

Although I already had these as separate json files, I now reworked my script to fetch the posts in batches, so I got them in batches of 10 and 100. The sets of 100 posts per fetch is a complete set, and the files with 10 posts each will be used for testing further routines.
The API endpoint /posts is just for the post type of ‘post’.
As the ‘wp_posts’ table also contains the pages and media file information (post type “attachment”), these will have to be fetched in the next step.

Step 4: Get json for pages

As step 2, but now I get the pages. As the pages are a small amount on most sites, I decided to get these as one item per file. This to reduce the risk for parsing errors.

Step 5: Get json for entries in the media library

As the other steps for getting posts, as the media items also are a post type (‘attachment’) with some special fields (source URLs for the files). Media items were grabbed in batches of 100, as they are most likely to be problem free with the limited content of the entries.

Parsing time

Now things get more complicated when we start to parse the data we got. This will be described in part 2 of this series of notes.
Part 2: The WordPress database and parsing taxonomies

Preserving a WordPress site using the WP REST API – the WordPress database and parsing taxonomies

To make the most out of my notes, you should have your own sites (source and destination) set up for testing and analyzing the WordPress database content.

If you’re intend to try out something from my notes, you should have your own site to try things out against, or at least have access to a site you can dig into without any legal issues. You will also need to set up a destination site somewhere, and I recommend that you do it on a virtual machine with shell access somewhere (Oracle Cloud Free Tier is a forever-free alternative – and no, I’m not getting paid for recommending them, I just use and like their services – for stability AVOID their old AMD machines and only create machines on the Ampere A1 platform).
For just trying out some GET requests, I have set up lab.webit.nu and populated it with random content using wp-cli. This site might break at any time, so do not rely on it being available.

Some commands I give as examples has to be run on a Linux/Unix machine. This might also be possible in windows using microsoft ubuntu.

The WordPress database tables

It’s now a good time to examine the WordPress database tables used to store the terms. The descriptions below are my own findings in my own words. I later found a resource on the WP Staging plugin home page:
https://wp-staging.com/docs/the-wordpress-database-structure/

wp_terms
Terms such as categories, tags and menu names

+------------+-----------------+------+-----+---------+----------------+
| Field      | Type            | Null | Key | Default | Extra          |
+------------+-----------------+------+-----+---------+----------------+
| term_id    | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| name       | varchar(200)    | NO   | MUL |         |                |
| slug       | varchar(200)    | NO   | MUL |         |                |
| term_group | bigint          | NO   |     | 0       |                |
+------------+-----------------+------+-----+---------+----------------+

Note: in the database from a larger site with about 3500 terms I manage, I have not seen any other value than 0 (zero) in the ‘term_group’ field.

wp_term_taxonomy
Connects the terms (tags, categories with their names) to the taxonomy they belong to. This table also holds the ‘description’ and ‘parent’ fields for the term.
“category” and “post_tag” are the most used ones.
Every term should have the corresponding entry in this table.

+------------------+-----------------+------+-----+---------+----------------+
| Field            | Type            | Null | Key | Default | Extra          |
+------------------+-----------------+------+-----+---------+----------------+
| term_taxonomy_id | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| term_id          | bigint unsigned | NO   | MUL | 0       |                |
| taxonomy         | varchar(32)     | NO   | MUL |         |                |
| description      | longtext        | NO   |     | NULL    |                |
| parent           | bigint unsigned | NO   |     | 0       |                |
| count            | bigint          | NO   |     | 0       |                |
+------------------+-----------------+------+-----+---------+----------------+

wp_term_relationships
Connect page or post with term
object_id: post, page etc id (any object that supports tags or categories)
term_taxonomy_id: id in wp_term_taxonomy

+------------------+-----------------+------+-----+---------+-------+
| Field            | Type            | Null | Key | Default | Extra |
+------------------+-----------------+------+-----+---------+-------+
| object_id        | bigint unsigned | NO   | PRI | 0       |       |
| term_taxonomy_id | bigint unsigned | NO   | PRI | 0       |       |
| term_order       | int             | NO   |     | 0       |       |
+------------------+-----------------+------+-----+---------+-------+

Note: in the database from a larger site with about 3500 terms I manage, I have not seen any other value than 0 (zero) in the ‘term_order’ field.

wp_termmeta
Additional data for term items. This table is used by plugins.

+------------+-----------------+------+-----+---------+----------------+
| Field      | Type            | Null | Key | Default | Extra          |
+------------+-----------------+------+-----+---------+----------------+
| meta_id    | bigint unsigned | NO   | PRI | NULL    | auto_increment |
| term_id    | bigint unsigned | NO   | MUL | 0       |                |
| meta_key   | varchar(255)    | YES  | MUL | NULL    |                |
| meta_value | longtext        | YES  |     | NULL    |                |
+------------+-----------------+------+-----+---------+----------------+

Adding the terms to the WordPress database

At this time, only the wp_terms and wp_term_taxonomy tables are to be populated. As I later will parse objects from the media library, I will convert the json response to an associative array for easier manipulation of the meta value for images (more on that later).
PHP has the function json_decode() that has the option to return an array instead of an object.
Below is incomplete code of my fully working code, as you read this, I assume you are able to put things together from my hints.

$db = mysqli_connect(your db connection details here);
$file = "your-file-with-10-categories-or-tags.json";
$jsondata = json_decode(file_get_contents($file),true);
print count($jsondata) . " items\n";
foreach($jsondata as $post)
{
  if (!empty($post['taxonomy']))
  {
    $parent = !empty($post['parent']) ? $post['parent'] : 0;
    $name = mysqli_real_escape_string($db,$post['name']);
    $desc = mysqli_real_escape_string($db,$post['description']);
    $sql1 = <<<EOM
INSERT IGNORE INTO wp_terms(term_id,name,slug)
 VALUES({$post['id']},"{$name}","{$post['slug']}");
EOM;
  print "$sql1\n";

    $sql2 = <<<EOM
INSERT IGNORE INTO wp_term_taxonomy(term_taxonomy_id,term_id,taxonomy,description,parent)
 VALUES ({$post['id']},{$post['id']},"{$post['taxonomy']}","{$desc}",{$parent});
EOM;
    print "$sql2\n";
  }
}

After this step, you will be able to see the categories in the wp-admin backend of the destination site.

Moving Apache and MySQL to new server

This is not a guide/tutorial, just notes I made while moving the data disk from my old server to a new following my previous installation guide.

New server:
Ubuntu Server 24.04.1 LTS

Old server:
Ubuntu Server 22.04.4 LTS

Checks before trying to start anything

Did you follow any other of my guides ? You need to redo the setup using those instructions to ensure needed packages and Apache modules are installed.

  1. Do the sites you are hosting in the need of different PHP versions (will need php-fpm and associated modules), or do you run the sites using different users (will need both php-fpm and mpm-itk)?
    Running sites on different versions of PHP on the same server
    Apache HTTPd and PHP security

  2. Any sites using HTTPS (every site should use it, plain HTTP only for redirecting to HTTPS site) ?
    Install certbot as explained below. When creating the first certificate (for the ‘default’ site), the ssl module will be activated in Apache. This will however require port 80 of the newly installed server to be accessible from the outside, and only you (should) know how this is done for your specific network setup.

    apt install certbot python3-certbot-apache
    certbot --apache
    

    If you want to do offline-testing before making the new server available online, just enable the ssl module in Apache and use the existing certificates (use the ‘hosts’ file to point vhost names to the local ip address of the new server).

Troubleshooting

MySQL won’t start
UID and GID of MySQL user was changed from 114:120 to 110:110. This will give “Error: 13 (Permission denied)” when trying to start without correcting the ownership of /var/lib/mysql and its content.

Apache won’t start
The problems starting Apache on the new server is caused by either not correctly installed (but activated) modules. Rename mods-available and mods-enabled to something else (for reference) and copy in those which was working right after installing the new server.
The remaining startup-problems are caused by virtualhosts using not installed or activated modules, so disable all sites to start debugging (rename sites-enabled and create a new empty one, put back one site at a time and start with 001-default).
If you use PHP-FPM and different users on each site, you have to redo that setup on the new server. The php-fpm configurations are included on my data drive (/etc/php/8.3/fpm/pool.d/), but for these to work they need their respective PHP-FPM version installed.
Also, proxy_fcgi is needed to be able to redirect php file access to the fastcgi php handler. All of this is mentioned in my earlier guide.

Apache cannot access vhost site files
Did you enable the extra security with file protection and separate users per site according to my guide mentioned above ?
You will then also need to install and activate the mpm_itk module again.

HTTPs sites get connection refused
Do you have the Apache SSL module activated ? Is the firewall open for HTTPS (port 443) ?

Disk management in Linux

Solution to issues with moving disks between Linux systems

Recently, I had to attach the disk from another Linux system because my user had lost its ‘sudo’ permission. When trying to mount the root partition, I got the not-too-helpful error message:

mount: unknown filesystem type 'LVM2_member'

The reason for this is that a standard Linux installation uses LVM for the partitions, and surprisingly-stupid-enough gives every installation the same name of the volume group, “ubuntu-vg”, so it will collide with the running sysems VG with the same name.

The procedure
Shut down the computer you are going to transfer the disk from, then remove the disk.
(for a virtual one, you just have to keep it shut down during this operation)
Shut down (not needed for virtual) the computer which will have the disk connected and connect the disk.
Start up or reboot the computer with the disks (now with the same VG name)
(a virtual server/computer might not need to be rebooted at all, check with ‘dmesg’ if the other disk was found)

This is usually the first thing one would try for getting access to a disk from another computer. This will fail (with the error message this post is all about):

root@ubu-04:~# fdisk -l /dev/xvdb
...
Device       Start      End  Sectors Size Type
/dev/xvdb1    2048     4095     2048   1M BIOS boot
/dev/xvdb2    4096  2101247  2097152   1G Linux filesystem
/dev/xvdb3 2101248 33552383 31451136  15G Linux filesystem
..

The partition that will be mounted on /boot is directly mountable and accessible now:

root@ubu-04:~# mkdir disk
root@ubu-04:~# mount /dev/xvdb2 disk
root@ubu-04:~# ls disk/
config-5.4.0-176-generic      initrd.img-5.4.0-182-generic  vmlinuz
config-5.4.0-182-generic      initrd.img.old                vmlinuz-5.4.0-176-generic
grub                          lost+found                    vmlinuz-5.4.0-182-generic
initrd.img                    System.map-5.4.0-176-generic  vmlinuz.old
initrd.img-5.4.0-176-generic  System.map-5.4.0-182-generic

The partition with the rest of the content will give the not-so-useful error message:

root@ubu-04:~# mount /dev/xvdb3 disk
mount: /root/disk: unknown filesystem type 'LVM2_member'.
root@ubu-04:~#

lvscan identifies there is a problem:

root@ubu-04:~# lvscan
  inactive          '/dev/ubuntu-vg/ubuntu-lv' [<15.00 GiB] inherit
  ACTIVE            '/dev/ubuntu-vg/ubuntu-lv' [10.00 GiB] inherit
root@ubu-04:~#

Fix by renaming the VG
The solution I used to access the content on the attached disk was to give the VG a non-conflicting name. This can be whatever you choose, but I simply added the hostname which this disk belongs to:
Be sure to rename the correct one:
Getting the VG UUID of the one to rename can be done a couple of ways. If you do this before removing the disk you want to access on the other computer, just use the command 'vgdisplay' show the ID:

root@test-1:~# vgdisplay
  --- Volume group ---
  VG Name               ubuntu-vg
  System ID
  Format                lvm2
...
  VG UUID               90blAq-ggmA-rmsf-mBqU-3mRH-oxoS-lys4ih

or, if you found this post after stumbling on the same problem that I did, you can find the ID by using 'lvscan' on the computer with the two identical VG names:

root@ubu-04:~# lvscan -v
  Cache: Duplicate VG name ubuntu-vg: Prefer existing fSauMy-cW75-PFje-cx8s-rpUR-zYgd-PL9Bef vs new 90blAq-ggmA-rmsf-mBqU-3mRH-oxoS-lys4ih
  inactive          '/dev/ubuntu-vg/ubuntu-lv' [<15.00 GiB] inherit
  ACTIVE            '/dev/ubuntu-vg/ubuntu-lv' [10.00 GiB] inherit
root@ubu-04:~#

Rename VG and rescan

root@ubu-04:~# vgrename 90blAq-ggmA-rmsf-mBqU-3mRH-oxoS-lys4ih ubuntu-vg-test-1
  Processing VG ubuntu-vg-test-2 because of matching UUID 90blAq-ggmA-rmsf-mBqU-3mRH-oxoS-lys4ih
  Volume group "90blAq-ggmA-rmsf-mBqU-3mRH-oxoS-lys4ih" successfully renamed to "ubuntu-vg-test-1"
root@ubu-04:~# modprobe dm-mod
root@ubu-04:~# vgchange -ay
  1 logical volume(s) in volume group "ubuntu-vg-test-1" now active
  1 logical volume(s) in volume group "ubuntu-vg" now active
root@ubu-04:~# lvscan
  ACTIVE            '/dev/ubuntu-vg-test-1/ubuntu-lv' [<15.00 GiB] inherit
  ACTIVE            '/dev/ubuntu-vg/ubuntu-lv' [10.00 GiB] inherit
root@ubu-04:~#

Now the partition should be mountable:

root@ubu-04:~# mount /dev/ubuntu-vg-test-1/ubuntu-lv disk/
root@ubu-04:~# ls disk/
bin    dev   lib    libx32      mnt   root  snap      sys  var
boot   etc   lib32  lost+found  opt   run   srv       tmp
cdrom  home  lib64  media       proc  sbin  swap.img  usr
root@ubu-04:~#

Do whatever you need to do with that partition mounted (in my case, repairing sudo access for my user by adding it on the 'sudo' entry in the /etc/group file), then shut down the computer with the two disks, detacht the disk and reattach it to the computer it came from (or simply start the virtual machine which the disk came from).

Making the system bootable when the disk is put back where it belongs
Now that the VG was renamed, the system on that disk will no longer boot because it cannot mount the root partition. If you try, you will get dumped into the very limited busybox shell.

In the busybox shell, do this to make the system boot:

cd /dev/mapper
mv ubuntu--vg--test--1-ubuntu--lv  ubuntu--vg-ubuntu--lv
exit

The system will now boot up. To make the new VG name permanent (so this 'rename' thing in busybox will not be needed on every reboot), change the old VG name to the new name in '/boot/grub/grub.cfg'

sed -i s/ubuntu--vg/ubuntu--vg--test--1/g grub.cfg

The easiest method creating a (per machine) unique VG name

If the system you are taking the disk from is still in working condition, and you are able to make yourself root using 'sudo' (which I lost on one machine for some unexplained reason, probably caused by a normal update Google search for 'lost+sudo+after+update' ), change the VG name and adjust grub.cfg while everything works..

root@test-2:~# vgdisplay |grep UUID
  VG UUID               8jzFk6-QlL8-xXL9-LJth-Qo1r-ACve-2KUmxP
root@test-2:~# vgrename 8jzFk6-QlL8-xXL9-LJth-Qo1r-ACve-2KUmxP ubuntu-vg-$(hostname)
root@test-2:~# sed -i s/ubuntu--vg/ubuntu--vg--$(hostname|sed s/-/--/g)/g grub.cfg

Only micro$oft makes it possible…

Cleaning up a micro$oft mess

Part of this post is splitted out from my Command line post (and maybe others), about the newly invented Bash (complete Micro$oft Ubuntu) in windows.

The parts in this post is more related to the mess windows makes on a system drive that has to be cleaned up (for getting back storage, or when it’s time to reinstall windows or take out an old computer of service and replace it with a more recent) when replaced and being reused as a data drive only.

If you did a full clone of the old drive, you can safely just wipe the old once you verified everything is available on the new drive. This is more for those who do a clean installation on a new drive and want to keep user-files on the old drive.

Only Windows makes this mess (OneDrive)

When taking a hard drive containing OneDrive folders out and mounting it as a secondary drive, some files in the OneDrive folders might be “online only” or to be “downloaded on demand”. On that disk, which is disconnected from the OneDrive account, all the files will be there (until disconnected) but not accessible.
The inaccessible files can be detected by the “dir” command, giving the file sizes in parentheses.
One option to clean out the mess made by disconnecting the OneDrive disk is to just delete the whole old OneDrive folder, if you are sure there’s nothing there you want to keep (you files should be stored online).
If you want to be safe, keeping all the files that are available offline in the folder, you need to get rid of the “placeholder” files in there.
This is how I did, it might or might not work for you (always be careful when deleting files).
First, I removed all files in my Pictures (automatic sync from camera) folder with a block size of zero:
stat -c "#rm%brm %n" * | sed 's/#rm0rm/rm/g' >rmscript
./rmscript

I decided it was too much work to use that method for doing all directories recursively inside the OneDrive folder, so for the rest of the online-only files I used another method.
For this, I used the “file” command which gave an error for the inaccessible files “(Invalid argument)”:
find . -type f -exec file {} >>filinfo \;
grep "(Invalid argument)" filinfo >offlinefiles
sed "s;\./;rm './;g" offlinefiles >offlinefiles2
sed "s;: ERROR: ;' #;g" offlinefiles2 >offlinefiles3

What this does is creating a list of all the files starting from the current directory. The next command filters out the lines with the error “(Invalid argument)”. As the names in the list begins with “./”, this is then replaced with “rm ./”, creating commands to remove all the files which was filtered out in the previous command.
I then replace ” ERROR” with a hashtag to tell shell to ignore the rest of the line (“#” starts a comment, if you didn’t know).
Double check the “offlinefiles3” before executing it.

Cleaning up in AppData

For the most of the times, AppData contains only crap which can be deleted without risking loosing anything important, but if you use certain applications, that folder may contain very important data.

%AppData% (your user’s AppData/Roaming) contains settings and data for some applications:
If you have been using Thunderbird as a mail client, you will have all the local databases of email and account settings in %AppData%/Thunderbird.
Firefox (Mozilla/Firefox/Profiles) might also be valuable to save for transferring to another system.
FileZilla (ftp/sftp client) has the settings stored here.
JottaCloud (or branded Jotta-clones like Elgiganten Cloud) store its databases in “Jotta” (no matter if it is Jotta or a branded clone), for more information about the content in the databases, see my post JottaCloud secrets

WSL or Micro$oft Ubuntu

If you have been using WSL (Micro$oft Ubuntu) the default user home location is nested inside AppData/Local (not %AppData%, but you could navigate there then go one step up and enter “Local” from there).
The full location is

AppData/Local/Packages/CanonicalGroupLimited.UbuntuonWindows_79rhkp1fndgsc/LocalState/rootfs

User created files (if you have some) are usually in ‘root’ and in ‘home’,but if you think you have modified files elsewhere (/etc or if you have mysql databases), you can search for them using the oldest known non-system file as a reference timestamp file.
To find out what files you have modified since WSL (if any other than inside ‘home’), run the command ‘ls -lta’ inside your home folder, then take the last (oldest) file in there, and run a second command in the ‘rootfs’ folder:

...1fndgsc/LocalState/rootfs# find . -type f -newer home/youruser/.bash_logout

This will list all files that have been modified since the .bash_logout file (which was my oldest file in /home)

The registry, make things even more complicated

The registry has been the main location for application ans system settings since Windows 95, and will probably be so for a long time into the future.
When you have taken your old system disk offline, and on a new clean installation discover that settings have been lost in all applications there exists some ways to recover contents from your old drive.

In your user’s home directory, usually c:\users\youtoldusername, there is a file called NTUSER.DAT (probably hidden, but you should know how to google to find out how to be able to see and access hidden files). This file is the container for your old user’s settings, a ‘hive’ in HKEY_USERS. This cam be imported using regedit, and at the import time you will be asked to give it a new name (so it won’t overwrite any existing registry content).

One application I use/have used in the past, and assume at least some of you who find my posts is using is PuTTY.
Once you have imported the old registry, you will find the PuTTY sessions in whatever you named your import, followed by:

Software\SimonTatham\PuTTY\Sessions

Click the “Sessions” node,then right click and export.

To import the settings to your user on any other computer, you have to search and replace content in the exported “.reg” file:

HKEY_USERS\whatevernameyouusedatimport\

with

HKEY_CURRENT_USER\

This will overwrite existing values with the content from the import, so it’s up to you to preserve the values you wish to keep (export, to import-overwrite again). If you have not yet started the application (in this case PuTTY) to import settings to, you will have to do that before, otherwise the import would fail (regedit will not create the folder “key” or the full path down to that key if it does not exist, but it will report that the import was successful even when it fails).

Windows update failure (KB5046740) error 0x800f081f

Recently, Windows Update started to fail on one of my newer computers with Win 11 Pro 24H2. Just the specific update named in the title of this part, others following that went through fine. I did all the usual ‘remedies’ for this problem (stop wuauserv etc and remove the cache, multiple restarts, tried manual installation of the update and more), and it just continued to fail.
As a last attempt before reinstalling, I followed “step 5” (since the other things had already been tried, and tried again) in the video (youtube: v=8aYa_B7UgFE). Most of this had already been done, except the ‘dism’ commands:

SC config trustedinstaller start=auto
net stop bits
net stop wuauserv
net stop msiserver
net stop cryptsvc
net stop appidsvc
Ren %Systemroot%\SoftwareDistribution SoftwareDistribution.old
Ren %Systemroot%\System32\catroot2 catroot2.old
regsvr32.exe /s atl.dll
regsvr32.exe /s urlmon.dll
regsvr32.exe /s mshtml.dll
netsh winsock reset
netsh winsock reset proxy
rundll32.exe pnpclean.dll,RunDLL_PnpClean /DRIVERS /MAXCLEAN
dism /Online /Cleanup-image /ScanHealth
dism /Online /Cleanup-image /CheckHealth
dism /Online /Cleanup-image /RestoreHealth
dism /Online /Cleanup-image /StartComponentCleanup
Sfc /ScanNow
net start bits
net start wuauserv
net start msiserver
net start cryptsvc
net start appidsvc

I’m obviously not alone with having this update to fail:
Google search for 'kb5046740%2Bunable%2Bto%2Binstall'

OpenWrt configuration notes

This post is a continuation of OpenWrt on Raspberry Pi 4 (and CM4)

It’s more about OpenWrt than related to Raspberry Pi hardware like the Pi4 and CM4.

Securing the WAN interface

In the first part, I gave instructions on how to open for SSH and LuCI web access on the WAN interface “the easy way”. This is in no way recommended when the router is moved over to an external WAN on the eth0 interface.
These ports (22 and 80) will be targets for network scanners on the outside, which will find your external IP on the router in a very short time.
Also, if you want to redirect traffic from the outside to these ports on the router, I suggest you use any other port numbers than well-known service ports as 22, 80, 8080, 8000, 443, 8443, 20-21 and 25. There are a lot more recommended to avoid redirecting to the router, as explained in Huawei’s tech note
High-Risk Ports: What Are the Common High-Risk Ports and How to Block Them

Make it safer
If you decide you want to be able to access and configure the router from the outside, your first step should be to set a secure password for the ‘root’ user. The next step is to use other ports than the default well-known ones for SSH (22) and HTTP (80).
This can be done both from LuCI and by editing the firewall configuration file (/etc/config/firewall) directly. I will show and explain how to do it directly in the file (by accessing the router through SSH), as it’s easier to explain, and finding out how to do it in LuCI can be done later (just check the tabs in the firewall configuration there to see how it was done).
Selected ports for my example:
SSH to OpenWRT router from outside, port 16322
HTTP to OpenWRT management page from outside, port 16380

Pretty self-explanatory in the configuration file:

config redirect
        option name 'OpenWRT LuCI'
        option target 'DNAT'
        list proto 'tcp'
        option family 'ipv4'
        option src 'wan'
        option src_dport '16380'
        option dest_port '80'

config redirect
        option name 'OpenWRT SSH'
        option target 'DNAT'
        list proto 'tcp'
        option family 'ipv4'
        option src 'wan'
        option src_dport '16322'
        option dest_port '22'

Reload the firewall configuration by using the command ‘/etc/init.d/firewall reload’
In LuCI, the above will look like this (I modified my config to match what was entered into the file, therefor the notice about unsaved changes)

This is much better than using the well-known ports, but still the ports could be scanned for on the WAN address.
To secure it a bit more, it’s possible to limit to allow access only from known IP-addresses or networks.
To limit to only one IP-address, it can be done by to the relevant redirect section inserting

        option src_ip '8.8.8.8'

This will only allow access to the port from the specified IP (which in this case is one of Google’s DNS servers, which is very unlikely to happen).
In LuCI, the changes above will be visible under “Advanced settings” of the port forward rule.

Using IP sets for the rules
IP sets can be used to allow or deny traffic from multiple IP addresses without having to have a separate rule for each block/allow.
In case for a simple IP-address list, the configuration section would look like this in the /etc/config/firewall file:

config ipset
        option name 'Trusted'
        option comment 'Trusted networks'
        option family 'ipv4'
        list match 'src_ip'
        list entry '8.8.8.8'
        list entry '4.0.0.0/9'

Replace the ‘option src_ip’ in the redirect rule with

        option ipset 'Trusted'

This is the way it will be done if configured using LuCI. A more effieient way is to use include files with hashed IP sets. This is half-explained in a post in the OpenWRT forum: Port forward using ipset
The solution here is to recreate the IP set every time it will be used, but it should be possible to store the IP set and include it in the configuration.

For more information about IP sets, see the excellent OpenWRT documentation: IP set examples

I will have to investigate that in more depth when blocking known bad-bahaving IP hosts or networks.

Redirecting traffic from WAN to a device on the LAN
This is done more or less in the same way as above, but also adding ‘option dest_ip’ to the redirect section:

config redirect
        option name 'SSH to linux server'
        option target 'DNAT'
        list proto 'tcp'
        option src 'wan'
        option src_dport '16222'
        option dest 'lan'
        option dest_ip '172.16.2.11'
        option dest_port '22'
        option ipset 'Trusted'

You can use whatever port you want on the outside (16422 in the example) and any IP on the inside which the router can reach (the example needs a larger netmask: my examples assumes that the router lives on 172.16.3.254, so for it to reach hosts at 172.16.2.x, the netmask have to be ‘255.255.254.0’ (/23) or wider).

If you want to redirect any of those well-known ports to anything inside the LAN, as the ‘Linux server’ above (‘ipset’ limitation removed here, but if you want to keep the server private, just leave it in):

config redirect
        option name 'SSH to linux server'
        option target 'DNAT'
        list proto 'tcp'
        option src 'wan'
        option src_dport '80'
        option dest 'lan'
        option dest_ip '172.16.2.11'
        option dest_port '80'

To be continued (in this post or in another). This is set to be published at the end of 3 Dec 2023 instead of keeping it secret until I decide to release it.

Notes on DHCP

DHCP scope on LAN
As I mentioned in the previous part, the start address of the LAN DHCP scope is configured as an offset from the router’s network address. So unless you have the LAN interface on a single C-subnet, you might have to adjust the start of the scope accordingly. I’m not going into exactly what to set the start value to, you either know it, or you can test and read the results in the logs.

Just a simple example: If you want to have the DHCP scope on a C-net upwards from the start of the LAN segment, just add 256 to the offset; router = 172.16.3.254, netmask = 255.255.254.0, network = 172.16.2.0. To set DHCP range to 192.16.3.100–250, you will have to set the ‘start’ to 356 and the ‘limit’ to 150.

Save and apply changes
It the changes are made from LuCI, ‘save & apply’ will do whatever is needed for the changes to take effect. If you do changes by editing the configuration file (/etc/config/dhcp) directly, you will need to reload the configuration for dnsmasq:

/etc/init.d/dnsmasq reload

Static leases
Static leses are configured in /etc/config/dhcp (which is the file changed by LuCI). A section like this (depending on what you configure through LuCI) is added to that file:

config host
        option name 'Chromecast'
        option mac '90:ca:fa:77:88:99'
        option ip '172.16.3.72'
        option leasetime 'infinite'

Simple enough that requires no explanation, except that the statically assigned IP address is allowed to be outside the configured DHCP range (the address is a bit below the scope here). Use static leases for stuff you want to know which IP addresses they get, without having to make a static configuration on the device itself.

Dropping an active DHCP lease
Active leases are stored in /tmp/dhcp.leases
Remove the corresponding line of the lease you want to drop, then restart udhcpc:

PID=`pidof udhcpc` && kill -SIGUSR1 $PID

Changing display name of known devices
Edit the file /etc/ethers to change display names of detected devices on DHCP. Restart dnsmasq afterwards to get the new names displayed in LuCI.

Online resources

IP subnet calculator
https://www.calculator.net/ip-subnet-calculator.html

MAC address vendor lookup
https://mac.lc/
https://hwaddress.com/