Overview of the News From Nan website
Since 1998 we have maintained a family-oriented, WordPress website: www.newsfromnan.com. The site consists of over 200 posts and 20 pages.
The site content has attached tag and category metadata. Example categories are history, news, and travel. Example tags are hiking and biking (two of our favorite activities), and California (where we live) and Minnesota (where some of our family lives).
Overview of the data analysis project
We recently completed a project that analyzed the popularity of content in various categories and with various tags. Popularity was measured by total hits over a one-month time interval.
Two sets of data were collected from the WordPress site to perform the analysis:
- An export of the site content in XML format
- Content hits counts gathered by the WP Statistics plugin in CSV format
Once the data was collected, a Python data wrangling script merged the two sets together to produce a single CSV file with the following columns:
- type of post (page or post)
- categories attached
- tags attached
- total number of hits
The merged CSV file was then used as input to an R script executed by RStudio. The entire implementation process is illustrated below.
Histogram of the hits-count
The first step in the analysis was to plot a histogram of the hits-count to see the distribution of hits over all pages and posts.
The above chart shows that a small number of pages/posts gets most of the hits.
Looking at which categories are most popular
In this study we looked at which categories are the most popular. Below is a plot of total number of hits by category that shows which category is the most popular.
The plot shows that the most popular category is “family history.” The category consists of a series of posts we created to describe the history of various branches of our family history and ancestry. If we look at a list of the top few pages, we see confirmation of this.
At the top of the list are the Dienst and Nuss families. We have been contacted many times by distant relatives that have found a connection to us by looking at these pages.
Planning future projects
In a future study, we plan to look at how tags affect post popularity (the tags describe the posts in greater detail than the categories do).