We’ve started a new project in 2021 that capitalizes on our passion for data analytics, Dick’s programming skill, and the knowledge we gained in a data science course we took a couple of years ago. Dick uses Python code to scrape and wrangle data from and about our two websites (newsfromnan.com and vrcommunications.us), and R code and packages to display the results. We’re calling the resulting visualizations “Viz of the Day.”
For this visualization we looked at the use of tags on our personal website, newsfromnan.com. The site has about 400 posts that include more than 1000 metadata tags.
Our VIZ provides an overview of the site’s contents by displaying the relative usage of the various tags.
How We Did It
Our analysis makes use of an XML export file of the WordPress site.
A Python data-wrangling program scanned the XML file and created an output file in CSV format that lists each tag and how may posts use it. An R script makes use of a ggwordcloud package to create a word cloud.
The relative size of each tag in the cloud is proportional to the logorithm of how many times it is used.