Data Manipulation

The census

The census data from the United Kingdom was so well-made within Excel that it could have been presented in its original format. The program which the UK uses utilized some functions of Excel to make the data very easy to manipulate and visualize. The main measurement I looked at was the main language of people in the United Kingdom. Obviously, I cared about the Celtic languages, so I quickly eliminated the dozens of rows dedicated to a dizzying array of international languages. The census even marked the type of area in which the census subjects lived; varying levels of urban and rural development made for a thought-provoking spread of the languages, but that was too far into the weeds. I eliminated all but the ‘total’, ‘rural total’, and ‘urban total’. This was just for England.

For Wales, I cared more about the proficiency of Welsh in the population. Again, the data was beautiful and accessible. I added a couple of lines to more clearly mark sums of important groupings of proficiency and eliminated the development markers.

The Republic of Ireland used a different service. Their service delivered the data in very simplistic terms and had to be organized a bit more efficiently into a compilation of data on one or two sheets. Their digitized census data was more widespread; I was able to acquire a population of speakers of Irish from 1861 to 1926. I used current census data from 2006 and 2011 to look at more recent trends.


The Michael Krauss Collection

I created an Excel document and began keying in publication information in a spreadsheet. After some patterns immerged, I was able to create drop down boxes and force a format on fields like year of publication. After collecting data on the books, which spanned all the Celtic languages and were in part written in English, I used OpenRefine to standardize spelling of author names and reorganize. I used the program to find everything Cornish in the collection, and I exported that out to a .csv.


What I found using Tableau

Since all my data were in formats I could understand but which Tableau could not necessarily figure out, I converted some data sets into simpler, reformatted versions specifically for Tableau. I was able to relatively easily create visualizations which demonstrated what I had hoped to demonstrate, and answer some questions.