Linear Frequency - Version.1
MAT 259, 2013
Saeed Mahani

Introduction
Question:
How has interest in items relating to coffee changed in comparison with tea between 2006 and 2011?

My goal was to retrieve every record relating to coffee or tea, so I thought the ‘title’ and ‘subj’ fields were approriate to search for the keywords tea and coffee. Since the SPL system became operational in 2005, I limited the date range to 2006 and onward to not have any incomplete years. I wanted to compare the popularity of coffee versus tea over time, so I summed up the number of records relating to each seperately, and displayed the result in two seperate columns, ordered by year.

Background
and Sketches

Query
SELECT year(cout) as 'Year', sum(case when (title like '%coffee%' or subj like '%coffee%') then 1 else 0 end) as 'Coffee Count', sum(case when (title like '% tea %' or title like '% tea' or title like 'tea %' or subj like '% tea %' or subj like '% tea' or subj like 'tea %') then 1 else 0 end) as 'Tea Count' FROM inraw WHERE (title like '%coffee%' or subj like '%coffee%' or title like '% tea %' or title like '% tea' or title like 'tea %' or subj like '% tea %' or subj like '% tea' or subj like 'tea %') and year(cout) >= 2006 GROUP BY year(cout) ORDER BY year(cout) asc;
Process
Processing time: 258.598 seconds duration, 0.0 seconds fetch time

Analysis
The results seen in the table below show coffee and tea at a similar level of popularity in 2006. In the following 3 years, both surged in popularity, but tea’s surge was greater, beating coffee by a significant margin every year (about 20%-50%). In 2011 the two returned back to a similar lower level.


Visualization Concept
I want to highlight the change in popularity between tea and coffee, so rather than show exact numbers I will show a normalized set of bars showing the proportion of coffee items vs tea items. Just to clarify: in 2006, coffee and tea iteams were equally popular in the library, so the meeting point for the 2006 bar is roughly in the middle (vertically). However, the following four years tea was more popular, and is shown below by more of the bar area given to the tea side. Each side of the bars will be appropriately colored; green for tea, brown for coffee. Time goes from left to right be general convention. I may have more granularity in the time axis if I find certain peaks and dips meaningful and worth showing (perhaps due to some world event relating to coffee/tea).


Code
I used Processing.

Run in Browser

Source Code






Linear Frequency - Version 2.
MAT 259, 2013
Saeed Mahani

Introduction
I was curious to see how the SPL reflected the shifts in programming language preferences in the past few years. I knew mobile programming was on the rise, as well as web, but didn't quite know how core languages like Java and C++ were doing. I picked the two most popular languages from each category to observe.

Query
select year(activity.o) as 'Year', month(activity.o) as 'Month', sum(case when (keyword.keyword = 'c') then 1 else 0 end) as 'C', sum(case when (keyword.keyword = 'java') then 1 else 0 end) as 'Java', sum(case when (keyword.keyword = 'javascript') then 1 else 0 end) as 'JavaScript', sum(case when (keyword.keyword = 'html') then 1 else 0 end) as 'HTML', sum(case when (keyword.keyword = 'android') then 1 else 0 end) as 'Android', sum(case when (keyword.keyword = 'ios') then 1 else 0 end) as 'iOS' from keyword, title, collection, item, subject, activity where keyword.bib = title.bib and item.bib = title.bib and item.item = collection.item and item.bib = subject.bib and activity.bib = item.bib and (subject.subject like '%programming%' or subject.subject like '%computer%' or subject.subject like '%software%' or subject.subject like '%development%') and (keyword.keyword = 'java' or keyword.keyword = 'javascript' or keyword.keyword = 'python' or keyword.keyword = 'android' or keyword.keyword = 'ios' or keyword.keyword = 'c' or keyword.keyword = 'html') and year(activity.o) >= 2006 and year(activity.o) <= 2012 group by year(activity.o), month(activity.o);
Query Explanation
The query searches for checkout events that contain programming language names in the subject or title and are categorized as a programming/software item. I only look in the years for which we have complete SPL data, and group the results by month. Though this query uses expensive SQL operators ('like', 'sum'), it completes in just a few seconds.

Results Analysis
I wanted to visualize the individual rise/fall in popularity of these popular programming languages while still showing the general trend in programming related checkouts at the library. A segmented bar graph was ideal for this. The number of languages I was examining was small enough to assign a unique color to each. It was then simply a matter of adjusting the saturation and brightness (and keeping it consistent for each color) to find colors that delimitated the languages enough but were also easy on the eyes. The results make clear a few interesting points in the data. For one, the mobile platform begins to appear in checkouts in early 2010 with the first Android books. Second, core languages seem to remain at a fairly constant level despite the increase in the other categories. Lastly, though web languages have been around for a couple decades, they are recently experiencing an upward surge, possibly due to the radical changes in website architecture visible in complex websites like Facebook.


Code
I used Processing.

Run in Browser

Source Code