Project Description

Visualizing Seattle Public Library's Book Adoption and Popularity

Thomas Hervey

Unique and real-time datasets often brings new perspectives about both external and internal phenomena. Our course's access to checkout data since 2006 for the Seattle Public library is both an opportunity to explore library patron's activity as well as the internal structure and happenings at one of the worlds most progressive libraries. This project attempts to examine the intersection of these 'frontend' and 'backend' phenomena by exploring the relationships between a book's adoption, popularity and time.

Many would suggest that the growth of the World Wide Web has dramatically driven th user adoption of e-media and even possibly slowed the adoption of physical media, such as library book checkouts. But is this true? Are checkouts less since the public's online use has grown? Is this the same for all genres? To evaluate this, I pose the following questions:

Does a title's age affect its checkout frequency (ie. same book in 2006 v 2015)
Do checkout frequencies increase with new media? (ie. 2006 release v 2015 release)
Has there been an increase in young adult interest? (ie. 'youth fantasy hype)

If a library were able to see if a particular book, type of book were being adopted (difference in publication time and first checkout) at a longer rate, they could investigate ways to keep popularity high, by means of greater advertising, or quicker internal purchases, etc. Ultimately, an internal look at their data could give the Seattle Public Library tools to keep customers coming if e-media is indeed a problem.

After a series of reevaluations followed by question changes, I am able to generate my SQL query pulls. In addition, in order to gather pertinent publication information, I query data from table generated from a web scrape, and then built an Amazon API call by providing ISBN numbers. The following links are to the SQL queries and PHP program that I used to generate my final dataset before visualization.

To investigate my data, I begin with several sketches investigating the trends that I expect as well aside as well as pair these trends with a strong visualization. Originating as an interactive scatter-plot, a user is able to explore the data on their own while making minute changes to the axis normalizations and color scheme.

Initially, my data didn't look like what I had expected. Thinking that the publication date would have had a larger influence on total checkout count, I determined that this needs to be invesitgated further as well as include the relationship to adoption. This work in progress had limited interaction, but allows for hovring which reveals further information about the data point (such as title and important dates). The results are 1607 records that I was able to obtain full details including the much needed publication date. All of these records were published in 2006 or later, were books, and had at least some checkouts over the 10 year span.

The final result uses stronger colors and more meaningful interactions. In addition to more display features for highlighted elements, a user can toggle between axis representations, showing the transition of each data point. This result led to me once again adjust my questions and find further ways to explore the internal structure of the Seattle Public Library's data. To see the next steps including a 3D visualization that leverages statistics, please see my second project and third project.

All work is developed within Processing [in addition to external API pull]
Source Code + Data