Does Title Length Differ?
MAT 259, 2016
Jingyi Xiao

Concept
Introduction

Generally speaking, a song has shorter title than a book, and book has shorter title than research paper.

Question

Does title length in different Dewey classes differ?

Idea

I queried the average title length in Seattle Public Library and classified them as different Dewey classes, divisions and sections. I wanted to see if there is anything interesting, like some Dewey classes tend to have longer title than others.


Query
Query for different Dewey classes.
SELECT
a.dewey,
AVG(CASE
WHEN a.itemtype LIKE '%bk%' THEN a.length
ELSE 0
END) AS book
FROM
(SELECT DISTINCT
bibNumber,
itemtype,
FLOOR(deweyClass) AS dewey,
LENGTH(title) AS length
FROM
spl_2016.inraw
WHERE
YEAR(cout) > 2005 AND YEAR(cout) < 2017
AND FLOOR(deweyClass) != '') AS a
GROUP BY a.dewey;

Query for different Dewey divisons.
SELECT
a.dewey,
AVG(CASE
WHEN a.itemtype LIKE '%bk%' THEN a.length
ELSE 0
END) AS book
FROM
(SELECT DISTINCT
bibNumber,
itemtype,
FLOOR(deweyClass) AS dewey,
LENGTH(title) AS length
FROM
spl_2016.inraw
WHERE
YEAR(cout) > 2005 AND YEAR(cout) < 2017
AND FLOOR(deweyClass/10)*10 != '') AS a
GROUP BY a.dewey;

Query for different Dewey sections.
SELECT
a.dewey,
AVG(CASE
WHEN a.itemtype LIKE '%bk%' THEN a.length
ELSE 0
END) AS book
FROM
(SELECT DISTINCT
bibNumber,
itemtype,
FLOOR(deweyClass) AS dewey,
LENGTH(title) AS length
FROM
spl_2016.inraw
WHERE
YEAR(cout) > 2005 AND YEAR(cout) < 2017
AND FLOOR(deweyClass/100)*100 != '') AS a
GROUP BY a.dewey;

Preliminary sketches

The visualization is inspired by sunburst chart. Sunburst chart is a hierarchical chart which can show proportion of different values at each level in a hierarchy. So I think it can show hierarchy of Dewey classes.




Process
Data Structure

As sunburst chart is hierarchical, data in program should also be hierarchical in some way. I used tree data structure to organize data in processing to solve this problem (more details in code).

Interaction

Details are shown when mouse over the corresponding sector. This part is a little complicated because distances and angles from the center of circle needs to be calculated and compared to find which sector is chosen.

Visualization and layout

At first, I used the same length for same level, just used angle of sector to show the data. Then I found people are not sensitive to angles (like 40 degree and 50 degree is not a big difference), people are more sensitive with length. So later I changed the length of sector as well to show the data.

I also change the layout at the end (compare with final result). I changed the legend smaller to avoid distracting audience. I also make the chart bigger and more attractive. And after choosing several fonts, I finally decide to go back to default font.





(I didn't save the process screenshot, so I used the photos instead.)


Final result
Overview


Dewey class 800


Dewey class 600 (with mouse over interaction)



Evaluation/Analysis

Title length in Dewey class 400(Language), 500(Science), 700(Art & Recreation) and 800(Literature) is relatively short than others. And when we look deeper into each Dewey class, some Dewey classes tend to have similar length, but some differ. For example, title length in 780(Music) is much shorter than others, 800(Literature, rhetoric) and 910(Geography & Travel) is relatively longer.

Overall, books related to language, art and literature tend to have shorter title. I think maybe because these themes are naturally more concise or need to be concise.


Code
Built with Processing 3.0.1
Source Code + Data