Dewey Stucture Exploration
MAT 259, 2022
Siming, Su

I am interested to see the usage time of each Dewey Class in the library. By analyzing the estimated time people spend on each Dewey class, we could get a sense of which categories are easier to understand, which takes a long time to read, and so on. Based on this information, we could set up different deadlines of returning each Dewey Class. For example, Arts and Recreation usually take less time to borrow than others, the library can set up a shorter deadline of returning this category so that others may have more chance to borrow it. To dive deeper, I would like to make some visualizations to visualize how estimated usage time (day) is distributed among each Dewey class.

SELECT AVG(datediff(cin, cout)) as usageTime,
FROM spl_2016.inraw
WHERE deweyClass != ""
GROUP BY FLOOR(deweyClass/100)

SELECT YEAR(cout) as eachYear,
AVG(TIMESTAMPDIFF(day, cout, cin)) as usageTime,
FLOOR(deweyClass/100) as class
FROM spl_2016.inraw
WHERE YEAR(cout) >= 2010 AND YEAR(cout) <= 2020 AND deweyClass != ""
GROUP BY YEAR(cout), FLOOR(deweyClass/100)

Data Visualization

Query Result

Data Analysis and Conclusion
Notice that due to Covid in 2020, we have a much larger time gap between checking in and checking out, it is probably because electronic copies have much longer return deadline than hard copies. Furthermore, notice that the structure of estimated time spent in each Dewey class is almost the same across years. That means, even though the total time spent in each year is different, but the ratio of spending on each Dewey class remains stable.
By visualization above, we know that the ratio time spent in each Dewey Class is stable, so it is probably a good idea to shorten the deadline for borrowing the art and recreation books so that it could increases the flow of borrowing and returning in the library.
There remains an issue why such a ratio structure presents, my theory is that it is probably because language books take a longer time to learn. For art books, people tend to just read faster to get the central idea. However, more investigations are needed.

All work is developed Seattle Public Library, Python, and SQL.