Flocking Simulation Based on Checkout Co-occurrency
MAT 259, 2023
Lu Yang

Concept
I want to create a dynamic self-organized flocking simulation based on books been checked out at the same time. I assume it would be interesting to see:
1. Book titles with a specific keyword may contain different Dewey classes and different subjects
2. Books checked out with these books may contain a broader range of Dewey classes and subjects
3. These books may aggregate at a different Dewey class than their designated ones

Query
With the above assumptions, I queried book titles with the keyword “architecture”, for its multiple meaning in different disciplines. For each book title, I also queried books that were checked in and out at the same time with it, to approximate relevant books that doesn’t have the keyword “architecture”.

SELECT


t1.bibNumber AS A_bib,
t1.title as A_title,
FLOOR(t1.deweyClass) AS A_dewey,
t3.subject AS A_subject,

t2.bibNumber AS B_bib,
t2.title as B_title,
FLOOR(t2.deweyClass) AS B_dewey,
t4.subject AS B_subject

FROM
(SELECT
bibNumber,
GROUP_CONCAT(subject
SEPARATOR ';') AS subject
FROM
spl_2016.subject
WHERE
subject REGEXP '^[0-9a-zA-Z .]+$'
-- filter out non-English subjects
-- The anchor ^ and $ ensure that you are matching the entire string and not part of it.
-- Next the character class [0-9a-zA-Z .] matches a single upper/lower case letter or a space or a period.
-- The + is the quantifier for one or more repetitions of the previous sub-regex.
-- so in this case it allows us to match one or more of either a period or a space or a upper/lower case letter.
GROUP BY bibNumber) AS t3,

(SELECT
bibNumber,
GROUP_CONCAT(subject
SEPARATOR ';') AS subject
FROM
spl_2016.subject
WHERE
subject REGEXP '^[0-9a-zA-Z .]+$'
-- filter out non-English subjects
-- The anchor ^ and $ ensure that you are matching the entire string and not part of it.
-- Next the character class [0-9a-zA-Z .] matches a single upper/lower case letter or a space or a period.
-- The + is the quantifier for one or more repetitions of the previous sub-regex.
-- so in this case it allows us to match one or more of either a period or a space or a upper/lower case letter.
GROUP BY bibNumber) AS t4,

spl_2016.inraw t1
INNER JOIN
spl_2016.inraw t2 ON t1.cout = t2.cout AND t1.cin = t2.cin
AND t1.title LIKE '%architecture%'
AND t1.bibNumber != t2.bibNumber
AND t1.deweyClass != ''
AND t2.deweyClass != ''
-- AND YEAR(t1.cout) > 2017

WHERE
t1.bibNumber = t3.bibNumber
AND t2.bibNumber = t4.bibNumber



From the queried data, I ran another analysis that records for each subject title appeared in the dataset:
1. Dewey classes of itself
2. Subjects that co-occurred with it within the same book title
3. Frequencies of their co-occurrency
4. Dewey classes of the co-occurred books
5. Subjects of co-occurred books
6. Frequencies of their co-occurrency




Final result
The flocking simulation does not intend to deliver a result right at the beginning. Instead, it requires interaction with provided parameters and observation through its self-organized forms. Different parameter setups could lead to different results.
The flocking system contains two components:
Matrix environment that consists of static points, which represent Dewey classes
first three digits of Dewey classes(abc.def)
a -> point.x, b -> point.y, c -> point.z
when appears in the swarm agents search distance, these points will apply as strong attraction force to them if they are co-currently related




Code
All work is developed within Processing
Source Code + Data