Network science · Data · 2021
Co-Authorship Network
Mining DBLP to ask whether collaboration patterns explain a department's rising reputation.
The question
For a network-science course at NTU, I built a co-authorship graph from DBLP — nodes are authors, an edge means they've ever collaborated — to ask:
Can network science help explain research collaboration among a department's faculty over time, and even account for its reputation growth?
What it involved
The interesting work was upstream of the algorithms: turning messy, real bibliographic data into a clean graph — name disambiguation, time-slicing the collaborations, deciding what counts as an edge. With the graph built, I used centrality measures to find the connectors holding the network together and community detection to surface research clusters, then tracked how both evolved across years.
Why it's my most-starred repo
It's quietly my most-starred project (22★) — partly because it's genuinely useful as a worked example of applied network analysis on real, messy data rather than a toy graph. It's also a reminder that the work people find valuable isn't always the flashiest model; it's often the clearly-explained, reproducible analysis of a real dataset.
What I took away
- The hard, valuable part of graph analysis is constructing an honest graph from messy real data.
- Centrality + community detection over time turned a static graph into a story about a department.
- My most-starred repo is an analysis, not a model — clarity and reproducibility get noticed.