Category Archives: Publication

New NERDS paper on highway barriers to social ties

This week we are on fire and have yet another big paper out, long time in the making, led by Luca Maria Aiello with multi-NERDS involvement, just published in PNAS: Urban highways are barriers to social ties, by L.M. Aiello, A. Vybornova, S. Juhász, M. Szell, and E. Bokányi.

Highways are physical barriers that cut opportunities for social connections, but the magnitude of this effect has not been quantified. Such quantitative evidence would enable policy-makers to prioritize interventions that reconnect urban communities—an urgent need in many US cities. We relate urban highways in the 50 largest US cities with massive, geolocated online social network data to quantify the decrease in social connectivity associated with highways. We find that this barrier effect is strong in all 50 cities, and particularly prominent over shorter distances. We also confirm this effect for highways that are historically associated with racial segregation. Our research demonstrates with high granularity the long-lasting impact of decades-old infrastructure on society and provides tools for evidence-based remedies.

New NERDS paper on academic mentorship

We have a big paper out today, long time in the making, led by Yanmeng Xin, our long-term PhD student visitor in 2021-2023, co-authored by Roberta Sinatra, just published in Nature Human Behavior: Academic mentees thrive in big groups, but survive in small groups, by Y. Xing, Y Ma, Y. Fan, R. Sinatra, and A. Zeng.

The main message of the paper is intriguing: If you “grow up” in a big research group, and if you survive, you will have high success. At the same time, in a big group it is also harder to survive, especially if your mentor is very productive. So what is then good mentorship, and what is a good group to be?

Interestingly, at NERDS we are a fairly big research group, but with several mentors who are by themselves not too busy, so we combine the best of both worlds 😁 In fact, this paper itself is another one in a long series of success stories where a visitor accomplished something great while staying at our ✨🦄 ~enchanted NERDS grounds~ 🧚‍♂️✨ in Copenhagen. (“NERDS is one of the best places I have ever stayed.”)

Mentoring is a key component of scientific achievements, contributing to overall measures of career success for mentees and mentors. Within the scientific community, possessing a large research group is often perceived as an indicator of exceptional mentorship and high-quality research. However, such large, competitive groups may also escalate dropout rates, particularly among early-career researchers. Overly high dropout rates of young researchers may lead to severe postdoc shortage and loss of top-tier academics in contemporary academia. In this context, we collect longitudinal genealogical data on mentor-mentee relations and their publication, and analyze the influence of a mentor’s group size on the future academic longevity and performance of their mentees. Our findings indicate that mentees trained in larger groups tend to exhibit superior academic performance compared to those from smaller groups, provided they remain in academia post-graduation. However, we also observe two surprising patterns: Academic survival rate is significantly lower for (1) mentees from larger groups, and for (2) mentees with more productive mentors. The trend is verified in institutions of different prestige. These findings highlight a negative correlation between a mentor’s success and the academic survival rate of their mentees, prompting a rethinking of effective mentorship and offering actionable insights for career advancement.

The Atlas for the Aspiring Network Scientist v2

NERDS member Michele Coscia has updated his textbook for the Network Analysis and Advanced Network Science classes he teaches at ITU. This “Atlas for the Aspiring Network Scientist”, has now reached version 2.0, and 916 pages, and is available for anyone to read for free: https://arxiv.org/abs/2101.00863

Website: https://www.networkatlas.eu/

The new edition has a much improved coverage on graph neural networks, network data uncertainty, and background knowledge in statistics, machine learning, probability theory, and linear algebra.

Even version 2.0 has big margins for improvements. Please contact Michele with any comments.

Find a more detailed explanation of The Atlas for the Aspiring Network Scientist on Michele’s page: https://www.michelecoscia.com/?p=2393.

Two new NERDS papers published: Gamestop, Copenhagen bike lanes

We have two new publications out, one on the Gamestop short squeeze by Reddit users, and one on bicycle network design with use case Copenhagen:

The dynamics of the Reddit collective action leading to the GameStop short squeeze, by A. Desiderio, L.M. Aiello, G. Cimini & L. Alessandretti , published in npj complexity

In early 2021, the stock prices of GameStop, AMC, Nokia, and BlackBerry experienced dramatic increases, triggered by short-squeeze operations that have been largely attributed to Reddit’s retail investors. Here we shed light on the extent and timing of Reddit users’ influence on the GameStop short squeeze. Using statistical analysis tools with high temporal resolution, we find that increasing Reddit discussions anticipated high trading volumes. This effect emerged abruptly a few weeks before the event but waned once the community gained widespread visibility through Twitter. Meanwhile, the collective investment of the community, quantified through posts of individual positions, closely mirrored the market capitalization of the stock. This evidence suggests a coordinated action of users in developing a shared financial strategy through social media—targeting GameStop first and other stocks afterward. Overall, our results provide novel insights into the role of Reddit users in the dynamics of the GameStop short squeeze.
Cohesive urban bicycle infrastructure design through optimal transport routing in multilayer networks, by A. Lonardi, M. Szell and C. De Bacco, published in Journal of the Royal Society Interface

Bicycle infrastructure networks must meet the needs of cyclists to position cycling as a viable transportation choice in cities. In particular, protected infrastructure should be planned cohesively for the whole city and spacious enough to accommodate all cyclists safely and prevent cyclist congestion—a common problem in cycling cities like Copenhagen. Here, we devise an adaptive method for optimal bicycle network design and for evaluating congestion criticalities on bicycle paths. The method goes beyond static network measures, using computationally efficient adaptation rules inspired by optimal transport on the dynamically updating multilayer network of roads and protected bicycle lanes. Street capacities and cyclist flows reciprocally control each other to optimally accommodate cyclists on streets with one control parameter that dictates the preference of bicycle infrastructure over roads. Applying our method to Copenhagen confirms that the city’s bicycle network is generally well-developed. However, we are able to identify the network’s bottlenecks, and we find, at a finer scale, disparities in network accessibility and criticalities between different neighbourhoods. Our model and results are generalizable beyond this particular case study to serve as a scalable and versatile tool for aiding urban planners in designing cycling-friendly cities.

Five new NERDS winter papers published!

We have been very productive over the winter! Five new NERDS publications were released in December and this January, on topics as diverse as archaeological networks, dynamic networks, spatial data science, climate change debates, and LLM-generated data:

“A Network of Mutualities of Being”: Socio-material Archaeological Networks and Biological Ties at Çatalhöyük, by C. Mazzucato, M. Coscia, A. Küçükakdağ Doğu, S. Haddow, M. Sıddık Kılıç, E. Yüncü & M. Somel, published in Journal of Archaeological Method and Theory

In this paper, we propose a Network Science framework to integrate archaeogenomic data and material culture at an intra-site scale to study biological relatedness and social organization at the Neolithic site of Çatalhöyük. Methodologically, we propose the use of network variance to investigate the association between biological relatedness and material culture within networks of houses. This approach allows us to observe how material culture similarity between buildings is associated with biological relationships between individuals and how biogenetic ties concentrate at specific localities on site.
Graph Evolution Rules Meet Communities: Assessing Global and Local Patterns in the Evolution of Dynamic Networks, by A. Galdeman, M. Zignani & S. Gaito, published in Big Data Mining and Analytics

In this paper, we comprehensively explore Graph Evolution Rules (GERs) in dynamic networks from diverse systems with a focus on the rules characterizing the formation and evolution of their modular structures, using EvoMine for GER extraction and the Leiden algorithm for community detection. We characterize network and module evolution through GER profiles, enabling cross-system comparisons. By combining GERs and network communities, we decompose network evolution into regions to uncover insights into global and mesoscopic network evolution patterns. From a mesoscopic standpoint, the evolution patterns characterizing communities emphasize a non-homogeneous nature, with each community, or groups of them, displaying specific evolution patterns, while other networks’ communities follow more uniform evolution patterns. Additionally, closely interconnected sets of communities tend to evolve similarly. Our findings offer valuable insights into the intricate mechanisms governing the growth and development of dynamic networks and their communities, shedding light on the interplay between modular structures and evolving network dynamics.
Teaching spatial data science, by A.R. Vierø & M. Szell, published in Geoforum Perspektiv

Spatial data science is an emerging field building on geographic information science, geography, and data science. Here we first discuss the definition and history of the field, arguing that it indeed warrants a new label. Then, we present the design of our course Geospatial Data Science at IT University of Copenhagen and discuss the importance of teaching not just spatial data science tools but also spatial and critical thinking. We conclude with a perspective on the potential future for spatial data science, arguing that qualitative theory and methods will continue to play an important role despite new GeoAI-related advances.
Do You See What I See? Emotional Reaction to Visual Content in the Online Debate About Climate Change, by L. Rossi, A. Segerberg, L. Arminio & M. Magnani, in Environmental Communication.

This paper explores the visual echo chamber effect in online climate change communication. We analyze communication by progressive actors and counteractors involved in the public debate about climate change on Facebook, to address the possibility that visual content can bridge ideologically diverse communities. Specifically, we investigate whether visual content depicting protest serves this purpose. The findings reveal a small amount of shared visual content. Interestingly, the emotional reactions to this content for the most part diverge significantly, suggesting that pre-existing attitudes, such as climate ideological position, influence interpretation. Contrary to our expectations, however, we do not observe visual content representing protest activity bridging the two groups. This work posits the possibility of a two-fold (de)polarization around visual content that both connects and divides, which contributes to a more nuanced understanding of the social dynamics that create and sustain the echo chamber effect observed in online climate change debates.
The Problems of LLM-generated Data in Social Science Research by L. Rossi, K. Harrison & I Shklovski, in Sociologica.
The paper explores LLMs when used for generating synthetic data for social science and design research. Researchers have used LLM-generated data for data augmentation and prototyping, as well as for direct analysis where LLMs acted as proxies for real human subjects. LLM-based synthetic data build on fundamentally different epistemological assumptions than previous synthetically generated data and are justified by a different set of considerations. In this essay, we explore the various ways in which LLMs have been used to generate research data and consider the underlying epistemological (and accompanying methodological) assumptions. We challenge some of the assumptions made about LLM-generated data, and we highlight the main challenges that social sciences and humanities need to address if they want to adopt LLMs as synthetic data generators.

NERDS clarify AI’s Physics Nobel

Two weeks ago the Nobel prize in physics was awarded to Hopfield and Hinton for their research on artificial neural networks. This caused quite some uproar, especially by many of our computer science and physics colleagues. As original-physicists-turned-data-scientists-dabbling-in-AI, who have done data-driven Science of Science research exactly on the crucial role of Hopfield and Hinton’s papers in physics, we penned a comment pointing to our clarifying research which was now published as a correspondence in Nature:

Was the Nobel prize for physics? Yes — not that it matters, by M. Szell, Y. Ma, and R. Sinatra

Here the entire correspondence:

The award of the 2024 Nobel Prize in Physics to John Hopfield and Geoffrey Hinton for their groundbreaking research on artificial neural networks (Nature 634, 523–524; 2024) has caused consternation in some quarters. Surely this is computer science, not physics?

Existing data can help to inform this debate. Almost a decade ago, two of us (M.S. and R.S.) co-authored an analysis of referencing and citation patterns that explicitly placed Hopfield’s seminal 1982 paper on neural networks among 3.2 million interdisciplinary papers in non-physics journals that were “indistinguishable from papers published in physics journals”. Six other physics Nobel-winning papers were also in this set (R. Sinatra et al. Nature Phys. 11, 791–796; 2015).

The physics Nobel prize has until recently rewarded conventional ‘core’ physics research, even though Hopfield’s and Hinton’s papers were ripe for recognition (M. Szell et al. Nature Phys. 14, 1075–1078; 2018). We hope that this year’s prize will expedite the breakdown of silos that obstruct thinking across disciplines. Clinging to the idea of research fields as fixed territories is at best small-minded, and at worst harmful, when it comes to solving global challenges such as climate change.

Our original version – before editorial changes – provides a slightly different angle and an instructive figure (that was cut for publication):

New NERDS paper on COVID genome sequencing

Our newest faculty hire Jonas L. Juul is already making a splash. He published a big multi-author paper in Nature Communications: High-resolution epidemiological landscape from ~290,000 SARS-CoV-2 genomes from Denmark, by M.P. Khurana et al

We are happy that with Jonas, who was part of the Statens Serum Institut’s expert group on mathematical modeling of COVID-19 during the reopening of Denmark in the spring and summer of 2020, we have gained a solid footing in medical applications of data/network science.

We examined the drivers of molecular evolution and spread of 291,791 SARS-CoV-2 genomes from Denmark in 2021. With a sequencing rate consistently exceeding 60%, and up to 80% of PCR-positive samples between March and November, the viral genome set is broadly whole-epidemic representative. We identify a consistent rise in viral diversity over time, with notable spikes upon the importation of novel variants (e.g., Delta and Omicron). By linking genomic data with rich individual-level demographic data from national registers, we find that individuals aged < 15 and > 75 years had a lower contribution to molecular change (i.e., branch lengths) compared to other age groups, but similar molecular evolutionary rates, suggesting a lower likelihood of introducing novel variants. Similarly, we find greater molecular change among vaccinated individuals, suggestive of immune evasion. We also observe evidence of transmission in rural areas to follow predictable diffusion processes. Conversely, urban areas are expectedly more complex due to their high mobility, emphasising the role of population structure in driving virus spread. Our analyses highlight the added value of integrating genomic data with detailed demographic and spatial information, particularly in the absence of structured infection surveys.

New NERDS paper on network analysis of Italian music

A new NERDS authored paper is out in Applied Network Science: Node attribute analysis for cultural data analytics: a case study on Italian XX–XXI century music, by M. Coscia

We use the Italian music record industry from 1902 to 2024 as a case study. In this scenario, a possible research objective could be to discuss the relationships between different music genres as they are performed by different bands. Estimating genre similarity by counting the number of records each band published performing a given genre is not enough, because it assumes bands operate independently from each other. In reality, bands share members and have complex relationships. These relationships cannot be automatically learned, both because we miss the data behind their creation, but also because they are established in a serendipitous way between artists, without following consistent patterns. However, we can be map them in a complex network. We can then use the counts of band records with a given genre as a node attribute in a band network. In this paper we show how recently developed techniques for node attribute analysis are a natural choice to analyze such attributes. Alternative network analysis techniques focus on analyzing nodes, rather than node attributes, ending up either being inapplicable in this scenario, or requiring the creation of more complex n-partite high order structures that can result less intuitive. By using node attribute analysis techniques, we show that we are able to describe which music genres concentrate or spread out in this network, which time periods show a balance of exploration-versus-exploitation, which Italian regions correlate more with which music genres, and a new approach to classify clusters of coherent music genres or eras of activity by the distance on this network between genres or years.

Three new NERDS papers with our master students: Failing our youngest, superblockify, women on wikipedia

We have 3 new papers that came out over the summer so far, on diverse, very interesting topics. The first authors in all 3 of these papers were our master students – showing how impactful good master projects can be:

Failing Our Youngest: On the Biases, Pitfalls, and Risks in a Decision Support Algorithm Used for Child Protection, by T.M. Hansen, R. Sinatra, and V. Sekara, published at FAccT’24
Through a freedom of information request, we accessed a new algorithm of Danish child protection services to aid caseworkers in identifying children at heightened risk of maltreatment, named Decision Support, and conduct an audit. We find that the algorithm has significant methodological flaws, suffers from information leakage, relies on inappropriate proxy values for maltreatment assessment, generates inconsistent risk scores, and exhibits age-based discrimination. Given these serious issues, we strongly advise against the use of this kind of algorithms in local government, municipal, and child protection settings, and we call for rigorous evaluation of such tools before implementation and for continual monitoring post-deployment by listing a series of specific recommendations.

See also our accompanying policy paper published earlier.
superblockify: A Python Package for Automated Generation, Visualization, and Analysis of Potential Superblocks in Cities, by C.M. Büth, A. Vybornova, and M. Szell, published in The Journal of Open Source Software (JOSS)
superblockify is a Python package designed to assist in planning future Superblock implementations by partitioning an urban street network into Superblock-like neighborhoods and providing tools for visualizing and analyzing these partition results. A Superblock is a set of adjacent urban blocks where vehicular through traffic is prevented or pacified, giving priority to people walking and cycling. The potential Superblock blueprints
and descriptive statistics generated by superblockify can be used by urban planners as a first step in a data-driven planning pipeline for future urban transformations, or by urban data scientists as an efficient computational method to evaluate potential Superblock partitions.

The software is available at: superblockify.city
Traces of Unequal Entry Requirement for Illustrious People on Wikipedia Based on their Gender, by L. Krivaa and M. Coscia, published in Advances in Complex Systems
In this paper, we study issues of fair gender representations for people in history noted by multiple language editions of Wikipedia: are women underrepresented on Wikipedia? We do so via a combination of natural language processing and network science. Our results indicate that there is indeed a higher bar for women to have their own biographical page on Wikipedia: women are only included when they have more significant connections than men to the rest of the network. There are visible effects of the initiatives Wikipedia is taking to fix this issue, showing that the gap is narrowing, which validates our interpretation of the data.

New NERDS paper on urban morphology & street network simplification

A new NERDS co-authored paper is out open-access in the Journal of Spatial Information Science (JOSIS): A shape-based heuristic for the detection of urban block artifacts in street networks, by Martin Fleischmann & Anastassia Vybornova.

a) Bridge, Amsterdam; b) Roundabout, Abidjan; c) Intersection, Kabul; d) Motorway, Vienna. Polygons classified as face artifacts are shown in red, and the OSM street network (without service roads) is shown in black. Face artifacts are polygons enclosed by street network geometries (in the case of OSM, lane centerlines) that do not represent morphological urban blocks, but instead are a result of detailed transportation-focused mapping of the streetscape. Map data (c) OpenStreetMap contributors (c) CARTO

We propose a cheap computational heuristic for the identification of ‘face artifacts’, i.e., geometries that are enclosed by transportation edges but do not represent urban blocks. Sounds cryptic? Just check out the picture – the artifacts (in red) might be painfully familiar to anyone who has worked with street network data. Our proposed heuristic, implemented open-source in momepy, is the first step towards a fully automated street network simplification workflow. Next steps coming up – stay tuned!

Networks, Data, and Society (NERDS)

Research group at IT University of Copenhagen

Category Archives: Publication

New NERDS paper on highway barriers to social ties

New NERDS paper on academic mentorship

The Atlas for the Aspiring Network Scientist v2

Two new NERDS papers published: Gamestop, Copenhagen bike lanes

Five new NERDS winter papers published!

NERDS clarify AI’s Physics Nobel

New NERDS paper on COVID genome sequencing

New NERDS paper on network analysis of Italian music

Three new NERDS papers with our master students: Failing our youngest, superblockify, women on wikipedia

New NERDS paper on urban morphology & street network simplification