Category Archives: Publication

New NERDS paper out on bicycle network quality in Denmark

We published a new all-NERDS paper, applying our BikeDNA tool to the whole country of Denmark as part of our Cykelpulje project!

How Good Is Open Bicycle Network Data? A Countrywide Case Study of Denmark, by A. Rahbek Vierø, A. Vybornova, and M. Szell, published in Geographical Analysis


We compare the two largest open data sets on dedicated bicycle infrastructure in Denmark, OpenStreetMap (OSM) and GeoDanmark, in a countrywide data quality assessment, asking whether the data are good enough for network-based analysis of cycling conditions. We find that neither of the data sets is of sufficient quality, and that data conflation is necessary to obtain a more complete data set. Our analysis of the spatial variation of data quality suggests that rural areas are more prone to incomplete data. We demonstrate that the prevalent method of using infrastructure density as a proxy for data completeness is not suitable for bicycle infrastructure data, and that matching of corresponding features is thus necessary to assess data completeness. Based on our data quality assessment, we recommend strategic mapping efforts toward data completeness, consistent standards to support comparability between different data sources, and increased focus on data topology to ensure high-quality bicycle network data.

Explore also the interactive map: https://anerv.github.io/bikedna_webmap/

Five new NERDS publications out!

We have been very productive this year already! Five new NERDS publications are released this week:

  1. Which sport is becoming more predictable? A cross-discipline analysis of predictability in team sports, by M. Coscia, published in EPJ Data Science

    We analyze more than 300,000 professional sports matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones.

  2. Algorithmic Fairness: Learnings From a Case That Used AI For Decision Support, by V. Sekara, T.S. Skadegard Thorsen, and R. Sinatra, published by the Crown Princess Mary Center

    This policy brief provides a small introduction to algorithmic fairness and an example of auditing fairness in an algorithm which was aimed at identifying and assessing children at risk from abuse.

  3. The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks, by A.G. Møller, J.A. Dalsgaard, A. Pera, L.M. Aiello (accepted at EACL’24).
    How good are Large Language Models in generating synthetic examples for training classifiers? To find out, we used GPT4 and Llama2 to augment existing training sets for typical Computational Social Science tasks. Our experiments show that the time to replace human-generated training data with LLMs has yet to come: human-generated text and labels provide more valuable information during training for most tasks. However, artificial data augmentation can add value when encountering extremely rare classes in multi-class scenarios, as finding new examples in real-world data can be challenging. 

  4. Shifting Climates: Climate Change Communication from YouTube to TikTok, by A. Pera, L.M. Aiello (accepted at WebSci’24).

    How do video content creators tailor their communication strategies in the era of short-form content? We conducted a comparative study of the YouTube and TikTok video productions of 21 prominent climate communicators active on both platforms. We found that when using TikTok, creators use a more emotionally resonant, self-referential, and action-oriented language compared to YouTube. Also, the response of the public aligns more closely to the tone of the videos in TikTok.

  5. The role of interface design on prompt-mediated creativity in Generative AI, by M. Torricelli, M. Martino, A. Baronchelli, L.M. Aiello (accepted at WebSci’24).
    We analyze 145k+ user prompts from two Generative AI platforms for image generation to see how people explore new concepts over time, and how their exploration might be influenced by different design choices in human-computer interfaces to Generative AI. We find that creativity in prompts declines when the interface provides generation shortcuts that deviate the user attention from prompting.

New NERDS papers: Network reorganization, Mastodon migration, News sharing on Facebook

We have three brand new papers out, this time in PNAS, Scientific Reports, and the Journal of Quantitative Description:

  1. Socioeconomic reorganization of communication and mobility networks in response to external shocks, by L. Napoli, V. Sekara, M. García-Herranz, and M. Karsai, published in PNAS

    We analyze mobile phone communication data to investigate the dynamics of network segregation patterns of the same set of people both in terms of mobility and of social communication during the initial wave of COVID-19 in Sierra Leone. Interestingly, we find opposite trends in the network segregation dynamics, characterized overall by simultaneous increase in mobility segregation and reduction in social network segregation. Our results underscore the significance of data-driven studies going beyond single-axis approaches to assess the impact of emergency policies.
  2. Drivers of social influence in the Twitter migration to Mastodon, by L. La Cava, L.M. Aiello, and A. Tagarelli , published in Scientific Reports

    We analyzed the social network and the public conversations of about 75,000 users who migrated from Twitter to Mastodon, as we NERDS did too a year ago, and observed that the temporal trace of their migrations is compatible with a phenomenon of social influence, as described by a compartmental epidemic model of information diffusion. Drawing from prior research on behavioral change, we delved into the factors that account for variations of the effectiveness of the influence process across different Twitter communities.
    Read more in our blog post:
    https://communities.springernature.com/posts/get-out-of-the-nest-drivers-of-social-influence-in-the-twitter-migration-to-mastodon
  3. Cracking Open the European Newsfeed, by L. Rossi, F. Giglioetto, and G. Marino, published in Journal of Quantitative Description: Digital Media

    This paper contributes to the ongoing effort to describe and quantify the quality of information that is shared on large social media platforms. We do this by complementing existing research that provided a first quantitative assessment of the quality of the information circulating on Facebook among US users. Leveraging an updated version of the same data source — Meta’s URL Shares Dataset — and replicating much of the methodology, we quantify the trustworthy and untrustworthy links to external websites that have been shared on Facebook in the period between 2019 and 2022 in three major European countries (Germany, France, and Italy). We observe a clear decline in the number of URLs present in the dataset and an increase in the URLs from untrustworthy domains as a percentage of the total URLs shared in a year. This increase seems to be higher in electoral years (in Germany and in Italy) but it does not translate into an increase of Views received from untrustworthy sources.

New NERDS review paper on Sidewalk Networks

Sidewalk networks: Review and outlook, by D. Rhoads, C. Rames, A. Solé-Ribalta, M.C. González, M. Szell & J. Borge-Holthoefer, published in Computers, Environment and Urban Systems

From a transport perspective, increasing active travel –and walking in particular– is crucial for the future of sustainable cities, as reflected in global decarbonisation policies and agendas. Further, walking is much more than a mere mode of transport: it provides a fundamental social function, fostering vibrant cohesive communities. Arguably, walking and its associated infrastructure –sidewalks– should rank among the highest priorities for planning authorities. However, efficiency- and speed-driven urbanisation has gradually reallocated street space to private cars, leading to automobiles being the prioritised mode of transport today. Empirical research has generally followed suit, and a systemic understanding of walking as a phenomenon is largely missing, i.e., questions like how connected, resilient, accessible, or socially equitable is the pedestrian infrastructure of whole neighbourhoods and cities. Such relative neglect of sidewalk network research is, first and foremost, the consequence of a generalised lack of publicly available data on sidewalk infrastructure worldwide. A second reason might be its apparent lack of interest from a systemic standpoint: pedestrian mobility does not produce coordination challenges on the scale that cars do. In this work, we confront this perception by showing that there is ample research potential in the study of system-wide sidewalk networks, with both structural and dynamical challenges which might be critical to pursue the latest aspirations towards sustainable mobility in cities.

OECD recommendations for mobility policies based on NERDS research

The OECD/ITF (International Transport Forum) released the document “Towards the Light: Effective Light Mobility Policies in Cities” with policy recommendations towards more sustainable cities through light mobility such as bicycles, scooters, or micro vehicles.

 

In this report, a whole section called “Go faster! Develop high-quality wheeled light mobility infrastructure that fits the context” is based almost entirely on several of our NERDS papers on bicycle/micromobility network analysis. The section discusses how “a strong effort should be made to ensure that the newly created network is connected to the greatest extent possible and allows access to important and popular points of interest”, and how data-driven approaches that we developed are “important tools” that can complement traditional manual approaches:

Further, the report cites a previous study of ours on the perceived distribution of road space,

[Cars] have become so entrenched in the urban landscape that the general public often systematically overestimates the amount of mobility space allocated to non-motorised modes – while underestimating the space allocated to the car (Szell, 2018). Additionally, much of the violence they impose on all other road users is normalised and remains unaddressed in public and policy discourses.

and concludes:

Policy makers and planners need to remove their car blinders and cure their car blindness so that they can finally see the light.

We wholeheartedly agree and are happy that our research is useful for sustainable policy-making in an international context. (The International Transport Forum is an inter-governmental organisation within the OECD system, and is the only global body with a mandate for all modes of transport. It acts as a think tank for transport policy issues and organises the annual global summit of transport ministers.)

New NERDS paper: Mobility science

Future directions in human mobility science, by L. Pappalardo, E. Manley, V. Sekara, L. Alessandretti, published in Nature Computational Science

     

We provide a brief review of human mobility science and present three key areas where we expect to see substantial advancements. We start from the mind and discuss the need to better understand how spatial cognition shapes mobility patterns. We then move to societies and argue the importance of better understanding new forms of transportation. We conclude by discussing how algorithms shape mobility behavior and provide useful tools for modelers. Finally, we discuss how progress on these research directions may help us address some of the challenges our society faces today.

New NERDS summer papers: BikeDNA, Climate change ads, Social sleep

We welcome the summer with 3 new diverse papers!

  1. BikeDNA: A tool for bicycle infrastructure data and network assessment, by A. Rahbek Vierø, A. Vybornova & M. Szell, published in Environment and Planning B


    See also: https://github.com/anerv/BikeDNA
    Building high-quality bicycle networks requires knowledge of existing bicycle infrastructure. However, bicycle network data from governmental agencies or crowdsourced projects like OpenStreetMap often suffer from unknown, heterogeneous, or low quality, which hampers the green transition of human mobility. In particular, bicycle-specific data have peculiarities that require a tailor-made, reproducible quality assessment pipeline: For example, bicycle networks are much more fragmented than road networks, or are mapped with inconsistent data models. To fill this gap, we introduce BikeDNA, an open-source tool for reproducible quality assessment tailored to bicycle infrastructure data with a focus on network structure and connectivity. BikeDNA performs either a standalone analysis of one data set or a comparative analysis between OpenStreetMap and a reference data set, including feature matching. Data quality metrics are considered both globally for the entire study area and locally on grid cell level, thus exposing spatial variation in data quality. Interactive maps and HTML/PDF reports are generated to facilitate the visual exploration and communication of results. BikeDNA supports quality assessments of bicycle infrastructure data for a wide range of applications—from urban planning to OpenStreetMap data improvement or network research for sustainable mobility.
  2. How Do US Congress Members Advertise Climate Change: An Analysis of Ads Run on Meta’s Platforms, by L. Aisenpreis, G. Gyrst & V. Sekaram published in Proceedings of the International AAAI Conference on Web and Social Media

    Ensuring transparency and integrity in political communication on climate change has arguably never been more important than today. Yet we know little about how politicians focus on, talk about, and portray climate change on social media. Here we study it from the perspective of political advertisement. We use Meta’s Ad Library to collect 602,546 ads that have been issued by US Congress members since mid-2018. Out of those only 19,176 (3.2%) are climate-related. Analyzing this data, we find that Democrats focus substantially more on climate change than Republicans, with 99.7% of all climate-related ads stemming from Democratic politicians. In particular, we find this is driven by a small core of Democratic politicians, where 72% of all impressions can be attributed to 10 politicians. Interestingly, we find a significant difference in the average amount of impressions generated per dollar spent between the two parties. Republicans generate on average 188% more impressions with their climate ads for the same money spent as Democrats. We build models to explain the differences and find that demographic factors only partially explain the variance. Our results demonstrate differences of climate-related advertisements of US congress members and reveal differences in advertising characteristics between the two political parties. We anticipate our work to be a starting point for further studies about climate-related ads on Meta’s platforms.
  3. Social dimensions impact individual sleep quantity and quality, by S. Park, A. Zhunis, M. Constantinides, L.M. Aiello, D. Quercia & M. Cha, published in Scientific Reports

    While sleep positively impacts well-being, health, and productivity, the effects of societal factors on sleep remain underexplored. Here we analyze the sleep of 30,082 individuals across 11 countries using 52 million activity records from wearable devices. Our data are consistent with past studies of gender and age-associated sleep characteristics. However, our analysis of wearable device data uncovers differences in recorded vs. self-reported bedtime and sleep duration. The dataset allowed us to study how country-specific metrics such as GDP and cultural indices relate to sleep in groups and individuals. Our analysis indicates that diverse sleep metrics can be represented by two dimensions: sleep quantity and quality. We find that 55% of the variation in sleep quality, and 63% in sleep quantity, are explained by societal factors. Within a societal boundary, individual sleep experience was modified by factors like exercise. Increased exercise or daily steps were associated with better sleep quality (for example, faster sleep onset and less time awake in bed), especially in countries like the U.S. and Finland. Understanding how social norms relate to sleep will help create strategies and policies that enhance the positive impacts of sleep on health, such as productivity and well-being.

New NERDS paper: Gender inequality in cycling

Revealing the determinants of gender inequality in urban cycling with large-scale data, by A. Battiston, L. Napoli, P. Bajardi, A. Panisson, A. Perotti, M. Szell & R. Schifanella, published in EPJ Data Science

The uptake of cycling in today’s cities is especially low for women: there is a largely unexplained, persistent gender gap in cycling. To understand the determinants of this gender gap in cycling at scale, here we use massive, automatically-collected data from the tracking application Strava on outdoor cycling for 61 cities across the United States, the United Kingdom, Italy and the Benelux area. While Strava data is particularly well-suited to describe the behavior of regular cyclists and its generalizability to occasional cyclists requires further investigation, the size of these data and their characteristics represent an unprecedented opportunity for the literature on cycling. Leveraging the associated gender and usage information, we first quantify the emerging gender gap in recreational cycling at city-level. A comparison of cycling rates of women across cities within similar geographical areas—where the penetration of Strava is assumed to be comparable—unveils a broad range of gender gaps. On a macroscopic level, we link this heterogeneity to a variety of urban indicators and provide evidence for traditional hypotheses on the determinants of the gender-cycling-gap. We find a positive association between female cycling rate and urban road safety. On a microscopic level, we identify female preferences for street-specific features in the city of New York. Assuming that the determinants of the gender-cycling-gap are similar across regular and occasional cyclists, our study suggests that enhancing the quality of the dedicated cycling infrastructure may be a way to make urban environments more accessible for women, thereby making urban transport more sustainable for everyone.

New NERDS paper: Quantifying Ideological Polarization on a Network

Today we published a paper on ideological polarization. Special congrats to Marilena for this being her first paper!

Quantifying Ideological Polarization on a Network Using Generalized Euclidean Distance, by M. Hohmann, K. Devriendt, M. Coscia, published in Science Advances


An intensely debated topic is whether political polarization on social media is on the rise. We can investigate this question only if we can quantify polarization, by taking into account how extreme the opinions of the people are, how much they organize into echo chambers, and how these echo chambers organize in the network. Current polarization estimates are insensitive to at least one of these factors: they cannot conclusively clarify the opening question. Here, we propose a measure of ideological polarization which can capture the factors we listed. The measure is based on the Generalized Euclidean (GE) distance, which estimates the distance between two vectors on a network, e.g., representing people’s opinion. This measure can fill the methodological gap left by the state of the art, and leads to useful insights when applied to real-world debates happening on social media and to data from the US Congress.

New NERDS paper: Multidimensional tie strength and economic development

Multidimensional tie strength and economic development, by L.M. Aiello, S. Joglekar, and D. Quercia, published in Scientific Reports

For decades, Granovetter’s tie strength has been quantified using the frequency of interaction. Yet, frequency does not reflect Granovetter’s initial conception of strength, which is a mix of social dimensions including exchnage of knowledge and provision of support. We used Natural Language Processing to quantify whether text messages convey expressions of knowledge or support, and applied it to a large conversation network from of Reddit users resident in the United States. Borrowing a classic experimental setup, we tested whether the diversity of social connections of Reddit users resident in a specific US state would correlate with the economic opportunities in that state (estimated with GDP per capita). We found that the combination of diversity calculated on the knowledge and support networks correlates much more strongly with GDP than diversity calculated on a network weighted with interaction frequency (R2=0.62 vs. R2=0.30). We also found that the two types of ties differ in their geographical span. Knowledge ties are long-distance (i.e., connecting people living in far-away states), support ties are mostly created among people living close by. Read more in this blogpost.