Category Archives: Publication

Two NERDS papers out: Road User Safety and Growing Bicycle Networks

We just published two more papers! Both are on the topic of sustainable mobility:

    1. Growing urban bicycle networks, by M. Szell, S. Mimar, T. Perlman, G. Ghoshal, and R. Sinatra, published in Scientific Reports

      Here we explore systematically the topological limitations of urban bicycle network development. For 62 cities we study different variations of growing a synthetic bicycle network between an arbitrary set of points routed on the urban street network. We find initially decreasing returns on investment until a critical threshold, posing fundamental consequences to sustainable urban planning: Cities must invest into bicycle networks with the right growth strategy, and persistently, to surpass a critical mass. Growing networks from scratch makes our approach a generally applicable starting point for sustainable urban bicycle network planning with minimal data requirements.
      The paper comes with an accompanying data visualization: https://growbike.net

    2. Identifying urban features for vulnerable road user safety in Europe, by M. Klanjcic, L. Gauvin, M. Tizzoni, and M. Szell, published in EPJ Data Science

      We identify urban features that are determinants of vulnerable road user safety through the analysis of inter-mode collision data across 24 European cities. We observe that cities with the highest rates of walking and cycling modal shares are the safest for the most vulnerable users. Our results suggest that policies aimed at increasing the modal share of walking and cycling are key to improve road safety for all road users.
      We explain and motivate our project in this accompanying blogpost (to appear on https://blogs.biomedcentral.com): 

      Identifying urban features for vulnerable road user safety in Europe

      Which traffic participants and which urban features are most associated with road deaths? Recently published work in EPJ Data Science explores this question with data from 24 European cities.

      Road crashes result in yearly 1.3 million deaths and 2.3 trillion USD of economic damage. Because of this pressing societal issue, the UN has declared in 2015 the global sustainability goal to halve the number of road deaths by 2020. This goal failed: Road deaths keep rising worldwide.

      Many cities are wondering how to solve this issue. However, they might not have the full picture because road crash statistics tend to be reported in a victim-centered way: There are detailed statistics on the distributions of victim demographics such as age or gender, but this neglects necessary information for answering two important questions towards better crash prevention: 1) Who causes the crashes? 2) Why do crashes happen?

      The first question can be explored via the so-called casualty matrix. It shows the casualties between all combinations of different traffic participants, for example the threat of cars on cars, of cars on pedestrians, or the threat of trucks on cyclists.

      As has been previously shown in an impressive data visualization by a Dutch journalist and researcher team, the by far biggest threat to human life on urban streets in the Netherlands is motorized vehicles – cars and trucks – while cyclists and pedestrians are overwhelmingly their victims and harmless. This sounds plausible, but a systematic, quantitative study over multiple countries has been missing.

      The second question – Why do crashes happen? – is much harder to answer. Generally crashes happen in an interplay between the individual behaviors of crash participants and their environments.

      Environmental features like the extent of pedestrian areas, cycling tracks, or speed limits, are easier to collect than behavioral data, therefore their relation to road user risk could be explored in a straightforward way. And because the environment can be changed or regulated by decision makers, they can be held responsible to act.

      To support with evidence such actions towards making cities safer, the OECD recently called for developing a modern approach to road safety: 1) collect and analyze crash data from a larger set of cities, 2) investigate the relationships between urban shape, density, speeds, and road user risk, and 3) analyze casualty matrices.

      Inspired by these OECD recommendations and the Dutch data visualization, in our work we collected crash data from 24 cities in 5 European countries with high enough resolution to build and explore casualty matrices, to quantify road safety in a systemic approach, and to identify those urban features that are most relevant, especially for vulnerable road users like pedestrians who are known to be disproportionally impacted.

      Exploring the casualty matrices first, we found the same overall picture as our Dutch colleagues, see Fig. 1: Cars are the most substantial hazard. However, we also found considerable local variations. For example, cars are a considerable threat to pedestrians and cyclists in inner London, but this is much less the case in Barcelona.

      Figure 1: Casualty matrices for 2018 show road deaths and serious injuries after a traffic participant on the left collided with one on the bottom. Cars are responsible for the majority of road deaths/injuries, while columns for pedestrians and cyclists do not appear because they pose practically no risk to others.

      When normalizing the number of deaths and serious injuries by population, we found British cities to be most dangerous, while Oslo is by far the least dangerous. This is not surprising given how much Oslo has recently invested into following a Vision Zero strategy, i.e. to aim for zero road fatalities, which they achieved in 2019.

      Finally, we set out to answer: What are the urban features most associated with crashes? Here we considered several features acquired from different sources, like OpenStreetMap: population density, the amount of bicycle tracks versus car lanes, the fraction of low-speed-limit streets, the distribution of how people move (walking, cycling, public transport, or motor vehicles), temperature, precipitation, and GDP.

      Using an information theory measure to identify the most fitting pairings of these features with road crashes, we found the best significant predictor, see Fig. 2: Cities with more people walking have less road deaths.

      Figure 2: The share of people walking in a city is a significant predictor for less casualties, for any traffic participant killed or seriously injured by a car. Numbers are regression coefficients, black borders denote statistical significance at p < 0.05.

      Interestingly, this result extends to all modes of transport: More walking is associated not just with higher pedestrian safety, but also with higher cyclist and motorist safety.

      Apart from the share of walking, a similarly strong association with road safety is having more streets limited to at most 30 km/h.

      We need to be clear that our results can be only as good as the underlying data, and these can have large reporting biases. For example, crashes with cyclists are often not reported, and different EU countries have different reporting procedures. So, more research and more standardized policies on crash reporting are needed. Also, we only calculated statistical correlations, so we cannot say anything about cause and effect.

      Nevertheless, our data-driven conclusions support a modern, evidence-based paradigm of road safety, suggesting this advice to urban decision makers: Make your cities more walkable and remove the hazard of cars. Besides the massive public health benefits, this will make your city more livable and its transport system more sustainable.

Two NERDS papers out: Committed Minorities from Reddit to Wall Street and Multilayer Network Distances

We are on a streak and just published two more papers!

    1. From Reddit to Wall Street: the role of committed minorities in financial collective action, by L. Lucchini, L.M. Aiello, L. Alessandretti, G. De Francisci Morales, M. Starnini and A. Baronchelli, published in Royal Society Open Science

      We analyzed the coordinated activity on Reddit that led to target short-selling activity by hedge funds on GameStop shares, causing a surge in the share price and triggering significant losses for the funds involved. We found that a small fraction of individuals can trigger large behavioural cascades, and we show the role of commitment, network centrality, and social identity in such coordination process. Our findings shed light on financial collective action, which several observers anticipate will grow in importance.
    2. Generalized Euclidean Measure to Estimate Distances on Multilayer Networks, by M. Coscia, published in ACM Transactions on Knowledge Discovery from Data

      In this paper we propose an algorithm solving node vector distance for multilayer networks, which is a problem that arises for all kinds of network spreading processes like epidemics, economic growth, or human behavior. We do so by adapting the Mahalanobis distance, incorporating the graph’s topology via the pseudoinverse of its Laplacian. Since this is a proper generalization of the Euclidean distance in a complex space defined by the topology of the graph, and that it works on multilayer networks, we call our measure the Multi Layer Generalized Euclidean (MLGE).

Two NERDS papers out: Measuring Violence via Twitter and Missing Links in Bike Networks

We start the spring with two new papers:

    1. Measuring Violence: A Computational Analysis of Violence and Propagation of Image Tweets From Political Protest, by L. Rossi, C. Neumayer, J. Henrichsen, and L.K. Beck, published in Social Science Computer Review

      We investigated the impact of violence on the propagation of images in social media in the context of political protest. Using a computational approach, we measure the relative violence of a large set of images shared on Twitter during the protests against the G20 summit in Frankfurt am Main in 2017. This allows us to investigate if more violent content is shared more times and faster than less violent content on Twitter, and if different online communities can be characterized by the level of violence of the visual content they share. The level of violence in an image tweet does not correlate with the number of retweets and mentions it receives that the time to retweet is marginally lower for image tweets containing a high level of violence and that the level of violence in image tweets differs between communities.
    2. Automated Detection of Missing Links in Bicycle Networks, by A. Vybornova, T. Cunha, A. Gühnemann, and M. Szell, published in Geographical Analysis

      Here, we develop the IPDC procedure (Identify, Prioritize, Decluster, Classify) for finding the most important missing links in urban bicycle networks, using data from OpenStreetMap. In this procedure we first identify all possible gaps following a multiplex network approach, prioritize them according to a flow-based metric, decluster emerging gap clusters, and manually classify the types of gaps. We apply the IPDC procedure to Copenhagen and report the 105 top priority gaps. For evaluation, we compare these gaps with the city’s most recent Cycle Path Prioritization Plan and find considerable overlaps. Our results show how network analysis with minimal data requirements can serve as a cost-efficient support tool for bicycle network planning.
      We also developed an interactive visualization of our results at: fixbike.net

First NERDS papers of 2022 published: Epidemic Dreams and Conflicts versus Polarization

We start 2022 with two new papers!

    1. Epidemic dreams: dreaming about health during the COVID-19 pandemic, by S. Šćepanović, L.M. Aiello, D. Barrett and D. Quercia, published in Royal Society Open Science

      Luca and collaborators ask: Why were our dreams during the pandemic weird? Their computer analysis unearthed buried psychological reactions to the COVID-19 pandemic: expressions in waking life reflected a linear and logical thought process and, as such, described realistic symptoms or related disorders (e.g. nasal pain, SARS, H1N1); those in dreaming life reflected a thought process closer to the visual and emotional spheres and, as such, described either conditions unrelated to the virus (e.g. maggots, deformities, snake bites), or conditions of surreal nature (e.g. teeth falling out, body crumbling into sand).
    2. How minimizing conflicts could lead to polarization on social media: An agent-based model investigation, by M. Coscia and L. Rossi, published in PLOS ONE
       

       The paper explores an agent based model on how policing content and backlash on social media (i.e. conflict) can lead to an increase in polarization for both users and news sources. We find that the tendency of users and sources to avoid policing, backlash and conflict in general can increase polarization online. Specifically polarization comes from the ease of sharing political posts, intolerance for opposing points of view causing backlash and policing, and volatility in changing one’s opinion when faced with new information. On the other hand, it seems that the integrity of a news source in trying to resist the backlash and policing has little effect.
      Learn more on Michele’s blogpost.

Two new NERDS papers: NFT market and Pearson correlations on networks

Two new papers from the NERDS crew!

    1. Mapping the NFT revolution: market trends, trade networks, and visual features, by M. Nadini, L. Alessandretti, F. Di Giacinto, M. Martino, L.M. Aiello, A. Baronchelli, published in Scientific Reports

      Luca and collaborators performed the first large-scale analysis of the market of Non Fungible Tokens (NFTs) since its birth. The looked at 6.1 million trades of 4.7 million NFTs to learn about market, traders, visual features and price prediction. The dataset they collected is available. Learn more on this blogpost from the Alan Turing Institute.
      Also, watch the accompanying video: https://www.youtube.com/watch?v=KyIITtPKJbY
    2. Pearson correlations on complex networks, by M. Coscia, published in Journal of Complex Networks
       

       
      Estimating the correlation between two processes happening on the same network is therefore an important problem with a number of applications. However, at present there is no way to do so: current methods to estimate the correlation between two processes happening on the same network either correlate a network with itself, a single process with the network structure, or calculate a network distance between two processes. To fill this gap, Michele created a new method to extend the Pearson correlation coefficient to work on complex networks, and showed its usefulness in tasks related to social network analysis and economics. Learn more on this blogpost.

Two NERDS summer papers: Streetonomics and COVID Twitter psychology

Prolific NERDS researcher Luca Maria Aiello published 2 more papers over the summer. They already received wide media coverage:

  1. Streetonomics: Quantifying culture using street names, by M. Bancilhon, M. Constantinides , E.P. Bogucka, L.M. Aiello, D. Quercia, published in PLOS ONE

    This paper studies the names of 4,932 honorific streets in the cities of Paris, Vienna, London and New York, finding that street names greatly reflect the extent to which a society is gender biased, which professions are considered elite ones, and the extent to which a city is influenced by the rest of the world, quantifying a society’s value system.

    The paper was covered in media here:
    https://www.bbc.com/future/article/20210712-streetonomics-what-our-addresses-say-about-us
    https://www.fastcompany.com/90652762/how-streets-in-new-york-london-paris-and-vienna-got-their-names-according-to-streetonomics
    https://www.thetimes.co.uk/article/street-names-show-why-great-cities-are-worlds-apart-x06lbdwgj
    https://elpais.com/ciencia/2021-06-30/el-machismo-esta-en-las-calles.html
    https://www.lefigaro.fr/sciences/l-ame-d-une-ville-peut-elle-se-lire-dans-les-noms-de-ses-rues-20210701

  2. How epidemic psychology works on Twitter: evolution of responses to the COVID-19 pandemic in the U.S., by L.M. Aiello, D. Quercia, K. Zhou, M. Constantinides, S. Šćepanović, S. Joglekar, published in Humanities and Social Sciences Communications

    The paper studies the use of language of 122M tweets related to the COVID-19 pandemic posted in the U.S. during the whole year of 2020. On Twitter, we identified three distinct phases. Each of them is characterized by different regimes of the three psycho-social epidemics.
    See also: https://www.fastcompany.com/90659372/pandemic-emotions-research-twitter

NERDS at ICWSM-2021

NERDS are currently active at this year’s International AAAI Conference on Web and Social Media (ICWSM): https://www.icwsm.org/2021/index.htmlThe International AAAI Conference on Web and Social Media (ICWSM) is a forum for researchers from multiple disciplines to come together to share knowledge, discuss ideas, exchange information, and learn about cutting-edge research in diverse fields with the common theme of online social media. This overall theme includes research in new perspectives in social theories, as well as computational algorithms for analyzing social media. ICWSM is a singularly fitting venue for research that blends social science and computational approaches to answer important and challenging questions about human social behavior through social media while advancing computational tools for vast and unstructured data.

As usual, Luca Maria Aiello fulfils his role as Senior PC member at ICWSM. Further, we published two new papers at the event:

  1. The Healthy States of America: Creating a Health Taxonomy with Social Media, by S. Šćepanović, L.M. Aiello, K. Zhou, S. Joglekar, and D. Quercia

    Since the uptake of social media, researchers have mined online discussions to track the outbreak and evolution of specific diseases or chronic conditions such as influenza or depression. Here we developed a Deep Learning tool for Natural Language Processing that extracts mentions of virtually any medical condition or disease from unstructured social media text. We applied it to Reddit and Twitter posts, analyzed the clusters of the two resulting co-occurrence networks of conditions, and discovered that they correspond to well-defined categories of medical conditions. This resulted in the creation of the first comprehensive taxonomy of medical conditions automatically derived from online discussions, which strikingly resembles the official International Statistical Classification of Diseases and Related Health Problems (ICD-11).
  2. Multilayer Graph Association Rules for Link Prediction, by M. Coscia and M. Szell

    Here we investigate the multilayer link prediction problem with graph association rules: Will two nodes connect, and of which type?

Four new NERDS papers in May 2021

This month we have published four new papers, in Nature, Sustainability, IEEE Computer Graphics and Applications, and ACM Computing Surveys:

  1. The universal visitation law of human mobility, by M. Schläpfer, L. Dong, K. O’Keeffe, P. Santi, M. Szell, H. Salat, S. Anklesaria, M. Vazifeh, C. Ratti, G.B. West, published in Nature
    More info at the accompanying interactive website: https://senseable.mit.edu/wanderlust/

    This paper reveals a simple and robust scaling law that captures the temporal and spatial spectrum of population movement on the basis of large-scale mobility data from diverse cities around the globe.

  2. Implementing Gehl’s Theory to Study Urban Space. The Case of Monotowns, by D. Cerrone, J. López Baeza, P. Lehtovuori, D. Quercia, R. Schifanella, and L.M. Aiello, published in Sustainability

    The paper presents a method to operationalize Jan Gehl’s questions for public space into metrics to map Russian monotowns’ urban life in 2017. With the use of social media data, it becomes possible to scale Gehl’s approach from the survey of small urban areas to the analysis of entire cities while maintaining the human scale’s resolution.
  3. The Dreamcatcher: Interactive Storytelling of Dreams, by E.P. Bogucka, B.A. Aseniero, L.M. Aiello, D. Quercia, published in IEEE Computer Graphics and Applications

    Here we designed “The Dreamcatcher,” an interactive visual tool that explores the link between dreams and waking life through a collection of dream reports. We conducted a user study with 154 participants and found a 25% increase in the number of people believing that dream analysis can improve our daily lives after interacting with our tool. The visualization informed people about the potential of the continuity hypothesis to a surprising extent, to the point that it increased their concerns about sharing their own dream reports, thus opening new questions on how to design privacy-aware tools for dream collection.
  4. Community Detection in Multiplex Networks, by M. Magnani, O. Hanteer, R. Interdonato, L. Rossi, A. Tagarelli, published in ACM Computing Surveys

    Here we provide a taxonomy of community detection algorithms in multiplex networks. We characterize the different algorithms based on various properties and we discuss the type of communities detected by each method. We then provide an extensive experimental evaluation of the reviewed methods to answer three main questions: to what extent the evaluated methods are able to detect ground-truth communities, to what extent different methods produce similar community structures, and to what extent the evaluated methods are scalable. Besides offering a much needed overview of the methods and assumptions of CD in multiplex networks the paper attempts to provide few guiding principles for the choice of a community detection approach to multiplex data. 

New Paper on Sampling Social Media + Call for Abstract @ Networks21 Satellite

NERDS member Michele Coscia is having a busy March!

He published a new paper in the TKDD journal titled “Noise Corrected Sampling of Online Social Networks“. The paper focuses on a new way to perform topological network sampling, i.e. to explore a network by following its edges such that the explored (sub)network is as similar as possible to the whole structure. In this paper, the method uses a Bayesian framework to estimate the amount of novel information a new connection brings about into the currently explored sample.

He is also organizing a satellite for the Networks21 conference. The satellite is titled “Complex Networks in Economics and Innovation”. The organizers are looking for contributed abstracts on network applications on research about economic development and innovation. Read more on the official website, or submit your abstract to the submission site.

NERDS paper out: How unique is your app fingerprint?

We have a new exciting paper out: Temporal and cultural limits of privacy in smartphone app usage by Vedran Sekara et al. published in Nature Scientific Reports, asking:

How unique is your app fingerprint?The paper looks into which apps people use and creates app-fingerprints for 3.5 million individuals. Similar to forensic science where you need 12 points to distinguish between fingerprints we ask how many apps do we need to distinguish between two users? We find people’s smartphone app behavior is very unique and 3 apps are enough to identify more than 90% of all individuals. But app-fingerprints change over time and are different between countries. We find that people have more unique app-fingerprints during summer because we use more unique apps, and Americans have the most unique fingerprints (need the fewest apps to identify them) while Finns are the least unique (need more apps to identify their fingerprint). Why is this important? Because the work highlights problems with current policies intended to protect user privacy and emphasizes that policies cannot directly be ported between countries.