Methods: Machine Learning,Software
Fields: Complex Networks, Ethnography

Collaborators: Michael Fire and Carlos Guestrin



Complex networks have non-trivial characteristics and appear in many real-world systems. Due to their vital importance in a large number of research fields, various studies have offered explanations on how complex networks evolve, but the full  underlying dynamics of complex networks are not completely understood.  Many of the barriers to better understanding the evolution process of these networks can be removed with the emergence of new data sources.
This study utilizes the recently published Reddit dataset, containing over 1.65 billion comments, to construct the largest publicly available social network corpus, which contains detailed information on the evolution process of 11,965 social networks. We used this dataset to study the effect of the patterns in which new users join a network (referred to as user arrival curves, or UACs) on the network topology.  Our results present evidence that UACs are a central factor in molding a network’s topology; that is, different arrival patterns create different topological properties. Additionally, we show that it is possible to uncover the types of user arrival patterns by analyzing a social network’s  topology. These results imply that existing complex network evolution models need to be revisited and modified to include user arrival patterns as input to the models, in order to create models that more accurately reflect real-world complex networks.

Project Website

Draft Publication

Social Network Evolution Data Download