Social & Information Networks

Link prediction is the task of predicting previously unobserved relationships between entities. There are many exciting applications of this particular area of network science. Most research in the area of link prediction has been restricted to scoring based on a single measure within network topologies. Our work is developing a powerful new measure and placing existing measures in the context of a machine learning task. We are also casting the problem as a high class imbalance task.


Through the NetHealth project we are collecting continuous longitudinal data on both people’s social networks and health behaviors in order to (1) test theories about the mechanisms linking social networks and human behavior, and (2) assess the extent to which social influence processes lead to changes in health-related behaviors. Assessing the extent to which social influence operates in social networks is critical for devising future interventions that could harness the power of social networks to reduce incidences of unhealthy behaviors and increase the prevalence of healthier ones. However, empirically determining how important social influence is has turned out to be very difficult. There are other mechanisms besides social contagion through which the observed clustering (i.e., ties among similar people) can occur: self-selection (forming ties with similar others), joint exposure to concurrent exogenous factors, selective avoidance, and high decay rates for ties that do not exhibit trait matching.

Empirically adjudicating between these competing processes requires high-validity, fine-grained, longitudinal data on changes over time in people’s social networks and their health-related behavioral sides. This project is collecting such data using smartphones to capture information on communicative interactions and social ties, and Fitbit devices to capture information on physical activity (PA) and sleep habits (SH). A cohort of over six hundred college students installed applications on their smartphones and have been wearing a Fitbit device for two years. Data about who communicates with who, PA and SH allows us to map out the co-evolution of social networks and PA and SH behaviors. With two additional years of data, we will have data for a student’s full collegiate tenure, allowing us to examine not only the coevolution of networks and behaviors, but when changes in networks, PA and SH are more or less likely to occur.

With these data we are answering fundamental questions about networks and behaviors. Does who one knows (a person’s position in a social network) influence what a person does (for instance, how physically active a person is)? Or does what a person does impact who they know? What happens when two people with different PA and/or SH form a social tie? Are they likely to become similar and, if so, is it because the less active (poorer sleeper) becomes more active (a healthier sleeper) or vice a versa? Alternatively, when there are health behavior differences, is the tie likely to die quickly and never get a chance to strengthen enough so that influence processes can begin to come into play? Which of these processes are more pronounced at various stages in the career of a college student?


Influence Drives the Emergence and Growth of Social Networks

Social influence has been a widely accepted phenomenon in social networks for decades. This includes influence maximization, influence selection and quantification, and influence validation. Different from existing work, our research focuses on the effects of social influence on the evolution of social networks, aiming to answer that whether social influence is a strong force shaping the network dynamics. The problem is explored from both microscopic and macroscopic perspectives. In microscopic level, we try to answer the question that whether the model derived from social influence propagation mechanism can yield high precision in the link prediction problem. While from macroscopic perspective, we are also interested to know whether the model hypothesized from social influence spreading is able to explain popular scaling-laws in social networks. Our objective is to unveil the significant factors with a great degree of precision than has heretofore been possible, and shed new light on networks evolution.


Longitudinal Analysis and Modeling of Large-scale Social Networks

The growth in information technology systems is generating new sources of data on human behavior that are only now beginning to be analyzed. Digital communications systems log communication events and therefore contain valuable information on usage patterns that can be used to map social networks and analyze human behaviors within them. The availability of this data of over millions of individuals provides the potential to induce transformative changes in the way we analyze and understand human behavior. The data generated by digital communication technologies has five key traits that have the potential to transform the way researchers study social networks: 1) quality of statistics (the data comes from millions of users), 2) purely observational (non-obtrusive measurement), 3) complete network data (not just information on the ego networks of a sample of people) 4) longitudinal (spanning several years), and 5) spatial information (e.g., cell-phones can be geographically located). Data of such extent and longitudinal character brings with it novel challenges which can only be tackled by a well orchestrated multidisciplinary approach involving network social science, physics methods developed for large-scale interacting particle systems, mathematical statistics and data analysis, and computer science methods of data mining, community detection algorithms and agent-based modeling.

Open Sourcing the Design of Civil Infrastructure

This project involves creating a virtual organization (VO) that allows stakeholders – engineers, public officials, researchers, students, and even the public at-large – to engage as Citizen Engineers in four dimensions of collaboration: harnessing human effort, tapping collective knowledge, pooling communal software and leveraging distributed computational hardware, to rehabilitate our nation’s deteriorating civil infrastructure. With an archival Design Gallery, Social Network and Tool Repository integrated through a cyber-infrastructure that promotes accelerated research to practice, we are researching and addressing how to tap the "wisdom of the crowd" and to use crowdsourcing networks to facilitate the developing and assessment of innovative engineering practices that can address the challenges facing civil infrastructure.


Using Smart Devices to Capture the Emotionality of Offline Communication

The increasing prevalence of online interactions may be inhibiting the development of strong, reciprocal, and emotionally significant offline social ties. In order to address this issue we are developing an innovative system using smart devices that detects speech traits indicative of various emotional states and provides data on offline emotionality needed to understand changing social networks. This seed project is funded by the National Academies Keck Futures Initiate and is an outgrowth of our attendance at a special conference on the Informed Brain in a Digital World.

Click here to see a video of Dr. Hachen discussing this project.


Pathfinding in Typed Information Network Analysis

Meta path-based similarity was introduced to find paths consisting of a sequence of relations defined between different object types (i.e., structural paths at the type or meta-level). However, meta path-based similarity requires handcrafted type-queries, only operates under strict conditions, and fails to scale to even moderately sized data sets. We are developing a framework that mines frequent typed or meta-paths from large-scale typed information networks.