Published December 17, 2014 | Version v1
Publication

Sampling Online Social Networks: An Experimental Study of Twitter

Description

Online social networks (OSNs) are an important source of information for scientists in different fields such as computer science, sociology, economics, etc. However, it is hard to study OSNs as they are very large. For instance, Facebook has 1.28 billion active users in March 2014 and Twitter claims 255 million active users in April 2014. Also, com-panies take measures to prevent crawls of their OSNs and refrain from sharing their data with the research community. For these reasons, we argue that sampling techniques will be the best technique to study OSNs in the future. In this work, we take an experimental approach to study the characteristics of well-known sampling techniques on a full social graph of Twitter crawled in 2012 [2]. Our contri-bution is to evaluate the behavior of these techniques on a real directed graph by considering two sampling scenarios: (a) obtaining most popular users (b) obtaining an unbiased sample of users, and to find the most suitable sampling tech-niques for each scenario.

Abstract

International audience

Additional details

Identifiers

URL
https://inria.hal.science/hal-01096980
URN
urn:oai:HAL:hal-01096980v1

Origin repository

Origin repository
UNICA