Introduction to social network analysis

Transcription

Introduction to social network analysis
Introduction to social network analysis
Introduction to social network analysis
Paola Tubaro
University of Greenwich, London
26 March 2012
Introduction to social network analysis
Introduction
Introducing SNA
Rise of online social networking
services:
⇒ social networks to the fore.
New interest for social network
analysis (SNA).
Yet networks have always existed!
Likewise, SNA now has a long
history.
Introduction to social network analysis
Introduction
Today
Understand what SNA is.
Understand how you could use
it.
Learn basic principles and
measures.
Introduction to social network analysis
Introduction
Outline
Outline
1 Introduction
2 What is SNA
3 Data
4 Network metrics
5 Further readings
Introduction to social network analysis
Introduction
Motivation
What can SNA be used for?
Improvements in organisational performance.
Policy interventions for behaviour change;
Introduction to social network analysis
Introduction
Motivation
The organisational chain of a company
Introduction to social network analysis
Introduction
Motivation
Formal chart vs. network
With whom do you discuss issues important to your work?
Introduction to social network analysis
Introduction
Motivation
Formal chart vs. network
With whom do you discuss issues important to your work?
Senior people relatively peripheral (Barry): removed from day-to-day activities of the group.
Introduction to social network analysis
Introduction
Motivation
Formal chart vs. network
With whom do you discuss issues important to your work?
The very central role of Nick (what if he moves to another job?)
Introduction to social network analysis
Introduction
Motivation
Formal chart vs. network
With whom do you discuss issues important to your work?
Product 1 division relatively separate from overall network.
Introduction to social network analysis
Introduction
Motivation
Interventions
Using network data to improve flows of communication and coordination in the organisation.
Introduction to social network analysis
Introduction
Motivation
Networks for behaviour change: smoking prevention
Network of friendships among sixth grade pupils.
Squares = girls, circles = boys; blue = smokers, red = non-smokers. Valente et al. 2003.
Introduction to social network analysis
Introduction
Motivation
Use popular pupils (“opinion leaders”) to reduce smoking
in adolescents
Identify most popular pupils in class;
Recruit and train them;
Use them to spread the message.
Valente et al. 2003: network method effective in reducing adolescents’
smoking.
Introduction to social network analysis
What is SNA
Defining SNA
An approach to human behaviours and social interactions.
A set of specific analytical and statistical methods.
A special type of data (and techniques of data collection).
A set of visualisation tools.
Introduction to social network analysis
What is SNA
What is a network
What is a network —a formal definition
= A set of units (nodes) connected
by one or more relations (ties)
What is a node?
⇒ Depends on setting: person,
group/organisation, object.
What is a tie?
⇒ A relation or a shared trait:
friendship, advice, exchange,
co-work.
Introduction to social network analysis
What is SNA
What is a network
Graphs and networks
Circles (A, B) represent nodes.
Lines (e.g. between A and B) represent
ties/edges.
Graph visualizes the whole structure of
ties of a defined group.
Graphical conventions (colours, size of
nodes and/or ties) can be added to show
attributes.
For example: if this is a network of
friendship, blue = boys, red = girls.
Introduction to social network analysis
What is SNA
What is a network
Graphs and networks
Circles (A, B) represent nodes.
Lines (e.g. between A and B) represent
ties/edges.
Graph visualizes the whole structure of
ties of a defined group.
Graphical conventions (colours, size of
nodes and/or ties) can be added to show
attributes.
For example: if this is a network of
friendship, blue = boys, red = girls.
Introduction to social network analysis
What is SNA
What is a network
Isolates, dyads and triads
a
u
Isolate
b
u
d u
c
u
A
A
A
A
A
Auf
e u
Dyad
Triad
Introduction to social network analysis
What is SNA
The network perspective
A new perspective
SNA requires a change of mindset with
respect to other social science approaches.
Emphasis is on relationships, not
attributes.
Not just dyadic relationships (just A and
B), but dyadic relationships as embedded
in a whole set of relationships.
Introduction to social network analysis
What is SNA
The network perspective
A new perspective
SNA requires a change of mindset with
respect to other social science approaches.
Emphasis is on relationships, not
attributes.
Not just dyadic relationships (just A and
B), but dyadic relationships as embedded
in a whole set of relationships.
Introduction to social network analysis
What is SNA
The network perspective
A new perspective
SNA requires a change of mindset with
respect to other social science approaches.
Emphasis is on relationships, not
attributes.
Not just dyadic relationships (just A and
B), but dyadic relationships as embedded
in a whole set of relationships.
Introduction to social network analysis
What is SNA
The network perspective
Embedded relationships
Figure: Suppose the relationship represented here is friendship. How may friendship between A and B vary in these
three different contexts?
Introduction to social network analysis
What is SNA
The network perspective
Triads
a
a
a
u
u
u
AK
A
A
A
A
A
A
A
A
A
A
A
A
A
A
U uc b u
U uc b u
A
-A
-Auc
b u
Intransitive
Transitive
Intransitive: Only bilateral ties.
Transitive: A friend of my friend is my friend.
Three-cycles: a form of generalized exchange.
3-cycles
Introduction to social network analysis
What is SNA
The network perspective
Triads
a
a
a
u
u
u
AK
A
A
A
A
A
A
A
A
A
A
A
A
A
A
U uc b u
U uc b u
A
-A
-Auc
b u
Intransitive
Transitive
Intransitive: Only bilateral ties.
Transitive: A friend of my friend is my friend.
Three-cycles: a form of generalized exchange.
3-cycles
Introduction to social network analysis
What is SNA
The network perspective
Network effects, more globally
a
1 x
A
K
e x
A
A
x
d
A
Axc
b x
For example, those who attract many choices will attract even more in
future (reputation effect, “Matthew” effect).
Does a high (and growing) number of friends have advantages /
disadvantages?
Introduction to social network analysis
What is SNA
The network perspective
Network effects, more globally
a
1 x
A
K
e x
A
A
x
d
A
Axc
b x
For example, those who attract many choices will attract even more in
future (reputation effect, “Matthew” effect).
Does a high (and growing) number of friends have advantages /
disadvantages?
Introduction to social network analysis
What is SNA
The network perspective
Network effects, more globally
a
1 x
A
K
e x
A
A
x
d
A
Axc
b x
For example, those who attract many choices will attract even more in
future (reputation effect, “Matthew” effect).
Does a high (and growing) number of friends have advantages /
disadvantages?
Introduction to social network analysis
What is SNA
The network perspective
Network effects, more globally
a
1 x
A
K
e x
A
A
x
d
A
Axc
b x
For example, those who attract many choices will attract even more in
future (reputation effect, “Matthew” effect).
Does a high (and growing) number of friends have advantages /
disadvantages?
Introduction to social network analysis
What is SNA
Summary
Now you know:
What a network is;
Correspondence between a network and a graph;
Difference between triadic and dyadic structures;
Global effects of network structure.
Introduction to social network analysis
Data
Network data
Data format:
How network data look like
How they differ from other social science data
From data to graph
Data collection:
Name generators/interpreters
Archives
Web crawlers
Introduction to social network analysis
Data
Data format
Data type 1: Ego networks
The whole set of contacts (alters) of
one person or entity (ego).
Usually includes attributes of alters
and ties between them.
Usually collected for a sample of egos
(e.g. in a survey).
Typically, graphically represented with
ego at its centre (star-shaped).
Introduction to social network analysis
Data
Data format
Example: Ego networks to discover “hidden” populations
Enrolling HIV+ persons to participate in vaccine preparedness study through their networks. Valente, 2010.
Introduction to social network analysis
Data
Data format
Data type II: Whole networks
Mapping the whole set of ties of a
particular group, setting or population.
Not focused on one particular person
or entity.
Network boundaries must be
well-defined.
Examples: network of friends in a
classroom; network of
knowledge-sharing between employees
of an organisation.
Introduction to social network analysis
Data
Data format
Data storage: traditional social science
Social science data are usually represented in the form of a rectangular
table, where each row is an observation, each column is a variable: For
example:
name
Jane
Mary
Bob
Sue
Alan
Tom
age
25
31
29
28
32
29
gender
0
0
1
0
1
1
married
0
0
1
1
0
1
Introduction to social network analysis
Data
Data format
Network data storage I: matrix
Network data can be stored as a n-by-n square matrix with all nodes
listed in both columns and rows.
The value of cell (i, j) in the matrix indicates whether the node i and the
node j are connected (1) or not (0).
The diagonal is meaningless.
For example, for a friendship network:
Jane
Jane
Mary
Bob
Sue
Alan
Tom
1
1
0
0
0
Mary
1
0
1
0
0
Bob
1
0
0
0
0
Sue
0
1
0
1
0
Alan
0
0
1
1
1
Tom
0
0
0
0
1
Introduction to social network analysis
Data
Data format
Data storage II: Edge list
The edge list stores each pair of connected nodes in a single row of a
table.
For example, for the same friendship network:
ego
Jane
Jane
Mary
Bob
Alan
Alan
alter
Mary
Bob
Sue
Alan
Tom
Susan
Introduction to social network analysis
Data
Data format
Which format to choose
Most network analysis packages support both formats.
Some provide conversion facilities (e.g. UCINET: edge list to
matrix).
It is usually possible to combine network data (in matrix or edge list
format) and attributes.
A rectangular table is usually needed for attribute data —as in
traditional social science.
Introduction to social network analysis
Data
Data format
Some general rules
Matrix visually appealing when nodeset is small, but difficult to
handle when it is large (because all possible pairs must be explicitly
included).
With large node sets, edge list is more convenient (because only
existing ties need to be listed).
Introduction to social network analysis
Data
Data format
Tie data I
Directed ties:
a
x
b
-x
a
x
b
-x
Tie goes from one node to another, but
not necessarily back.
E.g. Advice-giving, money-lending.
Usual graphical representation: arrow.
When directed ties do go in both
directions, they are reciprocal ties.
Usual graphical representation: double
arrow.
Introduction to social network analysis
Data
Data format
Tie data I
Directed ties:
a
x
b
-x
a
x
b
-x
Tie goes from one node to another, but
not necessarily back.
E.g. Advice-giving, money-lending.
Usual graphical representation: arrow.
When directed ties do go in both
directions, they are reciprocal ties.
Usual graphical representation: double
arrow.
Introduction to social network analysis
Data
Data format
Tie data II
Undirected ties:
Ties are mutual by definition.
E.g. Siblings, co-workers.
Usual graphical representation: line.
a
x
b
x
Introduction to social network analysis
Data
Data format
Undirected ties: matrix is symmetric
Jane
Jane
Mary
Bob
Sue
Alan
Tom
1
1
0
0
0
Mary
1
0
1
0
0
Bob
1
0
0
0
0
Sue
0
1
0
1
0
Alan
0
0
1
1
1
Tom
0
0
0
0
1
Introduction to social network analysis
Data
Data format
Directed ties: matrix is NOT symmetric
Jane
Jane
Mary
Bob
Sue
Alan
Tom
0
0
0
0
0
Mary
1
0
0
0
0
Bob
1
0
0
0
0
Sue
0
1
0
1
0
Alan
0
0
1
0
0
Tom
0
0
0
0
1
Introduction to social network analysis
Data
Data format
Binary and valued ties
Binary ties indicate presence or absence of tie
Valued ties can be stronger or weaker, under
some definition of strength:
Emotional closeness;
Frequency of contact;
Duration of Relationships.
Graphically: line (arrow) thickness often
represents strength of tie.
Introduction to social network analysis
Data
Data format
Storing valued ties in a edge list
The edge list can include a third column with attributes of each tie.
In our friendship example, we can include duration of friendship:
ego
alter
Jane
Jane
Mary
Bob
Alan
Alan
Mary
Bob
Susan
Alan
Tom
Susan
duration
(years)
5
2
3
1
2
2
Introduction to social network analysis
Data
Data format
Storing valued ties in a matrix
Instead of 0-1 values, the matrix has different values depending on
duration of the relationship:
Jane
Jane
Mary
Bob
Sue
Alan
Tom
0
0
0
0
0
Mary
5
0
0
0
0
Bob
2
0
0
0
0
Sue
0
3
0
2
0
Alan
0
0
1
0
0
Tom
0
0
0
0
2
Introduction to social network analysis
Data
Data format
Graphs
Basic principles of graph representation
are simple (nodes and edges).
But graph visualisation is a complex
problem in computer science.
Which representation is most suitable for
detecting network structure and
properties?
Circle
Introduction to social network analysis
Data
Data format
Graphs
Basic principles of graph representation
are simple (nodes and edges).
But graph visualisation is a complex
problem in computer science.
Which representation is most suitable for
detecting network structure and
properties?
Fruchtermann-Rheinhold
Introduction to social network analysis
Data
Data format
Graphs
Basic principles of graph representation
are simple (nodes and edges).
But graph visualisation is a complex
problem in computer science.
Which representation is most suitable for
detecting network structure and
properties?
Kamada-Kawai
Introduction to social network analysis
Data
Data format
Graphs
Basic principles of graph representation
are simple (nodes and edges).
But graph visualisation is a complex
problem in computer science.
Which representation is most suitable for
detecting network structure and
properties?
Spring
Introduction to social network analysis
Data
Data format
Graphs
Basic principles of graph representation
are simple (nodes and edges).
But graph visualisation is a complex
problem in computer science.
Which representation is most suitable for
detecting network structure and
properties?
MDS
Introduction to social network analysis
Data
Data format
Now you know:
Format for network data: square matrix, rectangular matrix, edge
list.
Difference between Ego and whole networks.
Directed and undirected ties.
Binary and valued ties.
Graphical conventions to represent these different data.
Introduction to social network analysis
Data
Data collection
Collecting network data
Networks are built from nodes and the ties between them.
Who are the nodes?
What are the ties?
How to elicit information?
Introduction to social network analysis
Data
Data collection
How to identify nodes
Ego-network data collections often included in larger surveys.
Whole network data collection requires defining network boundaries,
for example:
Members of an organisation;
Students of one school;
Attendees of one particular event.
N.B. collection of whole network data needs to be exhaustive
–sensitive to response rate.
Introduction to social network analysis
Data
Data collection
Collecting network data through surveys: name generators
and interpreters
Name generators are questions to elicit respondents’ alters, for example:
From time to time, most people discuss important matters
with other people. Looking back of the last six months, who
are the people with whom you discussed matters important to
you. Just tell me their names or initials.
(General Social Survey, 1985)
Can be accompanied by name interpreters to report alter characteristics
and identify ties between alters.
Figure: A name generator with a graphical interface in a web-based survey; research project ANAMIA.
Introduction to social network analysis
Data
Data collection
Collecting network data through surveys: rosters
Provide respondents with a list of potential network members and ask
them to choose from the list those to whom they are tied, for example:
Here is the list of all the members of your Firm.
Would you go through this list, and check the names of those
you socialize with outside work. You know their family, they
know yours, for instance. I do not mean all the people you are
simply on a friendly level with, or people you happen to meet
at Firm functions.
(Lazega, 2001)
Introduction to social network analysis
Data
Data collection
Collecting network data through surveys: rosters (cont.)
Used for whole network studies.
Also useful as a memory-aid.
Requires the researcher to have a complete list of nodes from start.
Only feasible for relatively small networks (e.g. schools, companies).
Introduction to social network analysis
Data
Data collection
Collecting network data from archives
For example: contract data from companies’ financial statements;
citations data, from publishers’ portals.
Depends on the quality of the archive and the actual availability of
network information.
Need to ensure definition of ties is consistent and data are reported
uniformly across all nodes.
Need to ensure completeness (for whole networks).
Figure: A citations network. From a study of the literature on pro-anorexic websites over ten years, with a corpus
of 60 scientific articles. Casilli, Tubaro and Araya (2012), ANAMIA.
Introduction to social network analysis
Data
Data collection
Webcrawling
Using dedicated software to retrieve websites and the links between
them.
Increasingly popular with the rise of web-based networks, online
social networking services, the study of the Internet as a network.
Defining network boundaries may be difficult.
Frequent need for manual verification of data quality.
Privacy protection issues.
Figure: A map of the pro-anorexic web sphere in France. F. Pailler, D. Pereira, ANAMIA.
Introduction to social network analysis
Data
Data collection
Now you know:
Different ways of collecting network data: surveys, archives,
webcrawling.
All have advantages and disadvantages.
Choice depends on research questions, context, and expected
outcomes.
Introduction to social network analysis
Network metrics
Measuring properties of networks
Focus is on properties of patterns of relationships, independently of
node attributes.
Based on the mathematics of graph theory, refined with social
science concepts.
A variety of algorithms, measures and software applications are
available.
Introduction to social network analysis
Network metrics
Size
Size
Network size = number of nodes (= number of contacts in a
personal network);
The “Dunbar number”: cognitive limitations restrict the size of
personal networks to about 150 contacts;
An open question: have social media increased human capacity to
maintain relationships?
Median network size on Facebook = 99, average about 150 - 200
(though large variation).
Introduction to social network analysis
Network metrics
Density
Density
The proportion of ties that actually exist and the ties that could
exist in principle:
Density =
L
(n∗(n−1))
2
Density =
for undirected ties;
L
(n∗(n−1))
for directed ties.
where L = number of edges, n = number of nodes.
Introduction to social network analysis
Network metrics
Density
Application: Dense networks and behaviours
Denser online networks spread behaviours faster: Centola 2010.
Introduction to social network analysis
Network metrics
Density
Why is this so?
When adoption of a new behavior requires social reinforcement (threshold effect), a denser network favours change.
Introduction to social network analysis
Network metrics
Centrality
Degree centrality
Who are the most “important” nodes?
Diane has the highest number of direct
connections (degree);
A connector, or hub.
Krackhardt’s kite network.
Introduction to social network analysis
Network metrics
Centrality
Degree centrality
Who are the most “important” nodes?
Diane has the highest number of direct
connections (degree);
A connector, or hub.
Krackhardt’s kite network.
Introduction to social network analysis
Network metrics
Centrality
Betweenness centrality
Heather has fewer connections than
Diane;
Yet she occupies a strategic position,
between different parts of the network;
She controls what flows in the network.
Krackhardt’s kite network.
Introduction to social network analysis
Network metrics
Centrality
Closeness centrality
Fernando and Garth have fewer
connections than Diane;
But they are at a shorter distance from all
other network members;
They can monitor the information flow in
the network.
Krackhardt’s kite network.
Introduction to social network analysis
Network metrics
Centrality
Core-periphery structures
Ike and Jane have low centrality scores;
e.g. they may be external contractors for
a company;
may be sources of fresh information!
Krackhardt’s kite network.
Introduction to social network analysis
Network metrics
Centrality
Network centralisation
The extent to which a network is dominated by one (or a few) nodes:
u
@
u
@
@
@
u
u
u
u
@u
@
@
@
@u
u
u
u
u
u
Introduction to social network analysis
Network metrics
Centrality
Network centralisation
Measures the extent to which a network is dominated by a single
central node.
Comparing centrality of the most central node to the centrality of
other nodes.
Normalized by dividing by the maximum centralization possible for a
network of the given size.
Ranges from 0 to 1 (star network).
Introduction to social network analysis
Network metrics
Centrality
Centralisation may vary over time
Figure: The advice network of judges in a Parisian court. Correlation between degrees, first to second observation
(left panel) and second to third (right panel).
Introduction to social network analysis
Network metrics
Distance
Distance
Distance: number of steps from one member to another;
Shorter paths in a network are the most important;
The shorter the path from one network to the other, the quicker
and more efficient the flow of information, advice, knowledge.
Left: Longer paths; Right: Shorter paths.
Introduction to social network analysis
Network metrics
Cliques
Cliques
u
A
A
A
A
A
Au
u
3-member clique
u
@
@
u
@
u
@
@
@u
4-member clique
u
u
P
@
@PPP
PP @
@
PP
Pu
@
@
@
@
@u
u
5-member clique
A clique is a sub-set of nodes where all possible pairs of nodes are
directly connected.
Scott (2000).
Introduction to social network analysis
Network metrics
Cliques
Real-world cliques
u
@
u
u
@
@
@
u
u
u
u
@
@u
u
u
@
@
@
@
@u
1-clique
@
u
2-clique
Completely connected groups uncommon.
n-clique: points connected by a maximum path link.
n-cliques of greater than 2 empirically infrequent.
Scott (2000).
3-clique
Introduction to social network analysis
Network metrics
Cliques
Application: Small Worlds
A “small world” network is sparse, but with dense neighbourhoods and
short paths; and there are few steps from one member to any other.
Introduction to social network analysis
Network metrics
Cliques
Now you know:
Key metrics to measure properties of networks:
Size;
Density;
Centrality / Centralisation;
Distance;
Cliques.
Introduction to social network analysis
Further readings
Books on social network analysis: general
Thomas W. Valente. Social networks and health. Models, Methods,
and Applications, Oxford UP 2010.
Christina Prell. Social Network Analysis. History, Theory and
Methodology, Sage 2011 (October).
John P. Scott. Social Network Analysis: A Handbook, Sage 2000.
Introduction to social network analysis
Further readings
Books on social network analysis: general (cont.)
Stanley Wasserman and Katherine Faust. Social Network Analysis:
Methods and Applications, Cambridge UP, 1994.
Peter J. Carrington, John Scott, Stanley Wasserman (Eds.) Models
and Methods in Social Network Analysis, Cambridge UP, 2005.
David Knoke. Social Network Analysis, Sage 2008.
Introduction to social network analysis
Further readings
Books on social network analysis: Theory
Ronald S. Burt. Brokerage and Closure: An Introduction to Social
Capital, Oxford UP, 2005.
Ronald S. Burt. Neighbor Networks: Competitive Advantage Local
and Personal, Oxford UP, 2010.
Nan Lin. Social Capital: A Theory of Social Structure and Action,
Cambridge UP, 2002.
Introduction to social network analysis
Further readings
Books on social network analysis: Economics
Matthew O. Jackson Social and Economic Networks, Princeton UP,
2010.
Sanjeev Goyal. Connections: An Introduction to the Economics of
Networks, Princeton UP, 2009.
Fernando Vega-Redondo. Complex Social Networks,Cambridge UP
2007.
Introduction to social network analysis
Further readings
Journals
Social Networks, Elsevier
Connections
Journal of Social Structure
Redes (Spanish)
Introduction to social network analysis
Further readings
Associations and conferences
INSNA: Sunbelt XXXIII conference, May 2013, Hamburg
(www.insna.org);
AFS - RT26, Ecole d’été, September 2012;
UKSNA: 8th annual conference, Bristol, June 2012;
ASNA: 9th annual conference, Zurich, September 2012.
Introduction to social network analysis
Further readings
Thank you!
Paola Tubaro, [email protected]