Introduction to social network analysis
Transcription
Introduction to social network analysis
Introduction to social network analysis Introduction to social network analysis Paola Tubaro University of Greenwich, London 26 March 2012 Introduction to social network analysis Introduction Introducing SNA Rise of online social networking services: ⇒ social networks to the fore. New interest for social network analysis (SNA). Yet networks have always existed! Likewise, SNA now has a long history. Introduction to social network analysis Introduction Today Understand what SNA is. Understand how you could use it. Learn basic principles and measures. Introduction to social network analysis Introduction Outline Outline 1 Introduction 2 What is SNA 3 Data 4 Network metrics 5 Further readings Introduction to social network analysis Introduction Motivation What can SNA be used for? Improvements in organisational performance. Policy interventions for behaviour change; Introduction to social network analysis Introduction Motivation The organisational chain of a company Introduction to social network analysis Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? Introduction to social network analysis Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? Senior people relatively peripheral (Barry): removed from day-to-day activities of the group. Introduction to social network analysis Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? The very central role of Nick (what if he moves to another job?) Introduction to social network analysis Introduction Motivation Formal chart vs. network With whom do you discuss issues important to your work? Product 1 division relatively separate from overall network. Introduction to social network analysis Introduction Motivation Interventions Using network data to improve flows of communication and coordination in the organisation. Introduction to social network analysis Introduction Motivation Networks for behaviour change: smoking prevention Network of friendships among sixth grade pupils. Squares = girls, circles = boys; blue = smokers, red = non-smokers. Valente et al. 2003. Introduction to social network analysis Introduction Motivation Use popular pupils (“opinion leaders”) to reduce smoking in adolescents Identify most popular pupils in class; Recruit and train them; Use them to spread the message. Valente et al. 2003: network method effective in reducing adolescents’ smoking. Introduction to social network analysis What is SNA Defining SNA An approach to human behaviours and social interactions. A set of specific analytical and statistical methods. A special type of data (and techniques of data collection). A set of visualisation tools. Introduction to social network analysis What is SNA What is a network What is a network —a formal definition = A set of units (nodes) connected by one or more relations (ties) What is a node? ⇒ Depends on setting: person, group/organisation, object. What is a tie? ⇒ A relation or a shared trait: friendship, advice, exchange, co-work. Introduction to social network analysis What is SNA What is a network Graphs and networks Circles (A, B) represent nodes. Lines (e.g. between A and B) represent ties/edges. Graph visualizes the whole structure of ties of a defined group. Graphical conventions (colours, size of nodes and/or ties) can be added to show attributes. For example: if this is a network of friendship, blue = boys, red = girls. Introduction to social network analysis What is SNA What is a network Graphs and networks Circles (A, B) represent nodes. Lines (e.g. between A and B) represent ties/edges. Graph visualizes the whole structure of ties of a defined group. Graphical conventions (colours, size of nodes and/or ties) can be added to show attributes. For example: if this is a network of friendship, blue = boys, red = girls. Introduction to social network analysis What is SNA What is a network Isolates, dyads and triads a u Isolate b u d u c u A A A A A Auf e u Dyad Triad Introduction to social network analysis What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships. Introduction to social network analysis What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships. Introduction to social network analysis What is SNA The network perspective A new perspective SNA requires a change of mindset with respect to other social science approaches. Emphasis is on relationships, not attributes. Not just dyadic relationships (just A and B), but dyadic relationships as embedded in a whole set of relationships. Introduction to social network analysis What is SNA The network perspective Embedded relationships Figure: Suppose the relationship represented here is friendship. How may friendship between A and B vary in these three different contexts? Introduction to social network analysis What is SNA The network perspective Triads a a a u u u AK A A A A A A A A A A A A A A U uc b u U uc b u A -A -Auc b u Intransitive Transitive Intransitive: Only bilateral ties. Transitive: A friend of my friend is my friend. Three-cycles: a form of generalized exchange. 3-cycles Introduction to social network analysis What is SNA The network perspective Triads a a a u u u AK A A A A A A A A A A A A A A U uc b u U uc b u A -A -Auc b u Intransitive Transitive Intransitive: Only bilateral ties. Transitive: A friend of my friend is my friend. Three-cycles: a form of generalized exchange. 3-cycles Introduction to social network analysis What is SNA The network perspective Network effects, more globally a 1 x A K e x A A x d A Axc b x For example, those who attract many choices will attract even more in future (reputation effect, “Matthew” effect). Does a high (and growing) number of friends have advantages / disadvantages? Introduction to social network analysis What is SNA The network perspective Network effects, more globally a 1 x A K e x A A x d A Axc b x For example, those who attract many choices will attract even more in future (reputation effect, “Matthew” effect). Does a high (and growing) number of friends have advantages / disadvantages? Introduction to social network analysis What is SNA The network perspective Network effects, more globally a 1 x A K e x A A x d A Axc b x For example, those who attract many choices will attract even more in future (reputation effect, “Matthew” effect). Does a high (and growing) number of friends have advantages / disadvantages? Introduction to social network analysis What is SNA The network perspective Network effects, more globally a 1 x A K e x A A x d A Axc b x For example, those who attract many choices will attract even more in future (reputation effect, “Matthew” effect). Does a high (and growing) number of friends have advantages / disadvantages? Introduction to social network analysis What is SNA Summary Now you know: What a network is; Correspondence between a network and a graph; Difference between triadic and dyadic structures; Global effects of network structure. Introduction to social network analysis Data Network data Data format: How network data look like How they differ from other social science data From data to graph Data collection: Name generators/interpreters Archives Web crawlers Introduction to social network analysis Data Data format Data type 1: Ego networks The whole set of contacts (alters) of one person or entity (ego). Usually includes attributes of alters and ties between them. Usually collected for a sample of egos (e.g. in a survey). Typically, graphically represented with ego at its centre (star-shaped). Introduction to social network analysis Data Data format Example: Ego networks to discover “hidden” populations Enrolling HIV+ persons to participate in vaccine preparedness study through their networks. Valente, 2010. Introduction to social network analysis Data Data format Data type II: Whole networks Mapping the whole set of ties of a particular group, setting or population. Not focused on one particular person or entity. Network boundaries must be well-defined. Examples: network of friends in a classroom; network of knowledge-sharing between employees of an organisation. Introduction to social network analysis Data Data format Data storage: traditional social science Social science data are usually represented in the form of a rectangular table, where each row is an observation, each column is a variable: For example: name Jane Mary Bob Sue Alan Tom age 25 31 29 28 32 29 gender 0 0 1 0 1 1 married 0 0 1 1 0 1 Introduction to social network analysis Data Data format Network data storage I: matrix Network data can be stored as a n-by-n square matrix with all nodes listed in both columns and rows. The value of cell (i, j) in the matrix indicates whether the node i and the node j are connected (1) or not (0). The diagonal is meaningless. For example, for a friendship network: Jane Jane Mary Bob Sue Alan Tom 1 1 0 0 0 Mary 1 0 1 0 0 Bob 1 0 0 0 0 Sue 0 1 0 1 0 Alan 0 0 1 1 1 Tom 0 0 0 0 1 Introduction to social network analysis Data Data format Data storage II: Edge list The edge list stores each pair of connected nodes in a single row of a table. For example, for the same friendship network: ego Jane Jane Mary Bob Alan Alan alter Mary Bob Sue Alan Tom Susan Introduction to social network analysis Data Data format Which format to choose Most network analysis packages support both formats. Some provide conversion facilities (e.g. UCINET: edge list to matrix). It is usually possible to combine network data (in matrix or edge list format) and attributes. A rectangular table is usually needed for attribute data —as in traditional social science. Introduction to social network analysis Data Data format Some general rules Matrix visually appealing when nodeset is small, but difficult to handle when it is large (because all possible pairs must be explicitly included). With large node sets, edge list is more convenient (because only existing ties need to be listed). Introduction to social network analysis Data Data format Tie data I Directed ties: a x b -x a x b -x Tie goes from one node to another, but not necessarily back. E.g. Advice-giving, money-lending. Usual graphical representation: arrow. When directed ties do go in both directions, they are reciprocal ties. Usual graphical representation: double arrow. Introduction to social network analysis Data Data format Tie data I Directed ties: a x b -x a x b -x Tie goes from one node to another, but not necessarily back. E.g. Advice-giving, money-lending. Usual graphical representation: arrow. When directed ties do go in both directions, they are reciprocal ties. Usual graphical representation: double arrow. Introduction to social network analysis Data Data format Tie data II Undirected ties: Ties are mutual by definition. E.g. Siblings, co-workers. Usual graphical representation: line. a x b x Introduction to social network analysis Data Data format Undirected ties: matrix is symmetric Jane Jane Mary Bob Sue Alan Tom 1 1 0 0 0 Mary 1 0 1 0 0 Bob 1 0 0 0 0 Sue 0 1 0 1 0 Alan 0 0 1 1 1 Tom 0 0 0 0 1 Introduction to social network analysis Data Data format Directed ties: matrix is NOT symmetric Jane Jane Mary Bob Sue Alan Tom 0 0 0 0 0 Mary 1 0 0 0 0 Bob 1 0 0 0 0 Sue 0 1 0 1 0 Alan 0 0 1 0 0 Tom 0 0 0 0 1 Introduction to social network analysis Data Data format Binary and valued ties Binary ties indicate presence or absence of tie Valued ties can be stronger or weaker, under some definition of strength: Emotional closeness; Frequency of contact; Duration of Relationships. Graphically: line (arrow) thickness often represents strength of tie. Introduction to social network analysis Data Data format Storing valued ties in a edge list The edge list can include a third column with attributes of each tie. In our friendship example, we can include duration of friendship: ego alter Jane Jane Mary Bob Alan Alan Mary Bob Susan Alan Tom Susan duration (years) 5 2 3 1 2 2 Introduction to social network analysis Data Data format Storing valued ties in a matrix Instead of 0-1 values, the matrix has different values depending on duration of the relationship: Jane Jane Mary Bob Sue Alan Tom 0 0 0 0 0 Mary 5 0 0 0 0 Bob 2 0 0 0 0 Sue 0 3 0 2 0 Alan 0 0 1 0 0 Tom 0 0 0 0 2 Introduction to social network analysis Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Circle Introduction to social network analysis Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Fruchtermann-Rheinhold Introduction to social network analysis Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Kamada-Kawai Introduction to social network analysis Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? Spring Introduction to social network analysis Data Data format Graphs Basic principles of graph representation are simple (nodes and edges). But graph visualisation is a complex problem in computer science. Which representation is most suitable for detecting network structure and properties? MDS Introduction to social network analysis Data Data format Now you know: Format for network data: square matrix, rectangular matrix, edge list. Difference between Ego and whole networks. Directed and undirected ties. Binary and valued ties. Graphical conventions to represent these different data. Introduction to social network analysis Data Data collection Collecting network data Networks are built from nodes and the ties between them. Who are the nodes? What are the ties? How to elicit information? Introduction to social network analysis Data Data collection How to identify nodes Ego-network data collections often included in larger surveys. Whole network data collection requires defining network boundaries, for example: Members of an organisation; Students of one school; Attendees of one particular event. N.B. collection of whole network data needs to be exhaustive –sensitive to response rate. Introduction to social network analysis Data Data collection Collecting network data through surveys: name generators and interpreters Name generators are questions to elicit respondents’ alters, for example: From time to time, most people discuss important matters with other people. Looking back of the last six months, who are the people with whom you discussed matters important to you. Just tell me their names or initials. (General Social Survey, 1985) Can be accompanied by name interpreters to report alter characteristics and identify ties between alters. Figure: A name generator with a graphical interface in a web-based survey; research project ANAMIA. Introduction to social network analysis Data Data collection Collecting network data through surveys: rosters Provide respondents with a list of potential network members and ask them to choose from the list those to whom they are tied, for example: Here is the list of all the members of your Firm. Would you go through this list, and check the names of those you socialize with outside work. You know their family, they know yours, for instance. I do not mean all the people you are simply on a friendly level with, or people you happen to meet at Firm functions. (Lazega, 2001) Introduction to social network analysis Data Data collection Collecting network data through surveys: rosters (cont.) Used for whole network studies. Also useful as a memory-aid. Requires the researcher to have a complete list of nodes from start. Only feasible for relatively small networks (e.g. schools, companies). Introduction to social network analysis Data Data collection Collecting network data from archives For example: contract data from companies’ financial statements; citations data, from publishers’ portals. Depends on the quality of the archive and the actual availability of network information. Need to ensure definition of ties is consistent and data are reported uniformly across all nodes. Need to ensure completeness (for whole networks). Figure: A citations network. From a study of the literature on pro-anorexic websites over ten years, with a corpus of 60 scientific articles. Casilli, Tubaro and Araya (2012), ANAMIA. Introduction to social network analysis Data Data collection Webcrawling Using dedicated software to retrieve websites and the links between them. Increasingly popular with the rise of web-based networks, online social networking services, the study of the Internet as a network. Defining network boundaries may be difficult. Frequent need for manual verification of data quality. Privacy protection issues. Figure: A map of the pro-anorexic web sphere in France. F. Pailler, D. Pereira, ANAMIA. Introduction to social network analysis Data Data collection Now you know: Different ways of collecting network data: surveys, archives, webcrawling. All have advantages and disadvantages. Choice depends on research questions, context, and expected outcomes. Introduction to social network analysis Network metrics Measuring properties of networks Focus is on properties of patterns of relationships, independently of node attributes. Based on the mathematics of graph theory, refined with social science concepts. A variety of algorithms, measures and software applications are available. Introduction to social network analysis Network metrics Size Size Network size = number of nodes (= number of contacts in a personal network); The “Dunbar number”: cognitive limitations restrict the size of personal networks to about 150 contacts; An open question: have social media increased human capacity to maintain relationships? Median network size on Facebook = 99, average about 150 - 200 (though large variation). Introduction to social network analysis Network metrics Density Density The proportion of ties that actually exist and the ties that could exist in principle: Density = L (n∗(n−1)) 2 Density = for undirected ties; L (n∗(n−1)) for directed ties. where L = number of edges, n = number of nodes. Introduction to social network analysis Network metrics Density Application: Dense networks and behaviours Denser online networks spread behaviours faster: Centola 2010. Introduction to social network analysis Network metrics Density Why is this so? When adoption of a new behavior requires social reinforcement (threshold effect), a denser network favours change. Introduction to social network analysis Network metrics Centrality Degree centrality Who are the most “important” nodes? Diane has the highest number of direct connections (degree); A connector, or hub. Krackhardt’s kite network. Introduction to social network analysis Network metrics Centrality Degree centrality Who are the most “important” nodes? Diane has the highest number of direct connections (degree); A connector, or hub. Krackhardt’s kite network. Introduction to social network analysis Network metrics Centrality Betweenness centrality Heather has fewer connections than Diane; Yet she occupies a strategic position, between different parts of the network; She controls what flows in the network. Krackhardt’s kite network. Introduction to social network analysis Network metrics Centrality Closeness centrality Fernando and Garth have fewer connections than Diane; But they are at a shorter distance from all other network members; They can monitor the information flow in the network. Krackhardt’s kite network. Introduction to social network analysis Network metrics Centrality Core-periphery structures Ike and Jane have low centrality scores; e.g. they may be external contractors for a company; may be sources of fresh information! Krackhardt’s kite network. Introduction to social network analysis Network metrics Centrality Network centralisation The extent to which a network is dominated by one (or a few) nodes: u @ u @ @ @ u u u u @u @ @ @ @u u u u u u Introduction to social network analysis Network metrics Centrality Network centralisation Measures the extent to which a network is dominated by a single central node. Comparing centrality of the most central node to the centrality of other nodes. Normalized by dividing by the maximum centralization possible for a network of the given size. Ranges from 0 to 1 (star network). Introduction to social network analysis Network metrics Centrality Centralisation may vary over time Figure: The advice network of judges in a Parisian court. Correlation between degrees, first to second observation (left panel) and second to third (right panel). Introduction to social network analysis Network metrics Distance Distance Distance: number of steps from one member to another; Shorter paths in a network are the most important; The shorter the path from one network to the other, the quicker and more efficient the flow of information, advice, knowledge. Left: Longer paths; Right: Shorter paths. Introduction to social network analysis Network metrics Cliques Cliques u A A A A A Au u 3-member clique u @ @ u @ u @ @ @u 4-member clique u u P @ @PPP PP @ @ PP Pu @ @ @ @ @u u 5-member clique A clique is a sub-set of nodes where all possible pairs of nodes are directly connected. Scott (2000). Introduction to social network analysis Network metrics Cliques Real-world cliques u @ u u @ @ @ u u u u @ @u u u @ @ @ @ @u 1-clique @ u 2-clique Completely connected groups uncommon. n-clique: points connected by a maximum path link. n-cliques of greater than 2 empirically infrequent. Scott (2000). 3-clique Introduction to social network analysis Network metrics Cliques Application: Small Worlds A “small world” network is sparse, but with dense neighbourhoods and short paths; and there are few steps from one member to any other. Introduction to social network analysis Network metrics Cliques Now you know: Key metrics to measure properties of networks: Size; Density; Centrality / Centralisation; Distance; Cliques. Introduction to social network analysis Further readings Books on social network analysis: general Thomas W. Valente. Social networks and health. Models, Methods, and Applications, Oxford UP 2010. Christina Prell. Social Network Analysis. History, Theory and Methodology, Sage 2011 (October). John P. Scott. Social Network Analysis: A Handbook, Sage 2000. Introduction to social network analysis Further readings Books on social network analysis: general (cont.) Stanley Wasserman and Katherine Faust. Social Network Analysis: Methods and Applications, Cambridge UP, 1994. Peter J. Carrington, John Scott, Stanley Wasserman (Eds.) Models and Methods in Social Network Analysis, Cambridge UP, 2005. David Knoke. Social Network Analysis, Sage 2008. Introduction to social network analysis Further readings Books on social network analysis: Theory Ronald S. Burt. Brokerage and Closure: An Introduction to Social Capital, Oxford UP, 2005. Ronald S. Burt. Neighbor Networks: Competitive Advantage Local and Personal, Oxford UP, 2010. Nan Lin. Social Capital: A Theory of Social Structure and Action, Cambridge UP, 2002. Introduction to social network analysis Further readings Books on social network analysis: Economics Matthew O. Jackson Social and Economic Networks, Princeton UP, 2010. Sanjeev Goyal. Connections: An Introduction to the Economics of Networks, Princeton UP, 2009. Fernando Vega-Redondo. Complex Social Networks,Cambridge UP 2007. Introduction to social network analysis Further readings Journals Social Networks, Elsevier Connections Journal of Social Structure Redes (Spanish) Introduction to social network analysis Further readings Associations and conferences INSNA: Sunbelt XXXIII conference, May 2013, Hamburg (www.insna.org); AFS - RT26, Ecole d’été, September 2012; UKSNA: 8th annual conference, Bristol, June 2012; ASNA: 9th annual conference, Zurich, September 2012. Introduction to social network analysis Further readings Thank you! Paola Tubaro, [email protected]