Text messaging (SMS) Data collection via text

Transcription

Text messaging (SMS) Data collection via text
7/2/2014
Effort and sensitivity effects in mobile text messaging interviews
Michael F. Schober*, Frederick G. Conrad†, Huiying (Yanna) Yan†, Matthieu G. Sauvage‐Mar‡
AAPOR 69th Annual Conference, Anaheim, CA May 2014
Text messaging (SMS)
• Widespread adoption and increasing use for daily communication worldwide
• Increasingly is being used to collect survey data for market research, political polling, social measurement
• Scale is vast: Already hundreds of millions of potential participants are included in text sample frames worldwide
* New School for Social Research, New York; †University of Michigan, Ann Arbor; ‡ GeoPoll, Washington, DC
Data collection via text
• Can occur
– at respondents’ convenience
– in places without landline infrastructure
– even when network signal is weak or volatile
– at notably low cost if interviewing is automated
• Can lead to higher quality data than voice interviews on smartphones (Schober et al., 2012; Conrad et al., 2013)
– Less satisficing
– More disclosure
How does design of text interview affect data quality?
• There is little systematic evidence about best practices for text interviewing
• Text interviewing can be implemented in different ways
1
7/2/2014
Single‐character vs. Multi‐character (Smartphone)
Numeric keypad • Unlike alphabetic keypads on smartphones:
– Physical keys, not onscreen
– Entering a single character (e.g. “a”) requires pressing a numeric key 1‐3 times
– Still widely used in developing world
Single‐character
Multi‐character
Further options for text interviewing
• Automated vs. human interviewer
– e.g. Schober et al. 2012, Conrad et al. 2013
Current study
• Methodological experiment designed to test two hypotheses
– Hypothesis 1: Effort
– Hypothesis 2: Sensitivity
• Different possible delays in sending next question after R sends answer
• Different possible periods for closing interview
– Because Rs can delay hours or days between responses
2
7/2/2014
Hypothesis 1: Effort
• Text survey respondents may be more likely to choose response options that require fewer keystrokes
– Each keystroke takes additional effort
• This is probably why in daily communication, texters
frequently abbreviate (c u l8r, omg, lol, imho…)
– In interacting with interfaces (generally), users tend to minimize effort
Effort: Predictions
• If effort hypothesis is right, then – Rs should be drawn to response options with fewer characters
– Response distributions should differ in single‐
character than multiple‐character text response interviews
– Tendency may be greater for Rs using numeric keypads (which need more keystrokes) than full alphabetic keypads.
Hypothesis 2: Sensitivity
Experiment
• Text survey Rs may be more likely to select the more sensitive response options when they are singlecharacter than when they must articulate (type) the full answer. • 12 closed‐form survey questions
• texted in French by an automated system to Rs in Tunisia
• 6 questions more likely to be sensitive for Tunisian respondents and 6 less likely to be sensitive
• Response options varied in length
– Single‐character labels are arbitrary rather than meaningful  may be less valueladen than multiple‐character
– Single‐character responding may allow respondents to “hide behind” the character
3
7/2/2014
Experiment (cont’d)
• Rs randomly assigned to either single‐ or multiple character interviews
– multi‐character respondents required to key in full responses
– single‐character respondents required to answer with a single character (i.e. “1,” “2,” “3”)
• Counterbalanced question order presentation
– Counterbalanced presentation of sensitive and non‐
sensitive questions across respondents
– Two alternate orders of presentation of response options for the 6 questions
– Rules out sensitivity order and response order effects as explanation for findings
GeoPoll
• GeoPoll is a global mobile surveying platform for collecting data about the developing world • GeoPoll partners with Mobile Network Operators (MNOs) to invite subscribers to join national panels
– over 110M people have joined worldwide
• Surveys delivered via automated text messaging and touchtone IVR interviews • Regional data collection is possible
• No cost to respondents
• Respondents receive airtime incentive for participating
Data collection
• SRS (Simple Random Sampling) sample drawn from Tunisia GeoPoll mobile sampling frame
– 2,072,040 mobile subscribers
– Covers all provinces
– Includes urban and rural areas
• Rs invited on their own phone
– 0.5 dinars (roughly $0.30 USD) as an incentive
• Survey fielded in 12 days spread over the course from Jan. 2013 to Mar. 2013
Respondents
• 2472 French‐speaking Rs completed the survey
– Response rate is approximately 3% – Of people started, the completion rate is approximately 67%
• Rs’ demographics
– Gender: 49% Female, 51% Male
– Age: 76% 10‐29 yrs, 21% 30‐49 yrs, 3% 50 yrs+
– Rs’ attributes did not differ between conditions
• Asked afterward whether had answered with an alphabetic (smartphone) or numeric keyboard
4
7/2/2014
Potentially sensitive question;
response options of varying length
• Vous assistez à des services religieux:
(How often do you attend religious services:) Multi‐character
Single‐character Characters
au moins une fois par semaine 1) au moins …
29
(In English) at least once a week
presque chaque semaine
2) presque…
22
almost every week
environ une fois par mois
3) environ …
25
about once a month
Rarement
4) rarement
8
seldom
jamais
5) jamais
6
never
Likely non‐sensitive question;
response options of varying length
• Laquelle de ces activités récréatives préférez
vous le plus:
• (Which of these recreational activities do you most prefer:) Multi‐character
Single‐character Characters
(In English) Potentially sensitive question;
response options of same length
• Votre famille a‐t‐elle donné durant les 6 derniers
mois un pot de vin à l'université, un agent public, la police routière, agent du fisc
• (Has your household paid a bribe to any of the following in the past 6 months: university, gov
official, traffic police, tax official?) Multi‐character
Single‐character
Characters
(In English) Oui
1) Oui
3
Yes
Non
2) Non
3
No
Likely non‐sensitive question;
response options of same length
• Avez‐vous mangé dans un restaurant la semaine dernière?
• (During the past week, have you eaten in a restaurant?) Multi‐character
Single‐character
Characters
(In English) regarder la télévision
1) regarder..
22
Television
Oui
1) Oui
3
Yes
aller au cinéma
2) aller ...
15
going to the movies
Non
2) Non
3
No
lire
3) lire
4
reading
discuter avec des amis
4) discuter…
22
talking with friends
faire du sport
5) faire…
14
playing sports
5
7/2/2014
Effort cont’d: Effect of Character length
Results: Effort Findings
Average number of
Longest response option selected Average number of shortest response option selected
3.00
3.00
2.63
2.50
• Rs drawn to shorter response options in both multi‐ and single‐character conditions • But especially in the multi‐character condition Predicted Probability of Selection
• Rs in multicharacter condition more likely to select shortest response option and less likely to select longest response option than Rs in singlecharacter condition 2.50
2.08
2.00
2.00
1.50
1.50
1.00
1.00
0.50
0.50
0.00
0.00
single‐character
multi‐character
single‐character
multi‐character
t(1753.3)=14.17, p<.0001
t(1584.9)=10.86, p<.0001
0.8
0.7
0.6
0.5
0.4
Single‐character
condition
0.3
Multi‐character
condition
0.2
0.1
0
‐2
‐1
0
(Mean‐2SD) (Mean‐SD)
(Mean)
1
2
(Mean+SD) (Mean+2SD)
Standardized Response Option Length
Effort cont’d: Individual Qs
Effort cont’d: Individual Qs
Potentially sensitive Q, response options of varying length
Likely non‐sensitive Q, response options of varying length
• How often do you attend religious services? 70.0%
57.2%
60.0%
Χ2(4)= 45.01, p<.0001
n=2345
45.0%
50.0%
35.9%
35.0%
40.0%
Single‐character
condition
27.5%
30.0%
22.4%
6.6%
7.4%
4.4%
Multi‐character
condition
6.7%
4.3%
15.0%
26.5%
jamais
(6 chars)
Single‐character
condition
9.9%
10.0%
5.0%
0.0%
au moins
presque environ une rarement
une fois par chaque fois par mois (8 chars)
semaine
semaine
(25 chars)
(29 chars) (22 chars)
31.2%
25.9% 26.0%
25.0%
20.0%
20.0%
12.2%
30.0%
Χ2(4)=626.80, p<.0001
n= 2292
40.6%
40.0%
51.2%
10.0%
• Which of these recreational activities do you most prefer: 1.8% 1.9%
0.3%
Multi‐character
condition
0.0%
regarder la aller au
lire
television cinema (4 chars)
(22 chars) (15 chars)
discuter faire du
avec des
sport
amis
(14 chars)
(22 chars)
6
7/2/2014
Effort cont’d:
Moderating effect of keypads
• Rs who use numeric keypads in multi‐character condition especially drawn to shortest response options
Average number of Results: Sensitivity findings
• Overall, we did not consistently observe more socially undesirable responding in the single‐
character condition.
Average number of most socially undesirable answers selected
shortest response option selected
3.00
2.80
2.51
2.50
2.07
interaction keypads*multi‐
character F(1,1740) = 7.50, p < 0.01
1.50
Multi‐character
1.00
1.00
0.50
0.50
0.00
0.00
Numberic keypads
Numeric keypads
Numeric keypads
Full alphabetic keypads
Single‐character
condition
Sensitivity cont’d: Bribe item
• BUT Rs more likely to report having paid a bribe with a single‐character response than by typing out “oui” (“yes”), which is no shorter than “non.”
Has your household paid a bribe to any of the following in the past 6 months: university, gov official, traffic police, tax official? 90.0%
80.2%
80.0%
84.6%
70.0%
Χ2(1)= 8.12, p<0.01
n=2437
60.0%
Single‐character
condition
Multi‐character
condition
50.0%
40.0%
20.0%
t(2157) =‐1.38, n.s.
Single‐character
1.50
30.0%
2.00
2.07
2.00
19.8%
Multi‐character
condition
Summary
• Strong evidence for effort hypothesis
– Rs drawn to response options with fewer characters
• Especially in multiple‐character text response interviews
– Tendency greater for Rs using numeric keypads (which need more keystrokes) than full alphabetic keypads
• Suggestive evidence for sensitivity hypothesis
15.4%
10.0%
0.0%
Yes (Oui)
No (Non)
7
7/2/2014
Caveat
• Of course, we don’t know the true values
– Evidence from other modes is that satisficing leads to less accurate answers – Evidence from other domains is that greater disclosure of sensitive information is more likely to be true • (e.g., Kreuter, Presser & Tourangeau, 2008)
• This suggests to us that minimizing response entry effort in texting is likely to lead to improved data quality
Interviewing by texting
• Already prevalent and likely to increase
• These results begin to clarify principles for designing text surveys Many more questions
• Any evidence of differential response rates—
higher response or completion rate—in single‐ vs. multi‐character text interviews?
• Across different domains, how do texting results compare with voice results? • How does population literacy affect findings?
Acknowledgments
• GeoPoll: Max Richman, King Beach, Jon Bernt, James Eberhard
• New School for Social Research
• Single‐character design is likely to be advantageous
– certainly for reducing burden on Rs
– also potentially for reducing sensitivity 8