7.5 Information Propagation on WhatsApp
7.5.5 Real Time Dynamics on WhatsApp Network
Finally, we look at how actual content disseminates through WhatsApp network and apply these real time dynamics in our infection model to estimate the time of dis-semination of misinformation in IMPs. By observing all occurrences of a single piece of information, it is possible to analyze some dissemination characteristics of this kind of content (e.g., we can point where it originated, how long it takes to be reshared, how long it lasts on the network and how many users this content reached).
Multimedia messages usually are shared unaltered across the network and, then, they are easier to track than text messages. Thus, we choose to select the images posted on
7.5. Information Propagation on WhatsApp 140 Figure 7.12: Users infected by time in simulations of the SEIR model using recovery rate (r) to add immunity to users. (α) = 0.1. Forward limit(φ) = 5.
0 50 100 150 200 250 300
Time (iterations) 0.0
0.2 0.4 0.6 0.8 1.0
Users
E. r = 0.001 I. r = 0.001 R. r = 0.001
E. r = 0.01 I. r = 0.01 R. r = 0.01
E. r = 0.1 I. r = 0.1 R. r = 0.1
(a) Wpp. India
0 50 100 150 200 250
Time (iterations) 0.0
0.2 0.4 0.6 0.8 1.0
Users
E. r = 0.001 I. r = 0.001 R. r = 0.001
E. r = 0.01 I. r = 0.01 R. r = 0.01
E. r = 0.1 I. r = 0.1 R. r = 0.1
(b) Wpp. Indonesia
0 50 100 150 200 250
Time (iterations) 0.0
0.2 0.4 0.6 0.8 1.0
Users
E. r = 0.001 I. r = 0.001 R. r = 0.001
E. r = 0.01 I. r = 0.01 R. r = 0.01
E. r = 0.1 I. r = 0.1 R. r = 0.1
(c) Wpp. Brazil
0 50 100 150 200 250
Time (iterations) 0.0
0.2 0.4 0.6 0.8 1.0
Users
E. r = 0.001 I. r = 0.001 R. r = 0.001
E. r = 0.01 I. r = 0.01 R. r = 0.01
E. r = 0.1 I. r = 0.1 R. r = 0.1
(d) Telegram Brazil Source: The Author.
WhatsApp to analyze real time characteristics about the propagation of content across the network. Since we group together the sets of similar images and all their sharing informa-tion using perceptual hashes, we are able to determine time intervals regarding spreading of those messages. For this analysis, we count image popularity for these datasets used during experiments and simulations with SEI model and calculate its spreading across the network. In total, for WhatsApp datasets from Brazil, India and Indonesia, more than 1 million of images were identified, from which 784k were unique image objects. To evaluate spreading metrics regarding time and coverage, we just consider the images that were shared at least twice, since we cannot see the effect of spreading of images only posted a single time. This set consists of 2,384 images in Indonesia, 103,031 images in Brazil, and 44,731 images for India, which represents approximately 20% of all images.
With this data, we calculate the total number of shares of each image and how many groups they have appeared. Figures7.13(a)and7.13(b)show the Cumulative Distribution Function (CDF) of the total number of shares and the number of distinct groups each image appeared in. It is possible to note that there are some very popular images broadly
7.5. Information Propagation on WhatsApp 141 Figure 7.13: CDF of sharing coverage and time dynamics metrics of images shared at least twice on WhatsApp.
101 102 103
Total Shares
0.0 0.2 0.4 0.6 0.8 1.0
Prob. (X < x)
Brazil India Indonesia
(a) Total shares
100 101 102
#Groups Posted
0.0 0.2 0.4 0.6 0.8 1.0
Brazil India Indonesia
(b) Total groups
100 101 102 103 104 105
LifeTime (min)
0.0 0.2 0.4 0.6 0.8 1.0
Brazil India Indonesia
(c) Lifetime
100 101 102 103 104 105
Inter-event (min)
0.0 0.2 0.4 0.6 0.8 1.0
Brazil India Indonesia
(d) Inter-event Source: The Author.
shared more than 500 times in Brazil and a thousand times in India, moreover, they reached more than 100 groups in both countries. Even though a large part of images were shared just a few times, the more popular ones demonstrate that WhatsApp can be used not only for particular conversations but also as a mass communication media with a potential virality of its content.
We also analyze their “lifetimes” in Figure 7.13(c). The lifetime is given by the time difference in terms of minutes between the last and first occurrence of a single image in our dataset. While most of the images (80%) last no more than 2 days, there are images for both Brazil and in India that continue to be reshared even after 2 months (105minutes) of their first appearance. Further analysis, in Figure7.13(d)shows the distribution of the
“inter-event time” between posts of the same image. This represents how many minutes it takes for the image to be re-posted after an appearance in the network. We observe that the inter-event time of images in India is much faster than in Brazil and Indonesia, i.e., more than 50% of posts are done in intervals of 10 minutes or less, while just 20% of shares were done in this same time interval in Brazil and Indonesia. A manual investigation for reasons behind the short period of time between posts suggested that in the data from India, there is more automated, spam-like behavior compared to those in Brazil and Indonesia.
In conclusion, these results suggest that WhatsApp is a very dynamic network
7.5. Information Propagation on WhatsApp 142 Figure 7.14: Real Time SEI model using “incubation time” before spread infection and each iteration equals 1 minute (log). (α) = (β) = 0.1. Forward limit (φ) = 5.
100 0.0
101 102 103 104 105 0.2
0.4 0.6 0.8 1.0
Time (minutes)
%Users Infected Exp. GroupTime
Inf. GroupTime Exp. Random Inf. Random Exp. Inter-event Inf. Inter-event
(a) Wpp. Brazil
100 0.0
101 102 103 104 105 0.2
0.4 0.6 0.8
1.0 Exp. GroupTime Inf. GroupTime Exp. Random Inf. Random Exp. Inter-event Inf. Inter-event
(b) Wpp. India
100 0.0
101 102 103 104 105 0.2
0.4 0.6 0.8
1.0 Exp. GroupTime Inf. GroupTime Exp. Random Inf. Random Exp. Inter-event Inf. Inter-event
(c) Wpp. Indonesia Source: The Author.
and most of its image content is ephemeral, i.e., the images usually appear and vanish quickly. The linear structure of chats makes it difficult for an old content to be revisited, yet there are some that linger on the network longer, continuing to spread over weeks or even months.
Finally, we used these dynamics of propagation calculated from our data of What-sApp to realize one more simulation to measure viralization velocity in terms of real time.
In the SEI model, the spread of information was measured in terms of the number of iterations. In this experiment, we adapt the SEI model to use these time dynamics and to present a measure of the spread in terms of minutes. For this, we add an “incubation time” based on the time real data takes to spread over the network. In this version of the model, each iteration represents 1 minute, but when an infected node intends to spread, it has to wait a specific amount of time before doing it. This time is sampled from a distribution of “waiting times”, which can be:
(i)Random: a uniform distribution with domain between 1 and 1440 minutes (1 day);
(ii) Inter-event Time: the empirical distribution of inter-event times computed in Fig-ure 7.13(d); (iii) Group Time: this strategy is based on the following idea – it usually takes longer for a message to reach 100 groups than to reach 2 groups. To implement this, in this strategy, we make the incubation time in the initial steps smaller than in the subsequent steps, also based on the time information from dataset gathered from WhatsApp.
During the simulation, we track the number of groups the infection has already spread and, for each step, we have a different time distribution according to how long it took for the actual images in WhatsApp to reach those number of groups in our data.
Figure 7.14 shows experiments considering the three strategies to compute the time to spread. In India, where we have the bursty inter-event times, we see that with the inter-event time strategy 60% of users are exposed to the content in the first 200 minutes of infection. In Brazil, group time is faster than inter-event time and infected around half of users in the first 2 days (3000 minutes). Finally, in Indonesia all three
7.6. Findings on Dissemination on WhatsApp 143