Censorship of domestic social media platforms in China is operated through a system of intermediary liability or “self-discipline” in which companies are held liable for the content on their platforms.Footnote 1 Companies are expected to invest in technology and personnel to carry out content censorship according to government regulations. Self-discipline works as a means for the government to push responsibility of information control to the private sector. Social media censorship in China is dynamic and often reactionary to sensitive events. Previous work suggests that the Chinese government strategically restricts online content during the course of an event to mould public opinion.Footnote 2
While it is clear that events act as catalysts for censorship, do these controls change once an event is over? Does the censored content remain blocked or is it eventually permitted? What can be inferred from these censorship patterns about the role the government and private companies play in information control in China?
In this paper, we present the first study of the relationship between political events and censorship on WeChat (weixin 微信), the most popular chat platform in China, through a longitudinal analysis of keyword censorship related to the 19th National Communist Party Congress (NCPC19).
The NCPC19 was held from 18 October to 24 October 2017, marking the halfway point of President Xi Jinping's 习近平 ten-year term, and served as a bellwether of his power over the Party. The National Communist Party Congress is the most important political event for the Chinese Communist Party (CCP), and information around it is carefully managed.
To measure the effect of the NCPC19 on censorship on WeChat, we collected news articles reporting on the NCPC19 and tested the sending of the article text in a WeChat group chat, documenting instances of message blocking in chat features. For news articles whose texts were blocked, we dynamically determined the exact combinations of keywords used to trigger the message blocking using a testing methodology which involves sending many modified copies of the original blocked message.
Overall, we found 531 NCPC19-related keyword combinations were blocked. The rate of keyword blocking increased days before the NCPC19 began; during the Congress, WeChat applied the broadest level of censorship. Approximately one year after the Congress, the majority of the previously blocked keyword combinations (75.7 per cent) no longer triggered censorship. A broad range of content was censored, including criticism and general speculation about the Congress, leaders and power struggles. Censored keyword combinations also included generic references to government policies and CCP ideological concepts, which could restrict benign discussions and potentially even pro-government messages. Keyword combinations related to President Xi Jinping were blocked at higher rates and remained blocked for the longest duration relative to other types of content.
Our results provide insights into how political events affect censorship on social media platforms in China. Periods of power transition such as the Congress can trigger uncertainty and insecurity in authoritarian regimes, making controlling messaging around such events critically important. The heightened censorship on the eve of the NCPC19 suggests that WeChat faced direct or indirect government pressure to control content related to the event. However, rather than the selective and strategic information controls observed in other studies,Footnote 3 this pressure appears to have motivated broad and expansive censorship of content related to the NCPC19. The eventual unblocking of NCPC19-related keyword combinations shows that sensitivity around events is episodic. As government pressure around the event relaxed, incentives to block related content may have also decreased.
In this section, we review theories on how social media are used by authoritarian regimes, the system of information control in China, and describe the importance of the National Party Congress (NPC) to the CCP.
One of the central puzzles in understanding authoritarian regimes in the digital age is how the state allows or suppresses online expression to maintain its rule. China, with its intricate censorship apparatus, provides useful insights into this puzzle. Numerous studies have argued that the CCP is adaptive at handling social tension and strategically controlling information and technologies to serve its political goals.Footnote 4 How the CCP decides which types of content to permit or censor is a topic of debate in the literature. While Gary King, Jennifer Pan and Margaret Roberts's theory, which argues that the Chinese authorities target collective action but tolerate government criticism (a topic others found frequently targeted with repressionFootnote 5), has been influential, there are counterpoints to it.Footnote 6 Studies have found that private companies have a degree of flexibility in implementing censorship guidelines and that there is no unified keyword list provided to companies.Footnote 7 Blake Miller found that content related to collective action, political humour and government criticism is censored at a similar rate on Sina Weibo (xinlang weibo 新浪微博).Footnote 8 Juha Antero Vuori and Lauri Paltemaa found that keywords related to the CCP are more likely to be filtered than those related to opposition or protests, leading to the conclusion that the goal of the CCP's censorship practice is to protect its one-party rule.Footnote 9
Dynamic changes in censorship patterns have been correlated with events. Anne-Marie Brady argues that the CCP is capable of “engineering consent” by instigating particular emotions and selectively reporting information during important events,Footnote 10 although recent work suggests that sudden censorship during an important event may backfire.Footnote 11 Analysing censorship on Sina Weibo related to the 18th National Communist Party Congress in 2012, Jason Ng and Pierre Landry found that search results on the platform censored names of higher ranked and incumbent Congress delegates before and during the event with a decrease in blocking following the Congress.Footnote 12
Scholars have proposed various theories to explain China's dynamic censorship.Footnote 13 Jonathan Hassid argues that whether the authorities would tolerate online space depends on whether they can predict the direction of online discussion.Footnote 14 Peter Lorentzen develops a model where an authoritarian regime can choose “precisely” how much reporting to permit at a given time to balance its desire to minimize local corruption without risking an uprising.Footnote 15 Others contend that the state can manipulate messaging on social media to shape favourable public opinion.Footnote 16
Underlying much of this literature is an assumption that the Chinese government is a powerful entity with unified goals and that it is able to mobilize privately owned platforms to implement its censorship decisions. While it is clear that government actors in China have considerable influence over the content management practices of private companies, the system of control has been shown to be fragmented and decentralized. Social media platforms in China operate in a system of intermediary liability in which they are held liable for content on their platforms and are expected to invest in staff and technologies to moderate content and stay in compliance with government regulations.Footnote 17 Failure to comply can lead to fines or revocation of operating licences. However, the guidelines on prohibited topics provided by the government are vaguely and broadly defined (for example, “disrupting social order and stability”), which can lead to self-censorship.Footnote 18 Further complicating this system are multiple government bureaucracies that rely on many private companies to enforce censorship directives.Footnote 19 Within this structure, private companies may not always be dependable agents to the state. Companies have been found to defy government directives in order to attract more users from competitorsFootnote 20 and to censor names of competitors,Footnote 21 actions which appear to be motivated by business interests rather than government pressures.
The NPC offers an opportunity to observe how the system of information controls in China reacts to sensitive events. The functions of the Congress include confirming personnel changes at the central level, revising ideological principles set out in the Party constitution, and adjusting the national development strategy. Since power transition is norm-based rather than institutionalized, the NPC is often a period of political uncertainty. Moreover, in single-party regimes, these events are not only a period of power transition but also an opportunity to signal the Party's strength, highlight achievements of the rulers and implicitly discourage opponents from challenging the status quo.Footnote 22 Authoritarian regimes also need to present an image of unity during major power transitions.Footnote 23 Combined, these factors make the control of messaging around events like the NPC critically important for authoritarian regimes.
The NCPC19 was particularly important for Xi Jinping. Following months of intense speculation over how Xi might further his influence at the NCPC19, the Congress concluded with Xi's power reaching new heights including making Xi the third leader to have his name enshrined in the Party constitution after Mao Zedong 毛泽东 and Deng Xiaoping 邓小平 and breaking with leadership transition traditions and the norm of a ten-year term limit for the Chinese presidency by not naming a clear successor to Xi. In the lead-up to the NCPC19, reports circulated of new regulations over the internet in China, which were attributed to “stability maintenance” procedures put in place in anticipation of the event.Footnote 24 Following Ng and Landry's experiment of using censorship of Sina Weibo during the Party Congress as a proxy to gauge general trends of freedom of expression in China,Footnote 25 how messages are controlled around the NCPC19 will shed light on the overall political environment under the current administration.
In this section, we describe how keyword-based censorship on WeChat works, how this blocking can be empirically measured, our novel method for discovering censored keyword combinations related to events, and our testing regimen that applied this method to the case study.
Measuring keyword censorship on WeChat
WeChat hosts user-generated content through three main features: chat functions (including one-to-one chat and group chat), WeChat Moments (pengyou quan 朋友圈) (similar to the timeline feature of Facebook), and the Public Accounts platform (gongzhong pingtai 公众平台) (social media blogging). Previous research has documented censorship on all of these features.Footnote 26 In this study, we focus on keyword-based censorship seen in chat features.
WeChat's keyword-based censorship mechanism for chat
Keyword-based censorship on WeChat is only enabled for users with accounts registered to mainland China phone numbers.Footnote 27 Censorship for these accounts persists even if these users later link their account to a number outside of mainland China. Censorship on WeChat is not transparent: the message containing filtered content does not appear on the receiver's end and no notice is given to the sender that their message is blocked or why it was blocked.
WeChat performs censorship server-side, which means that measuring keyword blocking on the platform requires devising a message possibly containing censored content, sending that message through the app and recording whether it is censored. WeChat censors a message based on whether it contains a blacklisted keyword combination. A keyword combination consists of one or more keyword components and a message is filtered if it contains every component in a blacklisted keyword combination somewhere in the message, even if they are not adjacent.Footnote 28 For example, if a keyword combination contains three components (for example, “习近平 [ + ] 强人政治 [ + ] 中共十九大” Xi Jinping [ + ] qiangren zhengzhi [ + ] zhonggong shijiu da, Xi Jinping [ + ] strongman politics [ + ] NCPC19), a message is filtered if all three components appear somewhere in the message, in any order (see Figure 1). Combinations with a larger number of components are able to more precisely target content, whereas combinations with a small number of components or even a single component generally target a broader range of content.
Conducting tests to determine if content related to a specific event is blocked has the challenge of developing a sample that is relevant to the event. Previous research has found that using news articles as a sample set for testing is an effective means of tracking censorship of event-related content over a defined time period.Footnote 29 We used this approach to develop our testing sample for documenting keyword censorship related to the NCPC19.
To automate testing and add rigour to our process, we developed a tool that scrapes news articles from RSS feeds of selected news sites. We automatically trialled the sending of title and body text from each article in a WeChat group chat between three test accounts: two registered to Canadian phone numbers and one registered to a mainland Chinese phone number.Footnote 30 One of the Canadian accounts was used to send messages, and the second Canadian account performed no actions, acting only as a passive user to facilitate the creation of a group chat. Throughout this process, our test accounts were limited to interacting with each other in the group chat and never interfaced with real users of the platform. The Chinese account was used to passively monitor whether messages sent in the group chat had been filtered.Footnote 31
Determining keyword combinations from censored text
After sending some text as a message in the WeChat group chat, if the Chinese account did not receive it, we then flagged the message text as containing one or more keyword combinations that trigger censorship. We then ran further tests to reduce the text to the minimum number of characters required for censorship to occur. We bisected the message to identify regions containing characters not necessary to trigger censorship (i.e. by separating the article into two halves and deleting each half to see if it is censored), and recursively reapplied this procedure to any halves required for censorship. We then split the keyword combination into its components by finding all positions in this string of characters where separating the text (i.e. by adding other characters in between) still resulted in a censored message. We call the resulting components of this process the blocked keyword combination.
Data collection and retest periods
We conducted observations of keyword-based blocking related to the NCPC19 during three event phases: before the event (22 September–17 October 2017), during the event (18–25 October 2017) and after the event (26 October–25 November 2017). Between 22 September and 25 November 2017, we retested on a daily basis any keyword combinations that we found blocked previously. Even when a keyword combination was found to be unblocked, we still continued retesting it and monitoring its censorship status to observe possible fluctuations. From 24 January to 10 September 2018, we again retested previously discovered NCPC19-related keyword combinations to monitor for the unblocking of previously sensitive keyword combinations.Footnote 32 Retesting of blocked content was motivated in part by previous work that suggests that the duration of time for which a keyword combination is censored can signal how sensitive the content is to the CCP.Footnote 33
Our testing sample consisted of articles with RSS tags related to the Congress collected from a variety of news sources including China's official state media, international media, Hong Kong-based media and censored content aggregation websites (see Appendix A for a full list of media sources). The vast majority of news articles we collected were written in simplified Chinese. In addition to sample testing news sources, we tested the names of the 2,287 delegates to the NCPC19 in our sample to build on the work of Ng and Landry who tested delegate names on Sina Weibo during the 2012 Congress.Footnote 34
We performed a content analysis of the keyword combinations our tests identified as blocked to understand the underlying context behind their implementation. One researcher (a fluent Chinese speaker) grouped the keyword combinations into content categories based on a code book we developed for the study (see Appendix B). Another researcher then performed inter-rater reliability checks on a randomized sample of the coded keyword combinations to ensure consistency.
Our work was limited to only discovering censored content contained in those text samples that we tested. To minimize our sample biases, we sampled news articles from both international and regional media that may have included content critical of the government of China, and Chinese state media, which are government-approved sources of information.Footnote 35
Our case study analyses a unique political event and explores one social media platform, which could pose limitations to the generalizability of our findings across other events and platforms in China. The NPC is arguably the most important political event for the CCP. Owing to its highly sensitive nature, it is expected that the government would control information around the event more strictly than around non-sensitive national events and, accordingly, companies would face greater pressure to censor more content than usual.Footnote 36 In interpreting our findings, we are mindful that our data may be a result of inflated government pressure rather than how censorship is implemented on an everyday basis. Despite these limitations our findings and methodology provide a baseline for further comparative research.
Our findings are based on testing real-time censorship of group chats on WeChat. Such censorship is a form of pre-emptive censorship implemented via automatic keyword filtering as opposed to post-hoc censorship (i.e. decisions made by human reviewers after content has been posted). Past research shows that Chinese social media platforms use a mix of pre-emptive and post-hoc censorship to keep information and online discussion in line.Footnote 37 While our findings may not be directly generalized to censorship by human reviews, they serve as a comparison to post-hoc censorship.
We report our findings by providing an overview of keyword combination blocking and unblocking observed during the keyword collection period (22 September–25 November 2017) as well as unblocking results during our extended retest period (24 January–10 September 2018). We use the term “unblocking” to refer to when a previously censored keyword combination is accessible in subsequent retests. In our analysis, we break down our findings by event phase and content category and examine the sensitivity of each content category by computing a “sensitivity score.”
Overview of NCPC19-related Keyword Censorship
In this section, we report on the rate of keyword censorship roughly a month before, during, and a month after the NCPC19. Overall, we found that keyword censorship on WeChat increased as the Congress approached; as the event faded away, there was less related content being censored. We did not observe any spikes of new censorship during the week of Congress; instead, most of the new blocked keyword combinations were discovered during the week leading up to the Congress.
Between 22 September and 25 November 2017, we identified a total of 531 blocked keyword combinations related to the NCPC19. Figure 2 shows the distribution of the blocked keyword combinations we identified during our observation period, by date of discovery. There are two spikes in the daily number of blocked keyword combinations discovered, on 25 September and 14 October. The first spike was likely owing to sampling issues. When we began testing, our tool extracted text from articles that were published before 22 September 2017. Therefore, it is likely that this spike was owing to a larger number of articles being accumulated over time containing many unique blocked keyword combinations that had not been found previously. As testing continued, the number of new keyword combinations discovered by date dropped until the second spike on 14 October, four days before the opening of the NCPC19, which suggests that censorship was heightened as the Congress approached.
In this section, we report on the rate of keyword combination unblocking in a two-month period surrounding the Congress. Overall, we found that WeChat maintained strict control over Congress-related content in our two-month data collection period; WeChat did not lift its censorship on blocked content until at least a month after the Congress had come to an end.
We report the results of our data collection period across three event phases: pre-Congress (22 September–17 October 2017); Congress (18–25 October 2017); and post-Congress (26 October–25 November 2017). Figure 3 shows the daily number of new keyword combinations discovered and the daily number of unblocked keyword combinations during our keyword collection period (22 September–25 November 2017).
To minimize the interference of different sample sizes in each phase (owing to new keyword combinations being found blocked, which we added to our set of known combinations), we also report the number of keyword combinations that were unblocked in each phase as a percentage of the total number of blocked keyword combinations in that phase. Figure 4 shows the ratio of blocked and unblocked keyword combinations by each phase.
Unblocking rates were relatively low and consistent between 22 September and 25 November 2017. Within this period, there was a slightly higher unblocking rate in the pre-Congress phase. Out of 346 NCPC19-related keyword combinations discovered during the pre-Congress phase, 69 (19.9 per cent) were unblocked before 18 October, the day when the Congress officially opened. In our two-month data collection period, we saw the lowest proportion of unblocking relative to the total number of blocked keyword combinations during the Congress (18–25 October). As of the end of the Congress, out of the 342 known keyword combinations that were actively blocked at some point during the course of the Congress, 300 remained blocked and 42 were unblocked. Of the keyword combinations that were unblocked during the course of the Congress, over one-third (35.7 per cent) were unblocked on 25 October, the day the Congress ended. As of 25 November 2017, 201 out of the 531 keyword combinations discovered (37.8 per cent) had been unblocked; in other words, one month after the conclusion of the NCPC19, the majority of the Congress-related keyword combinations we identified during our testing remained blocked on WeChat.
Extended monitoring of keyword unblocking
To determine whether discussion of the Congress would eventually become uncensored, we performed retests of all keyword combinations that were still blocked as of 25 November 2017 between 24 January and 10 September 2018. We found that the unblocking rate in this period (60.9 per cent) was significantly higher than in the two months surrounding the Congress (19.9 per cent, 12.3 per cent and 21.4 per cent in our three data collection phases, respectively). Our results show that as of 10 September 2018, a total of 402 of 531 NCPC19-related keyword combinations (75.7 per cent) were unblocked on WeChat, which suggests that WeChat eventually relaxed its censorship as time after the Congress elapsed.
To better assess the possible motivations driving WeChat's keyword blocking and unblocking decisions, we analysed the context of each keyword combination by grouping them into ten content categories based on interpretation of the underlying context (for a copy of our code book, see Appendix B). We found a diversity of blocked content including keyword combinations with pro, neutral and anti-government sentiments. References to Xi Jinping accounted for the highest number of blocked keyword combinations and were unblocked at the lowest rate.
Figure 5 shows the distribution of all 531 keyword combinations found blocked between 22 September and 25 November 2017. The largest category of censored keyword combinations made references to Xi Jinping (31.6 per cent), followed by keyword combinations referencing Power Transition (14.1 per cent), and Party Policies and Ideologies (13.0 per cent).
While the results of our content analysis indicate that a high number of blocked keyword combinations are Xi Jinping-related, the absolute number of keyword combinations discovered does not necessarily reflect the severity of censorship under a given content category, as a single keyword combination with broad coverage (for example, “党主席制” dang zhuxi zhi, Party chairman system) could have greater impact in censoring messages than many highly specific keyword combinations (for example, “习家军 [ + ] 大权独揽 [ + ] 十九大” Xi jiajun [ + ] daquan dulan [ + ] shijiuda, Xi's army [ + ] hold power to himself [ + ] NCPC19).
To account for the fact that keyword combinations used to perform message filtering can have varying degrees of coverage, we devised a sensitivity score to estimate the impact of each keyword combination. This score is designed to reflect approximately how many articles are censored owing to a given keyword combination, as a combination that has fewer and more commonly used components and that is blocked for a longer time can be expected to cause more articles to be censored.
Since we do not have enough data to exactly calculate the probability of any keyword combination triggering censorship of an article, we perform the following approximation to work out the sensitivity score. We first take the observed frequencies of each individual keyword component in each article and approximate the probability of a keyword combination triggering censorship of an article as the product of the frequencies of each constituent component. To estimate the coverage of a keyword combination, we then multiply the frequency product by the total duration of each keyword combination on the blacklist as measured in days, yielding the final score.
In addition to calculating the sensitivity score of keyword combinations, we also calculate scores for content categories. We calculate a category's sensitivity score as the average of the score of each keyword combination under that category. Table 1 shows the sensitivity score of each content category in descending order.
Figure 6 shows a distribution of the 531 keyword combinations we identified, ordered by their respective sensitivity scores. The filtering coverage of any given keyword combination varies highly; the top 66 keyword combinations (12.4 per cent) accounted for about 95 per cent of the cumulative sensitivity score. This variation shows how, in our testing, a small handful of broad keyword combinations triggered the majority of filtering, while most keyword combinations are narrowly scoped to target highly specific text.
The sensitivity scores help to reveal what is considered to be more sensitive and therefore warrants censorship for a longer period of time or with fewer keyword components, resulting in broader filtering of content. Of our content categories, the most sensitive were Sensitive Date (0.020) and Social Activism (0.016), the categories with the fewest keyword combinations identified (6 and 12, respectively). The Sensitive Date category includes references that are critical of the CCP (for example, “六四事件 [ + ] 民主 [ + ] 独裁” liusishijian [ + ] minzhu [ + ] ducai, June 4 Incident [ + ] democracy [ + ] dictatorship), and is an example of how, for content which is considered highly sensitive, an alternative approach to implementing many filtering rules is to apply broad censorship using less specific keyword combinations. Xi-related keyword combinations featured the third-highest average sensitivity score (0.12). This content category also contained the highest number of keyword combinations (31.6 per cent), which demonstrates the overall sensitivity of topics related to Xi and shows the extent of WeChat's efforts to carefully manage discussion about him on the platform.
Content analysis of unblocked keyword combinations across phases
To further explore the context and significance of blocking and unblocking patterns, we provide detailed content analysis results from each testing phase in the sections below. Figure 7 tracks the number of blocked keyword combinations (the total number of known keyword combinations minus the number of unblocked keyword combinations) by date for each given date within our data collection period.
Overall, in the days leading up to the beginning of the Congress, WeChat prioritized controlling the circulation of content related to Xi Jinping and Power Transition themes such as speculated personnel changes and intra-Party factionalism. During the Congress, the unblocking rate was the lowest across all phases. The control over Xi Jinping-related content remained strict during the Congress. Interestingly, WeChat marginally relaxed the filtering of content related to Power Transition, a topic that is considered highly sensitive in authoritarian regimes. In the month following the close of Congress, we observed little unblocking of Xi- and Power Transition-related keyword combinations while WeChat unblocked more keyword combinations related to government criticism, discussions of domestic politics (Hong Kong, Taiwan and Ethnic Groups), Party policies, and more.
Phase 1: Pre-Congress
Between 22 September and 17 October 2017, we found 346 blocked keyword combinations. Among these, 69 were unblocked when the Congress began. Content categories that had the highest rate of unblocking during Phase 1 are: Congress Delegates (100.0 per cent), Hong Kong, Taiwan and Ethnic Groups (38.56 per cent) and Party Policies and Ideologies (33.3 per cent). In contrast, content pertaining to Sensitive Dates, Power Transition and Xi Jinping were unblocked at the lowest rates (0 per cent, 1.5 per cent and 4.6 per cent, respectively). Figure 8 shows the number of keyword combinations found unblocked and still blocked by content category in this phase.
All 23 keyword combinations in the Congress Delegate category that were found to be blocked during the Pre-Congress phase were also unblocked in this period. In subsequent testing, almost all keyword combinations in this category were unblocked two days after they were first found to be blocked. Considering that keyword combinations were blocked for a much longer duration on average (96 days), the filtering of delegate names may have been owing to an error on WeChat's part.
We found 42 blocked keyword combinations that made neutral references to CCP ideologies and central policy including major policy programmes such as the Belt and Road Initiatives (“一带一路 [ + ] 丝绸之路经济带 [ + ] 建设 [ + ] 构想” yidai yilu [ + ] xichou zhi lu jingji dai [ + ] jianshe [ + ] gouxiang, Belt and Road Initiatives [ + ] Silk Road Economic Belt [ + ] construct [ + ] vision) and key CCP ideological concepts such as “socialism with Chinese characteristics” (“中国特色社会主义 [ + ] 全面依法治国 [ + ] 宪法 [ + ] 法律” Zhongguo tese shehuizhuyi [ + ] quanmian yifa zhiguo [ + ] falü, socialism with Chinese characteristics [ + ] comprehensive rule of law [ + ] constitution [ + ] law). These keyword combinations were extracted from news articles published by the state media outlet Xinhua News Agency, a source that official government WeChat public accounts and government authorities instructed media units to use as a reference in their coverage of the NCPC19.Footnote 38 Fourteen out of these 42 keyword combinations were unblocked before the opening of the Congress. These keyword combinations were blocked for an average of one week.
Phase 2: During Congress
Between 18 and 25 October 2017, we found the lowest level of unblocking (42 keyword combinations accounting for 12.2 per cent of the total active keyword combinations in this date range). Figure 9 shows the number of keyword combinations found unblocked and still blocked by content category in this phase. While the low rate of unblocking is unsurprising considering that China's censorship is known to tighten around important events, it is interesting that we did not observe any spikes in new keyword combinations in any content category during this phase (see Figure 7).
Only one Congress Delegate keyword combination is represented in this sample. Following this, content that made neutral references to Party Policies and Ideologies were unblocked at the highest rate (23.8 per cent) compared to other content categories during the Congress. Examples include: “五中全会 [ + ] 五年规划 [ + ] 制定 [ + ] 目标” Wu zhong quanhui [ + ] wu nian guihua [ + ] zhiding [ + ] mubiao (Fifth Plenum [ + ] Five-year Plan [ + ] set [ + ] goal); “反腐败 [ + ] 斗争 [ + ] 自信 [ + ] 足够” fanfubai [ + ] douzheng [ + ] zixin [ + ] zugou (anti-corruption [ + ] struggle [ + ] confidence [ + ] enough).
Keyword combinations pertaining to Government Criticism and Power Transition were unblocked at a similar rate (17.5 per cent and 17.1 per cent, respectively). The majority of these keyword combinations were general references to personnel changes expected to happen at the NCPC19 (for example, “十九大人事” shijiuda renshi, 19th Party Congress human resource management; “十九大常委” shijiuda changwei, 19th Party Congress Standing Committee member), or specific personnel changes that were confirmed during the Congress (for example, “十九大 [ + ] 政治局委员 [ + ] 蔡奇” shijiuda [ + ] zhengzhiju weiyuan [ + ] Cai Qi, 19th Party Congress [ + ] Politburo Committee member [ + ] Cai Qi; “政治局常委 [ + ] 栗战书” zhengzhiju changwei [ + ] Li Zhanshu, Politburo Standing Committee member [ + ] Li Zhanshu).
Phase 3: Post-Congress
In the one-month period following the end of the Congress (26 October–25 November 2017), we continued to find new keyword combinations that are related to the event. As of 25 November 2017, 330 keyword combinations were still blocked on WeChat. Figure 10 shows the number of keyword combinations found unblocked and still blocked by content category in this phase.
During this phase, we found 90 unblocked keyword combinations. The three content categories that received significantly higher unblocking rates are Congress Delegate (100 per cent), Government Criticism (48.9 per cent) and Social Activism (37.5 per cent). The rest of the categories were unblocked at a similar rate, between 20 to 30 per cent, with the exception of content related to Power Transition (12.9 per cent) and Xi Jinping (8.0 per cent).
Data Retest Period
Between 24 January and 10 September 2018, we found 201 unblocked keyword combinations, representing one-half of the total unblocked keyword combinations identified in this study. In this retest period, the unblocking rate of active keyword combinations was 60.0 per cent. All content categories were unblocked at a similarly high rate, including the ones that were unblocked at a significantly lower rate in previous phases. These include content related to Xi Jinping (79.5 per cent), Social Activism (57.1 per cent), Power Transition (55.3 per cent), Leadership (50 per cent), and Sensitive Date (50 per cent).
As of 10 September 2018, 402 of the 531 NCPC19-related keyword combinations (75.71 per cent) were unblocked, and 129 remained censored. Figure 11 shows the distribution of when keyword combinations were unblocked, by content category, across all phases. Overall, this figure shows that WeChat prioritized controlling the circulation of references to Xi Jinping by keeping Xi-related keyword combinations blocked for the longest period of time.
Content categories of keyword combinations that remained blocked included references to Xi Jinping (34.1 per cent), Power Transition (23.3 per cent), Leadership (17.1 per cent), Party Policies and Ideology (12.4 per cent), Social Activism (3.9 per cent), Sensitive Date (3.1 per cent), Information Control (2.3 per cent), Government Criticism (2.3 per cent), and Hong Kong, Taiwan and Ethnic Groups (1.6 per cent). We highlight examples from the top four blocked categories.
Unlike those keyword combinations that were blocked only during the Congress, which include both critical and general references to leaders and Party policies, keyword combinations that remained blocked were predominately critical in nature. Almost all Xi Jinping-related content that remained blocked during this period reference his involvement in intra-party power struggles (for example, Xi jia jun), his desires to stay in power (for example, “习近平集权” Xi Jinping jiquan, Xi Jinping consolidates power), critiques of his leadership style (for example, “习禁评” Xi jin ping, a homonym of Xi Jinping, which means Xi Jinping bans commentaries), or references to Xi's family (for example, “习近平 [ + ] 女儿” Xi Jinping [ + ] nüer, Xi Jinping [ + ] daughter).
The majority of keyword combinations under the Power Transition category pertain to factionalism in the highest echelons of the CCP (for example, “十九大 [ + ] 團派 [ + ] 江派” shijiuda [ + ] tuanpai [ + ] Jiang pai, 19th Party Congress [ + ] CCP Youth League Clique [ + ] Jiang [Zemin] Clique; “周永康 [ + ] 篡党夺权” Zhou Yongkang [ + ] cuan dang duoquan, Zhou Yongkang [ + ] usurp Party leadership and seize state power). Seven out of 27 keyword combinations under this category make reference to Wang Qishan and his change of position. Intense speculation around Wang's career circulated in the media before the Congress, centring on whether Wang, a close ally of Xi, would step down at the age of 68 according to an unwritten CCP norm. During the NCPC19, it was confirmed that Wang would retire from the Politburo Standing Committee. However, keyword combinations related to such speculation remained blocked. The majority of these have critical connotations such as references to rumours related to top leaders (for example, “情人 [ + ] 王岐山” qingren [ + ] Wang Qishan, lover [ + ] Wang Qishan; “郭文贵 [ + ] 领导人” Guo Wengui [ + ] lingdaoren, Guo Wengui (a businessman-in-exile who tells sensational stories about Chinese leaders) [ + ] leaders).
There were 16 neutral references to Party Policies and Ideology that remained blocked. Most were generic references to the Party's intra-party campaigns (for example, “中央纪委 [ + ] 主体责任 [ + ] 党风廉政建设 [ + ] 调研” zhongyang jiwei [ + ] zhuti zeren [ + ] dang feng lianzheng jianshe [ + ] diaoyan, Central Commission for Discipline Inspection [ + ] liability [ + ] efforts to curb corruption and raise ethical standards of the Party [ + ] survey; “人心向背 [ + ] 我们党 [ + ] 改进作风 [ + ] 群众” renxin xiangbei [ + ] women dang [ + ] gaijin zuofeng [ + ] qunzhong, whether the people are for or against [ + ] our Party [ + ] improve work ethic [ + ] the masses).
The critical sensitivity of the NCPC19 for the CCP is reflected in how content related to the event was censored on WeChat. Our findings present a mix of censorship patterns – some of which can be interpreted as part of the CCP's efforts to manage and shape public opinion around sensitive events and others that less clearly contribute to a coherent strategy. Explaining these inconsistencies requires attention to the motivation and agenda of the government and WeChat. Given the importance of the NCPC19 and the increased government pressure to control messaging around the event, it is evident that WeChat came under direct or indirect government pressure to censor content. However, the ways in which the pressure was translated into controls on the platform show how the system of intermediary liability in China can create expansive and blunt reactions to political events, which may hamper the government's propaganda strategies.
Overall, our case study shows that the system of online control is effective in compelling companies to implement information controls. The spike in censored keyword combinations on the eve of the NCPC19 and the low levels of unblocking during the Congress suggests that WeChat faced pressure to control content around the event and is consistent with the reactive pattern of censorship that follows sensitive events found in other studies.Footnote 39
While WeChat applied broad restrictions to content related to the Congress, the different types of content it prioritized control over before, during and after the event partly reflect what the government considered most sensitive and destabilizing at a given moment. Blocking criticism of government and leaders could be a means to prevent the spread of messaging that is potentially destabilizing to the Party during a critical moment.Footnote 40 Censoring speculation and rumours concerning leaders and power struggles within the Party may be motivated by an effort to project images of power and unity and help leaders to save face or avoid embarrassment.Footnote 41 Prohibiting discussions of these symbols of criticism and resistance – even those that are not immediately connected to the Congress – may be a continuation of the CCP's guideline on manufacturing consent by nudging the public to “think positive” rather than ask hard questions or call into question the Party's legitimacy during sensitive events.Footnote 42 The overall focus on content related to Xi Jinping and other Party leaders is consistent with previous studies that find that the higher an official's rank, the more likely content related to them will be blocked.Footnote 43
However, the dynamic changes of censorship we observed throughout the event reflects the complexity of China's information control system, which cannot be explained by considering the strategies of the government alone and requires consideration of the role and motivations of the companies involved.
On the one hand, the data support previous studies that find that the tightening or relaxing of online space may depend on whether the authorities can predict the direction of online discussion, highlighting the role of government in online control and propaganda.Footnote 44 The censoring of speculation, rumours or even just the mentioning of power transitions heavily before the Congress and the lifting of censorship as the Congress proceeded corresponds to the timing when mainstream media unveiled the official arrangement of personnel changes. Censoring news that is broadcast nationwide or broadening the scope of censorship during the peak of an event may backfire.Footnote 45
On the other hand, our data point to nuances that have not been captured or fully explained in existing theories, which are largely based on the assumption that China's censorship is the precise outcome of government strategies. Instead, our study reveals that censorship decisions made regarding NCPC19-related content show a relationship between private companies and the state that goes beyond the companies passively implementing orders. In addition to blocking criticism of the CCP and government leaders, which is commonly assumed as the red line for public expression in China,Footnote 46 WeChat also filtered neutral references to CCP ideology and government policies throughout our observation period. The motivation behind blocking these keyword combinations is less clear as it could restrict general and potentially even pro-government conversations about the NCPC19. Over half of these neutral keyword combinations were unblocked after the Congress. A number of keyword combinations in this category were extracted from Xinhua News Agency, a news source that the CCP had instructed media to use as the standard for NCPC19 coverage.Footnote 47
It is unclear whether the decision to block neutral content came from WeChat, the state or a combination of both. Vuori and Paltemaa argue that censors follow “the logic of no talk is better than any talk when it comes to the Party and leaders.”Footnote 48 Following this reasoning, blocking references to Party ideology and policies may be part of a government censorship strategy. Unlike traditional media or the WeChat public accounts platform where articles are vetted before publication, discussions on chat threads are more difficult to predict and manage. Authorities may have regarded the possible negative impact of public discussion getting out of control as greater than the collateral damage of blocking potentially pro-government messages.
The blocking of this neutral content may also be a careful decision by WeChat to avoid official reprimands, considering the high stakes if it fails to properly control the spread of information on its platform.Footnote 49 Since censorship is operated through a system of intermediary liability, the broad and blunt censorship on WeChat may reflect the company taking a “better safe than sorry” approach around an event of utmost importance for Xi Jinping and the CCP so as to avoid potential penalties. If, as existing theories argue, the government can precisely implement its strategic censorship,Footnote 50 we should expect to see official news reports, pro-state messages and discussions that toe the Party line being distributed on the platform as they are part of the government's propaganda strategy. The NCPC19 is one of the most important events for the CCP. Compared to other unpredictable events, the CCP can largely foresee and dictate the direction of the NCPC19 as well as plan out a media strategy surrounding the event. The expansive and blunt censorship we observed on WeChat thus calls into question whether the government can precisely suppress or allow information around important events even when it has enough time to prepare.
In addition, whereas Ng and Landry found that names of 207 NCPC18 delegates were actively censored during the Congress,Footnote 51 we found that only 23 of the 2,287 Congress delegates names were blocked and were subsequently accessible in a matter of days after they were first found censored. This finding suggests that blocking was not significant. Potentially done in error on WeChat's end, it shows the decentralization of China's information control system. Even at a highly politically sensitive time like the Party Congress, private companies are still likely to have significant autonomy over content filtering decisions.
Despite the heavy censorship of Congress-related discussion, most keyword combinations were eventually unblocked, a pattern consistent with previous research.Footnote 52 However, the time it took WeChat to lift the block on most NCPC19-related keyword combinations was significantly longer than that which Ng and Landry observed in their work on censorship over the NCPC18 on Sina Weibo.Footnote 53 Ng and Landry speculated that the unblocking of search queries could be attributed to either the CCP's desire to use social media as a means to keep a check on officials or to the new wave of leaders assuming office.Footnote 54 They posed an open question as to whether the decrease in the blocking of searches they observed was evidence of the Xi Jinping administration's support of liberal reforms and relaxation on information restrictions. Our findings, however, suggest that the CCP has only strengthened its information controls. Content related to Xi Jinping, negative or neutral, was overall the most sensitive topic throughout the Congress in terms of the censorship scope, duration and significance. Companies continue to face heightened pressure around sensitive events, leading them to exert an extreme level of scrutiny and a long period of blocking over not only inflammatory and critical content but also neutral content related to those events.
Granted, the blunt censorship we observed in this study may be a result of the sensitivity surrounding the NCPC19. However, the inconsistencies in censorship patterns of this exceptionally sensitive event underlie some dilemmas of China's information control system. There is the conundrum between the state's increasing desire to incorporate social media into its propaganda machine and the unpredictability of online discussion. Meanwhile, whereas the state wishes to precisely manipulate information at different times of an event, it also relies on private companies to carry out directives – and companies may defy government orders out of their own commercial interestsFootnote 55 or censor even pro-state content to stay safe.
Major political events in China are routinely met with increased censorship, heightened security and propaganda including reactive censorship on social media platforms. The NCPC19 was one of the most politically sensitive events of 2017 and the broad censorship that we observed around it can be interpreted as part of the CCP's general strategy for public opinion management. However, our analysis of how WeChat censored content related to these events points to the need for a more nuanced assessment of how effective this strategy is, owing to how social media censorship in China is implemented.
Censorship on Chinese social media should be not be framed as a top-down monolithic system of control in which companies passively comply with government orders, but rather as the product of interaction between the government and private companies. The result is not necessarily an air-tight system that is precisely controlled by the state and always reflects government policy strategies.Footnote 56 The regime of self-discipline pushes responsibility for censorship down and on to private companies.Footnote 57 The government can signal sensitive events that should be managed through its directives and reprimands of companies, but the actual implementation of censorship is done at the company level, which can lead to over-blocking and the intentional or unintentional failure to comply with directives even around the most politically sensitive events. Our study shows that the underlying decision making behind social media censorship in China cannot be explained from only the perspective and agendas of the government or private companies; rather, it has to be seen as an intermingling of the two. Acknowledging this nuance and complexity in China's information control system is an important step towards a more accurate analysis of the effectiveness and weaknesses of authoritarian regimes’ information control strategies in the digital age.
The National Party Congress is arguably the CCP's most important political event, which poses questions of the generalizability of our research findings. In future work, we will apply this methodology to analysis of censorship around other events, political and non-political, to see whether similar patterns emerge. A comparative study of the implementation of censorship by different private companies in reaction to the same events could also shed light on the relationship between the state and private companies in the ecosystem of censorship in China.
This study was made possible by funding from Open Society Foundations. The authors thank the anonymous referees for their valuable comments.
Conflicts of interest
Lotus RUAN is a researcher at the Citizen Lab, Munk School of Global Affairs and Public Policy, University of Toronto. Her research examines internet control, propaganda and social governance, with an area focus on China.
Masashi CRETE-NISHIHATA is associate director at the Citizen Lab, Munk School of Global Affairs and Public Policy, University of Toronto. His research focuses on the human rights impact of information controls.
Jeffrey KNOCKEL is research associate at the Citizen Lab, Munk School of Global Affairs and Public Policy, University of Toronto. He fights to bring transparency to internet censorship and surveillance using reverse engineering techniques.
Ruohan XIONG was a researcher at the Citizen Lab, Munk School of Global Affairs and Public Policy, University of Toronto.
Jakub DALEK is a researcher at the Citizen Lab, Munk School of Global Affairs and Public Policy, University of Toronto. His research focus is on technical implementation of national and ISP level network filtering infrastructure and censorship.