Elite Polarization in South Korea: Evidence from a Natural Language Processing Model

Abstract This study analyzes political polarization among the South Korean elite by examining 17 years’ worth of subcommittee meeting minutes from the South Korean National Assembly's standing committees. Its analysis applies various natural language processing techniques and the bidirectional encoder representations from the transformers model to measure and analyze polarization in the language used during these meetings. Its findings indicate that the degree of political polarization increased and decreased at various times over the study period but has risen sharply since the second half of 2016 and remained high throughout 2020. This result suggests that partisan political gaps between members of the South Korean National Assembly increase substantially.


Introduction
South Korea has experienced economic development and democratization in a relatively short period of time. However, Koreans' evaluation of Korean politics is negative. Koreans are particularly concerned about what they perceive as a growing polarization between liberal and conservative political parties. Indeed, this perception has been verified by previous studies, which have shown that South Korea's two major political parties are now characterized by high levels of internal ideological homogeneity and have increasingly diverged from one another over time (Ka 2014;Park et al. 2016). In this way, the South Korean situation meets the definition of political polarization, which can be defined as loss of the capacity for inter-party dialog and compromise and the recurrence of hostile confrontations and deadlock (McCarty, Poole, and Rosenthal 2006). In this study, political polarization refers specifically to the polarization of the political elites participating in party politics (Baldassarri and Gelman 2008). This is because political parties do not resolve but rather amplify social conflicts in Korean society (Cho and Lee 2021;Kwon 2020;Lim et al. 2019).
In this way, the polarization of elites can drive polarization on a societal scale (Druckman, Peterson, and Slothuus 2013;Robison and Mullinix 2016;Banda and Cluverius 2018).
Despite its important role for representative democracy, public confidence in the National Assembly is low compared to other public institutions. A 2019 survey noted that public confidence in the National Assembly (19.7 percent) is rather low compared to institutions such as South Korea's central government (38.4 percent), court system (36.8 percent), police (36.5 percent), and prosecution (32.2 percent) (Statistics Korea 2020). Some scholars have suggested that this low level of public confidence in the National Assembly is a direct result of the disappearance of compromise and coexistence within National Assembly politics, which is reflected in the confrontation and deadlock between political parties (Yoo 2009;Seo 2016). As these hostile, polarizing confrontations between political parties continue and repeat, major legislation and policy issues become delayed, and law-making becomes more difficult and less effective (McCarty, Poole, and Rosenthal 2006;Gilmour 1995;Groseclose and McCarty 2001). Although it is easier for internally consistent majority ruling parties to garner the support necessary to pass legislative proposals, enacting such bills into law is more difficult when politics are polarized in this way (Krehbiel 1998;Brady and Volden 1998). This is a key issue, because as the productivity of politics decreases, distrust in politics increases, and the meaning and purpose of representative democracy fade as a result (Hibbing andTheiss-Morse 1995, 2002;Hetherington 2005;Theriault 2008). In short, although the expression and fierce contestation of political views is important to democracy, ideological differences between parties can widen to the point that confrontation and conflict intensify and productive debate or legislative work is impossible. Thus, it is very important for observers and policy-makers alike to determine the roots, extent, and consequences of political polarization, and to work to remedy it.
Since roll-call data were released from the second half of the 16th National Assembly, scholars have studied political polarization within the National Assembly using the Nominal Three-Step Estimation (NOMINATE) method proposed by Poole andRosenthal (1985) (e.g., Jeon 2006;Lee and Lee 2008;. However, this approach only measures the outcome of votes on bills, not how polarization arises and affects the legislative process in detail. The current study aims to present empirical evidence of the polarization of South Korean political elites by analyzing subcommittee meeting minutes that actually reflect the legislative process, rather than the result of votes on the floor. Thus, it fills some gaps in the literature. The subcommittee meeting minutes show how politicians from across the political spectrum use language to gain advantages in debates over bills, and they thus reflect the active, competitive use of language in the legislative process (Edelman 1985, 16). This study focuses primarily on the second subcommittee of the Legislation and Judiciary Committee because it examines the wording of all bills that have been reviewed by other committees. This study uses a natural language processing (NLP) model that learned the political language of 20 years of whole subcommittee meeting minutes to examine the minutes of the second subcommittee of the Legislation and Judiciary Committee from the 17th National Assembly through 20th National Assembly in their entirety and quantify changes in political polarization overtime.
The classification model of the NLP technique learns sentences and words belonging to the two classes, and the trained model classifies the target text and measures its accuracy. The degree of this accuracy is calculated by measuring the polarization of political language. Its findings indicate that the degree of political polarization increased and decreased at various times over the study period but has risen sharply since the second half of 2016 and remained high through 2020. This suggests that partisan political gaps between members of the South Korean National Assembly increase substantially.
The use of neural network NLP techniques can complement previous studies and present different perspectives on analyzing political polarization. The current study also contributes to the literature by providing empirical evidence from South Korea, as most recent attempts to analyze ideology using neural network NLP techniques have been limited to Anglophone and European countries.
The remainder of the article is organized as follows. Section 2 reviews the literature on measuring political polarization and presents some theoretical discussion of language's relation to politics. Section 3 discusses this study's methodology. Section 4 discusses this study's data-that is, the subcommittee meeting minutesin more detail. Section 5 provides the results. Section 6 discusses the findings in some detail. Section 7 discusses the implications of the findings.

Literature review
Political polarization is not exclusively a South Korean problem; it is a growing global phenomenon (McCarty, Poole, and Rosenthal 2006;Singer 2016;Pew Research Center 2017;Banda and Cluverius 2018;Vachudova 2019). Political polarization has been studied mainly in the context of national-level politics in the United States (Abramowitz and Saunders 2008;Theriault 2008;Shor and McCarty 2011;Banda and Cluverius 2018). According to Binder (1999) and Fleisher and Bond (2004), the political polarization of national-level politics in the United States has intensified since the Democratic and Republic parties took on more sharply divisive and ideological identities in the 1980s and 1990s. They also found that cross-party voting has declined, as has the number of opposition members supporting the president's agenda. Furthermore, Layman andCarsey (2000, 2002) found that congressional candidates with increasingly ideological roll-call voting records are more likely to be elected or re-elected.
Although South Korea, unlike the United States, has a parliamentary system with multiple political parties, two major parties have occupied most seats in the South Korean National Assemblies since the 1990s (after democratization)-much like the United States, which only has two political parties. While South Korea's liberal-conservative dimension does not exactly match the liberal-conservative dimension in the United States party system, it has become very analogous to its party system in terms of the political parties' position on economic and redistribution policy (Hix and Jun 2009;Kang 2018;Kwak 1998). In addition, Korean political parties have developed toward a loosely centralized organization, rather than a catch-all organization (Han 2021). The cohesion of the two political parties has been somewhat large, even though their cohesion has been weakened compared to when they were led by two charismatic politicians, Kim Young-sam and Kim Dae-jung, the former presidents (Han 2021;Horiuchi and Lee 2008;Jeon 2014;Kim 2008). Both parties have a strong support base in that they developed based on regionalism: the Honam and Youngnam regions (Cho 1998;Kang 2016;Kwon 2004;Lee and Repkine 2020;Moon 2005).
Scholarly literature on political polarization in South Korea has appeared only recently.  gave empirical grounds for examining political polarization in the Korean context by analyzing changes in lawmakers and Korean citizens' ideological orientation. They found that polarization increased between politicians but was minimal between citizens, suggesting that polarization in Korean politics is driven by political elites. Similarly, Ka (2014) surveyed members of the 16th, 17th, 18th, and 19th National Assemblies and found that the ideological gaps between them widened over this period.  analyzed roll-call data from the 16th, 17th, and 18th National Assemblies and found that the state of polarization in Korean party politics is severe and has become remarkably worse since the 17th National Assembly. Kang (2012) examined roll-call data and found that political polarization was particularly prominent in foreign policy discussions during the 19th National Assembly.  concurred, suggesting that political polarization in South Korean politics has accelerated since the 2000s not just as a result of debates over domestic issue such as expanding universal welfare but foreign policy and trade issues such as Korea's involvement in the Iraq War, the prospects of a Korea-US free trade agreement, and importing American beef.
Such polarization likely peaked in 2016-17 (Jung 2018). In December 2016, the National Assembly impeached President Park Geun-hye, a member of the conservative party. She was later removed from office by a unanimous decision of the Constitutional Court in March 2017 (Shin and Moon 2017). The polarized post-impeachment atmosphere is reflected in the 20th National Assembly's bill passage rate-approximately 36 percent, the lowest in the National Assembly's history. 1 Furthermore, this period saw an increase in conservative politicians' extra-parliamentary political activity, insofar as the liberal, public-led candlelight rallies were met by conservative, civil society-led Taegeukgi rallies involving conservative politicians (Cho and Lee 2021;Hwang and Willis 2020;Min and Yun 2018;Oh 2019;Reijven et al. 2020). These may partially explain the phenomenon of political polarization, but the bill passage rate and Taegeukgi rallies do not empirically prove that political polarization in South Korean has intensified since President Park's impeachment.
The studies aiming to prove political polarization are either based on roll-call data or survey data. Roll-call data are relatively easy to access, can be easily modified and applied, and can function as data in themselves. This makes it easy to apply them as a way to measure political polarization. Hence, many studies have used roll-call data to study polarization and ideology (e.g., Poole and Rosenthal 1997;Ansolabehere, Snyder, and Stewart 2001;Jeon 2006;Poole 2007;Garand 2010;Abramowitz 2010;Shor and McCarty 2011;. However, some studies have argued that this approach does not travel well in the parliamentary context (Schwarz, Traber, and Benoit 2017) because the voting data captured in roll-call data may reflect selection bias for politicians' votes (Carrubba et al. 2006;Carrubba, Gabel, and Hug 2008;Hug 2010). Politicians' votes may not directly reflect their ideological positions-they may reflect the influence of other factors, such as politicians' personal interests and their relations with the ruling government (Sinclair 1982(Sinclair , 2000Benedetto and Hix 2007;Kam 2009). This approach also overlooks the context of the legislative process, which is key to understand ideology (Jessee and Theriault 2014). For example, by focusing on votes, these studies cannot account for deliberation which arrives at consensus or the fact that some politicians abstain from voting or are not present at the vote. Such studies would mischaracterize voting results' relationship to ideology, perhaps especially in highly polarized contexts such as the Korean National Assembly.
Furthermore, decision-making processes at the party level can affect the individual lawmakers' votes regardless of their personal ideological dispositions. For example, as a party becomes more conservative, its members are likely to vote more conservatively regardless of their own stances on a given issue. This is especially true in South Korea, where party leadership has a strong influence over individual lawmakers (Jeon 2014;. In this case, the lawmaker is given a more conservative NOMINATE score than his or her ideological orientation. In other words, increases in the ideological distance between lawmakers and their political party's leadership might predict their overall ideological polarization. In NOMINATE-based analysis, it is necessary to decide which legislation and lawmakers to include in the analysis target, but there is no statistical or theoretically established criterion for this yet in literature. Other studies (e.g. Kang 2012;Ka 2014, 2016, Park et al. 2016Park et al. 2016Jung 2018 have used National Assembly survey data to study polarization in South Korean politics. Like the roll-call data, these data are also widely availablemembers of the National Assembly have filled out questionnaires on their ideological orientation since 2002. However, this method also has limitations. First, lawmakers sometimes refuse to respond to these surveys. For example, only 56.91 percent members from the Democratic Party, 55.7 percent from the Saenuri Party, 42.11 percent from the People's Party, and 16.67 percent from the Justice Party responded to a survey of the 20th National Assembly (Park et al. 2016, 127). Second, if the questionnaire items do not remain consistent over time, it might be more difficult to reliably measure lawmaker' ideology over time. Finally, and above all, this method relies on lawmakers' self-reported responses, so it is difficult to regard them as objective data-not the least because these responses do not reflect politicians' actual actions and may be strategically adjusted for the context of a study.
Other studies have undertaken qualitative analyses of National Assembly subcommittee meeting minutes. Kwon and Lee (2012) classified lawmakers' remarks in the minutes of the standing committee of the 17th National Assembly into those that reflected partisanship, representativeness, professionalism, and compromise. Their analysis showed that partisan criticisms and personal attacks were less prominent in these minutes than the other three types; instead, they found that the minutes reflected a search for opinions based on expert knowledge and the identification of causal relationships. Ka et al. (2008) analyzed the same meeting minutes in order to study National Assembly members' participation in the meeting and the subcommittee's decision-making processes. They found that the standing committee's decision-making method was closer to a consensus system than a majority system by showing that proposals for the chairman's resolutions were more common than votes. However, these qualitative analyses have two problems. First, the researchers' subjective selection of data makes it difficult for other researchers to reproduce their results. Second, it is difficult to extract sufficiently meaningful information from this kind of data, especially given the large (and increasing) volume of meeting minute data. Below, in the methodology section, this study discusses why current study chose to use an NLP model to overcome the limitations with previous studies described above and describe the use of the model.

Methodology NLP techniques
Unlike the roll-call-based approach, an NLP approach to meeting minutes analyzes the discussion process rather than voting results. Unlike the survey-based approach, an NLP approach analyzes the degree to which politicians' political polarization reflects their actual actions (in this case, behavior in subcommittee meetings). This approach can analyze vast amounts of meeting minutes using algorithms, which can be used by other researchers, making the results highly reproducible (see Appendix A for more details about NLP and see Appendix B for NLP code).
Other studies in political science and the social sciences have used NLP techniques to measure political polarization. These studies value these techniques because they allow the use of large amounts of text data using supervised or unsupervised learning approaches (Haselmayer and  This study trains a model to learn political language from meeting minutes (see below) and then classifies text data to quantify political polarization.
If political parties maintain disparate positions as a result of their ideological characteristics and these parties' members use language accordingly, NLP model should be able to classify text by learning the respective languages of liberal and conservative parties and then identifying classification elements. Figure 1 summarizes the data analysis process: the current study built NLP model using 20 years' worth of subcommittee text data, excluding that of the Legislation and Judiciary Committee, and then classified the text from the second subcommittee of the Legislation and Judiciary Committee as the target text.
The logic of text classification is as follows. Text classification refers to the process of receiving a sentence (or word) as input and classifying where the sentence belongs between pre-trained (defined) classes. That is, in advance, by classifying the sentences or words to be learned into binary (conservative vs liberal), the NLP model learns through the learning process (training process), and the target text is input and classified. To explain it based on Figure 1, the learning process means the process of learning to which class the sentences (or words) related to a specific issue belong through the "Training, tuning, and evaluation" (corresponding to "NLP model learning code" in Appendix B). And by classifying "Target Data" through the trained model, we can estimate the polarization (corresponding to "Test code" in Appendix B).
The logic of quantification of polarization is as follows. If the classification model can accurately distinguish between the language of given political parties, it will be considered to indicate significant ideological polarization between those factions. If the trained model cannot do so, then the degree of polarization between these parties' ideological positions will be considered to be small or insignificant. The classification model that learned the language of each party for 20 years classifies each meeting minute of the second subcommittee of the Legislation and Judiciary Committee. We can convert the accuracy (degree of polarization) of each into time-series data.
This study uses bidirectional encoder representations from transformers (BERT), a neural networks model that dynamically generates word vectors utilizing contexts and learning them in both directions according to context (see Appendix A for more details). BERT is a semi-supervised learning model that builds a general-purpose language understanding model using unsupervised learning of a large corpus. It fine-tunes this model using supervised learning and applies it to other work (Devlin et al., 2019). Simply put, BERT uses pre-learning within a text corpus to create a model that understands the basic patterns of language, and transfer learning is then performed by applying this model to new tasks. In this way, BERT combines the advantages of unsupervised and supervised learning approaches. Unsupervised methods come with significant post hoc validation costs, as the researcher "must combine experimental, substantive, and statistical evidence to demonstrate that the measures areas conceptually valid as measures from an equivalent supervised model" (Grimmer and Stewart 2013, 271). However, as described above, BERT performs a classification task through supervised learning This study utilized Google Colab Python 3 and Pytorch as analytic tools. This study also applied BERT with the Hugging Face transformer model installed. Hugging Face is a useful Pytorch interface designed for utilization with BERT. The library includes pre-build modifications of the model, which enables specific tasks, such as classification. When fine-tuning BERT model, this study established the following: batch size of 32, learning rate (Adam) of 2e-5, and 3 epochs. In dividing the data into training, validation, and testing data in the model-building process, this study first set 30 percent of the total data as test data and then 10 percent of the training data as validation data (see Appendix B for seeing the process of building BERT NLP model, additionally GPT-2 NLP model).

Data
Data that best reflect the functions of politicians' political language are necessary to measure political polarization. This study proposes an approach in which the model learns the entire minutes of the Standing Committee subcommittees except for the Legislation and Judiciary subcommittee and then classifies the minutes of the second subcommittee of the Legislation and Judiciary Committee. Figure 2 shows the South Korean legislative process. The subcommittee is responsible for practical legislation and government budget review within National Assembly committees. Standing committees in different fields, including education, diplomacy, national defense, labor, and the environment, include three to six subcommittees each with narrower roles. If a bill's contents are simple or uncontentious, it is not referred to subcommittees. If they are, the relevant committee refers the bill to a subcommittee for review after a general discussion. The final resolutions of bills are done in plenary sessions of the National Assembly after they have been reviewed by each standing committee. It is common for bills to be passed during plenary sessions without discussion. Not all bills and budgets are reviewed by the standing committee; in such cases, subcommittees conduct a practical review of the given bills.
All bills passed by each standing committee are examined by the second subcommittee of the Legislation and Judiciary Committee (see Figure 2). This subcommittee performs the final review before a bill is transferred to the plenary session; it thus effectively serves a role similar to that of the United States Senate. The second subcommittee reviews whether the bill conflicts with existing bills or violates the constitution and refines the wording of bills. It is during this process that each party's positions on the bill become clear. Thus, political language is actively used in the standing committee subcommittees and in the second subcommittee of the Legislation and Judiciary Committee in particular.
Article 50, Clause 1 of the South Korean Constitution states that the National Assembly must disclose the bill review process to the public, with the exception of bills that concern national security (Article 50, Clause 1 of the Constitution; Article 75 of the National Assembly Act) and meetings of the intelligence committee (Article 54-2 of the National Assembly Act). Thus, sample data are quite comprehensive, not limited to certain subcommittees. They contain meeting minutes from all standing committee subcommittee meetings between July 2000 and May 2020. Current research gave the Democratic Party of Korea (the leading liberal party) a value of zero and the People's Power Party (the leading conservative party) a value of one when pre-processing the minute data for training. These two leading parties have won most of the seats in South Korean parliament since the 1990s; this study refers to them by their current names (as of 2021) for convenience. This study also classified data which represent the actions of lawmakers from minor parties within the National Assembly in a binary manner, accounting for their political inclination to form coalitions with either of the two major parties. This study acknowledges that the classification of minor parties cannot but be subjective (see Appendix C for more details).
When pre-processing the subcommittee data, this study noted that documents from the 16th and 17th National Assemblies presented the names of assembly members in Chinese characters instead of Korean letters. Thus, this study had to translate their names into Korean. This study excluded unnecessary or irrelevant information, such as descriptions of the documents and remarks by other persons, from the analysis. It is important to pre-process the data so that we can better analyze relevant dialog and exchanges among relevant parties. Figure 3 shows the results of BERT model of 17 years' worth of subcommittee meeting data. For comparison, the red line marking a value of 0.7 is the benchmark for a high degree of polarization, and the blue dotted line marking a value of 0.5 is the average accuracy result of evaluating the test data in the process of building the BERT classification model. The green dotted line represents the degree of polarization, and the orange line represents the trend. The X-axis marks the years 2004-2020 (from the 17th to the 20th National Assembly) and the administration in each period. Figure 3 shows that there is a trend toward increasing political polarization.  Although it usually decreased after these escalations, polarization has remained very high since 2016-17. From a data-intuitive perspective, it may rise and fall due to noise in the data. However, remaining high should be interpreted in a different context. This finding implies that polarization is becoming and remaining more intense over time rather than returning to normal or exhibiting the wax/ wane pattern of previous National Assemblies. Given that this model captures wider political differences beyond particular legislation, this study interprets pre-2016 cycles in polarization as reflecting regular politics and debate and the high level of post-2016 polarization as capturing a growing and more enduring set of ideological differences between ruling and opposition parties. Figure 5 compares the results of the BERT and GPT-2 models. They exhibit similar trends in polarization, especially after 2016, thus validating the findings.

Discussion
The above section described the process and the finding that increasingly polarized political language indicates a serious, more enduring, and deepening polarization between political elites in South Korea. This deepening tension leads each of the parties to use different political language and stoke division on particular issues. This study also found that the more polarizing the language used in meeting minutes, the more polarized politics becomes (e.g. the growth in partisan language after 2016). These conflicts have obvious consequences for the legislative process.
There may be several reasons behind the increase in polarization post-2016, but the impeachment of President Park undeniably looms largest among them. After Moon Jae-in was elected following the impeachment, his administration promoted investigations which framed the previous two conservative administrations (Park Geun-hye administration and Lee Myung-bak administration) as corrupt and untrustworthy (B. Kim 2019; Kirk 2020; Lee 2018). These included the creation of a special committee to confiscate the illegitimate proceeds earned by former president Park Geun-hye and her longtime friend Choi Soon-sil. The Moon Jae-in administration's core and public focus on correcting the mistakes and injustices of previous governments formed by the opposition party has had a lingering and polarizing effect on Korean political discourse. Although these policies nominally aimed to restore and improve South Korean democracy, they have instead made Korean politics so polarized that party politics is nearly impossible. There have been no joint efforts between Unfortunately, the politicians who made up a large part of the conservative party opposed or denied the impeachment, and ironically disparaged and rejected the democratically constituted Moon Jae-in administration (Kim 2020). On the other hand, the faction leading the impeachment regarded the opposition faction only as an object of reform and did not accept it as an object of cooperation, driving parliamentary politics into a hostile confrontation. Following the impeachment, Park's partythe People's Power Party-conducted political activities outside of parliament, such as the Taegeukgi street rallies (Cho and Lee 2021; T.-H. Kim 2019). These rallies turned violent and, in turn, damaged the legislative process and the prospect for effective parliamentary politics (Cho and Lee 2021;Kwon 2020). Conflicts between two factions persisted throughout the 20th National Assembly, and the findings of this study can be interpreted as reflecting the language of conflict in the process of legislation (in terms of bill passage rate, the passage rate of bills proposed by lawmakers was particularly low in the 20th National Assembly. see Appendix D for details).

Conclusions
This study's findings indicate that political polarization has waxed and waned in South Korea's National Assembly since 2004, but increased sharply in late 2016 and has remained at a high level since. This indicates that South Korea, similar to many other countries, is affected by the widening and deepening phenomenon of political polarization. I conclude this study by discussing some implications of the findings.
The findings have important implications for the National Assembly moving forward. The analysis indicates that persistent use of polarizing political language stokes polarization in general. This is likely because it removes opportunities for politicians to find (or seek) common ground from which to build a compromise. These findings imply that bills whose contents are closely related to people's everyday lives and give the parties their few opportunities to make ideological gains may not be properly reviewed, and that those which have a lot of ideological content often face dead lock and/or last-resort negotiations. These findings also imply that polarization can undermine other important functions of the National Assembly, such as conducting confirmation hearings. Furthermore, an increasingly polarized environment pushes political parties to seek ideological gains rather than governance-they support or maintain policies which only appeal to their supporters, cast all political issues as dichotomous, and perpetuate a picture of their political opponents as enemies rather than parliamentary colleagues. Members of the 21st National Assembly, which opened in June 2020, should seek to avoid increasingly polarizing language and aim to resolve polarization and create cross-party dialog.
The current study has some shortcomings, but it contributes to the development of the analysis of political polarization based on NLP, analyzes the polarization of South Korean politics, complements previous studies with the new approach and gathers data that can be of use for further studies. Above all, this study opens up new areas of analysis of political phenomena using neural network algorithms.
The limitations of this study and suggestions for follow-up studies are as follows. First, this study's proposed causal link between language and political polarization is tentative and begs further exploration. This study's empirical findings are only suggestive; although it helps us understand the trends and timing of the escalation of polarization in South Korean politics since 2004, it does not suggest causes for this phenomenon or specify which social issues exacerbated polarization. Furthermore, it is difficult to determine clearly whether the polarization of political language is influenced by a particular political phase, represents a position on a particular bill, or is caused by an interaction between the two.
When measuring polarization in elite politics, there is no data directly representing polarization in the real world. Therefore, this study utilized the tool of the language of the space of the National Assembly, which is considered to reflect the political polarization. In terms of language as a proxy variable, the possibility of a certain level of error in its measurement is open. In a follow-up study, it is necessary to compare it with the result of analyzing the roll-call data of the 20th National Assembly in order to analyze the political polarization in more depth.
Second, polarization has multiple meanings-it can refer to gaps between parties' ideological tendencies or their to polarizing behavior (e.g., basing political positions on factional logic and the need to score ideological wins rather than focusing on governance). While both beget confrontation and conflict, the former presupposes a polarization of opinion and the latter does not. This study's approach makes it difficult to clearly determine whether the conflict between liberals and conservatives in South Korea is attributable to differences in their ideological orientation or politicians voting by party lines or logic.
Third, as mentioned above, previous studies have found that polarization is more prominent in certain fields, such as foreign policy (Kang 2012;Park et al. 2016). However, this study cannot contribute to this argument because this study analyzed meeting minutes in general without specifying a field of focus. Although polarization is a broad phenomenon with many disparate effects, it is important for future researchers to determine which dimensions of politics are most polarized or polarizing so that we can better understand and overcome polarization.
Fourth and finally, the polarization of South Korean parliamentary politics reported in this study does not necessarily reflect the overall polarization of South Korean society or political discourse. Political elites may be polarized without the masses being similarly polarized. Future studies should analyze the polarization of the masses in more detail.
---. 2014. "국회 원내지도부의 입법영향력 분석: 상임위원회 지도부를 중심으로" [The Keys to learning, is a series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. In this sense, neural networks refer to systems of neurons, either organic or artificial in nature. Among NLP models using neural networks, this study applies BERT, a transformer-based transfer learning model that uses contextual information embedded at the sentence level and combines supervised and unsupervised learning with bidirectional characteristics that accounts for both forward and reverse directional contexts (Devlin et al. 2019). This model was published by Google in 2018.
The reason why this study uses BERT as a text classification model lies in its embedding and tokenization methods. First is dynamic embedding (Peters et al. 2018;Devlin et al. 2019). Ideal expressions of words must contain the syntactic and semantic characteristics of the words and must capture meanings that can vary by context. Context-free models such as Word2Vec and GloVe, which embed at word level, have been limited in expressing words that shift by context, because in these models all words have static vectors (Mikolov et al. 2013;Pennington, Socher, and Manning 2014), and certain words always carry the same expression regardless of the situation.
A feature of BERT that most prominently distinguishes it from the conventional Word2Vec and GloVe is that one word may entail different embeddings depending on the shape and position of the characters, thereby removing ambiguity. Because BERT dynamically generates word vectors using contexts, it is possible to create different word expressions depending on context. Another characteristic of BERT that this study notes is the tokenization method. As the Korean language is an agglutinative language, the role of morphemes is more prominent than in English, for example, and that of words less so. Hence, conventional Korean NLP has generally been used to cut and tokenize Korean language data into morphemes (Lim 2019). Thus there exist methods to utilize external morpheme analyzers that provide favorable performance, but this approach is not only affected by the variable performance of morpheme analyzers but also prone to multiple out-of-vocabulary (OOV) occurrences in the Korean language, in which word forms vary extensively. Accordingly, this study examines alternative tokenization methods independent of morpheme analyzers.
BERT uses a tokenization algorithm called Word Piece that tokenizes without relying on external morpheme analyzers. Word Piece creates a token by collecting meaningful units frequently appearing in a word and expresses a word as a combination of sub words, enabling detailed expressions of its meaning (Wu et al. 2016;Devlin et al. 2019). This approach is useful for processing words not found in the dictionary, such as neologisms. The application of BERT Word Piece may be useful because the minutes of the subcommittee of the standing committee of the National Assembly, which this study analyzes, reflects various issues in diverse areas of real society.

Appendix C
The Millennium Democratic Party (59) The Grand National Party (139) The Yeollin Uri Party (49) The United Liberal Democrats (10) The United Democratic Party (136) The Grand National Party (112) The Democratic Labor Party (6) The Advancement Unification Party (9) The Creative Korea Party (1) The Pro-Park Alliance (3) The United Democratic Party (81) The Saenuri Party (165) The Unified Progressive Party (7) The Advancement Unification Party (14) The Creative Korea Party (2) The Korea Vision Party (1) The Democratic Party of Korea (103) The Saenuri Party (145) The People's Party (20) The Justice Party (5) The Democratic Party (1) * Party classification is based on the second half of each National Assembly, excluding independent politicians. The boldface refers to the two major parties and the numbers in brackets refers to the number of seats. The Democratic Party of Korea (128) The United Future Party (112) The Minsaeng Party (20) The Justice Party (6) The Minjung Party (1) The People's Party (1) The Our Republican Party (1) The Open Democratic Party (1) The Pro-Park New Party (1) The Democratic Party of Korea (174) The People Power Party (103) The Justice Party (6) The Open Democratic Party (3) The Basic Income Party (1) The Transition Korea (1) The People Party (3) Figure D1. Bill passage rate, 15th National Assembly to 20th National Assembly