On Finetuning Large Language Models

Yu Wang

doi:10.1017/pan.2023.36

On Finetuning Large Language Models

Published online by Cambridge University Press: 28 November 2023

Yu Wang

Show author details

Yu Wang*: Affiliation:
Fudan Institute for Advanced Study in Social Sciences, Fudan University, Shanghai, China
*: Email: yuwang.aiml@gmail.com

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

A recent paper by Häffner et al. (2023, Political Analysis 31, 481–499) introduces an interpretable deep learning approach for domain-specific dictionary creation, where it is claimed that the dictionary-based approach outperforms finetuned language models in predictive accuracy while retaining interpretability. We show that the dictionary-based approach’s reported superiority over large language models, BERT specifically, is due to the fact that most of the parameters in the language models are excluded from finetuning. In this letter, we first discuss the architecture of BERT models, then explain the limitations of finetuning only the top classification layer, and lastly we report results where finetuned language models outperform the newly proposed dictionary-based approach by 27% in terms of $R^2$ and 46% in terms of mean squared error once we allow these parameters to learn during finetuning. Researchers interested in large language models, text classification, and text regression should find our results useful. Our code and data are publicly available.

Keywords

finetuning large language models text as data

Type: Letter
Information: Political Analysis , Volume 32 , Issue 3 , July 2024 , pp. 379 - 383

DOI: https://doi.org/10.1017/pan.2023.36 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of The Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Edited by: John Doe

References

Bestvater, S. E., and Monroe, B. L.. 2023. “Sentiment Is Not Stance: Target-Aware Opinion Classification for Political Text Analysis.” Political Analysis 31 (2): 235–256. https://doi.org/10.1017/pan.2022.10 CrossRef Google Scholar

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.. 2019. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of NAACL-HLT, edited by Burstein, J., Doran, C., and Solorio, T., 4171–4186. Minneapolis: Association for Computational Linguistics.Google Scholar

Ding, N., et al. 2023. “Parameter-Efficient Fine-Tuning of Large-Scale Pre-Trained Language Models.” Nature Machine Intelligence 5: 220–235.10.1038/s42256-023-00626-4CrossRef Google Scholar

Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H., and Smith, N.. 2020. “Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping.” Preprint, arXiv:2002.06305.Google Scholar

Häffner, S., Hofer, M., Nagl, M., and Walterskirchen, J.. 2023. “Introducing an Interpretable Deep Learning Approach to Domain-Specific Dictionary Creation: A Use Case for Conflict Prediction.” Political Analysis 31 (4): 481–499. https://doi.org/10.1017/pan.2023.7 CrossRef Google Scholar

Houlsby, N., et al. 2019. “Parameter-Efficient Transfer Learning for NLP.” Proceedings of the 36th International Conference on Machine Learning, 2790–2799. PMLR.Google Scholar

Howard, J., and Ruder, S.. 2018. “Universal Language Model Fine-Tuning for Text Classification.” Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 328–339. Melbourne: Association for Computational Linguistics.Google Scholar

Hu, Y., et al. 2022. “ConfliBERT: A Pre-Trained Language Model for Political Conflict and Violence.” Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics, 5469–5482, Seattle: Association for Computational Linguistics.Google Scholar

Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R.. 2020. ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations. ICLR.Google Scholar

Liu, Y., et al. 2019. “RoBERTa: A Robustly Optimized BERT Pretraining Approach.” Preprint, arXiv:1907.11692.Google Scholar

Mosbach, M., Andriushchenko, M., and Klakow, D.. 2021. “On the Stability of Fine-Tuning BERT: Misconceptions, Explanations, and Strong Baselines.” ICLR.Google Scholar

Wang, Y. 2019. “Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data: A Comment.” Political Analysis 21 (1): 107–110.CrossRef Google Scholar

Wang, Y. 2023a. “Replication Data for: On Finetuning Large Language Models.” Harvard Dataverse. https://doi.org/10.7910/DVN/7PCLRI CrossRef Google Scholar

Wang, Y. 2023b. “Topic Classification for Political Texts with Pretrained Language Models.” Political Analysis 31 (4): 662–668.10.1017/pan.2023.3CrossRef Google Scholar

Zhang, T., Wu, F., Katiyar, A., Weinberger, K. Q., and Artzi, Y.. 2021. “Revisiting Few-Sample BERT Fine-Tuning.” ICLR.Google Scholar

Wang supplementary material

Appendix

PDF 194.7 KB

Wang Dataset

Dataset

https://doi.org/10.7910/DVN/7PCLRI

Link

Article contents

On Finetuning Large Language Models

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Wang supplementary material

Wang Dataset

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests