Abstract
Large Language Models (LLMs) are a machine learning technique that has transformed natural language processing. However, their large computational demands limit their accessibility, leading to the development of Small Language Models (SLMs), which, by running locally on a microcomputer, made AI-driven language processing and enhanced control for text analysis widely accessible. In this work, we use an SLM to analyze the evolution of Chemistry in Brazil by comparing data from the 2019 and 2024 Brazilian Chemical Society meetings (RASBQ). We demonstrate the viability of SLMs for extracting and structuring large volumes of text from scientific events collected in books of abstracts, thus enabling comprehensive comparative analyses that would otherwise be impractical. Our methodology extracts abstracts from the RASBQ digital proceedings and processes them using SLMs. These models converted the textual content into structured, manipulable data, thus enabling us to conduct a semantic and statistical analysis of the two events. The results highlight how SLMs can efficiently transform unstructured scientific proceedings into tractable data, thereby saving significant time and resources. The comparison between the 2019 and 2024 events revealed notable changes in thematic distribution, institutional participation, and potential regional impacts, underscoring the importance of data standardization in automated analyses. A Portuguese version of the Results and Discussion section is presented in the Supplementary Information.
Supplementary materials
Title
Supplementary Material
Description
Results and Discussion section in Portuguese.
Actions



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)