A general small language model (SLM) approach to examining scientific trends through conference proceedings: application to the 2019 and 2024 annual meetings of the Brazilian Chemical Society

Rubens Souza; Nathalia Rosa; Julio Duarte; Itamar Borges Jr

doi:10.26434/chemrxiv-2025-vjqhg-v2

Chemical Education

Search within Chemical Education

A general small language model (SLM) approach to examining scientific trends through conference proceedings: application to the 2019 and 2024 annual meetings of the Brazilian Chemical Society

02 September 2025, Version 2

Working Paper

Show author details

This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Large Language Models (LLMs) are a machine learning technique that has transformed natural language processing. However, their large computational demands limit their accessibility, leading to the development of Small Language Models (SLMs), which, by running locally on a microcomputer, made AI-driven language processing and enhanced control for text analysis widely accessible. In this work, we use an SLM to analyze the evolution of Chemistry in Brazil by comparing data from the 2019 and 2024 Brazilian Chemical Society meetings (RASBQ). We demonstrate the viability of SLMs for extracting and structuring large volumes of text from scientific events collected in books of abstracts, thus enabling comprehensive comparative analyses that would otherwise be impractical. Our methodology extracts abstracts from the RASBQ digital proceedings and processes them using SLMs. These models converted the textual content into structured, manipulable data, thus enabling us to conduct a semantic and statistical analysis of the two events. The results highlight how SLMs can efficiently transform unstructured scientific proceedings into tractable data, thereby saving significant time and resources. The comparison between the 2019 and 2024 events revealed notable changes in thematic distribution, institutional participation, and potential regional impacts, underscoring the importance of data standardization in automated analyses. A Portuguese version of the Results and Discussion section is presented in the Supplementary Information.

Keywords

Small Language Models (SLM)

Natural Language Processing

Scientific Text Analysis

Chemistry in Brazil

Supplementary materials

Title

Description

Actions

Title

Supplementary Material

Description

Results and Discussion section in Portuguese.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

USEFUL INFORMATION THAT WILL CHANGE YOUR FINANCIAL LIFE FOR EVER...CHRISTMAS LOAN OFFER WARM GREETINGS TO THE PEOPLE OVER HERE HAPPY CHRISTMAS IN ADVANCE, BOSNIA, SERBIA, ETC. "" THIS IS A GREAT OPPORTUNITY FOR YOU TO GET A LOAN FROM ALLIANT CREDIT UNION. I PROVIDE YOU WITH A PLATFORM TO BRING CUSTOMERS AND I WILL OFFER THEM A LOAN IN GOOD FAITH. You have the opportunity to get a loan in any denomination with repayment options from 1 year to 45 years without too many obligations. INFORMATION IS POWER. WhatsApp for USA: +1 (717) 826-3251 Email: {michaelgardloanoffice@gmail.com}

Version History

Sep 02, 2025 Version 2

Jun 25, 2025 Version 1

Version Notes

Stylist correction in the original text to improve reading

Metrics

932

424

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2025-vjqhg-v2

Funding

Conselho Nacional de Desenvolvimento Científico e Tecnológico

300281/2025-0

Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro

E-26/204.294/2024 and E-26/205.922/2022

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

A general small language model (SLM) approach to examining scientific trends through conference proceedings: application to the 2019 and 2024 annual meetings of the Brazilian Chemical Society

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share