Hostname: page-component-5db58dd55d-m58mf Total loading time: 0 Render date: 2026-05-31T18:53:12.640Z Has data issue: false hasContentIssue false

An exploratory hybrid AI workflow for Brazilian federal budget allocation

Published online by Cambridge University Press:  08 April 2026

Saulo de Oliveira Nonato*
Affiliation:
Independent Scholar
Marina Figueiredo Moreira
Affiliation:
Universidade de Brasília, Brazil
David Nadler Prata
Affiliation:
Universidade Federal do Tocantins, Brazil
Diana Vaz de Lima
Affiliation:
Universidade de Brasília, Brazil
Gabriel Reis Nadler Prata
Affiliation:
Universidade Federal do Tocantins, Brazil
*
Corresponding author: Saulo de Oliveira Nonato; Email: saulo11340@gmail.com

Abstract

This study assesses whether a hybrid prediction–optimisation workflow can be used as an exploratory exercise for Brazilian federal budget allocation under severe data constraints. Using executed expenditure by budgetary function (2000–2023; N = 24), a multi-output XGBoost model is estimated to link spending profiles to GDP growth, inflation, and the Gini index; Bayesian optimisation (Tree-structured Parzen Estimator/Optuna) is then applied to search, within explicit bounds and penalties, for allocation vectors that maximise a stated objective function favouring higher growth and lower inflation and inequality. To mitigate data scarcity, the short series is augmented with 1048 synthetic observations generated through controlled noise injection, bootstrapped resampling and variational autoencoder reconstruction. Under randomised K-fold cross-validation on the augmented dataset, the model achieves mean R2 = 0.97 and mean MSE = 0.04, while diagnostics indicate larger errors at extreme values and a persistent training–validation gap. A secondary robustness check uses an anti-leakage design by applying cross-validation to the 24 real observations and generating synthetic data only within each training fold. This yields markedly weaker generalisation for GDP growth and inflation (overall mean MSE = 1.03; overall mean R2 = −0.45), with positive performance remaining only for the Gini index (R2 = 0.60). Under these conditions, the optimisation step identifies a scenario that satisfies the objective function on standardised outputs (GDP growth = 1.15; inflation = −0.04; Gini = −0.17). The results support the use of the workflow to compare scenarios under explicit assumptions, rather than to produce prescriptive budget guidance.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Hybrid AI model for budgetary resource allocation. Note: Author’s own elaboration.

Figure 1

Table 1. Budgetary functions

Figure 2

Table 2. Variable description

Figure 3

Figure 2. Stages of the process for optimising public budgetary resources using AI. Note: Author’s own elaboration.

Figure 4

Table 3. Descriptive statistics of output variables (original versus synthetic data)

Figure 5

Figure 3. Distribution comparison: boxplots (a–c). Note: Author’s own elaboration.

Figure 6

Figure 4. Learning curve. Note: Author’s own elaboration.

Figure 7

Table 4. Performance of the machine-learning model under two validation protocols

Figure 8

Table 5. Comparison between actual and predicted values by the model (z-score)

Figure 9

Figure 5. Comparison of actual and predicted values (a–c). Note: Author’s own elaboration.

Figure 10

Figure 6. Scatter plots (a–c). Note: Author’s own elaboration.

Figure 11

Figure 7. Optimised allocation.

Submit a response

Comments

No Comments have been published for this article.