Hostname: page-component-77f85d65b8-zzw9c Total loading time: 0 Render date: 2026-03-29T19:55:33.559Z Has data issue: false hasContentIssue false

Game-theoretic policy computing and simulation for blockchained buffering system via diffusion approximation

Published online by Cambridge University Press:  12 January 2024

Wanyang Dai*
Affiliation:
Department of Mathematics and State Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, China
*
Corresponding author: Wanyang Dai; Email: nan5lu8@nju.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

We study 2-stage game-theoretic problem oriented 3-stage service policy computing, convolutional neural network (CNN) based algorithm design, and simulation for a blockchained buffering system with federated learning. More precisely, based on the game-theoretic problem consisting of both “win-lose” and “win-win” 2-stage competitions, we derive a 3-stage dynamical service policy via a saddle point to a zero-sum game problem and a Nash equilibrium point to a non-zero-sum game problem. This policy is concerning users-selection, dynamic pricing, and online rate resource allocation via stable digital currency for the system. The main focus is on the design and analysis of the joint 3-stage service policy for given queue/environment state dependent pricing and utility functions. The asymptotic optimality and fairness of this dynamic service policy is justified by diffusion modeling with approximation theory. A general CNN based policy computing algorithm flow chart along the line of the so-called big model framework is presented. Simulation case studies are conducted for the system with three users, where only two of the three users can be selected into the service by a zero-sum dual cost game competition policy at a time point. Then, the selected two users get into service and share the system rate service resource through a non-zero-sum dual cost game competition policy. Applications of our policy in the future blockchain based Internet (e.g., metaverse and web3.0) and supply chain finance are also briefly illustrated.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press.
Figure 0

Figure 1. A blockchained buffering system with federated learning, which consists of J users and V pools.

Figure 1

Figure 2. A 3-stage processing flow chart of users-selection, dynamic pricing, and rate scheduling for a multiple pool service system with J-users, where J is taken to be 3 for an illustration.

Figure 2

Figure 3. A generalized supply chain with finance transactions via stable digital currencies. In this figure, ATO means assemble to order, MTO means make to order, DD means digital dollar, DC/EP means (China Central Bank) digital currency/electronic payment, CBDC means (European) Central Bank digital currency, transmission control protocol means transmission control protocol, IoB means Internet of Blockchains, Asym and sym mean asymmetry and symmetry, respectively.

Figure 3

Figure 4. In this simulation, the number of simulation iterative times is $N=6,000$, the simulation time interval is $[0,T]$ with T = 20, which is further divided into $n=5,000$ subintervals as explained in Subsection 5. Other values of simulation parameters introduced in Definition 3.1 and Subsubsection 4 are as follows: initialprice1 = 2.25, initialprice2 = 1.5, initialprice3 = 2.25, upperboundprice1 = 4, upperboundprice2 = 2, upperboundprice3 = 4, lowerboundprice1 = 0.49, lowerboundprice2 = 0.7, lowerboundprice3 = 0.49, queuepolicylowerbound1 = 0, queuepolicylowerbound2 = 0, queuepolicylowerbound3 = 0, $\lambda_{1}=10/3$, $\lambda_{2}=5$, $\lambda_{3}=10/3$, $m_{1}=3$, $m_{2}=1$, $m_{3}=3$, $\mu_{1}=1/10$, $\mu_{2}=1/20$, $\mu_{3}=1/10$, $\alpha_{1}=\sqrt{10/3}$, $\alpha_{2}=\sqrt{20}$, $\alpha_{3}=\sqrt{10/3}$, $\beta_{1}=\sqrt{10}$, $\beta_{2}=\sqrt{20}$, $\beta_{3}=\sqrt{10}$, $\zeta_{1}=1$, $\zeta_{2}=\sqrt{2}$, $\zeta_{3}=1$, $\rho_{1}=\rho_{2}=\rho_{3}=1,000$, $\theta_{1}=-1$, $\theta_{2}=-1.2$, $\theta_{3}=-1$.

Figure 4

Figure 5. In this simulation, the number of simulation iterative times is $N=6,000$, the simulation time interval is $[0,T]$ with T = 20, which is further divided into $n=5,000$ subintervals as explained in Subsection 5. Other values of simulation parameters introduced in Definition 3.1 and Subsubsection 4 are as follows: initialprice1 = 1, initialprice2 = 1, initialprice3 = 1, upperboundprice1 = 1, upperboundprice2 = 1, upperboundprice3 = 1, lowerboundprice1 = 1, lowerboundprice2 = 1, lowerboundprice3 = 1, queuepolicylowerbound1 = 0, queuepolicylowerbound2 = 0, queuepolicylowerbound3 = 0, $\lambda_{1}=10/3$, $\lambda_{2}=5$, $\lambda_{3}=10/3$, $m_{1}=3$, $m_{2}=1$, $m_{3}=3$, $\mu_{1}=1/10$, $\mu_{2}=1/20$, $\mu_{3}=1/10$, $\alpha_{1}=\sqrt{10/3}$, $\alpha_{2}=\sqrt{20}$, $\alpha_{3}=\sqrt{10/3}$, $\beta_{1}=\sqrt{10}$, $\beta_{2}=\sqrt{20}$, $\beta_{3}=\sqrt{10}$, $\zeta_{1}=1$, $\zeta_{2}=\sqrt{2}$, $\zeta_{3}=1$, $\rho_{1}=\rho_{2}=\rho_{3}=1,000$, $\theta_{1}=-1$, $\theta_{2}=-1.2$, $\theta_{3}=-1$.

Figure 5

Figure 6. CNN based algorithm flow chart.

Figure 6

Figure 7. A 2-stage processing flow chart of dynamic pricing and rate scheduling for a single pool service system with 2-users.

Figure 7

Figure 8. A 3-stage processing flow chart of users-selection, dynamic pricing, and rate scheduling for a single pool service system with 3-users.

Figure 8

Figure 9. A 3-stage processing flow chart of users-selection, dynamic pricing, and rate scheduling for a service system with 2-pools and 3-users.

Figure 9

Figure 10. Pareto optimal Nash equilibrium policies with dynamic pricing, where, the Price1/10,000 in the lower-right graph means that Price1 is divided by 10,000.

Figure 10

Figure 11. In this simulation, the number of simulation iterative times is $N=6,000$, the simulation time interval is $[0,T]$ with T = 200, which is further divided into $n=5,000$ subintervals as explained in Subsection 5. Other values of simulation parameters introduced in Definition 3.1 and Subsubsection 4 are as follows: initialprice1 = 9, initialprice2 = 3, lowerboundprice1 = 0.64, lowerboundprice2 = 0.8, $\lambda_{1}=10/3$, $\lambda_{2}=5$, $m_{1}=3$, $m_{2}=1$, $\mu_{1}=1/10$, $\mu_{2}=1/20$, $\alpha_{1}=\sqrt{10/3}$, $\alpha_{2}=\sqrt{20}$, $\beta_{1}=\sqrt{10}$, $\beta_{2}=\sqrt{20}$, $\zeta_{1}=1$, $\zeta_{2}=\sqrt{2}$, $\rho_{1}=\rho_{2}=1,000$, $c_{1}^{2}=c_{2}^{1}=1,500$, $\theta_{1}=-1$, $\theta_{2}=-1.2$.