Abstract
In the standard transformer architecture, in
creasing model parameters leads to linear
growth in computational cost and activation
memory. To address this issue, we propose
a novel Infinite Parameter Large Language
Model (IP-LLM) architecture that decouples
model size from computational cost and de
vice memory. Existing large language models
Figure 1: Parameters A, B, C, and D store knowledge
are all fixed-parameter models, while human
knowledge is infinite and expands daily. Finite
parameters are inherently limited in their capac
ity to accommodate this boundless knowledge.
Our IP-LLM architecture can potentially ac
commodate infinite knowledge, resolving this
issue and laying the foundation for realizing a
truly omniscient and omnipotent artificial gen
eral intelligence in the future.Our architecture
surpasses MOE in performance while requiring
significantly less memory.



![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)