Hostname: page-component-77c89778f8-9q27g Total loading time: 0 Render date: 2024-07-16T21:10:24.690Z Has data issue: false hasContentIssue false

Singular value automata and approximate minimization

Published online by Cambridge University Press:  27 May 2019

Borja Balle*
Amazon Research, Cambridge, UK
Prakash Panangaden
School of Computer Science, McGill University, Montreal, Canada
Doina Precup
School of Computer Science, McGill University, Montreal, Canada
*Corresponding author. Email:


The present paper uses spectral theory of linear operators to construct approximatelyminimal realizations of weighted languages. Our new contributions are: (i) a new algorithm for the singular value decomposition (SVD) decomposition of finite-rank infinite Hankel matrices based on their representation in terms of weighted automata, (ii) a new canonical form for weighted automata arising from the SVD of its corresponding Hankelmatrix, and (iii) an algorithmto construct approximateminimizations of given weighted automata by truncating the canonical form.We give bounds on the quality of our approximation.

© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


This work was completed while the authors were at Lancaster University.


Adamyan, V. M., Arov, D. Z. and Krein, M. G. (1971). Analytic properties of Schmidt pairs for a Hankel operator and the generalized Schur-Takagi problem. Matematicheskii Sbornik 128 (1) 3475.Google Scholar
Albert, J. and Kari, J. (2009). Digital image compression. In: Manfred, D., Werner, K., and Heiko, V. (eds.) Handbook of Weighted Automata, Springer, 453479. Scholar
Antoulas, A. C. (2005). Approximation of large-scale dynamical systems. SIAM, Philadelphia, PA. Scholar
Baier, C., Größer, M. and Ciesinski, F. (2009). Model checking linear-time properties of probabilistic systems. In: Manfred, D., Werner, K., and Heiko, V. (eds.) Handbook of Weighted Automata, Springer, 519570. CrossRefGoogle Scholar
Bailly, R. (2011). Quadratic weighted automata: Spectral algorithm and likelihood maximization. In: Asian Conference on Machine Learning, 147163.Google Scholar
Bailly, R., Denis, F. and Ralaivola, L. (2009). Grammatical inference as a principal component analysis problem. In: Proceedings of the 26th Annual International Conference on Machine Learning, ACM, 3340.Google Scholar
Bailly, R., Carreras, X. and Quattoni, A. (2013). Unsupervised spectral learning of finite state transducers. In: Advances in Neural Information Processing Systems, 800808.Google Scholar
Balle, B. and Mohri, M. (2012). Spectral learning of general weighted automata via constrained matrix completion. In: Advances in neural information processingsystems, 21592167.Google Scholar
Balle, B., Quattoni, A. and Carreras, X. (2011). A spectral learning algorithm for finite state transducers. In: Joint European Conference on Machine Learningand Knowledge Discovery in Databases, Springer, 156171.CrossRefGoogle Scholar
Balle, B., Carreras, X., Luque, F. and Quattoni, A. (2014a). Spectral learning of weighted automata: A forward-backward perspective. Machine Learning 96 (1–2) 3363. Scholar
Balle, B., Hamilton, W. and Pineau, J. (2014b). Methods of moments for learning stochastic languages: Unified presentation and empirical comparison. In: International Conference on Machine Learning, 13861394.Google Scholar
Balle, B., Panangaden, P. and Precup, D. (2015). A canonical form for weighted automata and applications to approximate minimization. In: 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), IEEE, 701712.CrossRefGoogle Scholar
Balle, B., Gourdeau, P. and Panangaden, P. (2017). Bisimulation metrics for weighted finite automata. In: Proceedings of the 44th International Colloquium On Automata Languages and Programming Warsaw, vol. 103, 114.Google Scholar
Berstel, J. and Reutenauer, C. (2011). Noncommutative Rational Series with Applications. Cambridge University Press.Google Scholar
Bezhanishvili, N., Kupke, C. and Panangaden, P. (2012). Minimization via duality. In: Logic, Language, Information and Computation—19th International Workshop, WoLLIC 2012, Buenos Aires, Argentina, September 3–6, 2012. Proceedings, volume 7456 of Lecture Notes in Computer Science, Springer, 191205.Google Scholar
Bonchi, F., Bonsangue, M., Boreale, M., Rutten, J. and Silva, A. (2012). A coalgebraic perspective in linear weighted automata. Information and Computation 211 77105.CrossRefGoogle Scholar
Bonchi, F., Bonsangue, M. M., Hansen, H. H., Panangaden, P., Rutten, J. and Silva, A. (2014). Algebra-coalgebra duality in Brzozowski's minimization algorithm. ACM Transactions on Computational Logic 15 (1) 3:13:29.CrossRefGoogle Scholar
Boots, B., Siddiqi, S. and Gordon, G. (2009). Closing the learning-planning loop with predictive state representations. In: Proceedings of Robotics: Science and Systems VI. Google Scholar
Boreale, M. (2009). Weighted bisimulation in linear algebraic form. In: CONCUR 2009-Concurrency Theory, Springer, 163177.CrossRefGoogle Scholar
Brzozowski, J. A. (1962). Canonical regular expressions and minimal state graphs for definite events. In: Fox, J. (ed.) Proceedings of the Symposium on Mathematical Theory of Automata, number 12 in MRI Symposia Series, Polytechnic Press of the Polytechnic Institute of Brooklyn, April 1962, 529561. Book appeared in 1963.Google Scholar
de Gispert, A., Iglesias, G., Blackwood, G., Banga, E. and Byrne, W. (2010). Hierarchical phrase-based translation with weighted finite-state transducers and shallow-n grammars. Computational Linguistics 36 (3), 505533.CrossRefGoogle Scholar
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 39 (1), 122.Google Scholar
Denis, F. and Esposito, Y. (2008). On rational stochastic languages. Fundamenta Informaticae 86 (1,2) 4177.Google Scholar
Desharnais, J., Gupta, V., Jagadeesan, R. and Panangaden, P. (2003). Approximating labeled Markov processes. Information and Computation 184 (1) 160200.CrossRefGoogle Scholar
Enns, D. F. (1984). Model reduction with balanced realizations: An error bound and a frequency weighted generalization. In: The 23rd IEEE Conference on Decision and Control, 1984, vol. 23, IEEE, 127132.Google Scholar
Fuhrmann, P. A. (2011). A Polynomial Approach to Linear Algebra. Springer Science & Business Media.Google Scholar
Glover, K. (1984). All optimal hankel-norm approximations of linear multivariable systems and their l,8-error bounds. International journal of control 39 (6) 11151193.CrossRefGoogle Scholar
Hsu, D., Kakade, S. M. and Zhang, T. (2012). A spectral algorithm for learning hidden Markov models. Journal of Computer and System Sciences 78 (5) 1460–1480.CrossRefGoogle Scholar
Kiefer, S. and Wachter, B. (2014) Stability and complexity of minimising probabilistic automata. In: Javier, E., Pierre, F., Thore, H., and Elias, K. (eds.) Proceedings of the 41st International Colloquium on Automata, Languages and Programming (ICALP), part II, vol 8573, LNCS, Copenhagen, Denmark, Springer, 268279.Google Scholar
Knight, K. and May, J. (2009). Applications of weighted automata in natural language processing. In: Manfred, D., Werner, K., and Heiko, V. (eds.) Handbook of Weighted Automata, >Springer, 571596.CrossRefGoogle Scholar
Kulesza, A., Rao, N. R. and Singh, S. (2014). Low-rank spectral learning. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, 522530.Google Scholar
Kulesza, A., Jiang, N. and Singh, S. (2015). Low-rank spectral learning with weighted loss functions. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Google Scholar
Lototsky, S. V. (2015). Simple spectral bounds for sums of certain Kronecker products. Linear Algebra and its Applications 469 114129.CrossRefGoogle Scholar
Mohri, M., Pereira, F. C. N. and Riley, M. (2008). Speech recognition with weighted finite-state transducers. In: Handbook on Speech Processingand Speech Communication. Google Scholar
Nikol’Skii, N. K. (2012). Treatise on the Shift Operator: Spectral Function Theory, vol. 273, Springer Science & Business Media.Google Scholar
Peller, V. (2012). Hankel Operators and Their Applications. Springer Science & Business Media.Google Scholar
Popescu, G. (2003). Multivariable Nehari problem and interpolation. Journal of Functional Analysis 200 (2) 536581.CrossRefGoogle Scholar
Recasens, A. and Quattoni, A. (2013). Spectral learning of sequence taggers over continuous sequences. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 289304.Google Scholar
Rosenthal, J. S. (1995). Convergence rates for Markov chains. Siam Review 37 (3) 387405.CrossRefGoogle Scholar
Siddiqi, S., Boots, B. and Gordon, G. (2010). Reduced-rank hidden Markov models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 741748.Google Scholar
Trefethen, L. N. and III, D., Bau (1997). Numerical Linear Algebra. Siam.Google Scholar
Zhu, K. (1990). Operator Theory in Function Spaces, vol. 138. American Mathematical Society.Google Scholar