Skip to main content Accessibility help

Efficient probabilistic grammar induction for design

  • Mark E. Whiting (a1), Jonathan Cagan (a1) and Philip LeDuc (a1)


The use of grammars in design and analysis has been set back by the lack of automated ways to induce them from arbitrarily structured datasets. Machine translation methods provide a construct for inducing grammars from coded data which have been extended to be used for design through pre-coded design data. This work introduces a four-step process for inducing grammars from un-coded structured datasets which can constitute a wide variety of data types, including many used in the design. The method includes: (1) extracting objects from the data, (2) forming structures from objects, (3) expanding structures into rules based on frequency, and (4) finding rule similarities that lead to consolidation or abstraction. To evaluate this method, grammars are induced from generated data, architectural layouts and three-dimensional design models to demonstrate that this method offers usable grammars automatically which are functionally similar to grammars produced by hand.


Corresponding author

Author for correspondence: Jonathan Cagan, E-mail: and Philip LeDuc, E-mail:


Hide All
Ates, K and Zhang, K (2007) Constructing VEGGIE: machine learning for context-sensitive graph grammars. In Proceedings – International Conference on Tools with Artificial Intelligence, ICTAI, pp. 456463.
Babai, L (2015) Graph Isomorphism in Quasipolynomial Time. arXiv 7443327, 84.
Babai, L, Kantor, WM and Luks, EM (1983) Computational complexity and the classification of finite simple groups. In 24th Annual Symposium on Foundations of Computer Science (Sfcs 1983), pp. 162171.
Balahur, A and Turchi, M (2014) Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Computer Speech & Language 28(1), 5675.
Barnes, M and Finch, EL (2008) Collada-Digital Asset Schema Release 1.5.0, Specification. Clearlake Park, CA: Khronos Group.
Benrós, D, Hanna, S and Duarte, JP (2012) A generic shape grammar for the Palladian Villa, Malagueira house, and prairie house. Design Computing and Cognition ‘12’ 12(18), 321340.
Berwick, RC and Pilato, S (1987) Learning syntax by automata induction. Machine Learning 2(1), 938.
Chau, HH, Chen, X, McKay, A, and de Pennington, A (2004) Evaluation of a 3D shape grammar implementation. In Gero, JS (ed.). Design Computing and Cognition ’04. Dordrecht: Springer.
DeNero, J and Uszkoreit, J (2011) Inducing sentence structure from parallel corpora for reordering. In EMNLP 2011 – Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 193203.
Ding, Y and Palmer, M (2005) Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), vol 38 (June), pp. 541–48.
Fouss, F, Pirotte, A, Renders, JM and Saerens, M (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering 19(3), 355–69.
Gero, JS (1994) Towards a model of exploration in computer-aided design. In Gero, JS and Tyugu, E (eds). Formal Design Methods for Computer-Aided Design. Amsterdam: North-Holland, pp. 315336.
Gips, J (1999) Computer implementation of shape grammars. In Proc. Workshop on Shape Computation, MIT. Accessed at
Hagberg, AA, Schult, DA and Swart, PJ (2008) Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference (SciPy2008), pp. 1115.
Kang, U, Tong, H and Sun, J (2012) Fast random walk graph kernel. In Proceedings of the 2012 SIAM International Conference on Data Mining, pp. 828838.
Knuth, DE (1998) The Art of Computer Programming Volume 3. Sorting and Searching. Reading, MA: Addison Wesley, vol. 3, p. 829.
Königseder, C and Shea, K (2015) Analyzing generative design grammars. In Design Computing and Cognition ‘14, pp. 363381.
Kudo, T and Matsumoto, Y (2002) Japanese dependency analysis using cascaded chunking. In Proceeding of the 6th Conference on Natural language learning – COLING-02, vol. 20, pp. 17.
Leach, P, Mealling, M and Salz, R (2005) A Universally Unique IDentifier (UUID) URN Namespace. The Internet Society, pp. 132.
Lee, YS and Wu, YC (2007) A robust multilingual portable phrase chunking system. Expert Systems with Applications 33(3), 590599.
McCormack, JP and Cagan, J (2002) Designing inner hood panels through a shape grammar based framework. Artificial Intelligence in Engineering Design, Analysis and Manufacturing 16(4), 273290.
McKay, BD and Piperno, A (2014) Practical graph isomorphism, II. Journal of Symbolic Computation 60, 94112.
Mikolov, T, Le, QV and Sutskever, I (2013) Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168v1, 1–10.
Orsborn, S and Cagan, J (2009) Multiagent shape grammar implementation: automatically generating form concepts according to a preference function. Journal of Mechanical Design 131(12), 121007.
Piazzalunga, U and Fitzhorn, P (1998) Note on a three-dimensional shape grammar interpreter. Environment and Planning B: Planning and Design 25(1), 1130.
Rawson, K and Stahovich, TF (2009) Learning design rules with explicit termination conditions to enable efficient automated design. Journal of Mechanical Design, Transactions of the ASME 131(3), 031011-03101111.
Rowe, C (1977) Mathematics of the ideal villa and other essays. Jae 31, 48.
Rozenberg, G (1997) Handbook of graph grammars and computing by graph transformation. Handbook of Graph Grammars 1, 18.
Sánchez-Martínez, F and Pérez-Ortiz, JA (2010) Philipp Koehn, statistical machine translation. Machine Translation 24, 273278.
Schmidt, LC and Cagan, J (1997) GGREADA: a graph grammar-based machine design algorithm. Research in Engineering Design 9(4), 195213.
Schnier, T and Gero, JS (1996) Learning genetic representations as alternative to hand-coded shape grammars. In Artificial Intelligence in Design ’96. Dordrecht: Springer, pp. 3957.
Schwenk, H (2012) Continuous space translation models for phrase-based statistical machine translation. COLING (Posters) (December), pp. 10711080.
Slisenko, AO (1982) Context-free grammars as a tool for describing polynomial-time subclasses of hard problems. Information Processing Letters 14(2), 5256.
Speller, TH, Whitney, D and Crawley, E (2007) Using shape grammar to derive cellular automata rule patterns. Complex Systems 17(1/2), 79102.
Stiny, G (1980) Introduction to shape and shape grammars. Environment and Planning B 7(3), 343351.
Stiny, G and Mitchell, WJ (1978). The palladian grammar. Environment and planning B: Planning and Design 5(1), 518.
Stolcke, A and Omohundro, S (1994) Inducing probabilistic grammars by Bayesian model merging. In Grammatical Inference and Applications, pp. 106118.
Suh, NP (2001) Axiomatic Design: Advances and Applications. New York: Oxford University Press.
Talton, J, Yang, L, Kumar, R, Lim, M, Goodman, N and Měch, R (2012) Learning design patterns with Bayesian grammar induction. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology – UIST ’12, p. 63.
Trescak, T, Esteva, M and Rodriguez, I (2012) A shape grammar interpreter for rectilinear forms. CAD Computer Aided Design 44(7), 657670.
Trescak, T, Rodriguez, I and Esteva, M (2009) General shape grammar interpreter for intelligent designs generations. In Proceedings of the 2009 6th International Conference on Computer Graphics, Imaging and Visualization: New Advances and Trends, CGIV2009, pp. 235240.
Yue, K and Krishnamurti, R (2013) Tractable shape grammars. Environment and Planning B: Planning and Design 40(4), 576594.


Related content

Powered by UNSILO

Efficient probabilistic grammar induction for design

  • Mark E. Whiting (a1), Jonathan Cagan (a1) and Philip LeDuc (a1)


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.