Skip to main content Accessibility help
×
Home

Efficient parallel and incremental parsing of practical context-free languages

  • JEAN-PHILIPPE BERNARDY (a1) and KOEN CLAESSEN (a1)

Abstract

We present a divide-and-conquer algorithm for parsing context-free languages efficiently. Our algorithm is an instance of Valiant's (1975; General context-free recognition in less than cubic time. J. Comput. Syst. Sci.10(2), 308–314), who reduced the problem of parsing to matrix multiplications. We show that, while the conquer step of Valiant's is O(n3), it improves to O(log2n) under certain conditions satisfied by many useful inputs that occur in practice, and if one uses a sparse representation of matrices. The improvement happens because the multiplications involve an overwhelming majority of empty matrices. This result is relevant to modern computing: divide-and-conquer algorithms with a polylogarithmic conquer step can be parallelized relatively easily.

Copyright

References

Hide All
Allison, L. (1992) Lazy dynamic-programming can be eager. Inform. Process. Lett. 43 (4), 207212.
Bernardy, J.-P. (2008) Yi: An editor in Haskell for Haskell. In Proceedings of the 1st ACM SIGPLAN Symposium on Haskell. ACM, pp. 61–62.
Bernardy, J.-P. (2009) Lazy functional incremental parsing. In Proceedings of the 2nd ACM SIGPLAN Symposium on Haskell. ACM, pp. 49–60.
Bernardy, J.-P. and Claessen, K. (2013) Efficient divide-and-conquer parsing of practical context-free languages. In Proceedings of the 18th ACM SIGPLAN International Conference on Funct. Programming, pp. 111–122.
Bird, R. (1986) An Introduction to the Theory of Lists. Programming Research Group, Oxford University Comp. Laboratory.
Burckhardt, S., Leijen, D., Sadowski, C., Yi, J. & Ball, T. (2011) Two for the price of one: A model for parallel and incremental computation. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications. ACM, pp. 427–444.
Chomsky, N. (1959) On certain formal properties of grammars. Inform. Control 2 (2), 137167.
Chytil, M., Crochemore, M., Monien, B. & Rytter, W. (1991) On the parallel recognition of unambiguous context-free languages. Theor. Comput. Sci. 81 (2), 311316.
Claessen, K. (2004) Parallel parsing processes. J. Funct. Program. 14 (6), 741757.
Cocke, J. (1969) Programming Languages and their Compilers: Preliminary Notes. Courant Institute of Mathematical Sci., New York University.
Cormen, T. H., Leiserson, C. E., Rivest, R. L. & Stein, C. (2001) Introduction to Algorithms, 2nd ed.MIT press.
Forsberg, M. & Ranta, A.BNFC Quick reference, chapter Appendix A, London: College Publications, pp. 175192.
Free Software Foundation. (1991) Gnu general public license.
Gibbons, J. (1996) The third homomorphism theorem. J. Funct. Program. 6 (4), 657665.
Hinze, R. & Paterson, R. (2006) Finger trees: A simple general-purpose data structure. J. Funct. Program. 16 (2), 197218.
Hughes, R. J. M. & Swierstra, S. D. (2003) Polish parsers, step by step. In Proceedings of the Eighth ACM SIGPLAN International Conference on Funct. Programming. ACM, pp. 239–248.
Kasami, T. (1965) An Efficient Recognition and Syntax Analysis Algorithm for Context-Free Languages. Technical Report, DTIC Document.
Lange, M. and Leiß, H. (2009) To CNF or not to CNF? An efficient yet presentable version of the CYK algorithm. Inform. Didactica 8, 20082010.
Morita, K., Morihata, A., Matsuzaki, K., Hu, Z. & Takeichi, M. (2007) Automatic inversion generates divide-and-conquer parallel programs. ACM SIGPLAN Not. 42 (6), 146155.
Okhotin, A. (2014) Parsing by matrix multiplication generalized to boolean grammars. Theor. Comput. Sci. 516 (0), 101120.
O'Sullivan, B. (2013) The Criterion benchmarking library.
Rytter, W. and Giancarlo, R. (1987) Optimal parallel parsing of bracket languages. Theor. Comput. Sci. 53 (2), 295306.
Sikkel, K. and Nijholt, A. (1997) Parsing of Context-Free Languages. Berlin: Springer-Verlag, pp. 61100.
Strassen, V. (1969) Gaussian elimination is not optimal. Numer. Math. 13, 354356. DOI: 10.1007/BF02165411.
Tomita, M. (1986) Efficient Parsing for Natural Language. Dordrecht: Kluwer Academic Publishers.
Valiant, L. (1975) General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10 (2), 308314.
Wagner, T. A. and Graham, S. L. (1998) Efficient and flexible incremental parsing. ACM Trans. Program. Lang. Syst. 20 (5), 9801013.
Younger, D. (1967) Recognition and parsing of context-free languages in time n 3. Inform. Control 10 (2), 189208.

Efficient parallel and incremental parsing of practical context-free languages

  • JEAN-PHILIPPE BERNARDY (a1) and KOEN CLAESSEN (a1)

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed

Efficient parallel and incremental parsing of practical context-free languages

  • JEAN-PHILIPPE BERNARDY (a1) and KOEN CLAESSEN (a1)
Submit a response

Discussions

No Discussions have been published for this article.

×

Reply to: Submit a response


Your details


Conflicting interests

Do you have any conflicting interests? *