A variety of logical frameworks supports the use of higher order abstract syntax in representing formal systems. Although these systems seem superficially the same, they differ in a variety of ways, for example, how they handle a context of assumptions and which theorems about a given formal system can be concisely expressed and proved. Our contributions in this paper are two-fold: (1) We develop a common infrastructure and language for describing benchmarks for systems supporting reasoning with binders, and (2) we present several concrete benchmarks, which highlight a variety of different aspects of reasoning within a context of assumptions. Our work provides the background for the qualitative comparison of different systems that we have completed in a separate paper. It also allows us to outline future fundamental research questions regarding the design and implementation of meta-reasoning systems.