Assessing the Promise and Performance of Agencies in the Government of Canada

Abstract Canada has not escaped trends in most liberal democracies with the rapid growth of agencies created by government to deliver public goods, often justified on elements of their mandate—service delivery, adjudication of disputes, regulatory oversight, among others—benefiting from an arm's-length relationship to the government of the day. Yet Canadian studies of this phenomenon remain mostly absent from the robust comparative literature theorizing and documenting the emergence of widespread “agencification” and its relationship to performance. This article draws on the Government of Canada's Public Service Employee Survey (PSES) microdata from 2017 to test key hypotheses advanced by proponents of agencification, specifically that agencies are more innovative, autonomous and efficient public organizations. We discover that those working in agencies generally report less climate of innovation and less work autonomy than those working in departments, though some types of agencies—namely regulatory and parliamentary ones—defy these trends.


Introduction
When citizens access a service provided or highly regulated by government today, they are more likely than not interacting with a government agency rather than a government department. In Canada, border control officers, food inspectors, tax agents, corrections officers, national park rangers and public health officials-to name a few-are employed by government agencies, yet in previous eras, they were housed within government departments under the direct leadership of ministers. Government agencies were traditionally created to establish an arm's-length relationship between a public authority and the government in order to reduce day-to-day political interference or manage highly technical regulatory activity, but in more recent decades, agencies have been created to promote efficiencies and even competition in a sector. Even more recently, others are seemingly created more cynically, with an aim to reduce the normal oversight of departments by legislative actors (for example, Alberta's Canadian Energy Centre-sometimes called the Energy War Room-which is a communications-focused propaganda outlet).
The Government of Canada has well over 200 arm's-length agencies or boards that carry out public duties (Zussman, 2012). These bodies perform a range of tasks, including regulating sectors of the economy, undertaking research or funding the arts, protecting human rights and providing services to individuals. Thus agencies may be administrative in focus, regulatory, quasi-judicial (rendering enforceable decisions) or parliamentary, and the organizational design, accountabilities and tasks of agencies exhibit substantial diversity. Agencies have a different relationship to ministers than departments do, but they always fall within a portfolio of a minister to whom they are ultimately accountable and who answers for the agency in Parliament and to the public. Parliamentary agencies (for example, Office of the Auditor General, Office of the Privacy Commissioner) are an exception, as they are accountable directly to Parliament and have distinct kinds of autonomy-financial and otherwise. Maintaining arm's-length or partial autonomy is for many agencies the key to their effectiveness and perceptions of fairness, as some agencies are tasked with arbitrating or settling claims among conflicting interests (Privy Council Office, Government of Canada, 1999). Thus, ministers are responsible for the policies governing agencies but tend not to intervene on specific decisions or actions or the general day-to-day management of the organization.
Canada and similar countries have always had arm's-length agencies or authorities, some pre-confederation; the first major agency in Canada-the Board of Railway Commissioners, which had regulatory functions, including the approval of rates-was created in 1851 (MacDonald, 1993). However, the explosion in number of agencies from the 1990s onward is generally attributed to the influence of the new public management (NPM) philosophy that tore through Western governments in this period and beyond. While the justification for some public functions being arm's-length from the government in order to promote good governance was long established (for example, central bank autonomy), the NPM agenda maintained that certain public bodies would perform better with a more corporate structure-typically with a board, ostensibly more managerial autonomy, and often performance-based reporting. For example, in the mid-1990s, Canada created an integrated food inspection agency, a national revenue agency and a national parks agency by hiving off previous functions of government departments.
The growth of arm's-length agencies and boards in Canada mirrors trends in comparator countries that even more eagerly embraced NPM principles, particularly by infusing business practices and organizational designs into government structures. Yet scholars in the United Kingdom, continental Europe, the United States, and Oceania have found limited evidence in both case studies and crosscountry quantitative comparative studies that "agencification"-the increasing hiving off of department tasks to arm's-length bodies-contributes to higher public sector performance (Andrews, 2010;James, 2003;Overman and van Thiel, 2016;Yamamoto, 2006). Those who do find some evidence of positive effects caution that it is difficult to isolate cause and effect (Caulfield, 2006). Yet Canada has never been part of these cross-country analyses, and arm's-length bodies in the country have not been given the attention they deserve, especially given that in some provinces they have been claimed to be responsible for upward of 50 per cent of government operating expenditures (McCrank et al., 2007).
In short, we have no basis to make claims, positive or negative, about broad trends in agencification in Canada. Indeed, Tupper (2018) observed that "Canadian public administration lacks a coherent body of rigorous research on public agencies. A balanced perspective is lacking. For example, in what areas are semi-independent agencies effective?" (222). This article addresses this gap by drawing on the Government of Canada's Public Service Employee Survey (PSES) microdata from 2017, which surveys all employees in 86 departments and agencies on a host of questions related to their work, including organizational cultures, employee experience and ability to deliver high-quality service to Canadians. Using various responses from these surveys, we are able to assemble a set of answers to the following research question: Do the agencies within the Government of Canada show evidence of the three central claims from agencification proponents: more employee autonomy, a more efficient organization, and a more innovative work climate than departments? The quasi-experimental method of Mahalanobis distance matching (MDM) on the survey microdata is used in this study to estimate the "agency effect" on various organizational performance metrics from the 2017 PSES survey.
The article proceeds as follows. First, the article begins by exploring the agencification debate in the context of the NPM movement and reviewing the empirical findings from comparator countries. Second, the PSES survey and microdata of Government of Canada employees is described, and hypotheses are presented for the relevant questions on the survey that connect with agencification and its normative principles identified in the literature. Third, the quasi-experimental method of MDM is articulated and the PSES data are analyzed in relation to the hypotheses articulated. We discover that those working in agencies generally report a lower climate of innovation and work autonomy than those working in departments, though some types of agencies-namely regulatory and parliamentary ones-defy these trends. The final section contemplates possible explanations for agency-type divergence and their implications.

Literature Review
Government agencies can be distinguished from traditional government departments by a few common criteria in the literature. Talbott (2004) defines agencies as typically having three key features: (1) they are structurally separated in some way from a ministry, (2) they carry out a public task and (3) they operate under more businesslike conditions than traditional government departments or ministries. This definition aligns with that of Aulich et al. (2010), who contributed to a cross-national agency analysis that held that agencies are structurally differentiated from departments, perform a public function, have the capacity for autonomous decision making and have an expectation of continuity over time. Pollitt et al.'s (2004) definition is similarly aligned; it emphasizes that agencies typically have a separate legal status (a stricter definition) but agrees with the others that no agencies in democracies have full statutory independence from a parent ministry (except for parliamentary agencies). Agencies thus have more autonomy from traditional government departments to perform a public function, but an agency will remain arm's-length to a minister who conducts oversight and broad steering functions of its mandate. Arm's-length agencies around the world perform a diverse set of tasks, but they usually involve some form of policy implementation, such as delivering a service (for example, border control), regulation (for example, energy pipelines) or exercising different kinds of public authority (for example, collection of national statistics) (Pollitt et al., 2004;Thynne, 2004;van Thiel, 2012).
The number of agencies created by government has grown in most liberal democracies in recent decades, as has the extent to which they perform key government functions and activities. And while agencies have garnered significant attention from scholars in the United Kingdom, Europe and the United States, the story of agencification in Canada has largely been untold. Pollitt et al. (2004) summarize the largely international scholarship on agencies as motivated by three main questions: Why has the agency form become so popular over the past couple of decades? How can agencies be steered and controlled by their parent ministries? Under what conditions do agencies perform well?
Most scholars attribute the growth of arm's-length agencies in recent decades to the NPM movement that first took root in the early 1990s in the United States and United Kingdom (Pollitt et al., 2004;van Thiel and Yesilkagit, 2014), while some identify a parallel development of the depoliticized or technocratic regulatory state (Bach and Jann, 2010). The history of the development and implementation of NPM in many democracies is covered exhaustively elsewhere and thus will not be reviewed comprehensively here. However, its link to agencification is that agencies are an organizational structure that can more easily separate policy from implementation (a key NPM objective) and offer organizational flexibility outside of the ministerial structure in a way that can mimic private sector business practices that focus on efficiency, leanness, performance, and service responsiveness.
The normative foundation of government agencies within the NPM and technocratic regulatory state literatures centres on the professionalization potential of an arm's-length relationship to government. NPM scholars Osborne and Gaebler (1992) document how reformers claimed that agencies would be more professionally managed, more businesslike or leaner in approach, would offer higher-quality outcomes (whether services, administration, or regulatory enforcement) and reduce political interference. This was in part because the agencies would be granted more flexibility on corporate governance such as decision making, human resources, and financial management, as well being usually smaller in size and more singular in purpose. They similarly hypothesized that in agencies, managers would be given more operational freedom to spend their budgets, which could contribute to more innovative use of funds and better value for money (see also Pollitt et al., 2001). Laegreid et al. (2011) find in a study of 121 Norwegian and Flemish state agencies that the level of innovation is high among them and that this is primarily attributable to results-oriented control in these agencies. A lot of NPM-informed agency scholarship is rooted in principal-agent theory that holds that agencification would cultivate specialized public services that are a better fit with client demands, which will improve quality and efficiency (Boston et al., 1996). Other scholarship echoes the idea that agencies would be more likely to work closely with citizens and stakeholders in policy implementation as a nimble and public-facing organization than would traditional government departments (MacCarthaigh and Boyle, 2014).
While agencification reformers had many champions, particularly in the first wave in the early 1990s, others were skeptical of the above promises of the movement, and many more today express caution about these structural trends in government administration and implementation. Whereas some see benefits in agency autonomy from arbitrary and politicized intervention, Christensen and Laegreid (2003) see harmful ambiguity in roles and the risks of undermining political control of key public tasks. These authors later elaborate that agencification may contribute to increased complexity, problems of coordination, higher transaction costs and serious loss of effective political control and accountability (Christensen and Laegreid, 2007). Bach and Jann (2010) echo this concern when they describe Germany as a highly differentiated "administrative zoo" with a large number of species (443), as do Bouckaert et al. (2010) when they claim that agencification has led to fragmentation of the public sector and that there is a need to restore coordination between agencies and bureaucracies.
After decades of research within and across countries, the evidence is mixed about the performance effects of agencification. A recent 20-country study compared public sector outputs against the scale of agencification and found that while long-standing agencies (pre-NPM revolution) are associated with higher performance, overall public sector output efficiency decreases with greater agencification (Overman and van Thiel, 2016). This suggests that the slate of agencies created in the contemporary period has not realized expected results, as the agencies are no longer reserved for use in market-driven sectors. Other macro-level studies of agencification on performance have found mixed or no effects (Andrews, 2010;James, 2003;Yamamoto, 2006). On the other hand, Caulfield (2006) finds greater agencification in sub-Saharan Africa to be associated with higher economic growth, lower budget deficits and low inflation, but she notes caution on attributing those outcomes to specific reform initiatives. Dan's (2014) review of the agencification literature finds positive effects from agencification on an orientation toward results and service users' needs but also claims that the conditions under which agencification increases public sector performance are still illusive to researchers (see also Dan et al., 2012). Chun and Rainey (2005) add further nuance by challenging any broad-based claims about agencies, using empirical analysis to show that agencies have different dynamics based on their policy responsibility (for example, regulatory versus non-regulatory). As such, the performance of agencies may exhibit more variation by type (for example, regulatory, administrative, adjudicative, enforcement, and so forth) than do departments.
The Canadian story on agencification has largely gone untold in the comparative public administration literature, notwithstanding the growth in agencies in Canadian provinces and territories and within the federal government. The 1970s-1990s was a period in Canadian public administration scholarship that grappled with key questions of agencies, focusing principally on accountability deficits (Eichmanus and White, 1985;Hodgetts, 1973;Schultz, 1982;Silcox, 1975), though Johnson (1991) pushed back on this consensus. But most of this work was conducted prior to the explosion in number of agencies in Canada in concert with the NPM movement, and Canada now hosts not only many more agencies but also in many more different parts of the public sector than when most of the Canadian scholarship was examining them. An exception is the continued work on local government agencies that has seen a consistent through line in Canadian scholarship, documenting their creation, scope and evolution (Lucas, 2013(Lucas, , 2016Richmond et al., 1994). Fitzpatrick and Fyfe (2002) document that since the mid-1980s Canada has seen a shift toward service delivery bodies or agencies that are more removed from the administrative or political oversight and control of departments and ministers.
They provide examples of very large service-or operational-focused agencies that were created in Canada in the early 1990s-for example, the Canadian Food Inspection Agency and the Canada Revenue Agency-but these operate under rather distinct levels of administrative authority according to their context. The justification from the Government of Canada for the creation of these large service agencies was that "Government should be focused on the needs of citizens, not the needs of bureaucracy. Canadians want their governments to co-operate, not compete. And they want better service delivered at a lower cost. Legislation will be introduced that will allow for the creation of fewer, more effective government agencies" (quoted in Fitzpatrick and Fyfe, 2002: 85). And there are indeed examples of high-functioning and esteemed government agencies in Canada, such as Statistics Canada and the Canada Mortgage and Housing Corporation, that are associated with good governance and high performance in their mandate. Yet recent controversies in various provincial arm's-length agencies focused on service delivery, such as Ornge (emergency air transport services) and eHealth in Ontario-both plagued by spending scandals and performance concerns (MacDonald, 2013)-have brought to the fore perennial questions of how agencies can run amok without appropriate administrative and political oversight (Zussman, 2012).
Based on the normative theoretical basis for agencification identified in the literature above, this study examines the following hypotheses with respect to organizational type. If proponents of agencification are correct, we should find evidence of the following organization-wide traits: H1: Respondents working in agencies will report a better climate of innovation than those working in departments.
H2: Respondents working in agencies will report that they have more autonomy and flexibility than those working in departments.
H3: Respondents working in agencies will report a work environment that is leaner and more efficient that those in departments.
H4: The performance of agencies will exhibit systematic variation by type (for example, regulatory, administrative, adjudicative, enforcement, and so forth).
The hypotheses advanced above focus on the internal workings of the organization rather than broader issues of how it carries out its mandate, principally due to the type of data available, as described below. With the literature reviewed and hypotheses articulated, the next section describes the methods used to assess the normative claims identified in the agencification literature but in a Canadian context that has not been examined systematically in the way that comparator countries have.

Methods
The PSES has been conducted every few years (and as of 2017, annually) to measure federal government employees' opinions about their workplace, including engagement, leadership, and organizational performance. The survey is a census of all eligible employees and, therefore, no sampling is done (and thus no error due to sampling). A total of 169,703 employees in 86 (out of 205) federal departments and agencies (or authorities) participated in the 2017 survey, representing a response rate of 61.3 per cent. Statistics Canada applies weights to adjust for nonresponse bias, so that the respondents and population for each agency or department have the same overall distribution with respect to occupational groups, resulting in an N of 258,502. Further details on how agencies were categorized by mandate are in Table A1 in the appendix.
The 86 participating organizations are those in "core public administration departments and separate agencies," but there are very important agencies or crown corporations excluded such as the Bank of Canada, Canada Mortgage and Housing Corporation, Canada Post, and Elections Canada, in addition to over 100 smaller agencies or authorities who choose not to participate for risk of revealing the identities of respondents, given the nature of the questions. Thus, the dataset cannot be considered a complete representation of the federal public sector in Canada, though it is a very large segment of it. The researcher gained access to the PSES 2017 microdata file through the Research Data Centre at the University of British Columbia, which provides strictly supervised access to sensitive personal data held by the Government of Canada. Statistical outputs may be released outside of the facility, but microdata is only available to authorized researchers.
It is critical to acknowledge that the PSES is conceptualized primarily as a human resources management tool for use within the public service and not principally as a tool to assess the performance of departments or agencies. However, there are a number of questions within the PSES that can serve as credible estimates of organizational traits that have been purported to be relevant for arm's-length agency analysis: the amount of burden of bureaucratic processes within the organization, the number of approval stages, the climate of innovation, and employee operational freedom. The survey is especially powerful to leverage for these organizational traits given the large response size and the ability to easily separate and analyze responses from those working in traditional government departments from those working in various types of agencies. The primary focus of the analysis on all variables is to compare departments versus agencies in the Government of Canada. While this is generally a simple distinction to make, there are some departments that are colloquially known as "agencies" (for example, the "central agencies" of the government-Privy Council Office, Treasury Board, and Finance) but indeed are departments. We follow Schedule IV of the Financial Administration Act to differentiate departments from agencies according to their legal and financial relationship to the Executive and Parliament.
However, there is wide variation among agencies in terms of the sector in which they operate and their central tasks, and thus an analysis that collapses all respondents from agencies together in a comparison with respondents from departments would potentially miss important nuance about those work environments. As such, respondents were also differentiated by function: administrative, regulatory, adjudicative, enforcement, and parliamentary. This was done by the researcher by inspecting the mandate of the agency and determining its best categorical fit and is summarized in Table A1 in the appendix. In virtually all cases the categorization is obvious, as most agencies have singular and clearly articulated mandates, but in a handful of cases it requires researcher interpretation if the mandate spans multiple functions.
The quasi-experimental method of MDM is utilized in this study, which is increasingly applied in the analysis of large government employee surveys (Kim and Lee, 2020;Mullins et al., 2021). Quasi-experimental methods such as matching are useful for examining the effect of a particular experience (for example, working in an agency) when a randomized experiment is not feasible. We cannot randomly assign employees to agencies (treatment) and departments (control) to compare their assessments of the level of innovation, autonomy and efficiency within these organizations, but we can use matching methods to find individuals in departments who are most similar to each "treated" individual on several observable characteristics (Stuart, 2010). In essence we are creating counterfactuals by identifying and matching control respondents in departments who are most comparable to treated respondents in agencies through a systematic process of matching via theoretically relevant observable attributes. Matching is a powerful research design approach to reduce model dependence and bias (Ho et al., 2007).
The attributes used to match respondents in the case of this study are the following, transformed into dummy variables: gender, age cohort, occupational grouping, region, and job satisfaction (consistent with Mullins et al., 2021); these are detailed  further in Table A2 of the appendix. MDM will select a control respondent for each treated respondent in such a way that the distance (that is, difference) between them is as small as possible, such that the treatment and control groups are balanced in terms of observable characteristics, which simulates the conditions of random assignment in experimental studies. This allows for relatively simple direct comparisons in the outcomes of interest (in our case, level of innovation, autonomy and efficiency within these organizations), such that the average difference in outcome of interest is an estimate of the impact of being treated (that is, working in an agency). Matching was conducted in R using the MatchIt package created by Ho et al. (2007).

Data Analysis
The four hypotheses are tested with multiple measures from the PSES survey instrument. There are two different questions asked for each of the three dependent variables of interest in this study, which provides additional measurement reliability when it comes to hypothesis testing. Recall that these measures capture internal organizational features rather than output measures of organizational effectiveness, which is a function of the data available. Table 1 below details the multiple measures utilized. The descriptive characteristics of the key variables of interest when differentiating employees in departments compared to agencies are suggestive of modest effects of agencification, summarized in Figure 1 below. Employees in agencies, when compared to their equivalents in departments, are slightly less likely to agree on their organization fostering a climate of innovation, slightly more likely to claim impediments to worker autonomy, and slightly less likely to claim organizational inefficiencies interfering with work.
The descriptive data can be further differentiated by agency type, given the intention to explore with the fourth hypothesis that agencies may exhibit more variation by type (for example, regulatory, administrative, adjudicative, enforcement, and so forth) than they do compared to departments in the aggregate. This descriptive data are presented in Figures A1-3 in the appendix but reveals several noteworthy patterns across the variables of interest: enforcement agencies are consistently weaker on key metrics; regulatory, parliamentary and adjudicative agencies tend to be consistently stronger; and administrative agencies tend to mirror the employee sentiment from those in departments.
The descriptive data are informative of general trends by organization type, but there are a number of other drivers of employee sentiment that must be controlled for in order to separate out the agency effects on the key variables of interest. This is primarily done via matching methods with pretreatment covariates of gender, age, job satisfaction, occupation groups, and region (consistent with Mullins et al., 2021), but the results presented below are multivariate regressions using the same covariates. While there are different views in the literature on how to proceed with estimates in matched samples, we take the advice of Ho et al. (2007) that the analysis that you would have done before matching is the one that you should do after matching. Therefore, we perform logistic regression on the matched sample for the variables of interest against organization type, controlling for the covariates of gender, age, job satisfaction, occupation groups, and region. All of the dependent variable measures are therefore transformed into binary values, combining "strongly agree" and "agree" as 1 and combining "neutral," "disagree" and "strongly disagree" as 0. This is a methodological choice to isolate respondents who report positive orientations on the dependent variables of interest from those who are neutral or negative. This is a conservative measurement choice, as it holds that neutrality on response is a deliberate choice not to report a positive orientation in response to an organizational attribute.
The summary logistic regression results for the key variables of interest are depicted in Figure 2 below, shown as easily interpreted odds ratios (full results in Tables A3-A8 in the appendix). An odds ratio of less than 1 indicates that those in agencies are less likely to report agreement on the respective dependent variable measure than those in departments, and an odds ratio greater than 1 indicates employees in agencies are more likely. An odds ratio of 1 indicates no statistically significant effects are observed on that variable. On both measures of what we have defined as a climate of innovation ("new idea support" and "encourage innovation"), we see that employees in agencies report a lower climate of innovation, after controlling for relevant covariates such as gender, age, job satisfaction, occupation type, and region. For example, in response to the statement "I feel I would be supported by my department or agency if I proposed a new idea," employees in agencies are 9 per cent less likely to report agreement than those in departments, and they are 14 per cent less likely to agree that "I am encouraged to be innovative or to take initiative in my work" compared to departmental counterparts.
Furthermore, employees in agencies are 7 per cent and 22 per cent more likely to say work suffers from lack of autonomy ("changing priorities" and "lack stability," respectively), yet they are 14 per cent and 15 per cent less likely to claim organizational inefficiencies interfering with work ("approval stages" and "unnecessary processes," respectively) than their equivalents in departments. The evidence is clear from this level of differentiation of organization type (agency versus department) that neither Hypotheses 1 nor 2 are confirmed. In fact, the data find the opposite in those cases, thus providing additional evidence for the agency skeptics in the literature. The results do, however, lend support to H3: that employees working in agencies report higher organizational efficiency than do employees working in departments.
Yet the literature is quite clear that not all agencies are the same in function, and thus a mere agency versus department analysis is incomplete. As such, and consistent with H4, we probe further by conducting regression analysis with more differentiated agency types to measure against departments. The agencies can be subcategorized into five types (shown in Table A1 of the appendix): administrative, adjudicative, regulatory, enforcement, and parliamentary. The summary logistic regression results for the key variables of interest are depicted in Figure 3 below, once again shown as easily interpreted odds ratios (full results in Tables A3-A8 in the appendix). A clear pattern emerges across all variables: enforcement agencies are consistently weaker on key metrics, whereas regulatory and parliamentary agencies tend to be consistently stronger. Recall that the analysis controls for various demographic and positional characteristics in the estimation and that these agency effects are not driven, for example, by employees in enforcement agencies being more dissatisfied with their job (they are, but that is controlled and isolated from the agency effect, and full results in Tables A3-A8 of the appendix show this).
Beyond the big picture findings by agency type, there are noteworthy patterns for each of the dependent variables of interest. Figure 3 above shows that employees in regulatory and parliamentary agencies, when compared to employees working in departments, are 24-69 per cent more likely to report a climate of innovation (depending on the measure), whereas those in enforcement agencies are 29-33 per cent less likely. Those in regulatory, parliamentary, and administrative agencies, when compared to those working in departments, are 13-43 per cent less likely to claim limits to their autonomy at work, and those in enforcement agencies are 76 per cent more likely. Finally, those in parliamentary and administrative agencies, when compared to those working in departments, are 49-56 per cent less likely to claim organizational inefficiencies, but those in enforcement agencies are 26 per cent more likely. Thus what stands out on the three measures of internal organizational performance is that those in regulatory, parliamentary (and to a certain degree, adjudicative) agencies seem to align with pro-agencification views, whereas those in enforcement agencies align with agency-skeptical views.

Discussion
The agencification literature has largely been developed without insights from Canada, despite the considerable growth of arm's-length agencies in Canadian governments. And while country-based or comparative analysis of agencies around the world have not yielded any definitive answers with respect to agencification trends and organizational innovation and efficiency, until now we have had no basis to make claims, positive or negative, about what is happening in Canada in a broad sense. This study aimed to address this gap by drawing on high-quality data available from the Government of Canada, responding to recent efforts to make better use of PSES micro data (Cooper and Turgeon, 2021;McGrandle, 2019;Mullins et al., 2021). The data do not include provincial government departments or agencies but represent an original and systematic first look into agency organizational dynamics in Canada.
The results reveal that those working in agencies generally report less innovation and less work autonomy than those working in departments (thus providing evidence contra to H1 and H2), whereas they do report a work environment that is leaner and more efficient that those in departments (consistent with H3). That said, some types of agencies-namely, regulatory and parliamentary ones-appear to defy these broad trends and in fact seem to be stronger than other agencies (in particular enforcement) and departments (consistent with H4). So it is clear that any broad-based statement about agencification in the Government of Canada vis-à-vis performance lacks important nuance related to their core tasks.
The most consistently weaker types of agencies based on employee perception data on the internal organizational metrics analyzed are enforcement agencies. It is perhaps understandable that these kinds of agencies are not leaders in fostering a climate of innovation given their mandate to apply, systematically, the laws and regulations set by elected decision makers, even if there is a certain amount of discretion. Yet adjudicative agencies (for example, Immigration and Refugee Board, Parole Board of Canada) split from the weaker environment evident in enforcement agencies (for example, Canada Border Services Agency, Correctional Service of Canada) in this analysis. Prior work on adjudicative agencies suggests that there is a policy-making role in the process of creating jurisprudential guides and internal processes, thus generating innovation potential, even as they aim to foster efficiency and consistency in decision making (Houle and Sossin, 2006;Tomkinson, 2018Tomkinson, , 2020). Yet on the key measures of autonomy, those working in enforcement agencies are considerably more likely to report that their work suffers due to changing priorities and a lack of stability in the organization. Adjudicative agencies deviate from enforcement agencies on the autonomy measures, consistent with what we would expect given their uniquely defined arm's-length mandate. And on organizational efficiency, employees in enforcement agencies are more likely than departments (and other agencies) to report that their work suffers due to too many approval stages and unnecessary procedures, flying in the face of common rationales for arm's-length bodies as opportunities to create leaner and more efficient organizations.
By contrast, the most consistently stronger agencies on internal organizational metrics of innovation, autonomy and efficiency (based on employee perception) are those in the regulatory and parliamentary realms. Certainly this is not a surprise on the autonomy measures for the parliamentary agencies-for example, Office of the Auditor General, Office of the Commissioner of Official Languages-given that they are more formally and normatively more separated from Cabinet and ministerial intervention. They are also smaller organizations and thus one can understand how employees could report higher organizational efficiency. It is perhaps surprising, however, that the parliamentary agencies are also more likely to report a climate of innovation-measured by whether they feel a new idea they offer will receive support and if their organization encourages innovation. Yet this is consistent with early agency researchers in the NPM tradition, who argued that when leaders of agencies have more operational control over their budgets (as parliamentary agencies do), they are more likely to innovate and create higher value to the public. The consistently higher perceptions of these organizational attributes by those working in regulatory agencies conform to the same logic as the parliamentary agencies in terms of the effects of autonomy.
The findings from this study therefore ought to spark a renewed conversation in Canada with respect to agencification, in particular for governments when contemplating the creation, reconfiguration or dissolution of arm's-length agencies. Something is clearly happening within the enforcement agencies in Canada-the Canadian Border Services Agency, the Correctional Service of Canada, the Royal Canadian Mounted Police, among others-that points to organizational challenges. Size does not explain this, as there are similarly large agencies in other realms (for example, Canada Revenue Agency, Statistics Canada, Parks Canada) that do not exhibit such strong negative employee perceptions of the organizational climate of innovation, autonomy and efficiency. While this study cannot resolve this question, there may be important elements of the mandate of such agencies and the cultures developed within them that may help explain this phenomenon, but it has implications as governments consider reforming them to deliver high-quality services to the public.

Limitations
Despite the innovative approaches used to analyze Government of Canada's PSES microdata in this study, there are limitations. First, although matching methods are a powerful research design approach to reduce model dependence and bias, they require the assumption of unconfoundedness (that is, no hidden bias caused by unobservable characteristics) so that that treatment assignment is considered exogenous (Rubin, 1990). Yet in the case of this study, the selection for "treatment"working in agency as opposed to a department-may very well be based on unobservable variables. Even after matching on observable characteristics, there may be unobservable sources of selection bias that shape one's choice to work in an agency.
A second limitation is that all of the dependent variable data in this analysis are self-reported or perception-based, rather than objective measures of the climate of innovation, employee autonomy, or efficiency of the organizations. Perceptions may differ from reality (Taylor and Westover, 2011) or what a value-for-money audit may conclude on performance, but perceptions can be argued to be more important than objective conditions (Schyns and Croon, 2006), particularly if those perceptions deviate strongly from the average in the aggregate. Perceptions of high organizational performance and reputation significantly and positively relate to employee engagement (Men, 2012), and various studies have found measures of perceived organizational performance to be correlated positively with objective measures of organizational performance (Dollinger and Golden, 1992;McCracken et al., 2001). That said, more objective (or less perceptual) measures of organizational performance could be used if the unit of analysis were to shift from the individual to the organizational level and rely on public administration performance awards (as done by Bernier and Deschamps, 2020, with the analysis of policy innovation), but one would sacrifice the ability to draw on the rich PSES microdata.
A third limitation stemming from the use of PSES data is that it is a single source of data, and thus common method variance could be a problem due to systematic measurement errors (Favero and Bullock, 2015). It is most problematic when both dependent and independent variables are measured via respondents' perceptions. In the analysis presented in this study, only the dependent variables are measured as perceptions, whereas the main independent variable (organization type) and covariates (gender, age, occupational group, and region) are objective measures (with the exception of job satisfaction as a covariate). Furthermore, the questionnaire design of the PSES followed recognized procedural strategies to reduce this common source bias threat, including ensuring the anonymity of respondents, using different scale properties and, notably, providing proximal separation between like measures (George and Pandey, 2017).

Future Research Directions
The results from this study contribute to a critical ongoing discussion in the literature on the agencification of government, in particular by bringing in the Canadian experience that has otherwise gone unexamined. A systematic analysis of agencification in Canada is needed, and while this study has presented an opening shot, there is much more research to do in order to make definitive claims about such trends in the country. First, researchers ought to continue to mine the various waves of the Government of Canada's PSES which, as a consistently designed and delivered workforce census, can allow for trend analysis on the central questions motivating research on agencification. The available data for the PSES begin in 2005, and the PSES was conducted every three years prior to becoming annual in 2017. Drawing on these prior waves of survey data, researchers could examine if there are particular trends associated with partisan control of government on how arm's-length agencies conduct their work. There may be additional opportunities in Canada to draw on similar workforce surveys from the provincial governments, though none appear as default open-data (to approved researchers) by the relevant statistical agency, as is the case with the Government of Canada. Provincial-level data would allow for even more comparative data with which to confirm or challenge the findings from this study.
A second avenue for future research on agencification in Canada is to triangulate the perception-based survey analysis with outcomes-based analysis drawing on performance reviews conducted on federal departments and agencies by the Treasury Board Secretariat (TBS). Typically the TBS selects three federal organizations (departments and agencies) annually to conduct rigorous performance reviews, and they follow a similar format to assist with comparative analysis. Additionally, in response to the Policy on Results in 2016, the TBS (n.d) has assembled an online database for all annual departmental and agency results reports (GC InfoBase) and provides high-level comparative information (for example, percentage of annual performance targets met), as well as detailed reports on departments and agencies, allowing researchers to conduct both quantitative and qualitative comparative analysis using key outcome indicators.
A third, and complementary, approach for future research on agencification in Canada would be through interviews with key current and former leaders of agencies on the climate of innovation, the level of autonomy, and the efficiency of their organizations relative to their departmental affiliates. While this data will be perceptionbased, as in the PSES survey, it would allow for more textured data on the governance dynamics and pressures within agencies. For example, beyond learning about agency autonomy from questions in surveys, interviews can explore how leaders navigate mandate letters, ministerial oversight, their boards, and public-facing accountability. Given that arm's-length agencies exhibit considerable variation, researchers might focus on particular types of agencies (for example, transit, health, service, and so forth) to enable a more controlled comparison across levels of government and build out the knowledge base with careful qualitative work.