CAPAtoc 1..3 - The Institute of Public Administration of Canada
Transcription
CAPAtoc 1..3 - The Institute of Public Administration of Canada
march⁄mars 2015 | volume 58 | number⁄numéro 1 Canadian Public Administration publique du Canada Analyzing national, provincial, territorial, municipal, aboriginal and international governance practice in a changing world Analyse la pratique de la gouvernance nationale, provinciale, territoriale, municipale, autochtone et internationale dans un monde en évolution sp e ci a l i s s u e ⁄ n u m é ro s pé cial p erf o rm a n ce m e a s u r e m e n t, p e r fo r m ance ma n ag e m e n t , a n d a ccou ntab ility Theory, Practice, Metrics, Benchmarking Health, Social and Employment Services Institutions, Utilization, Impact, Accountability mesure d e l a p e r f o r m a n ce , ge s tio n d e la p er f o r m a n ce , e t i m p u tab ilité Théorie, pratique, métrologie, analyse comparative d’institutions de services de santé, sociaux et d’emploi, utilisation, impact, imputabilité Canadian Public Administration Administration publique du Canada Editorial Office / Bureau de rédaction 1075 rue Bay Street, Suite 401 Toronto ON M5S 2B1 CANADA Tel/Tél.: (416) 924-8787; Fax/Téléc.: (416) 924-4992 e-mail/courriel : [email protected] – [email protected] editor/rédacteur – Evert A. Lindquist, Professor and Director, School of Public Administration, University of Victoria associate editor/rédacteur associé – Denis Saint-Martin, Professeur, Université de Montréal associate editor/rédactrice associée – Cynthia Whitaker, Vancouver, British Columbia managing editor/directrice de la rédaction – Christy Paddick CEO/directeur général – Robert Taylor editorial board/comité de rédaction – Frances Abele, Carleton University Luc Bernier, ENAP Sandford Borins, University of Toronto Keith Brownsey, Mount Royal University Fred Carden, International Develepment Research Centre Ian D. Clark, University of Toronto Louis Côté, Observatoire de l’administration publique, Québec Katherine Fierlbeck, Dalhousie University Toby Fyfe, University of Ottawa Monica Gattinger, University of Ottawa Andrew Graham, Queen’s University Victor Y. Haines III, Université de Montréal Joseph Kushner, Brock University James A. McAllister, Ministry of Finance, Ontario Paul F. McKenna, Public Safety Innovation, Inc., Nova Scotia Janine O’Flynn, University of Melbourne Glen Randall, McMaster University Ken Rasmussen, University of Regina Alasdair Roberts, Suffolk University Law School Lloyd Robertson, Ministry of Transportation, Ontario Jeffrey Roy, Dalhousie University Tania Saba, Université de Montréal David Siegel, Brock University Paul ’t Hart, University of Utrecht Annis May Timpson, University of Edinburgh Graham White, University of Toronto Canadian Public Administration/Administration publique du Canada (ISSN 0008-4840 [print], ISSN 1754-7121 [online]) is published quarterly on behalf of the Institute of Public Administration of Canada by Wiley Subscription Services, Inc., a Wiley Company, 111 River St., Hoboken, NJ 07030-5774. Administration publique du Canada (ISSN 0008-4840 [imprimée], ISSN 1754-7121 [en ligne]) est publiée une fois par trimestre au nom de l’Institut d’administration publique du Canada par Wiley Subscription Services, Inc., a Wiley Company, 111 River St., Hoboken, NJ 07030-5774. We are grateful to the Social Sciences and Humanities Research Council of Canada for financial support in the publication of this Journal/Nous remercions le Conseil de recherches en sciences humaines du Canada de l’aide financière qu’il apporte à la publication de cette Revue. ISSN 0008-4840 December/décembre 2015 published quarterly publiée trimestriellement © The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015. All rights reserved/Tous droits réservés. PAP Registration No. 09891/Enregistrement postal des publications no 09891. Postage paid at Toronto/Port payé à Toronto. “We acknowledge the assistance of the Government of Canada through the Publications Assistance Program toward our mailing costs.” Nous reconnaissons l’aide financière du gouvernement du Canada, par l’entremise du Programme d’aide aux publications (PAP), pour nos dépenses d’envoi postal. Canadian Public Administration Administration publique du Canada March/mars 2015 ro 1 Volume 58, Number/nume Special Issue on Performance Measurement, Performance Management, and Accountability — Editors’ Introduction / ro spe cial sur la mesure de la Nume performance, gestion de la performance, et — Introduction des re dacteurs imputabilite Nathalie Dubois/1 Jim McDavid Etienne Charbonneau Jean-Louis Denis Original articles / Articles originaux Health system performance reporting in Canada: Bridging theory and practice at pan-Canadian level Jeremy Veillard/15 Brenda Tipper Sara Allin In Canada, performance reporting in the health sector is still in development. What are the results of a recent intervention by the Canadian Institute for Health Information to develop a platform for pan-Canadian performance reporting? What are the conceptual, methodological and operational challenges? Au Canada, l’etablissement de rapports sur la performance dans le secteur de la sante est toujours en cours de developpement. Quels sont les resultats d’une recente intervention de l’Institut canadien d’information sur la sante (ICIS) pour mettre au point une plateforme qui presente l’information sur la performance a l’echelle pancanadienne? Quels sont les defis conceptuels, methodologiques et operationnels? valuation de la Les enjeux de l’e passer les mythes performance : De Georges-Charles Thiebaut/39 François Champagne -Pierre Contandriopoulos Andre L’evaluation de la performance est au cœur des debats contemporains sur la reforme des systèmes de sante. Peu de consensus existe cependant quant a la definition et la mesure de la performance. Cet article propose un modèle d’evaluation globale et integree mieux adapte aux besoins des decideurs. Performance assessment is at the core of contemporary debates on Health systems reform. However, there is very little consensus about the definition and the measurement of performance. This article proposes a global and integrated assessment model that would be better suited to meet decision makers’ needs. ciation de la performance du système L’appre et des services sociaux du Que bec : de sante L’approche du Commissaire a la sante ^tre et au bien-e Olivier Sossa/63 Isabelle Ganache Depuis 2009, un Commissaire evalue de façon globale et integree le système de sante et de services sociaux du Quebec et rend compte de sa performance annuellement. Pour ce faire, le Commissaire se fonde sur un ensemble de renseignements provenant de diverses sources, des indicateurs de monitorage ou de resultats d’enqu^etes. Comme cet article l’explique, cette methode d’evaluation suppose un effort d’integration plus qu’un simple exercice d’ordonnancement ou d’analyse de quelques indicateurs isoles. Since 2009, a Commissioner assesses in a global and integrated way the Health and Social Services system in Quebec and accounts for its performance on an annual basis. To do so, the Commissioner relies on a set of information from various sources, monitoring indicators or survey results. As explained in this article, this method of assessment implies an integrating effort, which is more than a simple scheduling or benchmarking exercise of some isolated indicators. dans l’utilisation des indicateurs Volatilite de performance municipale : Bilan et nouvelle perspective d’analyse Etienne Charbonneau/89 rard Divay Ge Damien Gardey Malgre l’attention croissante qui leur accordee, les indicateurs de rendement ont une influence limitee sur la pratique des fonctionnaires et elus municipaux. Pourquoi en est-il ainsi? Cet article fait une revue de la litterature et developpe un cadre d’analyse de l’interface politico-administrative pour repondre a cette question. In spite of the increasing attention they are receiving, performance indicators have a limited impact on the practice of public servants and local elected officials. Why is it so? To answer this question, this article reviews the literature and develops an analytical framework of the political and administrative interface. Performance management in a Etienne Charbonneau /110 benchmarking regime: Quebec’s Municipal François Bellavance Management Indicators What are the factors, whether controllable or uncontrollable, that account for the uses of performance measurement by municipal managers? What do senior managers indicate as the strongest predictors of performance management? This article analyses data from a survey from 321 municipalities in a provincially mandated, yet flexible, municipal benchmarking performance regime in Quebec. Quels sont les facteurs, contr^ olables ou incontr^ olables, qui rendent compte de l’utilisation de la mesure de la performance par les administrateurs municipaux? Quels sont les plus forts indicateurs previsionnels de gestion de la performance qu’indiquent les cadres superieurs? Cet article analyse les donnees d’un sondage realise auprès de 321 municipalites dans un regime d’analyse comparative de la performance municipale au Quebec, prescrit par la province mais souple cependant. What metrics? On the utility of measuring the performance of policy research: An illustrative case and alternative from Employment and Social Development Canada Edward Nason /138 Michael A. O’Neill Why have government policy research activities seldom been the object of performance measurement? Is it because existing models are rooted in outputs and outcomes, often at the expense of relationships and networks? Is a purpose-built model needed? The Sphere of Influence of Research Policy model is provided as an illustration. Pourquoi les activites de recherche en politiques gouvernementales ont-elles rarement fait l’objet de mesure de la performance? Est-ce parce que les modèles actuels sont axes sur le rendement et les resultats, souvent au detriment des relations et des reseaux? Est-il necessaire d’avoir un modèle conçu specifiquement dans ce but? Le modèle Sphère d’influence en recherche des politiques est fourni a titre d’illustration. Performance measurement in Canadian employment service delivery, 1996-2000 John Grundy/161 Governmentality research sheds light on the discourses and techniques involved in managing the conduct of individuals and organizations. A central focus of this scholarship is the proliferation of managerial practices of measurement and monitoring that enable governance at a distance. Can this approach help us to appraise how the Human Resource Development Canada’s Results-Based Accountability Framework was used to measure the results of employment services for the unemployed in the 1990s? La recherche sur la gouvernementalite fournit des informations sur les discours et les techniques qui interviennent dans la gestion de la conduite des personnes et des organismes. Cet article met essentiellement l’accent sur la proliferation des pratiques de gestion relatives a la mesure et au contr^ ole qui permettent la gouvernance a distance. Cette approche peut-elle nous aider a evaluer la manière dont le Cadre de responsabilisation axe sur les resultats de Developpement des ressources humaines Canada a ete utilise pour mesurer les resultats des services d’emploi pour les ch^ omeurs dans les annees 1990? Bringing accountability up to date with the realities of public sector management in the 21st century Burt Perrin/183 What are the shortcomings of traditional approaches to accountability? Why do these approaches persist when they are known to provide an inaccurate view of performance, inhibit improved performance, and lessen confidence in government? What might be a vision of accountability more in keeping with the realities of public sector management in the 21st century? Quelles sont les deficiences dans les façons traditionnelles d’aborder l’imputabilite? Pourquoi continue-t-on de recourir a ces approches lorsque l’on sait qu’elles fournissent une fausse representation de la performance, qu’elles emp^echent l’amelioration de la performance et reduisent la confiance envers le gouvernement? Pourrait-il y avoir une vision de l’imputabilite qui corresponde mieux aux realites de la gestion du secteur public au 21e siècle? Reviewers / Evaluateurs 204 CANADIAN PUBLIC ADMINISTRATION is the refereed scholarly publication of the Institute of Public Administration of Canada (IPAC). The Journal is committed to the examination of the structures, processes, outputs and outcomes of public policy and public management related to executive, legislative, judicial and quasi-judicial functions at all three levels of Canadian government. Articles must be empirical in their methodology and accessible to the non-technical reader. Published quarterly, the Journal focuses primarily on Canadian issues, but manuscripts are welcome if they compare Canadian institutions and public-sector institutions and practices with those in other countries, or if they examine matters in other countries or in international organizations that are of particular relevance to the public administration community in Canada. Authors’ new style guide – Manuscripts submitted for publication should not exceed 7,500 words and should be accompanied by a 100-word abstract and five pullout quotations from the text of the article. Authors are asked to limit the number of text citations, notes and references, the style guide for which can be found on the IPAC website (www.ipac.ca). Non-compliance with these requirements may result in delay in publication. Contributions to the ‘‘book review’’ and ‘‘research note’’ sections should not exceed 1,250 and 5,500 words, respectively. How to submit a manuscript – Canadian Public Administration prefers to receive all manuscript submissions electronically using Manuscript Central. To submit a manuscript, please go to the Journal’s Manuscript Central homepage (http://mc.manuscriptcentral.com/capa). Log in or click the ‘‘Create Account’’ option if you are a first-time user of Manuscript Central. After submitting your manuscript, you will receive a confirmation e-mail. You can also access Manuscript Central any time to check the status of your manuscript. The Journal will inform you by e-mail once a decision has been made. Getting help with your submission Each page of the Manuscript Central website has a ‘‘Get Help Now’’ icon connecting directly to the online support system at http://mchelp.manuscriptcentral.com/gethelpnow/contact.htm and telephone support is available through the US ScholarOne support office at 888-503-1050. If you do not have Internet access or cannot submit online, the Editorial Office will help with online submissions. Please contact the Editorial Office, by telephone or by e-mail at cpaddick@ ipac.ca. All manuscripts and editorial correspondence should be sent via electronic mail to the Editor at the Institute of Public Administration of Canada ([email protected]). l’e valuation des pairs, de l’lnstitut ADMINISTRATION PUBLIQUE DU CANADA est la revue savante, soumise a examiner les structures, proce de s, rendements et d’administration publique du Canada (IAPC). La Revue s’engage a sultats des politiques publiques et du management public concernant les fonctions administrative, le gislative, judiciaire re ^tre empiriques dans leur et quasi-judiciaire aux trois paliers de gouvernement du Canada. Les articles doivent e thodologie et accessibles au lecteur non spe cialise . La Revue, publie e une fois par trimestre, porte essentiellement me sur des questions canadiennes; toutefois, les auteurs peuvent soumettre des manuscrits dans lesquels ils comparent celles d’autres pays ou examinent des questions dans d’autres les institutions et pratiques du secteur public canadien a pays; ou dans des organismes internationaux, qui ont une pertinence particulière pour le secteur de l’administration publique canadienne. passer 7 500 mots et doivent e ^tre Guide de style – Les manuscrits soumis pour publication ne doivent pas de s d’un sommaire de 100 mots et de cinq citations en exergue tire es du texte de l’article. Les auteurs sont accompagne s de limiter l‘usage qu’ils font des citations de texte, de notes et de re fe rences et de se tenir au Guide de style qui se prie trouve sur le site Web de l’IAPC (www.iapc.ca). Le non respect de ces exigences pourrait entra^ıner des retards de publication. Les articles soumis pour les rubriques « Comptes rendus » et « Notes de recherche » ne doivent pas passer 1 250 mots et 5 500 mots respectivement. de fère recevoir e lectroniquement Comment soumettre un manuscrit – La revue Administration publique du Canada pre toutes les soumissions de manuscrits par le biais de Manuscript Central. Pour soumettre un manuscrit, veuillez vous la page d’accueil de Manuscript Central pour la Revue (http://mc.manuscriptcentral.com/capa). Ouvrez une rendre a er un compte » si vous utilisez Manuscript Central pour la première fois. Une fois session ou cliquez sur l’option « Cre e, vous recevrez une confirmation par voie e lectronique. Voux pouvez e galement acce der a votre soumission envoye n’importe quel moment pour ve rifier le statut de votre manuscrit. La Revue vous informera par Manuscript Central a cision qui a e te prise. courriel de la de Obtenir de l’aide avec votre soumission ^ne « Obtenir de l’aide maintenant » qui est relie e Chaque page du site Internet de Manuscript Central a une ico http://mchelp.manuscriptcentral.com/gethelpnow/contact.htm. Le bureau directement au système de soutien en ligne a galement de l’aide en te le phonant au 888-503-1050. Si vous n’avez pas accès a de soutien US ScholarOne offre e daction peut vous aider. Veuillez communiquer Internet ou si vous ne pouvez pas soumettre en ligne, le Bureau de la re le phone ou courriel a [email protected]. avec lui par te Nathalie Dubois Jim McDavid Etienne Charbonneau Jean-Louis Denis Special issue on performance measurement, performance management, and accountability — ro spe cial Editors’ introduction / nume sur la mesure de la performance, gestion de la performance, et — Introduction des imputabilite dacteurs re Performance measurement, performance management and public reporting have grown from New Public Management (NPM)-inspired beginnings in the 1990s1 to being a near universal requirement in both the public and non-profit sectors. In Western countries, the movement to tie public management to performance objectives, targets, measureable indicators, and regular assessments of performance for both internal (managerial) and external organizational stakeholders is now a key part of orthodox public administration. In developing countries, where international aid organizations and multilateral donors offer assistance, aid is typically tied to instituting performance management systems with many of the same measurement and reporting expectations as are required in jurisdictions at all levels of government in Western countries. In this special issue of Canadian Public Administration we canvass realizations of performance evaluation, offering both expository and critical perspectives that range from descriptions of function-specific frameworks to reviews of the whole field of performance management. The articles reflect the diversity of approaches to conceptualizing, measuring and analyzing program and organizational performance, as well as different epistemological assumptions about how to interpret performance information and assess its worth. Several of the articles can be linked, examining the same function across jurisdictions or the same jurisdictions from complementary perspectives. Other articles stand alone and bring forward the findings from decades of research that have tracked and assessed the movement to use performance results to improve accountability and performance. We briefly summarize each article and then highlight underlying themes that connect the papers to broader themes in the literature. The first three articles offer complementary perspectives on performance measurement in the health field, which reflect a long-standing emphasis on science-based research and practice along with the use of evidence in CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 1–14 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 2 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS making program and policy-related decisions. Health-related programs are increasingly costly to deliver, and in Canada there is evidence that health care expenditures are a growing percentage of provincial budgets (Evans 2010). Because health is a provincial responsibility in Canada, funded in part by the federal government (Kirby 2001), efforts to set standards, control costs and improve the efficiency and effectiveness of services have involved two levels of government. In 1994, the Canadian Institute for Health Information (CIHI) was created by provincial and federal stakeholders as a source of comparative performance data for evidence-based policy and program decision-making across Canada. In “Health System Performance Reporting in Canada: Bridging Theory and Practice at a Pan-Canadian Level,” Jeremy Veillard, Brenda Tipper and Sara Allin describe how CIHI built an interactive health system performance framework for measuring, comparing and explaining variations in performance at the national, regional (provinces and health regions) and institutional levels. Building on ten years of public reporting (20002014), this will facilitate users building tailored comparisons of their own institutional performance to groups of matched peers. Veillard et al. offer examples of how “high performers” (institutions in the top decile) can be identified from the suite of performance indicators. They suggest that the system will be useful to improve both accountability and performance, with the latter influenced by peer pressure that comes from comparing one’s performance with a peer group. Veillard et al. also describe how in 2003, in response to the Romanow Commission on the Future of Health Care in Canada (Romanow 2002), the federal government and nine provinces agreed to create a national Health Council to support the dissemination of best practices and innovations by gathering performancerelated information and making it available to provincial and sub provincial jurisdictions. The Health Council operated as a repository and a forum for pan-Canadian discussions of health-related policy issues for ten years until it was disbanded by the Conservative government in 2013. Quebec has created its own institution. The Commissioner of Health and Well-Being (CEBS) is mandated to measure health system performance, examine health policy issues, issue reports and serve as a source of information and ideas to influence health system policy and performance in Quebec. The paper by Georges Thiebaut, François Champagne and AndrePierre Contandriopoulos, “Les enjeux de l’evaluation de la performance: Depasser les mythes” situates and describes the performance measurement framework (used by the Quebec Commission of Health and WellBeing: the Comprehensive and Integrated Framework for Health System Performance Assessment). Thiebaut et al. critique the system, observing SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 3 that measuring performance of complex systems is intrinsically judgmental. They also point out that it is simplistic to assume as many proponents of performance measurement have done, that such systems can be used to simultaneously improve accountability and improve performance. We return to this point in our summary of cross-cutting themes below. In “L’appreciation de la performance du système de sante et des services sociaux du Quebec : L’approche du Commissaire a la sante et au bien-^etre,” Olivier Sossa and Isabelle Ganache focus more directly on how the CEBS Commissioner assesses existing health-related programs and policies as well as advocating for new policies. The Commissioner uses the EGIPSS framework to describe system and sub-system performance as well as analyzing the causal patterns among factors that comprise the complex health and well-being system in the province. Sossa and Ganache point out that for the CEBS, performance assessment includes ethical and stakeholder dimensions, the latter involving experts and citizens in the review of performance results and the performance framework itself. Periodic reporting is intended to influence the development and implementation of health-related public policies, such as the 2012 report focused on mental health issues in Quebec and included an ethical dimension in assessing mental health-related policies and programs. Etienne Charbonneau and François Bellavance, in “Performance Management in a Benchmarking Regime: Quebec’s Municipal Management Indicators,” offer an empirical analysis of the factors that might explain the extent to which municipal managers in Quebec perceive their local governments using the 14 provincially mandated (since 2004) performance indicators. Charbonneau and Bellavance surveyed the population of general managers, getting 321 responses. Their analysis includes four dependent variables: general purpose uses; management uses; budgeting uses; and external reporting uses. Predictors include measures of their local government’s comparative rankings in performance, perceived barriers to using performance information, indices of selected internal features of their local government, selected socio-demographic factors and a measure of political completion in the last mayoral election. Although very few of these factors are statistically significant, one factor that stands out is managerial willingness to use performance information: the less willing managers are, the less likely they are to report their local government using the 14 performance indicators. The paper by Etienne Charbonneau, Gerard Divay, and Damien Gardey, “Volatilite dans l’utilisation des indicateurs de performance municipal,” builds on the central finding of the previous paper: a manager’s willingness to use the information. It develops a model to explain uses of performance information in local governments by focusing on the interface between managers and elected officials, who each bring their own 4 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS knowledge, values, interests and emotions to these relationships. These factors can change over time. Because local government work is cyclical, Charbonneau et al. hypothesize that different uses – passive (reporting results as required by law), purposeful (used to improve program performance), political (used to communicate with the public) and perverse (performance results are manipulated so that they are biased) – will vary over time. Indeed, the real payoff is to move beyond snapshots and support dynamic research that measures and correlates these patterns over time. Michael O’Neill and Eddy Nason’s “What Metrics? On the Utility of Measuring the Performance of Policy Research: An Illustrative Case and Alternative from Employment and Social Development Canada” offers a conceptual model which addresses several problems in measuring performance of policy research functions in governments. They focus on Employment and Social Development Canada and critique the default logic modelling approach suggested by Treasury Board as well as the conventional payback framework for measuring policy research performance. Instead, O’Neill and Nason outline a Spheres of Influence for Research Performance (SIRP) model that relies on an empirical understanding of how research products influence actors in a nested network ranging from the researchers to stakeholders outside of government. The paper by John Grundy, “Performance Measurement in Canadian Employment Service Delivery, 1996-2000,” marks a transition in our issue from pragmatic solutions to measuring and reporting performance to more critical looks at this field. Grundy takes a historical look at the Employment Insurance Commission transition from process and output measures to using high-stakes outcome indicators, and highlights the organizational impacts. Using a Foucaultian governmentality lens, he illuminates power relationships and the mechanisms of control (governing conduct at a distance) built into a New Public Management emphasis on the vertical alignment of performance measures and targets. His case shows us that when national headquarters mandated the use of two outcome performance measures (numbers of clients getting jobs and EI savings from clients returning to work), this generated a complex response from managers and front line workers that was far more nuanced than a Foucaultian analysis would have predicted. The special issue concludes with Burt Perrin’s “Bringing Accountability Up to Date with the Realities of Public Sector Management in the 21st Century.” He critically examines our current approach to accountability and the consequences of widespread efforts to fuse results-focused performance measurement systems with the traditional compliance-based approach to accountability central to parliamentary systems. These structures and processes, Perrin argues, do not strengthen accountability even when more and SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 5 more accountability expectations are layered onto existing administrative systems. For Perrin, too much is expected from performance systems, particularly when a focus on measuring and reporting outcomes is mandated. Performance measurement is at best a work in progress, and it invites those involved to undermine the process of measuring and reporting results. Performance measurement can be useful, but realistically, performance results will rarely make it possible to pinpoint what (or who) was right or wrong in explaining why a pattern of results were observed. Perrin argues for a different approach to accountability: one that re-introduces trust into the relationship between public servants and their political masters and has the goal of stimulating an ongoing performance dialogue between administrators and policy-makers. Cross-cutting themes Performance measurement in the public and non-profit sectors is generally tied to the expectation that reporting performance results will improve accountability and hence, improve program and organizational performance. Several of our papers (Veillard et al., Thiebaut et al., and Perrin) mention this issue. Thiebaut et al. compare these two rationales for performance measurement (Table 3), indicating that trying to achieve both may not be possible (see Myth 6 about simultaneously pursuing two objectives: accountability and continuous improvement of performance). Perrin goes further by suggesting that tying performance measurement to accountability for results will not only weaken accountability, particularly in high-stakes political contexts, but vitiate the uses of performance results for performance improvement. We suggest this question for future research: under what conditions, if any, is it possible for performance measurement and reporting systems to both improve accountability and improve performance? A second theme, related to the first, is the challenge of demonstrating that performance measurement systems and their products are actually used. Charbonneau and his colleagues identified four categories of uses and suggest that accountability and performance improvement uses are possible in local government contexts, but the empirical challenge is discovering what mix of “interface” factors produces what kinds of uses over time. Of interest as well is Charbonneau, Divay and Gardey’s mention of perverse uses as one of the four types. Perrin elaborates this latter use by pointing to research which suggests that requiring managers to collect and then report performance results engenders gaming behaviours. Although gaming is often attributed to managers wanting to avoid reporting negative results, politicians can use performance measurement systems to ensure that only positive performance stories are 6 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS communicated publically. Our contributors offer different perspectives on the challenges of using performance information. Veillard, Tipper and Allin outline a tailored “open data” approach to facilitate users constructing their own performance reports – presumably this will encourage health-related institutions to compare their performance and at least indirectly, use CIHI’s performance measurement system. Sossa and Ganache point to strategies that the Commissioner of Health and Well-Being in Quebec has used to engage stakeholders as to ensure the relevance of its performance-related products. But using performance measurement results continues to be a significant problem. Charbonneau and Bellavance’s empirical study of Quebec municipalities suggests that most local government managers report very little to no uses of the performance information mandated by the provincial government. Perrin and others have pointed out that even though many of the performance reports that are prepared and conveyed are intended to influence political decision-making, there is scant evidence this happens, particularly where government departments are mandated to produce reports. We see instead instances of performance reports tabled in a near-ritualistic manner. If they are intended to improve accountability they do so primarily in symbolic ways. In the evaluation field similar to the policy research function that O’Neill and Nason describe, program evaluators have developed frameworks for classifying uses. In addition to describing uses for evaluation products, evaluators have looked at process uses – doing an evaluation itself can engender uses (Kirkhart 2000; Mark and Henry 2004). For researchers and practitioners looking at performance measurement systems, there would be value both conceptually and empirically in applying what program evaluators have learned from several decades of looking at the usefulness of evaluations. Our final observation concerns the usefulness of performance results and the desire to (simultaneously) improve accountability and performance. Bouckaert (2005) refers to the “grand canyon” of public sector performance measurement – on one rim are program and organizational outputs and across the canyon are outcomes. Credibly connecting outputs to outcomes is perhaps the most challenging part of measuring and managing performance. Usually, our approaches to measuring performance provide us (at best) with partial pictures of what organizations are accomplishing. The linear logic models used to build performance measurement systems do not typically acknowledge the complexity embedded in programs and policies. Attribution is a key challenge (Bovaird 2014). Performance measurement and performance reporting can play a valuable role in public management, but perhaps that role is more SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 7 formative, focused on improving performance, than summative, focused on accountability for results. Depuis leurs debuts inspires par le courant du Nouveau management public (NPM) dans les annees 19901, les mecanismes de mesure de la performance, de gestion de la performance et d’imputabilite se sont developpes jusqu’a devenir une exigence quasi universelle tant dans le secteur public que dans celui des organismes a but non lucratif. Dans les pays occidentaux, le mouvement visant a associer la gestion publique a des objectifs de performance, a des cibles, a des indicateurs mesurables et a des evaluations regulières de la performance destines aux differentes parties prenantes de l’organisation, tant a l’interne (gestion) qu’a l’externe, fait maintenant partie integrante de l’administration publique orthodoxe. Dans les pays en voie de developpement o u interviennent des organismes d’aide internationaux et des donateurs multilateraux, l’aide fournie est generalement subordonnee a la mise en place de systèmes d’evaluation de la performance. Les attentes en matière de mesure de la performance et de production de rapports sont très semblables a celles qui sont requises dans les juridictions et a tous les paliers de gouvernement des pays occidentaux. Dans ce numero special de la revue Administration publique du Canada, nous passons en revue plusieurs realisations en matière d’evaluation de la performance a travers une serie d’exposes et de comptes rendus critiques, allant de la description de cadres adaptes a une fonction specifique jusqu’a l’examen de tout le champ de la gestion de la performance. Les articles de ce numero reflètent la diversite des approches utilisees pour conceptualiser, mesurer et analyser la performance des programmes et des organisations, ainsi que les differents postulats epistemologiques qui les sous-tendent quant a la façon d’interpreter les donnees et d’evaluer leur valeur sous l’angle de la performance. Plusieurs de ces articles ont des liens communs, du fait qu’ils examinent soit la m^eme fonction dans des juridictions diverses, soit la m^eme juridiction a partir de points de vue complementaires. D’autres articles sont autonomes et font etat des resultats de plusieurs decennies de recherche consacree a suivre et a evaluer le mouvement dans le but d’utiliser ces resultats pour ameliorer l’imputabilite et la performance. Nous resumons brièvement chacun des articles constituant ce numero special, puis soulignons les thèmes sous-jacents qui les relient a des thèmes plus generaux de la litterature. 8 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS Les trois premiers articles offrent des points de vue complementaires sur la mesure de la performance dans le domaine de la sante, qui reflètent la valorisation de la recherche scientifique et de l’apport de la pratique et, par consequent, l’importance accordee aux donnees probantes pour la prise de decisions. Il faut dire que les programmes lies a la sante sont de plus en plus co^ uteux, et il a ete demontre qu’au Canada, les depenses de sante representent un pourcentage de plus en plus important du budget des provinces (Evans 2010). Au Canada, la sante depend de la juridiction provinciale, tout en etant en partie financee par le gouvernement federal (Kirby 2001), les efforts entrepris pour fixer des normes, contr^ oler les co^ uts et ameliorer l’efficacite et la productivite des services ont implique les deux paliers de gouvernement. En 1994, l’Institut canadien d’information sur la sante (ICIS) a ete cree par differents acteurs engages aux niveaux provincial et federal pour fournir des donnees comparatives sur la performance permettant de prendre des decisions sur les programmes et les politiques en s’appuyant sur des donnees probantes provenant de partout au Canada. Dans leur article, « Health System Performance Reporting in Canada: Bridging Theory and Practice at a Pan-Canadian Level », Jeremy Veillard, Brenda Tipper et Sara Allin decrivent le processus mis en place par l’ICIS pour b^ atir un cadre de mesure de la performance du système de sante pour la mesure, la comparaison et l’explication des differences de performance observees aux niveaux national, regional (provinces et regions sociosanitaires), et institutionnel. S’appuyant sur dix annees de rapports publics (2000-2014), l’approche actuelle aidera les utilisateurs a etablir, en fonction de leurs besoins, des comparaisons entre les performances de leur propre institution et celles de groupes d’institutions comparables. S’appuyant sur des exemples, l’article de Veillard et coll. montre comment identifier les institutions « ultraperformantes » (institutions qui se situent dans le decile superieur) a partir d’une serie d’indicateurs de performance. Ils suggèrent que le système propose sera utile pour ameliorer aussi bien l’imputabilite que la performance des organisations. Selon les auteurs, le niveau de performance d’une institution devrait ^etre influence par la pression exercee par la comparaison des resultats de l’institution avec ceux d’institutions comparables. En 2003, a la suite de la Commission Romanow sur l’avenir du système de sante au Canada (Romanow 2002), le gouvernement federal et neuf provinces se sont mis d’accord sur la creation d’un Conseil national de la sante qui favoriserait la diffusion des exemples de pratique exemplaire et d’innovation en colligeant les donnees concernant la performance et en les mettant a la disposition des juridictions provinciales et infraprovinciales. Pendant dix ans, le Conseil national de la sante a servi a la fois de referentiel de donnees et de forum de discussion pancanadien avant d’^etre demantele par le gouvernement conservateur en 2013. SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 9 Le Quebec a choisi de creer sa propre institution. Le Commissaire a la sante et au bien-^etre (CSBE) a le mandat de mesurer la performance du système de sante, d’examiner les problèmes concernant les politiques de sante, de produire des rapports et d’^etre une source d’information et de recommandations visant l’amelioration des politiques et de la performance du système de sante au Quebec. L’article de Georges-Charles Thiebaut, François Champagne et AndrePierre Contandriopoulos, « Les enjeux de l’evaluation de la performance : Depasser les mythes », situe et decrit le cadre de mesure de la perfor mance EGIPSS (Evaluation globale de la performance des systèmes de services de sante). Thiebaut et coll. examinent et critiquent les approches d’evaluation de la performance puis donnent un aperçu du cadre EGIPSS. Leur article demontre que mesurer la performance de systèmes complexes relève intrinsèquement du jugement. Ils soulignent aussi qu’il est simpliste d’assumer – comme l’ont fait de nombreux defenseurs de la mesure de la performance – que ces systèmes puissent ^etre utilises pour ameliorer a la fois la performance et l’imputabilite. Il s’agit d’une question sur laquelle nous reviendrons dans notre recapitulation des thèmes qui traversent les differents articles de ce numero special. Olivier Sossa et Isabelle Ganache se concentrent, eux, directement sur le travail du CSBE au Quebec. Leur article, « L’appreciation de la performance du système de sante et des services sociaux du Quebec : L’approche du Commissaire a la sante et au bien-^etre », se concentre directement sur le cadre EGIPSS dans le contexte du travail accompli par le commissaire pour evaluer les programmes et les politiques relies aux problèmes de sante, ainsi que pour defendre de nouvelles politiques. Le Commissaire utilise le cadre EGIPSS pour decrire la performance du système et des sous-systèmes et pour identifier les modèles causaux a travers les differents facteurs dont depend le système complexe de la sante et du bien-^etre dans la province. Sossa et Ganache soulignent que pour le CSBE, l’evaluation de la performance comprend des dimensions ethique et participative, cette dernière couvrant la participation tant d’experts que de citoyens a l’examen des mesures de performance et du cadre de mesure lui-m^eme. La publication de rapports periodiques a pour but d’influencer l’elaboration et la mise en œuvre des politiques publiques. Par exemple, dans un rapport de 2012 centre sur les questions de sante mentale au Quebec, il a fallu prendre en compte la dimension ethique pour l’evaluation des politiques et des programmes afferents. Dans « Performance Management in a Benchmarking Regime: Que bec’s Municipal Management Indicators », Etienne Charbonneau et François Bellavance font une analyse empirique des facteurs susceptibles d’expliquer l’adhesion des dirigeants municipaux du Quebec a un exercice de mesure de la performance auquel les a contraint le gouvernement 10 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS provincial. L’exercice consiste a utiliser 14 indicateurs predetermines. Sur la base d’un questionnaire autoadministre, Charbonneau et Bellavance ont obtenu l’avis de 321 directeurs generaux. L’analyse des donnees inclut quatre variables dependantes : l’utilisation generale, l’utilisation pour la gestion, l’utilisation pour l’etablissement du budget et l’utilisation pour la redaction de rapports externes. Parmi les variables predictives, les auteurs retiennent les mesures du classement comparatif de la performance de l’administration locale, les limites perçues a l’utilisation des donnees sur la performance, les indices de certaines caracteristiques internes de l’administration locale, certains facteurs sociodemographiques et le degre de realisations des engagements politiques annonces lors de la dernière election municipale. Quoique très peu de ces facteurs soient significatifs du point de vue statistique, un des facteurs importants du modèle d’analyse est la volonte des dirigeants de se servir des resultats : moins ils sont desireux d’y avoir recours, moins il y a de chances qu’ils se servent de l’analyse des 14 indicateurs pour faire rapport de la performance de leur administration publique. L’article d’Etienne Charbonneau, Gerard Divay et Damien Gardey, « Volatilite dans l’utilisation des indicateurs de performance municipale », developpe le concept central de l’article precedent : la volonte des directeurs d’utiliser l’information. Il elabore un modèle permettant d’expliquer l’utilisation de ces renseignements au niveau des administrations locales en se concentrant sur l’interface entre les directeurs et les elus, qui investissent leurs propres connaissances, valeurs, inter^ets et emotions. Ces facteurs peuvent changer avec le temps. Et, comme le travail des administrations locales est cyclique, Charbonneau et coll. font l’hypothèse que les differentes utilisations – passive (transmission des resultats exiges par la loi), resolue (resultats utilises pour ameliorer la performance des programmes), politique (resultats utilises pour communiquer avec le public) et perverse (resultats manipules et donc biaises) – varient avec le temps. Toutefois, le principal enjeu est de mettre en place une recherche dynamique qui mesure ces cadres a travers le temps et les met en correlation. L’article de Michael O’Neill et Eddy Nason, « What Metrics? On the Utility of Measuring the Performance of Policy Research: An Illustrative Case and Alternative from Employment and Social Development Canada » presente un modèle conceptuel qui repond a plusieurs des problèmes qui se posent lorsqu’on mesure la performance des fonctions de recherche sur les politiques dans les gouvernements. Ils se concentrent sur Emploi et Developpement social Canada et critiquent l’approche de modelisation basee sur le raisonnement par defaut suggere par le Conseil du Tresor ainsi que le cadre conventionnel de la rentabilite pour mesurer la performance de la recherche sur les politiques. O’Neill et SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 11 Nason proposent plut^ ot le modèle SIRP (Spheres of Influence for Research Performance) qui repose sur une comprehension empirique de la façon dont les resultats de recherche peuvent influencer des agents dans un reseau imbrique allant des chercheurs jusqu’aux parties prenantes exterieures au gouvernement. L’article de John Grundy, « Performance Measurement in Canadian Employment Service Delivery, 1996-2000 » marque une transition entre l’etude de solutions pragmatiques pour la mesure de la performance et la production des rapports et l’adoption d’un point de vue plus critique sur ce champ de recherche. Grundy adopte un point de vue historique sur la façon dont on est passe, a l’interieur de la Commission de l’assuranceemploi du Canada, de la mesure des processus et des resultats directs a l’utilisation d’indicateurs portant sur des effets de plus large portee, mettant en relief les impacts organisationnels. En utilisant la notion foucaldienne de gouvernementalite, il met en lumière les relations de pouvoir et les mecanismes de contr^ ole qui font partie integrante du Nouveau management public concernant l’alignement vertical des mesures et des cibles de performance. Le cas etudie montre que, quand l’administration centrale a exige l’utilisation de deux indicateurs de performance (nombre de clients obtenant un travail et economie sur l’assurance-emploi lorsque des clients retrouvent un travail), cela a suscite, de la part des dirigeants et des travailleurs de première ligne, une reponse complexe beaucoup plus nuancee qu’une analyse foucaldienne aurait pu le predire. Le dernier article de ce numero special, par Burt Perrin, « Bringing Accountability Up to Date with the Realities of Public Sector Management in the 21st Century », examine de façon critique notre approche actuelle de l’imputabilite et les consequences des efforts considerables employes a fusionner des systèmes de mesure de la performance axes sur les resultats et l’approche traditionnelle, basee sur l’observance de règles, qui est encore centrale dans les systèmes parlementaires. Selon Perrin, ces structures et processus ne renforcent pas l’imputabilite m^eme si l’on a introduit de plus en plus d’attentes quant a l’imputabilite a tous les niveaux des systèmes administratifs. L’on attend trop des systèmes de performance, surtout quand on demande de se concentrer sur la mesure et la transmission de donnees concernant les resultats. La mesure de la performance est au mieux un travail en cours, et il pousse les personnes concernees a miner le processus de mesure et de production de rapports. La mesure de la performance peut ^etre utile, mais il faut ^etre realiste : les resultats de ces mesures permettront rarement de mettre le doigt sur le facteur (ou la personne) dont l’action pourrait expliquer l’observation d’une configuration de resultats donnee. Perrin plaide pour une approche differente de l’imputabilite qui reintroduirait la confiance dans la relation entre les fonctionnaires et les chefs politiques. 12 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS Thèmes transversaux L’etude de la performance dans le secteur public et dans les organismes a but non lucratif est generalement liee a la croyance qu’en collectant et rapportant les resultats de performance, on puisse accro^ıtre l’imputabilite et, donc, ameliorer la performance des programmes et des organisations. Plusieurs des articles ici publies (Veillard et coll., Thiebaut et coll., et Perrin) abordent cette question. En comparant ces deux justifications de la mesure de la performance (Tableau 3), Thiebaut et coll. indiquent que tenter de les atteindre conjointement pourrait bien ^etre impossible. En reference a l’article, il est enonce comme sixième mythe que l’on pourrait poursuivre simultanement les deux objectifs : imputabilite et amelioration continue de la performance. Perrin va plus loin en suggerant que lier la mesure de la performance a l’imputabilite quant aux resultats n’affaiblira pas seulement l’imputabilite, en particulier dans les contextes o u les enjeux politiques sont importants, mais risque aussi de vicier l’utilisation des resultats pour ameliorer la performance. Voici donc la question que nous posons pour une recherche ulterieure : dans quelles conditions est-il possible, le cas echeant, que les systèmes de mesure de la performance et de production de rapports ameliorent a la fois l’imputabilite et la performance? Un second thème, relie au premier, est la difficulte de demontrer que les systèmes de mesure de la performance et les resultats qu’ils produisent sont reellement utilises. Charbonneau et ses collègues ont identifie quatre categories d’utilisation et suggèrent que celles axees sur l’imputabilite et l’amelioration de la performance sont possibles dans des contextes locaux. Le defi au niveau empirique etant de decouvrir quelle combinaison de facteurs « d’interface » produit telle ou telle sorte signaler aussi, la mention par Charbonneau, d’usage au fil du temps. A Divay et Gardey d’usages pervers comme un des quatre types. Perrin elabore sur ce dernier type en soulignant des recherches permettant de penser que demander aux directeurs de recueillir puis de faire rapport des resultats de performance engendre des comportements visant a travestir les resultats. Bien que ce genre de comportement soit souvent attribue a des directeurs qui veulent eviter de faire etat de resultats negatifs, les politiciens aussi peuvent utiliser les systèmes de mesure de la performance pour s’assurer que seules les histoires de performances positives soient communiquees publiquement. Nos contributeurs presentent differents enjeux lies a l’utilisation des resultats de performance. Veillard, Tipper et Allin decrivent une approche ouverte et adaptee pour faciliter la construction par les utilisateurs de leurs propres rapports de performance – ce qui devrait supposement inciter les institutions liees a la sante a comparer leurs performances et, au moins indirectement, a utiliser le système de mesure de la performance de l’ICIS. Sossa et SPECIAL ISSUE ON PERFORMANCE MEASUREMENT, MANAGEMENT AND ACCOUNTABILITY 13 Ganache soulignent les strategies utilisees par le Commissaire a la sante et au bien-^etre pour collaborer avec les parties concernees et garantir la pertinence des resultats lies a la performance. Mais l’utilisation des resultats de la mesure de la performance continue d’^etre un enjeu important. L’etude empirique proposee par Charbonneau et Bellavance concernant les municipalites du Quebec permet de penser que la plupart des directeurs d’administrations locales n’utilisent que très peu – ou pas du tout – les informations collectees et analysees dans le cadre de l’exercice exige par le gouvernement provincial. Comme d’autres chercheurs, Perrin a souligne que m^eme si la plupart des rapports de performance qui sont prepares et envoyes ont pour but d’influencer la prise de decisions politiques, il y a peu de donnees permettant d’appuyer cette hypothèse, notamment quand les ministères sont obliges de produire ces rapports. On voit au contraire des cas de rapports sur la performance tablettes de façon quasi-rituelle. S’ils sont censes ameliorer l’imputabilite, ils le font donc surtout de façon symbolique. Dans le domaine de l’evaluation, similaire a la fonction de recherche sur les politiques que decrivent O’Neill et Nason, les evaluateurs de programme ont elabore des cadres pour classer les utilisations. En plus de decrire les utilisations des resultats d’evaluation, ils ont examine les utilisations de processus : le fait m^eme de produire une evaluation peut en effet engendrer des utilisations (Kirkhart 2000; Mark et Henry 2004). Pour les chercheurs et les praticiens qui s’interessent aux systèmes de mesure de la performance, il pourrait ^etre interessant, tant d’un point de vue conceptuel qu’empirique, d’appliquer ce que les evaluateurs de programme ont appris a conna^ıtre en verifiant l’utilite des evaluations pendant plusieurs decennies. Notre dernière observation concerne l’utilite des resultats de performance et le desir d’ameliorer simultanement l’imputabilite et la performance. Bouckaert (2005) parle du « grand canyon » de la mesure de la performance dans le secteur public : d’un c^ ote, se trouvent les resultats directs des programmes et des organismes (extrants), et de l’autre, les effets a court ou long terme (resultats). Relier les extrants aux resultats est peut-^etre la partie la plus difficile de la mesure de la performance. En general, nos approches de la mesure de la performance nous donnent (au mieux) un tableau partiel de ce que les organismes accomplissent. Les modèles de logique lineaire utilises pour b^atir les systèmes de mesure de la performance ne peuvent pas refleter la complexite intrinsèque des programmes et des politiques. L’attribution est l’un des principaux defis (Bovaird 2014). La mesure de la performance et la communication des donnees sur celle-ci peuvent jouer un r^ ole positif dans l’administration publique, mais il se pourrait que ce dernier soit plus formatif (centre sur l’amelioration de la performance) que sommatif (centre sur l’imputabilite). 14 NATHALIE DUBOIS, JIM MCDAVID, ETIENNE CHARBONNEAU, JEAN-LOUIS DENIS Note Performance measurement in local governments began in American cities in the early 1900s and was tied to the good-government Progressive Movement (Williams 2003). New Public Management advocates (Osborne and Gaebler 1993) made results-focused performance measurement a part of a broad movement to introduce business-like practices in the public sector. La mesure de la performance dans les administrations locales a vu le jour dans les villes americaines au debut des annees 1900, en liaison avec le Good-government Progressive Movement (Williams 2003). Les avocats du Nouveau Management Public (Osborne et Gaebler 1993) avaient fait de la mesure de la performance centree sur les resultats un des elements d’un large programme visant a introduire des pratiques entrepreneuriales dans le secteur public. References Bovaird, T. 2014. “Attributing outcomes to social policy interventions – ‘Gold standard’ or ‘fool’s gold’ in public policy and management?” Social Policy & Administration 48 (1): 1–23. Evans, R. 2010. The “unsustainability Myth” don’t believe claims that medicare is becoming unaffordable. Ottawa, ON: Canadian Centre for Policy Alternatives. Retrieved December [website]. Available at https://www.policyalternatives.ca/publications/monitor/unsustainabilitymyth. Kirby, M. 2001. The Health of Canadians – The Federal Role: Interim Report. The Standing Senate Committed on Social Affairs, Science and Technology. Ottawa, ON: The Parliament of Canada. Kirkhart, K. E. 2000. “Reconceptualizing evaluation use: An integrated theory of influence.” New Directions for Evaluation 88 (1): 5–23. Mark, M. M., and G. T. Henry. 2004. “The mechanisms and outcomes of evaluation influence.” Evaluation 10 (1): 35–57. Osborne, D., and T. Gaebler. 1993. Reinventing Government: How the Entrepreneurial Spirit is Transforming Government. New York, NY: Addison-Wesley Publishing Company. Romanow, R. 2002. Building on Values: The Future of Health Care in Canada: Final Report. Ottawa, ON: Commission on the Future of Health Care in Canada. Williams, D. 2003. “Measuring government in the early twentieth century.” Public Administration Review 63 (6): 643–59. Jeremy Veillard Brenda Tipper Sara Allin Health system performance reporting in Canada: Bridging theory and practice at pan-Canadian level Abstract: Public reporting is increasingly used to enhance accountability and transparency and stimulate performance improvement in the public sector. In Canada performance reporting in the health sector is still in development, and involves a large number of actors. This article reports on the results of a recent intervention by the Canadian Institute for Health Information (CIHI) to develop a platform for pan-Canadian performance reporting (http://www.yourhealthsystem. cihi.ca). It describes approaches taken to: develop a conceptual framework; engage the public in the definition of performance reporting priorities; and select indicators for public reporting. This article also discusses conceptual, methodological and operational challenges as well as a proposed evaluation strategy. Sommaire : La presentation de rapports destines au public est de plus en plus utilisee pour accro^ıtre la reddition de compte et la transparence et stimuler l’amelioration de la performance dans le secteur public. Au Canada, la presentation de l’information sur la performance dans le secteur de la sante est toujours en cours, et fait intervenir un grand nombre d’acteurs. Le present article montre les resultats d’une recente intervention de l’Institut canadien d’information sur la sante (ICIS) pour mettre au point une plate-forme afin de presenter de l’information sur la performance a l’echelle pancanadienne (http://www.yourhealthsystem.cihi.ca). Il decrit les demarches suivies pour mettre au point un cadre conceptuel, pour faire intervenir le public dans la definition des priorites pour la presentation de rapports destines au public sur la performance, et pour choisir les indicateurs pour la presentation de rapports destines au public. Cet article examine egalement les defis conceptuels, methodologiques et operationnels ainsi qu’une strategie d’evaluation eventuelle. Introduction In the health sector, numerous countries are releasing regular performance reports with an increased emphasis on outcomes and value for money. In unitary systems, provisions for public reporting include, for example, annual Quality Accounts for all health care organizations in England (Department of Health 2011). In federal systems, the Patient Protection and Affordable Care Act in the United States (United States of America Jeremy Veillard is Vice-President, Research and Analysis, Canadian Institute for Health Information (CIHI), Toronto. Brenda Tipper is Senior Program Consultant, Division of Research and Analysis, CIHI. Sara Allin is Senior Researcher, Division of Research and Analysis, CIHI. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 15–38 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 16 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Congress 2010) mandates quarterly public reporting of performance information by institutions caring for Medicare patients, while in Australia new legislation mandates quarterly and annual reporting on health system performance (Council of Australian Governments 2011). In general, the growth in public reporting initiatives on health system performance is driven by a growing interest in promoting transparency and accountability, and a belief that publicly reporting performance indicators will lead to performance improvement. This belief is very much anchored in the literature on new public sector management (Groot and Budding 2008). The performance measurement literature identifies four pathways to improvement through public reporting (Berwick, James and Coye 2003; Hibbard, Stockard and Tusler 2003; Hibbard 2008): the change pathway (providers of services use comparative information to improve performance); the selection pathway (health system users apply comparative information to change care consumption from poor to good performers); pay-forperformance (providers who achieve standards or targets receive financial rewards); and reputational damage (providers who perform poorly suffer damage to their public reputation from regular public reports). These four pathways are consistent with the public sector literature on performance management, which classifies performance improvement pathways depending on their internal or external source of control and on the supportive or punitive actions derived from the controls (Boland and Fowler 2000; Veillard et al. 2005). In Canada, the health system performance reporting agenda in the health sector is still largely under construction. It includes multiple players at all levels from national organizations to provincial health quality councils (Health Council of Canada 2012), leading to a situation described as “indicator chaos” (Saskatchewan Health Quality Council 2011). Strikingly, besides the general objective of greater transparency and accountability, the objectives and incentives related to public reporting initiatives in the health sector are often unclear or unspecified. The Canadian Institute for Health Information (CIHI) has the mandate to lead the development and maintenance of comprehensive and integrated health information that enables sound policy and effective health system management that improve health and health care (Canadian Institute for Health Information 2012). In particular, CIHI has reported regularly on pan-Canadian health system performance comparisons at various levels (regional health indicators, wait times, hospital level measures such as the Hospital Standardized Mortality Ratio) in the last fifteen years (Canadian Institute for Health Information 2007; Canadian Institute for Health Information 2013a, 2013b). In 2012, CIHI launched a new intervention focusing its public performance reporting efforts on a smaller number of cascading measures HEALTH SYSTEM PERFORMANCE IN CANADA 17 determined by a clarified health system performance framework and implementing a number of related initiatives supporting the performance improvement efforts of Canadian jurisdictions. This performance reporting initiative pursues two concomitant objectives: first, to enhance accountability and transparency about health system performance; and second, to stimulate and support performance improvement efforts of provincial and territorial governments. Specifically, this initiative aims to: stimulate performance improvement by reporting publicly on a small number of indicators aligned with priorities of the general public and of Canadian jurisdictions; focus public reporting instruments on the information needs of well-segmented audiences defined through various engagement mechanisms; and implement complementary analytical, research and capacity building initiatives supporting the performance improvement efforts of jurisdictions. This initiative is composed of five streams of work: (i) the development of a unifying health system performance measurement framework, aligned with current transformation strategies of jurisdictions; (ii) a set of integrated interactive performance websites, updated periodically when data becomes available; (iii) analytical instruments allowing policy makers, health system managers and analysts to carry out peer comparisons and deeper analyses of performance drivers in an environment respectful of privacy; (iv) a research agenda aligned with performance improvement priorities of provinces and territories; and (v) complementary activities helping build capacities of health system managers to use performance measurement and data for performance improvement. The first deliverable for this initiative was the public release in November 2013 of an interactive website dedicated to the general public and available at: http://www.yourhealthsystem.cihi.ca. The performance website interactively presents results for fifteen core performance indicators along the themes of access to care; quality of care; health system outcomes; value for money; and health promotion and disease prevention. All results can be shared easily through a social media platform using most popular social media tools. This paper has three objectives: first to present the methods used to develop and deliver the public performance reports related to this intervention; second, to report the results related to: (1) the definition of priorities of the general public for health sector performance reporting; (2) a national health system performance framework; (3) indicators selected to report publicly on system performance at various levels (national and provincial, regional, hospital, long-term care homes); and (4) the identification of top performers for the indicators already released; and third, to discuss the challenges related to the design and implementation of this intervention as well as evaluation methods in use to measure the impact of this intervention. 18 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Methods Conceptual framework development In order to achieve the objectives of comprehensiveness in performance reporting and parsimony in the selection of performance indicators, an initial step in this initiative was to develop a health system performance measurement framework that would take into consideration the evolving performance information needs of its various users (defined in this case as the general public, health system policy makers, and health system managers). This framework was built on the current state of scientific knowledge and offers an analytical and interpretative framework, which had to be theoretically justified and actionable for performance improvement. The desirable characteristics of a sound conceptual framework were defined as the following: Comprehensive: the framework should incorporate a wide range of performance dimensions that are clearly positioned within the boundaries of health systems Integrated: the framework should include various models and different theoretical and disciplinary perspectives on performance Theoretically justified: the choice of performance dimensions should be built on robust theoretical foundations Actionable: the framework should be designed to be amenable to action and show the expected causal relationships between its performance dimensions Strategically aligned: the framework should reflect health system improvement priorities of jurisdictions while keeping within its theoretical foundations (Champagne et al. 2005). The steps in the development of the conceptual framework supporting this initiative were the following: 1. Review existing international frameworks for health system performance reporting (Murray and Frank 2000; Hurst and Jee-Hughes 2001; Commonwealth Fund 2006; Kelley and Hurst 2006). 2. Review literature and evidence on organizational effectiveness and health system quality improvement reporting (Kaplan and Norton 1996; Veillard et al. 2010; Champagne et al. 2005; Institute of Medicine 2001; Stiefel and Nolan 2012; Champagne and Guisset 2005). 3. Develop a first draft of the health system performance framework, followed by review and discussion within CIHI. 4. Share first draft of performance framework with key stakeholders and national and international experts. HEALTH SYSTEM PERFORMANCE IN CANADA 19 5. Revise the first draft based on feedback to develop a proposed health system performance framework and related technical report for general review. 6. Post the framework and technical report on CIHI’s website for general comments and feedback. 7. Revise proposed framework to develop the final version of the health system performance framework presented in this document. A discussion of the process of developing the conceptual framework and the references that supported this work can be found elsewhere (Canadian Institute for Health Information 2013d). Identifying performance reporting priorities for the general public The objective of comprehensiveness in public reporting was important for performance management purposes and key to meeting the information needs of policy makers and system managers at various levels of the health system. However, the priorities and interests of the general public were less clear at the outset. Therefore, CIHI engaged directly with the general public in order to identify performance reporting priorities for Canadians (Canadian Institute for Health Information 2013c). We used a complementary approach of in-person dialogues with small groups of randomly recruited participants in urban and rural settings across Canada (Newfoundland and Labrador, New Brunswick, Ontario, Saskatchewan and British Columbia) and an online consultation mechanism of a representative sample of 3069 Canadians from all provinces and territories excluding Quebec (the government of Quebec elected to not participate in this phase of the project). This approach allowed us to hear a broad cross-section of views online, as well as to explore the subject matter in more depth and time through the in-person dialogues. Therefore, CIHI engaged directly with the general public in order to identify performance reporting priorities for Canadians The key engagement tool for both online and in-person consultations used fictional scenarios and accessible language to present participants with a set of measures from which to identify their priorities. Participants were asked to consider four inter-related performance areas as defined in a simplified version of the performance framework described 20 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN in Results: health system outcomes, social determinants of health, health system outputs, and health system inputs and characteristics. At each of the 3.5-hour in-person dialogues, we engaged diverse groups of local residents about the health system performance measures of greatest interest to them. To foster deliberation and informed participation, each participant was given a Conversation Guide that presented them with the same learning information as the online instrument, and asked them to identify their priorities using the same question formats to allow direct comparison across engagement channels. Our research methodology used electronic voting keypads to generate quantitative data on participants’ views. We partnered with EKOS Research Associates’ ProbIt online panel to engage a randomized, representative sample of Canadians in the online instrument and in-person dialogues. There was an over-representation from provinces with smaller populations to allow us to identify any potential differences based on geographic location. Also, the sample was slightly older than the Canadian population so results were weighted to better represent the general population. Five in-person dialogues, with a total of 100 participants, were held across Canada in order to achieve geographical representation from the western, central and eastern provinces (British Columbia, Saskatchewan, Ontario, New Brunswick, and Newfoundland). In order to include French speakers, the online instrument was also available in French and an in-person dialogue was held in French in New Brunswick. Indicator selection Once the themes of interest to the public and their relationship to the health system performance framework had been determined, we proceeded to use these as the basis for selecting performance indicators for reporting. The objective was to align the indicators with the results from the public engagement work, and to satisfy the criteria of reliability, validity and importance. In addition, the indicators had to be readily reportable, ideally with historical results and available at the health region level; and they had to be presentable in a way that the general public could understand. To narrow down to a parsimonious number of indicators offering strong face validity from a system performance perspective, we selected a panel of international and national experts and information users based on their experience in indicator selection, technical knowledge of the various performance domains covered by the framework, and use of performance information for system management and decision-making. We applied a modified Delphi panel technique through the various steps of an indicator selection process described below. HEALTH SYSTEM PERFORMANCE IN CANADA 21 The steps in the indicator selection process were: 1. Development of an inventory of existing performance indicators available at a pan-Canadian level and review of preliminary inventory against review criteria of validity and reliability, feasibility, and importance to the public. From these 6001 indicators, a working list of close to 150 indicators, categorized into five themes based on public engagement results was developed. 2. The working list was reviewed independently by three external experts (a senior health ministry analytics expert, an expert in knowledge translation, health policy and citizen engagement, and a regional medical officer of health with expertise in population health indicators). Based on their feedback, a short list of 50 indicators was developed for consideration by an external expert group composed of national and international experts. 3. A summary descriptive sheet was developed for each of the 50 preselected indicators and gathered information on the definitions of numerator, denominator and excluded cases, the validity and reliability of the indicator, existing data collection mechanisms, and main references to the scientific literature supporting the use of this indicator. This background material was used by the members of the expert advisory group to complete an on-line survey in which experts were asked to rate the indicators on criteria of reliability, validity, importance, and understandability on a five-point Likert scale; and to identify the top 15 indicators they would include in the public report. 4. Results of the survey were summarized and used as the starting point for a face-to-face consensus meeting of the expert group to recommend a list of 10-15 indicators to be publicly reported. 5. The recommendations from the consensus meeting were then reviewed internally by CIHI senior management and endorsed formally by the board of directors of CIHI. A similar process was used to select an expanded suite of indicators for public reporting on regional health authorities, hospitals, and longterm care facilities. Identification of top performers One of the objectives of publicly reporting on health system performance was to stimulate performance improvement through peer pressure by identifying top performers, where comparisons could be made and 22 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN where results were available. Two criteria were used to define top performers: 1. The confidence interval around the region or facility result should place it in the top decile of results for the current and previous two time periods 2. The top performers’ results should be statistically significantly better than the Canadian average result. These two criteria ensured that only consistently high (over the past three measured time periods) and significantly better results were identified as top results. This methodology was applied across various measurement levels of the CIHI health system performance website and therefore applied to health regions, hospitals (first release in November 2013, second release in November 2014) and long-term care facilities (to be released in June 2015). For hospital results, top performers were identified within each of four hospital peer groups (teaching; and large, medium and small community hospitals). It was also important to the objectives of the reporting initiative, that people visiting the website be able to tell, at a glance, how a particular province, health region or hospital was performing overall, and with respect to a particular quadrant or dimension of the performance framework. To achieve this, dashboards, displaying all available results for a selected province, health region or facility were developed. The indicators and results shown on the dashboards were on one page, organized according to the performance measurement framework, for a province, health region or hospital. Use of color-coding and symbols indicated how the results compared to the national average (significantly better, the same as, or significantly worse) and whether the trend over time had been improving, staying the same or getting worse. Addressing the importance of context A particular challenge was the inclusion of contextual information in the interactive website http://www.yourhealthsystem.cihi.ca to facilitate the understanding of complex performance information. We used content matter experts for each indicator to gather key facts from the scientific literature about international comparisons, performance variations across Canada and over time, and information related to disparities in results. We then presented these key facts as visually appealing infographics conveying key information we deemed important to facilitate interpretation of the performance results. We also included in the interactive website a small number of key contextual measures supporting the interpretation of results at regional and facility level. HEALTH SYSTEM PERFORMANCE IN CANADA 23 Figure 1. CIHI’s New Framework for Measuring Health System Performance Results Health system performance framework The health system performance measurement framework developed to guide performance reporting is composed of four interrelated quadrants – health system outcomes, social determinants of health, health system outputs, and health system inputs and characteristics – with each quadrant containing a set of performance dimensions (Canadian Institute for Health Information 2013d). These quadrants sit within demographic, political, economic and cultural contexts which influence each of them as well as the way they interact with each other. The health system performance measurement framework developed to guide performance reporting is composed of four interrelated quadrants – health system outcomes, social determinants of health, health system outputs, and health system inputs and characteristics – with each quadrant containing a set of performance dimensions. The framework depicts the expected causal relationships among the four quadrants and their performance dimensions with arrows (see 24 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Figure 1), leading to the end goal of better health system outcomes. Consistent with other international frameworks (Murray and Frank 2000, Hurst and Jee-Hughes 2001, Commonwealth Fund 2006, Kelley and Hurst 2006), outcomes represent the ultimate goals of the health system: the health status of Canadians, the responsiveness of the health system, equity in health and responsiveness, and value for money (Canadian Institute for Health Information 2013d). Health system inputs quadrant refers to characteristics of health services and providers, along with the resources and equipment they use, and the facilities in which they work (Donabedian 1980). The inputs include a range of dimensions from characteristics of leadership and governance, to the resources available for the health system, to their efficient allocation and to innovation and learning capacity. These dimensions potentially explain performance (Murray and Frank 2000) and can be seen as levers of improvement for the quality of health services (Champagne et al. 2005). Health services are represented in the “outputs” quadrant which includes the accessibility of health services, as well as the quality of services as described by the dimensions for this quadrant – patient-centered, safe, appropriate and effective, and efficiently delivered (Institute of Medicine 2001, Stiefel and Nolan 2012). In this framework, the outputs quadrant – the health services – includes the full range of health system services, including acute, community, primary, continuing, rehabilitation, health promotion and health protection and public health services. The quadrant representing the social determinants of health depicts the well documented effects of structural and intermediary factors on health system outcomes (Solar and Irwin 2010), as well as the important interactions between social determinants of health and health system inputs and outputs. For example, the dimension of leadership and governance within the health system inputs quadrant includes the effective partnerships health system leaders make with other sectors to develop and implement healthy public policies more generally. Also, improving accessibility and equitable access to health services depends on the tailoring of services to the different needs of diverse populations. While most performance frameworks are static, this framework describes performance as a dynamic process with interrelationships among its quadrants and dimensions and within the broader demographic, political, economic and cultural contexts, a view particularly relevant for performance improvement. The framework developed by CIHI aligns largely with the health system performance improvement strategies and goals of Canadian provinces and territories. This alignment is documented elsewhere (Canadian Institute for Health Information 2013d). Also, many of the concepts and performance dimensions in HEALTH SYSTEM PERFORMANCE IN CANADA 25 CIHI’s framework overlap with the international frameworks that it drew on. The main differences are: its strategic alignment with Canadian provinces and territories; and its depiction of expected causal relationships among the different quadrants (Canadian Institute for Health Information 2013d). Results of consultations with the general public The consultations with the general public revealed that they considered access to care to be the aspect of health system performance of most interest. Interestingly, however, the public rated their interest in all four quadrants of the HSP framework very highly. Online respondents reported their level of interest in the quadrants on a scale from one (not interested at all) to seven (very interested), and average ratings ranged from 5.7 (inputs) to 6.2 (outcomes) (Canadian Institute for Health Information 2013c). The consultations with the general public revealed that they considered access to care to be the aspect of health system performance of most interest. Participants were asked to provide their ratings of interest both before and after completing the full survey, and the results were relatively unchanged. Preference for the two quadrants rated most highly – outcomes and outputs – was equally high or higher at the end of the process (Canadian Institute for Health Information 2013c). While it is possible that some members of the public have no opinion on the health system and its performance, it appears that the participants in this study all held strong and consistent views before and after learning about these areas of performance. Within each of the four quadrants, respondents were asked to consider the dimensions of performance that were most important to them and allocate 100 hypothetical dollars to each performance dimension within each quadrant. While access, which is part of the outputs quadrant, was considered to be the most important dimension, others that were rated highly included equity (outcomes), responsiveness (outcomes), quality (outputs), health promotion and disease prevention (outputs) and value for money (outcomes). In-person focus groups yielded similar results to the online surveys, though they offered additional insights, such as that wait times were considered to be the most important component of access to care, and access to family physicians and specialists were of key 26 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Table 1. Results of on-line Ratings of Interest and Allocation of Hypothetical Dollars HSP framework quadrant Inputs Outputs Interest (on a scale of 1–7) post-survey 5.9 6.1 Outcomes 6.2 Social determinants of health 5.8 Performance dimension Allocation of resources Innovation Planning the right services Access Appropriateness Efficiency Patient Experience Health promotion & prevention Quality Safety Health status Responsiveness Equity Value for money Structural factors (people’s circumstances) Intermediary factors (neighbourhood characteristics) Score (scale of 1–100), weighted to 50 52 49 49 63 43 44 44 55 55 46 36 56 56 53 53 47 importance. Subgroup analyses were conducted by province/territory, urban/rural place of residence, age, and extent of prior health system use; the results were largely consistent across these subgroups. Quantitative results of the online ratings are presented in Table 1 below (Canadian Institute for Health Information 2013c). Indicators selected Using the indicator selection process described above, the indicators listed in Appendix A were selected for reporting in the aligned performance-reporting websites with target audiences and measurement levels as noted. While only fifteen indicators were selected to address the performance information needs of the general public, these indicators were also included in cascading performance reports targeted to meet the 27 HEALTH SYSTEM PERFORMANCE IN CANADA Table 2. Top Performers Results for CIHI Health System Performance Website for the General Public (http://www.ourhealthsystem.ca) 2a. Indicators reported for regions Total Number of Regional Regions with Performance Indicator Results % of the Population with a Regular MD % of repeat hospitalizations for MH % of the Population who are Current Smokers % of Adults Considered Obese SelfReported Life Expectancy at Birth Number of Top Performers Identified Top performers as % of total 55 15 27.3% 80 3 3.8% 55 2 3.6% 55 1 1.8% 72 9 12.5% 2b. 30-day overall readmission rate indicator reported for hospitals Hospital Peer Group Teaching Hospitals Large Community Hospitals Medium Community Hospitals Small Community Hospitals Total Number of Hospitals in Peer Group Number of Top Performers Identified Top performers as % of total 30 66 90 329 3 2 4 2 10.0% 3.0% 4.4% 0.6% needs of system managers. Additionally, all indicators were reported at the provincial/territorial level. Results for most, but not all, indicators were also reported at the health region level. Finally, there were a number of indicators that measure performance at the level of individual hospital or long-term care facility. All of these indicators are already or will be publicly available, and are described in Appendix A. Addressing the importance of context We tested our approach to addressing the importance of context through a survey of 250 members of the general public who had participated in the initial on-line survey. Overall, the respondents reported that the contextual information was well understood and that it facilitated their 28 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN interpretation of the performance indicators (Canadian Institute for Health Information 2013c). One particular difficulty which we were not able to address adequately at this point was the description of the importance of rurality and remoteness in the interpretation of performance results. Top performers For the first release of the performance website to the general public, there were five performance indicators reported for health regions and one hospital indicator for which we identified top performers. These indicators and the number of top performers identified are listed in Table 2 below. As can be seen for most indicators, the criteria of having three consistent years’ of results in the top decile meant that relatively few regions or hospitals were identified. However, due to larger confidence intervals around results for indicators based on survey samples, there were two indicators (percent of population with a regular medical doctor, and life expectancy at birth) where a larger proportion of top performers were identified (27.3% and 12.5% respectively). Discussion This performance reporting initiative pursues two concomitant objectives: to enhance accountability and transparency about health system performance; and to stimulate and support performance improvement efforts of provincial and territorial governments. Public reporting is a key component of this initiative, and it is complemented by other instruments and initiatives such as analytical tools, research and indicator development and capacity building activities. This initiative takes place in the context of policy interventions by provincial and territorial governments to improve health system performance. Having a full program of work in which performance reporting is only one instrument among a variety of initiatives aiming at supporting health system performance improvement through benchmarking and capacity building differs from other public performance reporting initiatives solely focused on making comparable performance information publicly available. Building for use: a new approach to performance reporting A major challenge with this approach to performance reporting was to both ensure that the information needs of different audiences would be met and that reporting would be parsimonious and focused on a small HEALTH SYSTEM PERFORMANCE IN CANADA 29 number of indicators aligned with system transformation priorities of jurisdictions. This tension is supported by recent literature on health system performance management (Veillard et al. 2010) and health services research (De Korne et al. 2010) pointing to the need to build relevant instruments for use by potential information users. Performance management indeed is a social construct using scientific methods that should be socially acceptable to the users of the information (Harding 2003). Therefore, a first step was to clarify the intended audience(s) of this health system performance reporting initiative. We focused on the general public, policy makers and health system managers which are an integral part of the mandate of the Canadian Institute for Health Information. We then engaged with information and report users in: the development of the conceptual framework; selection of key themes for reporting to the general public; definition of functionalities of performance websites; selection of performance indicators and related contextual measures; and review of prototypes of the websites. Consensus building methods such as modified Delphi panels were used to select the cascading sets of performance indicators. Additional criteria used for the design of the core set of cascading indicators were a high level of face, construct, and content validity of the set of indicators. In practice, it meant that the indicator set should provide a view of performance that intuitively reflects the performance of the system or organization as a whole and provides insights for performance improvement including sensitivity to context. These processes offered the advantage of involving decision-makers and experts in clarifying or developing concepts, defining terms used, selecting performance indicators and defining the functionalities of the public performance reporting websites. In addition, they helped strike a balance between the needs of information users and the integrity of the research and development process. Processes engaging policy-makers in the identification of emergent strategies have been documented in earlier research on strategic management in the cancer sector in Ontario (Greenberg et al. 2005). A result of this engagement is that the framework developed and performance measures selected raised a greater interest on the side of decision-makers. This is consistent with several studies and systematic reviews showing that linkage and exchange of knowledge between health services researchers and those who can use their results is the best predictor of when and how research gets used (Lavis 2006). This is also consistent with research findings demonstrating the need to combine content, context and process in health services research, in order to effect change (ten Asbroek et al. 2005). These general lessons should be applicable to other public sectors than the health sector in Canada and more broadly. 30 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Implementation considerations The development of this performance reporting initiative was a complex process involving a partnership with the private sector to carry out the public engagement and development phase of the website; engagement with the federal, provincial and territorial levels of government across Canada; and engagement with the general public as well as other national stakeholders and professional organizations. This initiative required mobilizing a large set of competences across CIHI programs to carry out the different phases of the project including methods development, a data validation phase with all health care providers whose results were published, prototype testing, knowledge transfer from the private partner to the organization, and communication and social media activity. Overall, it took eighteen months to deliver the website from design to delivery. This was only possible because it was made a top priority for the organization and additional resources were identified to support the fast track implementation of the project. In addition, a number of implementation challenges had to be addressed. For one performance theme (value for money) we could not identify indicators meeting all the criteria applied and had to select proxies which we complemented with additional contextual information to present a narrative about the policy issue at hand. For other themes such as quality of care, standardized measures of patient experience and harm to patients are under development and could not yet be reported on. This reflects the iterative nature of such initiatives. As a result, a process has been developed through which performance indicators included in the project will be reviewed every year by an expert committee. We also had to consult with provinces and territories on methods to identify top performers and get enough support to move forward with the proposed approach. Finally, the government of Quebec decided to participate in the project but opted out of two components of the project (consultation with the general public, identification of top performers) which created some limitations for this initiative. Impact evaluation We developed a strategy to evaluate the impact of this intervention through a multi-pronged approach. CIHI conducts a broad electronic survey of its stakeholders every two years, most recently in May 2014, during which up to 4000 clients/stakeholders are invited to participate. This survey is designed to ask key questions regarding stakeholder satisfaction with the products and services that CIHI provides and to monitor, over time, their overall sense of satisfaction with them. In 2014 a smaller survey, designed HEALTH SYSTEM PERFORMANCE IN CANADA 31 as an impact evaluation of CIHI’s analytical and performance reporting products, was conducted at the same time as the larger survey. It was targeted at a smaller set (approximately 850) of stakeholders with more specific questions focused on understanding and use of these products as well as the outcomes and impact of their use. Overall, respondents rated the usefulness of CIHI’s performance reporting tools as 8 out of 10, 70% reported that a CIHI report or tool had directly informed initiatives in their organization, and 40% reported that such use had led to documented improvements. At the time of writing, national focus groups for different types of stakeholders (policy makers, system managers, clinicians, and general public) are being organized to further explore the nature of use and impact. The impact of this performance reporting intervention is highly dependent on complementary interventions taken by a range of stakeholders as a result of or concomitantly to the intervention. We will also use pop-up surveys on our website to gather feedback from information users. In addition, we will continue to monitor initiatives of jurisdictions to align health system performance public reporting efforts. Finally, we are also collaborating with researchers to measure impact more systematically at an organizational level, using qualitative and quantitative approaches to identify improvements in processes and outcomes linked to public reporting, as well as exploring the use of innovative methods such as concept mapping (Jackson and Trochim 2002). The impact of this performance reporting intervention is highly dependent on complementary interventions taken by a range of stakeholders as a result of or concomitantly to the intervention. Attribution is notoriously difficult to determine. However, results so far have been encouraging in terms of visits to the website (over 80,000 for the first 11 months after the date of release), time visitors spend on the website (over 5 minutes on average) and social media activity and we have received overall support from jurisdictions and from the media in general for the initiative. Still, it is very early to assess the overall value of this initiative, and we hope that the evaluation strategy presented above will help us assess how to maximize the utility of performance reporting to support performance improvement in the health sector in Canada. Research efforts and next steps CIHI is collaborating with researchers at the University of Montreal through a research grant of the Canadian Institutes for Health Research to 32 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN define methods to enhance the communication of performance and benchmarking information to system managers so that information better supports performance improvement. This is one step to improve the way we build relevant health information instruments for use by different audiences. We will also continue to work to better represent the importance of context through additional contextual measures and to improve the functionalities of the website and the way we present variations in results through the use of statistical tools such as funnel plots (Spiegelhalter 2005). As the use of public reporting by Canadian jurisdictions evolves, it will also be important to document the use of the different pathways to performance improvement through public reporting and evaluate the impact of each pathway. As the use of public reporting by Canadian jurisdictions evolves, it will also be important to document the use of the different pathways to performance improvement through public reporting and evaluate the impact of each pathway. Canadian jurisdictions have mainly used the change pathway until now, hoping that making performance variations publically available would stimulate change through peer pressure, with a number of provinces also implementing pay-for-performance initiatives in the last few years. Other interventions going forward might use other pathways and a mix of the different pathways, providing opportunities to evaluate further the impact of the impact of public reporting as an integral component of health system performance improvement efforts in Canada. References Berwick, D. M., B. James, and M. J. Coye. 2003. “Connections between quality measurement and improvement.” Medical Care 41(Suppl. 1): I30–I38. Boland, Tony, and Alan Fowler. 2000. “A systems perspective on performance management in the public sector.” International Journal of Public Sector Management 13(5): 417–446. Canadian Institute for Health Information. 2007. HSMR: A New Approach for Measuring Hospital Mortality Trends in Canada. Ottawa: Canadian Institute for Health Information. ——. 2012. Strategic Plan 2012 to 2017: Better Data. Better Decisions. Healthier Canadians. Ottawa: Canadian Institute for Health Information. ——. 2013a. Health Indicators 2013. Ottawa: Canadian Institute for Health Information. ——. 2013b. Wait Times for Priority Procedures in Canada 2013. Ottawa: Canadian Institute for Health Information. ——. 2013c. Health System Performance Interactive Indicators Website: Public Engagement Summary Report. Ottawa: Canadian Institute for Health Information. ——. 2013d. A Performance Measurement Framework for the Canadian Health System. Ottawa: Canadian Institute for Health Information. HEALTH SYSTEM PERFORMANCE IN CANADA 33 Champagne, François, and Guisset, Ann-Lise. 2005. The Assessment of Hospital Performance : Collected Background Papers. Groupe de recherche interdisciplinaire en sante, Universite de Montreal. Champagne, François, Andre-Pierre Contandriopoulos, Julie Picot-Touche, François Beland, and Hung Nguyen. 2005. Un cadre d’Evaluation de la Performance des Systèmes de Sante : le modèle EGIPSS. Quebec: Le Conseil de la sante et du bien-^etre. Commonwealth Fund. 2006. Commission on a High Performance Health System. Framework for a High Performance Health System for the United States. New York: The Commonwealth Fund. Council of Australian Governments. 2011. National healthcare agreement 2011. Canberra: Council of Australian Governments. De Korne, D. F., K. Sol, J. D. H. van Wijngaarden, E. J. van Vliet, T. Custers, M. Cubbon, W. Spileers, J. Ygge, C. L. Ang, and N. S. Klazinga. 2010. “Evaluation of an international benchmarking initiative in nine eye hospitals.” Health Care Management Review 35(1): 23–35. Department of Health. 2011. The NHS outcomes framework 2012/13. London, UK: National Health Service. Donabedian, Avedis. 1980. The Definition of Quality and Approaches to Its Assessment. Chicago: Health Administration Press. Greenberg, Anna, Helen Angus, Terrence Sullivan, and Adalsteinn D. Brown. 2005. “Development of a set of strategy-based, system-level cancer care performance indicators in Ontario, Canada.” International Journal of Quality in Health Care 17: 107–14. Groot, Tom, and Tjerk Budding. 2008. “New Public Management’s current issues and future prospects.” Financial Accountability & Management 24(1): 1–13. Harding, Nancy. 2003. The social construction of management: Texts and identities. London: Routledge. Health Council of Canada. 2012. Measuring and reporting on health system performance in Canada: Opportunities for improvement. Toronto: Health Council of Canada. Hibbard, Judith H. 2008. “What can we say about the impact of public reporting? Inconsistent execution yields variable results.” Annals of Internal Medicine 148: 160–161. Hibbard, Judith H., Jean Stockard, and Martin Tusler. 2003. “Does publicizing hospital performance stimulate quality improvement efforts?” Health Affairs 22: 84–94. Hurst J., and Jee-Hughes M. 2001. Performance Measurement and Performance Management in OECD Health Systems. OECD Publishing. Institute of Medicine, Committee on Quality of Health Care in America. 2001. Crossing the Quality Chasm: A New Health System for the 21st Century. The National Academies Press. http://www.nap.edu/openbook.php?record_id510027. Jackson, Kristin M., and William M. K. Trochim. 2002. “Concept Mapping as an Alternative Approach for the Analysis of Open-Ended Survey Responses.” Organizational Methods 5(4): 307–336 Kelley, E., and Hurst J. 2006. Health Care Quality Indicators Project. Conceptual Framework Paper. Organization for Economic Co-operation and Development. Lavis, John. 2006. “Research, public policymaking, and knowledge-translation processes: Canadian efforts to build bridges.” Journal of Continuing Education in the Health Professions 26: 37–45. Murray, Christopher J. L., and Julio Frank. 2000. “A framework for assessing the performance of health systems.” Bulletin of the World Health Organization 78 (6): 717–731. Saskatchewan Health Quality Council. 2011. Think Big, Start Small, Act Now: Tackling Indicator Chaos. A Report on a National Summit. Saskatoon: Saskatchewan Health Quality Council. Solar, O., and A. Irwin. 2010. A conceptual framework for action on the social determinants of health. Geneva: World Health Organization. 34 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Spiegelhalter, David J. 2005. “Funnel plots for comparing institutional performance.” Statistics in Medicine 24: 1185–1202. Stiefel M., and Nolan K. 2012. A Guide to Measuring the Triple Aim: Population Health, Experience of Care, and Per Capita Cost. Cambridge, Massachusetts: Institute for Healthcare Improvement. http://www.ihi.org/knowledge/Pages/IHIWhitePapers/AGuidetoMeasur ingTripleAim.aspx. ten Asbroek A. H. A, D. M. J. Delnoij, L. W. Niessen, R. W. Scherpbier, N. Shrestha, D. S. Bam, C. Gunneberg, C. W. van der Hor, and N. S. Klazinga. 2005. “Implementing global knowledge in local practice: a WHO lung health initiative in Nepal.” Health Policy and Planning 20(5): 290–301. United States of America Congress. 2010. Patient Affordable Care and Protection Act. Available at: http://www.gpo.gov/fdsys/pkg/BILLS-111hr3590enr/pdf/BILLS-111hr3590enr. pdf. Veillard, J., F. Champagne, N. Klazinga, V. Kazandjian, O. A. Arah, and A. L. Guisset. 2005. “A performance assessment framework for hospitals: the WHO regional office for Europe PATH project.” International Journal of Quality in Health Care 17(6): 487–496. Veillard, J., T. Huynh, S. Ardal, S. Kadandale, N. Klazinga, and A. D. Brown. 2010. “Making health system performance measurement useful to policy-makers: Aligning strategies, measurement and local health system accountability” Healthcare Policy 2010, 5(3): 49–65. Appendix A: Suite of CIHI Health System Performance Indicators The table below lists the performance indicators selected for reporting in CIHI’s health system performance websites. The two columns under “Target Audience” show whether the indicator was selected primarily to meet the information needs of the “General Public” or “System Managers” or both. The columns under “Measurement Level” show whether the indicator can be reported for provinces, health regions and facilities (hospital or long-term care). Indicators available at the facility level can be aggregated into health regions and provinces. However, not all indicators are applicable to hospital or long-term care facilities level (e.g., life expectancy, immunization rates, etc.). Further, for some of these indicators (three of the access indicators), data is not available to identify health region and provincial results only can be reported. HEALTH SYSTEM INPUTS AND CHARACTERISTICS Contextual measures reported HEALTH SYSTEM OUTPUTS: Access to timely, comprehensive, high-quality health services 1 Have a regular doctor 2 Specialist wait times 3 Joint replacement wait times 4 Radiation treatment wait times 5 Total time spent in emergency department (admitted patients) 6 Emergency wait time for physician assessment 7 Hip fracture surgery wait times Person-centered 8 Repeat hospital stays for mental illness 9 Potentially inappropriate medication in long-term care 10 Restraint use in long-term care 11 Patient flow for hip replacement Safe 12 Obstetric trauma (with instrument) 13 In-hospital sepsis Quadrant/dimension/indicator X X X X X X X X X X X X X X X X X X X X X X X X X Province X X X X X System managers Target audience General public X X X X X X X X X X X X X X X Hospital Measurement level Health region X X Long-term care facility HEALTH SYSTEM PERFORMANCE IN CANADA 35 Potentially inappropriate use of antipsychotics in long-term care 15 Falls in the last 30 days in long-term care 16 Worsened pressure ulcer in long-term care Appropriate and effective 17 Hospital deaths (HSMR) 18 Hospital deaths following major surgery 19 All patients readmitted to hospital 20 Medical patients readmitted to hospital 21 Surgical patients readmitted to hospital 22 Obstetric patients readmitted to hospital 23 Patients under 19 and younger readmitted to hospital 24 Low-risk Caesarean section rate 25 Breastfeeding initiation 26 Influenza immunization for seniors 27 Ambulatory care sensitive conditions Efficiently-Delivered 28 Administrative expense 29 Cost of a standard hospital stay 14 Quadrant/dimension/indicator X X X General public X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Province X System managers Target audience Appendix: Continued X X X X X X X X X X X X X X X X X X X X X X X X X X Hospital Measurement level Health region X X Long-term care facility 36 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN HEALTH SYSTEM OUTCOMES: Improve Health Status of Canadians 30 Life expectancy at birth 31 Life expectancy at age 65 32 Avoidable deaths 33 Avoidable deaths from preventable causes 34 Avoidable deaths from treatable causes 35 Perceived health 36 Hospitalizations for: self-injury, stroke and heart attack 37 Worsened depressive mood in long-term care 38 Improved physical functioning in long-term care 39 Worsened physical functioning in long-term care 40 Experiencing worsened pain in long-term care 41 Experiencing pain in long-term care Improve Health System Responsiveness No indicators selected for reporting Improve Value for Money No indicators selected for reporting Quadrant/dimension/indicator X X General public X X X X X X X X X X X X X X X X X Province X X X X X X X System managers Target audience Appendix: Continued X X X X X X X X X X X X Hospital Measurement level Health region X X X X X Long-term care facility HEALTH SYSTEM PERFORMANCE IN CANADA 37 X X X X X X X X System managers X X X X X Province X X X X X Hospital Measurement level Health region Long-term care facility Notes: All indicators will be publicly available, even though only some are explicitly targeting the public. Structural factors influencing health refer to those that shape individuals’ and families’ socioeconomic position, such as income and social status, education and literacy, and gender and ethnicity. Taken together, the structural factors can expose individuals to and make them more vulnerable to unhealthy conditions, as represented by the biological, material, psychosocial and behavioural factors listed here (Solar and Irwin 2010). SOCIAL DETERMINANTS OF HEALTH: Biological, Material, Psychosocial and Behavioural Factors 42 Children vulnerable in areas of early development 43 Obesity 44 Smoking 45 Physical activity during leisure time 46 Heavy drinking Structural factors influencing health Contextual measures reported Quadrant/dimension/indicator General public Target audience Appendix: Continued 38 JEREMY VEILLARD, BRENDA TIPPER, SARA ALLIN Georges-Charles Thiebaut François Champagne Andre-Pierre Contandriopoulos valuation de la Les enjeux de l’e passer les performance : De mythes Sommaire : Depuis les annees 90, la plupart des pays occidentaux se sont dotes de cadres et d’outils d’evaluation de la performance des organisations et des systèmes de sante. Paradoxalement, malgre l’abondance de ces instruments, ce champ reste face a plusieurs enjeux theoriques, methodologiques et d’utilisation qui sont irresolus. Conceptuellement, la notion de performance et les elements qui la composent sont flous et peu ancres theoriquement. Methodologiquement, on assiste a une multiplication de mesures ne repondant pas toujours aux critères de fiabilite, de validite et d’utilite, et ne permettant pas de saisir la complexite de ce concept. Enfin en termes d’utilisation, les resultats de l’evaluation ne sont pas utilises a leur plein potentiel afin de permettre une amelioration continue de la performance. Cet article a pour objectif d’analyser ces enjeux en deconstruisant les mythes qui freinent le developpement de cadres innovants et mieux adaptes aux besoins des decideurs. Il vise egalement, afin de repondre a ces enjeux, a proposer des innovations sur la base du modèle d’evaluation globale et integree des systèmes de sante. Abstract: Since the 1990s, most Western countries have equipped themselves with frameworks and tools to measure health organizations and systems’ performance. Paradoxically, several theoretical, methodological and usage issues remain unresolved. Conceptually, the notion of performance is vague and not well rooted in theory. From a methodological perspective, we are faced with numerous measures that don’t always meet reliability, validity and usefulness criteria and don’t help to grasp the complexity of the concept. Finally, the assessment results are not fully used to continuously improve performance. This article focuses on analyzing these issues by deconstructing the myths that slow down the development of better-suited frameworks to meet decision-makers’ needs. It also proposes innovations based on the global and integrated assessment model of health systems. Le concept de performance et son evaluation ont fait leur apparition dans l’ensemble des services publics, et plus particulièrement dans le système de sante au debut des annees 1990. L’avènement de ce concept provient d’un double constat : un manque d’information sur le fonctionnement l’ENAP Georges-Charles Thiebaut, Ph.D, est chercheur postdoctoral a de Montreal. François Champagne, Ph.D, est professeur titulaire au departement d’administration de la sante (Ecole de sante publique) de l’Universite de Montreal. Andre-Pierre Contandriopoulos, Ph.D, est professeur titulaire au departement d’administration de la sante (Ecole de sante publique) de l’Universite de Montreal. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 39–62 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 40 GEORGES-CHARLES THIEBAUT, ET COLL. reel des services publics et une augmentation constante des co^ uts de ceux-ci (Butler 2000). Dans ce contexte, l’evaluation de la performance a ete conçue comme un instrument de gestion et de gouvernance utilise pour atteindre trois objectifs : fournir des informations sur l’activite des organisations et du système de sante, accro^ıtre l’imputabilite des acteurs œuvrant dans celui-ci et ameliorer la performance (Boland et Fowler 2000). Les chercheurs et les gouvernements ont developpe plusieurs cadres d’evaluation de la performance a tel point qu’il semblerait que ce champ ait atteint sa maturite. Pourtant, la conception de la performance, les outils de mesure et les objectifs associes a l’evaluation ne font pas consensus et de nombreux enjeux freinent le developpement de cadres plus robustes conceptuellement, dotes de meilleures mesures et mieux adaptes aux besoins des utilisateurs. Cet article a donc pour objectif d’analyser ces enjeux d’ordres theorique, methodologique et d’utilisation. Ces enjeux sont traverses par des mythes qui symbolisent certaines idees preconçues qui perdurent dans ce champ. En effet, d’un point de vue conceptuel, il n’existe pas de definition partagee de la performance, ni m^eme d’entente sur les dimensions la composant. Methodologiquement, la mesure de la performance soulève des enjeux quant a la conciliation entre la disponibilite, la fiabilite, la validite, l’utilite des indicateurs pour la mesurer et le developpement d’approches compatibles avec la complexite du concept de performance. Enfin, les resultats des evaluations de la performance ne sont pas utilises a leur plein potentiel du fait de l’absence d’une grille d’analyse et de jugement pour les interpreter (Freeman 2002), et d’une ambigu€ıte dans les finalites de l’utilisation des resultats de l’evaluation (Halachmi 2002). La majorite des cadres implantes semble osciller entre la volonte d’accro^ıtre l’imputabilite et le desir d’amelioration continue de la performance. Dans un second temps, nous presenterons le modèle d’evaluation globale et integre de la performance des systèmes de sante (EGIPSS) (Champagne et coll. 2005), qui semble le plus a m^eme de repondre a ces mythes. Cependant, malgre ses qualites, nous proposerons plusieurs innovations qui pourraient y ^etre adjointes afin de l’ameliorer. La multiplication des cadres valuation de la performance d’e La plupart des ministères de la sante ainsi que de nombreuses agences internationales ont developpe et implante des cadres de mesure de la performance. L’Australie, l’Angleterre ou encore le Canada font figure de precurseurs dans ce domaine. En effet, en Australie, l’autorite nationale de la performance (National Health Performance Authority) a developpe un cadre d’evaluation de la LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 41 performance dont les objectifs sont d’ameliorer la transparence et l’imputabilite, ainsi que l’instauration d’une dynamique d’amelioration de la performance (NHPA 2011). Ce cadre est constitue de trois grandes dimensions, l’equite, l’efficacite, et l’efficience, exprimees en termes d’activites et de resultats. En Angleterre, les reformes recentes ont ete accompagnees par l’implantation d’un cadre de mesure de la performance, l’Operating Framework (NHS 2011), comprenant un module specifique de mesure des resultats, the Outcomes framework (NHS 2012). L’emphase est mise sur l’innovation, la productivite et la prevention. Le cadre de mesure est principalement considere comme un outil d’imputabilite puisque les dimensions et les mesures de la performance sont orientees vers les resultats du système et appreciees en fonction de cibles designees par un organisme regulateur. Au Canada, l’Institut canadien d’information sur la sante a propose en 2000, un cadre conceptuel d’evaluation de la performance des systèmes de sante comprenant huit dimensions relatives a la qualite des soins : l’acceptabilite, l’accessibilite, la pertinence, la competence, la continuite, l’efficacite, l’efficience et la securite. Plusieurs institutions internationales dont l’Organisation mondiale de la sante (OMS), l’Organisation de cooperation et de developpement economiques (OCDE) ainsi que le Commonwealth Fund ont propose des cadres d’evaluation pour soutenir leurs Etats membres. L’OMS, au debut des annees 2000, a elabore un cadre permettant d’evaluer l’atteinte des objectifs intrinsèques des systèmes de sante (Murray et Evans 2003). Soit l’amelioration globale de l’etat de sante de la population, la reduction des iniquites de sante, la reactivite du système, la distribution equitable a travers la population de la contribution financière. En 2001, l’OCDE a egalement elabore un cadre d’evaluation de la performance des systèmes de sante (Hurst et Hughes 2001). Ce cadre est plus large que celui de l’OMS, puisqu’il prend en compte l’accessibilite comme composante de la reactivite, l’amelioration de l’etat de sante de la population ainsi que le niveau et la distribution des contributions au système de sante, ainsi que l’efficience declinee en efficience micro et macro economique. Enfin, le Commonwealth Fund a egalement propose un modèle de mesure de la performance des systèmes de sante. Ce cadre a pour objectif de rendre publique la performance des differents Etats americains et des h^ opitaux. Le cadre est construit autour du but ultime de tout système de sante, soit promouvoir une vie longue et productive. Pour ce faire, le système de soins de sante doit ^etre performant dans quatre dimensions, la qualite des soins regroupant la justesse des soins, la securite, la coordination, les soins centres sur le patient. Deuxièmement, l’accessibilite qui 42 GEORGES-CHARLES THIEBAUT, ET COLL. refère a la participation universelle, la protection financière et l’equite. S’ensuit l’efficience. Enfin, la capacite du système a s’ameliorer o u l’on trouve plusieurs elements, dont l’innovation, la structure d’information et le système d’education. Au niveau conceptuel, aucun de ces cadres ne definit clairement le concept de performance et son application aux organisations et aux systèmes de sante. Comme l’indique le Tableau 1, on constate que les dimensions de la performance sont principalement orientees vers la mesure de la qualite des soins (justesse, securite), de l’accessibilite, des resultats de sante et des co^ uts du système. De m^eme, il n’y a pas de theorie evaluative structurant ou justifiant la construction de ces cadres. Du point de vue de l’utilisation, ils sont surtout orientes vers un accroissement de l’imputabilite devant entra^ıner l’amelioration de la performance. oriques de l’e valuation Les enjeux the de la performance Certains chercheurs avancent que les developpements theoriques sont suffisamment avances pour s’interesser principalement aux questions de l’utilisation des resultats de l’evaluation de la performance (Adair et coll. 2006a). Nous pensons au contraire qu’il est important de revenir sur certains mythes associes a la definition de la performance et a la conception m^eme des cadres d’evaluation. En effet, l’utilisation est conditionnee par la nature et les proprietes du cadre d’evaluation. Mythe 1 : La performance est synonyme des soins. L’e valuation de la de la qualite mesurer l’atteinte performance consiste a des buts des organisations et du système de sante Ce mythe renvoie a deux elements majeurs relatifs a la conception des cadres d’evaluation de la performance. Premièrement, la notion de performance n’est pas formellement definie, ce qui contribue a rendre le concept de performance relativement flou. Deuxièmement, les dimensions constitutives de la performance se rapportent aux differentes composantes de la qualite des soins et aux objectifs du système de sante. L’Institut de medecine definit la performance comme « le degre par lequel les services de sante ameliorent les resultats de sante autant au niveau individuel que populationnel, cela au regard de la connaissance scientifique en vigueur » (cite par Lied et Kazandjian 1999 : 394). David Eddy (1998) abonde dans le m^eme sens en abordant la performance en fonction des resultats que * * * * * * * * * * * * * * * * Canada (ICIS) ICIS: Institut Canadien d’Information sur la Sante. Accessibilite Justesse des soins Experience des patients Securite Globalite Continuite des soins Productivite Viabilite Disponibilite des ressources Co^ uts et depenses Efficacite populationnelle Efficacite Efficience Equit e Ajustement aux besoins de la population Competence des professionnels Innovation Qualite de vie au travail Satisfaction de la population Maintien des valeurs RoyaumeUni * * * * * * * * Commonwealth Fund * * * * * OMS * * * * OCDE * * * * * * * * * * Australie Tableau 1. Comparaison des dimensions constitutives des differents cadres d’evaluation de la performance * * * * * * * * * * * * * * * * * * * * EGIPSS LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 43 44 GEORGES-CHARLES THIEBAUT, ET COLL. l’on peut attendre d’une prise en charge. Pourtant, la notion de performance devrait transcender celle de qualite pour inclure d’autres dimensions refletant la complexite des systèmes de sante, les determinants de la performance ainsi que les multiples perspectives des parties prenantes (Davies 1998; Lied et Kazandjian1999; Adair et coll. 2006b). Pour bien comprendre la performance, il est necessaire de l’apprehender comme un phenomène contingent et paradoxal (Champagne et coll. 2005) Cette perspective est renforcee par l’existence d’une abondante litterature en theorie des organisations ayant conceptualise les differentes perspectives a considerer pour apprecier la performance d’une organisation ou d’un système. François Champagne (2003) ainsi que Claude Sicotte et coll. (1998) ont decrit neuf modèles theoriques : – Le modèle rationnel qui correspond a une approche fonctionnaliste des organisations. La performance est apprehendee comme l’atteinte des buts specifiques de l’organisation (Sicotte et coll. 1998; Champagne 2003). – Le modèle des processus internes pour lequel la performance correspond a la capacite d’une organisation a fonctionner sans heurts suivant les normes en vigueur. La performance decoule de la qualite des processus de production (Champagne 2003). – Le modèle de l’acquisition des ressources apprehende l’organisation comme un système ouvert o u la performance decoule de l’acquisition des ressources, le maintien de celles-ci et l’adaptation de l’organisation a son environnement (Sicotte et coll. 1998; Champagne 2003). – Le modèle des relations humaines o u l’emphase est mise sur la qualite de l’environnement de travail et la satisfaction des besoins des employes. Une organisation performante maintient un environnement de travail sain (Sicotte et coll. 1998; Champagne 2003). – Le modèle politique selon lequel une organisation performante est celle qui parvient a satisfaire ses enjeux internes et externes. Ce modèle repose sur une vision politique ou strategique selon laquelle les organisations sont des arènes politiques dans lesquelles les acteurs interagissent en fonction de leurs propres inter^ets strategiques (Champagne 2003). – Le modèle de la legitimite social considère qu’une organisation est efficace dans la mesure o u elle maintient et survit en mettant en accord les processus et les resultats avec les valeurs sociales, les normes et les objectifs. La reputation, le prestige sont alors des indicateurs de performance (Champagne 2003). LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 45 – Le modèle zero defaut apprehende la performance d’une organisation comme sa capacite a ne pas faire d’erreurs et/ou supprimer toute trace de non-performance (Champagne 2003). – Le modèle comparatif de la performance selon lequel une organisation est jugee en comparaison a d’autres organisations similaires selon des critères de performance correspondant aux donnees disponibles (Champagne 2003). – Le modèle normatif du système d’action rationnelle, developpe par Donabedian, propose d’analyser la performance en fonction de normes attachees aux structures, aux processus et aux resultats (Champagne 2003). Les cadres d’evaluation de la performance nationaux et internationaux correspondent principalement aux modèles rationnel et comparatif. Pourtant, pour bien comprendre la performance, il est necessaire de l’apprehender comme un phenomène contingent et paradoxal (Champagne et coll. 2005). En effet, la performance est contingente, car tous les modèles reflètent des points de vue legitimes qui dependent des differentes perspectives et points de vue des acteurs et des differents contextes. Elle est egalement paradoxale, car pour ^etre performante, une organisation doit accomplir simultanement toutes les fonctions decrites par les differents modèles, m^eme si ces fonctions sont contradictoires. Dès lors, il est necessaire de postuler que le concept de performance doit ^etre aborde de façon globale et multidimensionnelle afin d’en respecter la complexite. Mythe 2 : Une conception des cadres valuation comme outil de classification d’e des indicateurs Les cadres d’evaluation nationaux et internationaux vehiculent une approche descriptive de la performance fondee sur la classification des indicateurs par dimension. Ceci apporte une information sur le niveau de performance de chaque indicateur, et parfois d’une dimension, si les indicateurs sont agreges. Cependant une telle approche ne permet pas d’analyser et de comprendre la performance d’une organisation a travers la creation d’un savoir decoulant d’un jugement evaluatif base sur la configuration des differents indicateurs et dimensions. Pour ce faire, il faut prendre en consideration les relations qui existent entre les dimensions et les indicateurs de la performance. Cet aspect central n’est malheureusement que très peu developpe dans les cadres d’evaluation actuellement implantes. Pourtant, David Norton et Robert Kaplan (1996) ainsi que Philippe Lorino (1997) soutiennent que 46 GEORGES-CHARLES THIEBAUT, ET COLL. la comprehension des cha^ınes causales entre les dimensions constitutives de la performance est un element central a considerer pour analyser la performance puis prendre des decisions menant a son amelioration. Cette vision est le corollaire du caractère multidimensionnel et paradoxal de la performance. Dès lors, la performance ne peut plus ^etre apprehendee uniquement par dimension, mais en fonction des tensions et des equilibres entre les dimensions qui la composent. La representation des cha^ınes causales pourrait ^etre realisee par la creation de modèles logiques permettant de decrire les configurations de dimensions en specifiant les determinants de la performance de chaque dimension et leur consequence sur d’autres. Ces deux mythes touchant au caractère multidimensionnel de la performance ainsi qu’a la necessite de construire des outils permettant de representer les configurations de relations entre les dimensions nous amènent a affirmer que de nombreux developpements theoriques sont encore a realiser. Ceux-ci sont imperatifs pour accro^ıtre l’utilisation des resultats des evaluations. En effet, le succès d’implantation et d’utilisation des cadres d’evaluation de la performance est dependant de la conception de la performance et du cadre pour la mesurer. thodologiques de Les enjeux me valuation de la performance l’e Deux aspects methodologiques sont centraux dans le developpement des cadres d’evaluation de la performance : le choix des indicateurs appropries pour mesurer les differentes dimensions de la performance, et l’elaboration de methodes d’evaluation susceptibles de representer la complexite de la performance. cessaire multiplication Mythe 3 : De la ne des indicateurs de performance pour produire des informations utiles et pertinentes L’implantation de dispositifs formels d’evaluation de la performance s’accompagne très generalement d’une multiplication d’indicateurs de performance, et ce independamment des considerations necessaires de validite, de fiabilite et d’utilite de la mesure. Pourtant, les mesures pour evaluer la performance des systèmes de sante doivent repondre a des exigences methodologiques similaires a celles des demarches evaluatives scientifiques afin de s’assurer de leurs qualites psychometriques (Contandriopoulos et coll. 2012). Cependant, en plus de ces critères classiques, d’autres exigences s’imposent. En effet, ces mesures doivent ^etre 47 LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE Tableau 2. Critères de selection des indicateurs (adapte de Adair et coll. 2006b, Champagne et Contandriopoulos (2005), Champagne et coll. 2005) Critère Description Proprietes des indicateurs L’indicateur presente un bon niveau d’evidence, decoulant de la recurrence d’utilisation dans diverses etudes scientifiques. Les mesures sont reliees a des questions importantes de sante ou du fonctionnement des services de sante. Strategique et susceptible d’^etre Ils sont en lien direct avec des elements influence par les acteurs du strategiques du système sur lesquels il est système possible d’agir et sur lesquels il y a un besoin d’amelioration. Cela signifie qu’il devrait y avoir une variation entre les h^ opitaux, symbolisant le besoin de disposer d’une mesure. Attribuable et relie Ces mesures font partie d’une cha^ıne causale permettant de comprendre et d’analyser les phenomènes. De plus, il existe un lien de causalite entre ces mesures et l’amelioration des services ou de l’etat de sante. Ces mesures ont un sens pour les acteurs Significative et interpretable cles et les parties prenantes du système de sante Non ambigu€e Les mesures fournissent une direction claire en termes d’amelioration et de changement a apporter Proprietes de la mesure Fiabilite Les mesures sont reproductibles dans le temps et permettent d’identifier de façon coherente les phenomènes dans differentes organisations. Validite apparente Il existe un consensus entre les utilisateurs et les experts sur le fait que la mesure est associee a la dimension ou la sousdimension analysee. Validite de contenu Les mesures selectionnees couvrent tous les aspects associes avec la dimension ou la sous-dimension mesuree. Validite de construit Les mesures selectionnees sont associees a d’autres indicateurs mesurant le m^eme phenomène. Precision L’indicateur est suffisamment precis pour que les resultats ne soient pas attribuables a des variations aleatoires. Degre d’evidence scientifique et importance de la mesure 48 GEORGES-CHARLES THIEBAUT, ET COLL. Tableau 2 : (suite) Critère Faisabilite Possible Disponibilite Description Le recueil des donnees est faisable et co^ ut/ efficient (le benefice est superieur au co^ ut) Les donnees sont disponibles acceptees par les futurs utilisateurs, facilement implantables, comparables entre h^ opitaux ou systèmes de sante (Smith et coll. 2009; Wallace, Lemaire et Ghali 2009), utiles a la prise de decision et surtout representer des marqueurs de la performance. Le respect de ces critères soulève trois exigences contradictoires. Tout d’abord une exigence de parcimonie et d’exhaustivite qui doit ^etre respectee lors de la selection des indicateurs. En effet, aucun indicateur n’est suffisant pour apprecier une dimension de la performance ou la qualite d’un processus. Pourtant, on ne peut multiplier les indicateurs sans risquer de perdre l’utilite des mesures et creer plus de confusion. La mesure doit ^etre choisie en fonction de ce qui est valorise et de ce que l’on veut valoriser. Une exigence de disponibilite et d’ajout parcimonieux de nouvelles donnees afin de fournir une serie de mesures suffisantes pour apprecier adequatement le niveau de performance de chaque dimension. En effet, on ne dispose que très rarement des meilleurs indicateurs possibles pour evaluer la performance. De ce fait, lors de l’implantation d’un système d’evaluation de la performance, la selection des indicateurs se fait en plusieurs etapes allant de l’utilisation des donnees disponibles, a l’implantation de nouvelles mesures, jusqu’a l’institutionnalisation de celles-ci, cela en minimisant le poids associe au developpement de nouvelles mesures (Friedberg et coll. 2011). Enfin, une exigence de validite et fiabilite des donnees tout en preservant l’utilite des mesures. Pour reconcilier ces exigences, Champagne et Contandriopoulos (2005), François Champagne et coll. (2005) et Carole Adair et coll. (2006b) ont propose une serie de critères guidant la selection des mesures afin d’assurer la qualite psychometrique, l’utilite et la facilite de collecte (Tableau 2). valuation de la performance Mythe 4 : L’e thodes est possible avec les me valuation classiques de l’e Les systèmes et les organisations de sante sont des systèmes complexes qui doivent ^etre evalues en tenant compte de cette complexite. Les systèmes complexes sont des systèmes ouverts qui co-evoluent avec leur environnement (Barnes, Matka et Sullivan 2003; McDaniel 2007). Les LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 49 elements du système sont en interrelations et interdependants et ces relations sont caracterisees par des boucles de retroaction positives ou negatives (McDaniel 2007). Ces systèmes ont la capacite de s’autoorganiser en generant l’emergence de nouvelles structures et de nouvelles pratiques (McDaniel 2007). De ce fait, celles-ci ne sont pas uniquement imposees par la hierarchie, mais naissent des interactions entre les acteurs. Finalement, l’analyse de ces systèmes ne peut se faire par l’etude de ses parties, mais necessite une approche globale (Meyer, Tsui et Hinings 1993) car un système complexe n’est ni la somme des elements le constituant, ni la somme des comportements individuels, mais le resultat collectif d’interactions non lineaires (McDaniel 2007). Pourtant, les cadres classiques d’evaluation de la performance tendent a conceptualiser la performance de façon lineaire, sans modeliser les interactions ni prendre en compte l’influence de l’environnement, par une classification des indicateurs selon qu’ils correspondent aux entrants, aux produits ou aux resultats. La prise en compte de la complexite pose a l’evaluateur des questions difficiles dans l’etat actuel des connaissances sur l’evaluation. Pour Michael Patton (2011), il s’agit du problème central auquel il faut trouver des reponses pour que l’evaluation soit a la hauteur des attentes des decideurs : « Evaluation has to explore merit and worth, processes and outcome, formative and summative evaluation; we have a good sense of the lay of the land. The great unexplored frontier is evaluation under conditions of complexity » (Patton 2011 : 1). Pour ce faire, les methodes d’evaluation doivent ^etre construites pour se rapprocher des caracteristiques des systèmes et des organisations complexes. Ainsi, pour evaluer la performance il est necessaire, non pas d’utiliser de façon sequentielle et interdependante les methodes et les approches habituelles de l’evaluation (Brousselle et coll. 2011), mais de mobiliser des approches configuratives et interpretatives (Contandriopoulos et coll. 2012) generant des formes de jugement adaptees aux realites des systèmes de sante et aux besoins des decisionnaires. Ces approches configurationnelles permettent de modeliser une constellation de dimensions representant les caracteristiques theoriques d’un phenomène (p. ex : la performance et ses composantes, telles que l’efficacite des soins, la securite, la disponibilite des ressources ou l’innovation) en interaction (Meyer, Tsui et Hining 1993). Ces configurations devraient ^etre accompagnees de grille de lecture fournissant un support a l’interpretation et l’appreciation des resultats du phenomène etudie. Une telle grille pourrait comprendre des synthèses narratives resumant les evidences scientifiques sur la nature des relations entre les composantes des configurations, ainsi que des jugements sur le niveau de performance de chaque dimension en fonction d’une norme. Ces jugements de type normatifs 50 GEORGES-CHARLES THIEBAUT, ET COLL. chaque dimension puis seraient regroupes au sein de s’appliqueraient a configurations. Ces methodes permettraient de disposer d’outils permettant a la fois de representer la complexite d’un phenomène tout en fournissant des grilles pour l’interpreter et porter un jugement. Cependant, l’interpretation et l’appreciation des configurations devraient ^etre completees par un processus permettant d’inclure les perspectives des acteurs concernes afin de prendre en compte ce qu’ils pensent, ce qui devrait ^etre entrepris selon leur point de vue et comment certaines visions devraient prevaloir sur d’autres (Barnes Matka et Sullivan 2003). Les enjeux d’utilisation des cadres valuation de la performance d’e Il ne suffit pas de disposer des resultats de l’evaluation, il est imperatif de pouvoir les interpreter a travers des grilles permettant de porter un jugement. En effet, evaluer consiste a porter un jugement de valeur permettant d’apprecier la qualite ou l’existence d’un objet (Brousselle et coll. 2011; Angers 2010). Il faut mettre en place un dispositif permettant de fournir des informations evaluatives scientifiquement valides et socialement legitimes, mais aussi un savoir permettant d’accro^ıtre la comprehension de la performance, afin que les differents acteurs concernes soient en mesure de prendre position et qu’ils puissent construire, individuellement et collectivement, un jugement susceptible de se traduire en action (Brousselle et coll. 2011). des donne es Mythe 5 : La disponibilite garantit leur utilisation Les cadres d’evaluation fournissent des donnees classees en fonction des dimensions de la performance selectionnees. Ces donnees sont parfois associees a des normes permettant d’en apprecier le niveau de performance. Cependant, l’existence de ces donnees et leur classement ne garantissent ni leur utilisation, ni m^eme un renforcement des capacites des decideurs. En effet, il y a souvent trop de donnees, inadaptees et non soutenues par un raisonnement permettant de produire une connaissance utile pour l’action (Pfeffer et Sutton 2006; Rousseau 2006), mais surtout les resultats de l’evaluation ne sont pas analyses a travers une grille de jugement permettant de creer un sens. Dès lors, l’enjeu de l’utilisation est d’ameliorer la transformation, la traduction des donnees pour permettre aux decideurs d’elaborer un jugement sur la performance qui decoule d’un raisonnement scientifique. La construction de modèles logiques, decrivant les configurations de causes a effets entre les dimensions de la performance, represente LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 51 l’element central permettant de construire des jugements plus adaptes aux besoins des decideurs. En effet, ces modèles logiques, bases sur des evidences scientifiques fourniraient plus que des resultats evaluatifs, mais un savoir. Autrement dit, « un ensemble de thèses et de questions a partir desquelles une activite peut ^etre conduite ou une information acquerir un sens en generant, le cas echeant, de nouvelles thèses ou de nouvelles questions » (Hatchuel et Weil 1992 : 16). Une telle approche permettrait de depasser l’utilisation classique des evaluations de type normatif ou instrumental ayant pour objectif d’informer et d’orienter les prises de decisions organisationnelles et politiques, contraindre et modifier les actions des fournisseurs de soins afin qu’ils se conforment aux normes et aux cibles (Weiss 1998; Henry et Mark 2003). En effet, ce type d’utilisation est insuffisant pour assurer une amelioration continue de la performance. Il est necessaire d’y adjoindre d’autres types d’utilisation et de nouvelles formes de jugement. Pour repondre a ces enjeux, trois formes de jugement devraient ^etre articulees : un jugement normatif, un jugement configurationnel et un jugement deliberatif Pour repondre a ces enjeux, trois formes de jugement devraient ^etre articulees : un jugement normatif, un jugement configurationnel et un jugement deliberatif. Le jugement normatif permet d’apprecier le niveau de performance en fonction d’une norme. C’est une première etape necessaire a la construction des jugements configurationnels et deliberatifs. En effet, le jugement configurationnel base sur les modèles logiques permettrait de mettre en relation et d’analyser plusieurs indicateurs a travers une configuration de dimension. Ce jugement pourrait generer des leviers d’action a partir des relations de causes a effets entre les dimensions. Cette utilisation de type conceptuel accro^ıtrait la connaissance des utilisateurs et la comprehension des phenomènes (Champagne et Contandriopoulos 2011a). Sur cette base, le jugement configurationnel mènerait a une modification des schemas cognitifs des utilisateurs, l’acquisition de nouvelles competences et d’habiletes et finalement un changement de comportement (Henry et Mark 2003; Weiss 1998). Cependant, les jugements normatifs et configurationnels ne sont pas suffisants pour garantir l’utilisation des resultats de l’evaluation et l’appropriation des connaissances generees par les modèles logiques. Il est necessaire d’y adjoindre un jugement deliberatif, soit un processus permettant a un groupe de recevoir et d’echanger des informations, d’examiner de façon critique un problème et de creer une comprehension rationnelle soutenant la prise de decision (Abelson et coll. 2003). En effet, 52 GEORGES-CHARLES THIEBAUT, ET COLL. les decisions manageriales sont rarement prises individuellement. Elles impliquent toujours d’autres acteurs et donc une negociation (Walshe et Rundall 2001). Cet aspect renvoie au processus politique inherent a toutes prises de decision au sein d’une organisation. Ainsi, le jugement deliberatif devrait generer une comprehension intersubjective permettant « d’accorder mutuellement les plans d’action sur le fondement de definitions communes des situations. » (Habermas 1987 : 295). Dès lors, ce n’est plus ni les resultats de l’evaluation ni la connaissance produite qui ont de la valeur, mais l’interpretation collective faite par les utilisateurs qui mène a l’action (Denis, Lehoux et Champagne 2004). valuation peuvent Mythe 6 : Les cadres d’e ment permettre de poursuivre simultane et l’ame lioration deux objectifs : l’imputabilite continue de la performance Les cadres, aussi bien nationaux qu’internationaux, affirment poursuivre simultanement ces deux objectifs dans leur utilisation. De m^eme, de nombreux chercheurs pensent que ces deux objectifs peuvent ^etre concilies dans un m^eme cadre d’evaluation et obtenus par les m^emes donnees (Smith et coll. 2009; Murray et Frenk 2000). La plupart des cadres d’evaluation de la performance ont pour objectif principal l’accroissement de l’imputabilite qui devrait entra^ıner une amelioration de la performance. Les cadres et les resultats de l’evaluation sont ainsi utilises comme outil de gouvernance pour encadrer et surveiller les dispensateurs de soins (Dubnick 2005). Smith et coll. (2009 : 5) affirment d’ailleurs que « peu importe le design du système de sante, le r^ ole fondamental de la mesure de la performance est d’aider les differents acteurs a rendre des comptes en permettant aux parties prenantes de prendre des decisions eclairees ». L’accroissement de l’imputabilite devrait mener de façon indirecte a l’amelioration de la performance par des incitatifs imposes aux dispensateurs de soins afin de se conformer aux normes et aux standards. Cependant, les liens etablis entre les mecanismes d’imputabilite et l’amelioration de la performance n’ont pas ete suffisamment demontres (Dubnick 2005). Plusieurs auteurs affirment que l’utilisation excessive de mesures de la performance orientees vers l’imputabilite et concentrees uniquement vers les resultats pouvait avoir des impacts negatifs sur l’organisation et des effets indesirables sur le comportement des gestionnaires (Smith 1995; Lied et Kazandjian 1999). Ainsi, il semble improbable d’utiliser les m^emes resultats de l’evaluation et les m^emes grilles de jugement pour poursuivre simultanement ces deux objectifs. Des differences majeures en termes d’objectifs, 53 LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE Tableau 3. Differences entre imputabilite et amelioration dans la construction et l’utilisation des resultats de l’evaluation de la performance (base sur Freeman 2002) IMPUTABILITE AMELIORATION Emphase Verification et assurance Orientee vers la mesure Objectif Contr^ ole et conformite Logique Fournir une imputabilite externe et assurer la legitimite Imposee, hierarchique ou institutionnelle Apprentissage. Promotion de l’amelioration continue. Orientee vers le changement Amelioration, renforcement et construction des capacites individuelles et organisationnelles, Comprehension des processus Promouvoir le changement et ameliorer la qualite de soins Participative (soit dans l’elaboration des outils, soit dans leur utilisation) Plus faible precision, Besoin de grille d’interpretation des donnees Demarche Precision de la mesure Type de mesure Haute precision. Utilisation de statistiques pour identifier les differences reelles Efficacite, volume de production, accessibilite Auditoires Gouvernements Public Assureurs Type d’utilisation Sommative Instrumentale Multidimensionnelle (structure-processusresultat) Professionnels Administrateurs Equipe en charge de l’amelioration continue Formative Instrumentale Conceptuelle Deliberative de logiques, de demarches, de mesures, d’auditoires et de types d’utilisation opposent ces deux finalites d’utilisation (Tableau 3). Comme le montre ce tableau, ces deux finalites diffèrent principalement au niveau de la logique et des auditoires auxquels elles s’adressent. En effet, l’imputabilite s’appuie sur la theorie de l’agence dont le but est de surveiller et encadrer le comportement de l’agent (Dubnick 2005). Ce type d’utilisation des resultats de l’evaluation est principalement destine 54 GEORGES-CHARLES THIEBAUT, ET COLL. aux institutions regulatrices au niveau regional et ministeriel. Ainsi, les mesures de performance servent a juger de la conformite des pratiques et de l’atteinte des resultats par les dispensateurs de soins. En outre, l’imputabilite est fondee sur des standards techniques exterieurs et sur une logique de reddition de compte envers le superieur, le ministère et la population (Halachmi 2002). Cette logique, fondee sur la rationalite instrumentale visant une utilisation sommative, pourrait ^etre resumee par la question : est-ce que les choses ont ete bien faites au regard des standards et des cibles a atteindre? A contrario, l’utilisation des resultats de l’evaluation dans une perspective d’amelioration continue de la performance tend a renforcer les capacites d’analyse des acteurs, leur niveau de comprehension des phenomènes permettant de changer de l’interieur (Halachmi 2002). Dans ce sens, ce n’est pas un principal qui contraint ou encadre un agent, mais des gestionnaires et des professionnels qui s’interrogent sur leurs pratiques et le fonctionnement de l’organisation pour l’ameliorer. L’utilisation est donc formative et s’interesse a la comprehension des phenomènes symbolisee par les questions : est-ce que c’etait la bonne chose a faire? et comment l’avons-nous faite? Dès lors, la performance organisationnelle est abordee comme un construit multidimensionnel qui devrait permettre aux differentes parties prenantes de debattre et d’elaborer un jugement sur les qualites essentielles et specifiques de l’organisation en fonction de leurs croyances, connaissances, responsabilites, inter^ets et projets Ainsi, nous pensons qu’il est difficile et inapproprie d’utiliser les m^emes resultats d’evaluation de la performance et les m^emes formes de jugement pour poursuivre les objectifs d’accroissement de l’imputabilite et d’amelioration de la performance. En effet, nous avançons que le jugement normatif correspondrait davantage a une recherche d’imputabilite, alors que l’utilisation des resultats pour porter des jugements configurationnels et deliberatifs serait plus appropriee pour ameliorer la performance. Par contre, un m^eme cadre d’evaluation pourrait ^etre decline en differents systèmes d’evaluation visant specifiquement l’accroissement de l’imputabilite ou l’amelioration continue de la performance. Un m^eme cadre d’evaluation pourrait donc ^etre compose de differents systèmes d’appreciation de la performance qui seraient utilises par des acteurs differents et ne contiendraient ni exactement les m^emes indicateurs, ni les m^emes grilles d’interpretation des resultats. LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 55 passer les mythes : Innover sur la De valuation globale et base du modèle d’e gre e de la performance des inte (EGIPSS) systèmes de sante Ces mythes referant a des enjeux theoriques, methodologiques et d’utilisation contraignent le developpement de cadre d’evaluation de la performance permettant reellement d’apporter un savoir adapte et utile aux decideurs et au gestionnaire afin d’ameliorer la performance des systèmes de sante. Dans l’optique de depasser plusieurs de ces mythes, François Champagne et coll. (2005) ont developpe, un modèle d’evaluation globale et integree de la performance des systèmes de sante (EGIPSS). Cependant, dix après son developpement et son operationnalisation dans de nombreux contextes dont les centres de sante et des services sociaux du Quebec ou les h^ opitaux de l’Etat du Mato Grosso du sud au Bresil, plusieurs ameliorations pourraient ^etre apportees a ce modèle afin d’accro^ıtre sa capacite a rendre compte de la complexite de la performance et favoriser l’utilisation des resultats qu’il genère. Nous avançons que la capacite du modèle EGIPSS a repondre adequatement aux six mythes que nous venons de decrire depend de l’introduction de modèles logiques representant les configurations de relations entre les dimensions les composant. Sur la base de ces modèles, il serait possible de developper des jugements configurationnels et deliberatifs promouvant la generation d’un savoir plus adapte pour favoriser l’utilisation des resultats et l’amelioration de la performance. Le modèle EGIPSS : une approche e sur la the orie de l’action fonde sociale Le modèle EGIPSS est ancre dans la theorie de l’action sociale developpee par Talcott Parsons (1951a et 1951b) dans laquelle les systèmes d’action sociaux sont apprehendes selon leur capacite a remplir simultanement plusieurs fonctions essentielles a leur survie. Inspire de cette theorie, le modèle EGIPSS est constitue de quatre fonctions en interaction : l’adaptation, le maintien des valeurs, la production et l’atteinte des buts permettant d’evaluer la performance d’un système d’action (Figure 1). Chacune de ces fonctions est declinee en plusieurs dimensions qui les composent. – L’adaptation : Tout système de sante ou organisation doit tenir compte de son environnement pour acquerir des ressources et s’adapter. Le système ou l’organisation de sante doit, a court terme, se procurer les ressources necessaires au maintien et au developpement de ses activites, ^etre oriente vers les besoins de la 56 GEORGES-CHARLES THIEBAUT, ET COLL. population, attirer les clientèles et avoir des habiletes a mobiliser la plus long terme, l’organisation de sante doit communaute. A developper son habilete a se transformer afin de s’adapter aux changements technologiques, populationnels, politiques et sociaux. – L’atteinte des buts : Cette fonction est liee a la capacite du système ou de l’organisation a atteindre ses buts fondamentaux. Pour une organisation publique de sante, il peut s’agir de l’amelioration de l’etat de sante des individus et de la population, de l’efficacite, de l’efficience, de l’equite et de la satisfaction des groupes d’inter^et. – La production : La fonction de production est le noyau technique de l’organisation. Traditionnellement, c’est a ce niveau qu’on retrouve la majorite des indicateurs qui sont generalement utilises pour mesurer la performance des etablissements de sante. – Le maintien des valeurs : Cette fonction symbolise la capacite d’une organisation ou d’un système a maintenir et promouvoir un système de valeurs partagees ainsi que la qualite de vie au travail des employes permettant d’assurer un niveau de motivation necessaire pour une action individuelle et collective performante. Elle est composee de deux dimensions : le consensus sur les valeurs et la qualite de vie au travail. Etant interdependantes, ces quatre fonctions et les dimensions qui les composent sont reliees par l’intermediaire d’equilibre au nombre de six (Figure 1) : – L’equilibre strategique (Adaptation-Atteinte des Buts) : Cette dimension de la performance evalue la compatibilite de la mise en œuvre des moyens (adaptation) en fonction des finalites organisationnelles (les buts), ainsi que la pertinence des buts etant donne l’environnement et la recherche d’une plus grande adaptation organisationnelle. – L’equilibre allocatif (adaptation-production) : Cette dimension de la performance evalue la justesse d’allocation des moyens (l’adaptation), et comment les mecanismes d’adaptation demeurent compatibles avec les imperatifs et les resultats de la production. – L’equilibre tactique (atteinte des buts-production) : Cette dimension de la performance evalue la capacite des mecanismes de contr^ ole decoulant du choix des buts organisationnels a gouverner le système de production; et comment les imperatifs et les resultats de la production viennent modifier le choix des buts de l’organisation. On s’interroge alors sur la pertinence des buts. – L’equilibre operationnel (maintien des valeurs-production) : Cette dimension de la performance evalue la capacite des mecanismes de LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 57 Figure 1. Le modèle d’evaluation globale et integree de la performance des systèmes de sante (Champagne et coll. 2005) generation de valeurs et du climat organisationnel a mobiliser positivement ou (negativement) le système de production, ainsi que l’impact des imperatifs et des resultats de la production et du climat et des valeurs organisationnelles. – L’equilibre legitimatif (maintien des valeurs-atteinte des buts) : Cette dimension de la performance evalue la capacite des mecanismes de generation des valeurs et du climat organisationnel a contribuer a l’atteinte des buts organisationnels, et comment le choix et la poursuite des buts de l’organisation viennent modifier et renforcer (ou miner) les valeurs et le climat organisationnel. – L’equilibre contextuel (maintien des valeurs-adaptation) : Cette dimension de la performance evalue la capacite des mecanismes de generation des valeurs et du climat organisationnel a mobiliser positivement le système d’adaptation, et comment les imperatifs et les resultats de l’adaptation viennent modifier et renforcer (ou miner) les valeurs et le climat organisationnel. Dès lors, la performance organisationnelle est abordee comme un construit multidimensionnel qui devrait permettre aux differentes parties prenantes de debattre et d’elaborer un jugement sur les qualites essentielles et specifiques de l’organisation en fonction de leurs croyances, connaissances, responsabilites, inter^ets et projets (Champagne et Contandriopoulos 2011bb). La performance d’une organisation se manifeste par sa capacite : 58 GEORGES-CHARLES THIEBAUT, ET COLL. realiser chacune de ses quatre fonctions, atteindre ses buts, s’adapter 1. A a son environnement (acquerir des ressources et repondre aux besoins), a produire des services de qualite avec productivite et a maintenir et developper des valeurs communes (culture organisationnelle) et; etablir et a maintenir une tension dynamique entre la realisation de 2. a ces quatre fonctions. L’appreciation de la performance repose ainsi, non seulement, sur la mesure d’indicateurs de reussite dans chacune des quatre fonctions de l’organisation, mais aussi sur le caractère dynamique de la tension qui existe entre les quatre p^ oles, c’est-a-dire sur la capacite de la gouvernance a orchestrer les echanges et les negociations entre les quatre fonctions par les differents acteurs. Ainsi, on constate que le modèle EGIPSS (Champagne et coll. 2005) repond adequatement a plusieurs mythes. Evaluer la performance de façon multidimensionnelle et selon quatre fonctions en tension permet de depasser le premier mythe et partiellement, le second. En effet, au-dela des tensions entre les fonctions, l’evaluation de la performance necessite une meilleure comprehension et modelisation sous forme de configuration des relations entre toutes les dimensions, autant a l’interieur de chaque fonction qu’entre les dimensions composant les autres fonctions. D’un point de vue methodologique, la robustesse et la finesse theorique (25 dimensions refletant des construits theoriques très circonscrits) du modèle EGIPSS facilitent la selection d’indicateurs pertinents et utilisables, directement relies au concept evalue. Par contre, certaines innovations devraient ^etre apportees pour affiner l’approche configurationnelle et le developpement de grille d’analyse pour l’interpretation et l’appreciation des resultats de l’evaluation. Ceci est le corollaire du cinquième mythe relatif a l’utilisation des donnees de l’evaluation de la performance. Le modèle EGIPSS se base principalement sur des jugements normatifs et des analyses relationnelles entre certaines des dimensions et les fonctions, bien que de façon non systematique. Finalement, ce modèle permet de repondre au sixième mythe puisqu’il peut ^etre decline en deux systèmes d’evaluation adaptes aux objectifs d’accroissement de l’imputabilite et d’amelioration continue de la performance. Ainsi, il apparait que certaines innovations pourraient ^etre apportees au modèle EGIPSS afin de complètement depasser le second, le quatrième et le cinquième mythe. Innover sur la base du modèle EGIPSS Ces innovations theoriques, methodologiques et d’utilisation sont profondement interreliees et touchent a la capacite de conceptualiser l’ensemble des interrelations entre les dimensions de la performance LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 59 pour proposer des configurations refletant la complexite des systèmes, afin de generer des jugements de type configurationnel et deliberatif. Pour ce faire, il est necessaire de developper des modèles logiques afin de representer les reseaux de relation relatifs a chaque dimension du modèle EGIPSS. Ces modèles logiques synthetiseraient, combineraient et organiseraient un ensemble de dimensions sur la base d’une representation graphique de leurs relations. Ces modèles, pour ^etre theoriquement valides, doivent ^etre justifies par la convergence d’etudes scientifiques demontrant a la fois leur influence sur la performance, ainsi que la nature des relations entre chacune des dimensions. Ces modèles logiques seraient la base conceptuelle permettant de porter un jugement configurationnel. Ce jugement emergerait de la combinaison de l’ensemble des jugements normatifs relatifs a chaque dimension et regroupes a l’interieur des modèles logiques. L’utilisation d’un jugement configurationnel faciliterait une conception de la performance non plus comme un phenomène lineaire, segmente o u chaque dimension serait traitee separement, mais comme reseau de relations entre des dimensions en interdependance Ce jugement configurationnel serait complete par un jugement deliberatif, issu d’un processus structure d’echanges permettant la confrontation des points de vue sur la base des modèles logiques et des jugements configurationnels. En effet, les seuls jugements normatifs et configurationnels ne paraissent pas suffisamment pour generer une interpretation commune a la fois des situations problematiques et des actions a entreprendre. Dans ce cadre, il faut apprehender le jugement deliberatif comme un processus de diagnostic et de resolution de problèmes permettant a des individus dont les valeurs et les inter^ets sont differents, d’ecouter, de comprendre, de se convaincre potentiellement, et finalement de parvenir a des decisions raisonnees et justifiees sur la base des resultats evaluatifs inseres dans des modèles logiques permettant leur appreciation (Abelson et coll. 2003). Le but ultime est l’emergence d’une comprehension commune et de solutions innovantes permettant l’amelioration de la performance. Les jugements configurationnels et deliberatifs pourraient s’appliquer autant au niveau macro (le système de sante), meso (les organisations de soins), que micro (les services et programmes), mais aussi aux differents paliers de gouvernance (strategique, tactique et operationnel). Les innovations que nous proposons d’adjoindre aux modèles EGIPSS permettraient de depasser les mythes que nous venons de decrire. En 60 GEORGES-CHARLES THIEBAUT, ET COLL. outre, l’ajout de modèles logiques supportant le developpement de jugements configurationnels et deliberatifs aurait plusieurs retombees theoriques et pratiques pour la gouvernance des organisations et du système de sante. Premièrement, l’utilisation d’un jugement configurationnel faciliterait une conception de la performance non plus comme un phenomène lineaire, segmente o u chaque dimension serait traitee separement, mais comme reseau de relations entre des dimensions en interdependance. Deuxièmement, le jugement deliberatif semble egalement avoir le potentiel d’ameliorer la capacite des acteurs a construire une vision partagee des problèmes et des solutions a y apporter afin de soutenir l’action. Ainsi, ces deux formes de jugements genereraient plus que des donnees informant les decideurs sur le niveau de performance, mais un savoir construit par les acteurs eux-m^emes, a la fois sur les determinants de la performance et sur les leviers d’action possibles pour l’ameliorer. L’ajout de modèles logiques et de nouvelles formes de jugement semblent ^etre des strategies prometteuses pour ameliorer le modèle EGIPSS, pour favoriser l’utilisation des resultats des evaluations de la performance dans la gestion quotidienne et pour soutenir les processus de prise de decision. Le modèle EGIPSS serait de ce fait mieux adapte aux besoins des utilisateurs, ce qui accro^ıtrait le potentiel des organisations a mettre en place des actions susceptibles de rehausser la performance. Bibliographie Abelson, Julia, Pierre-Gerlier Forrest, John Eyles, Patricia Smith, Elisabeth Martin et FrançoisPierre Gauvin. 2003. “Deliberations about deliberative methods: issue in the design and evaluation of public participation processes.” Social Science & Medicine 57: 239–51. Adair, Carol. E., Elizabeth Simpson, Ann L. Casebeer, Judith. M. Birdsell, Katharine A. Hayden et Steven Lewis. 2006a. “Performance Measurement in Healthcare: Part I- Concepts and Trends from a State of the Science Review.” Healthcare Policy 1(4) May: 85–104 ——. 2006b. “Performance Measurement in Healthcare: Part II- State of science findings by stage of the performance measurement process.” Healthcare Policy 2 (1), July: 56–78. Angers, Pierre. 2010. « La formation du jugement. » Dans La Formation du jugement, edite par Michael Schleifer. Quebec : Presses de l’Universite du Quebec. Barnes, Marian, Elizabeth Matka et Helen Sullivan. 2003. “Evidence, Understanding and complexity. Evaluation in Non-linear systems.” Evaluation 9 (3): 265–84. Boland, Tony, et Alan Fowler. 2000. “A systems perspective of performance management in public sector organisations.” The International Journal of Public Sector Management 13 (5): 417–46. Brousselle, Astrid, François Champagne, Andre-Pierre Contandriopoulos et Zulmira Hartz. 2011. L’evaluation : concepts et methodes. Montreal : Presses de l’Universite de Montreal. Butler, Michelle. 2000. Performance Measurement in Health Sector, Ireland public service development agency : Committee for Public Management Research, Discussion paper 14. Champagne, François. 2003. Defining a model of hospital performance for European hospitals. Barcelone. Rapport de recherche, Barcelone: WHO regional office for Europe. LES ENJEUX DE L’EVALUATION DE LA PERFORMANCE 61 ements d’architecture des sysChampagne, François et Andre-Pierre Contandriopoulos. 2005. El tèmes d’evaluation de la performance. Presentation a la conference de la COLUFRAS, Montreal. ——. 2011a. Utiliser l’evaluation, dans L’evaluation : Concepts et methodes, edite par Astrid Brousselle, François Champagne, Andre-Pierre Contandriopoulos et Zulmira Hartz, Deuxième edition. Montreal : Les Presses de l’Universite de Montreal. ——. 2011b. Development of a conceptual framework for health system performance assessment at CIHI. Rapport de recherche. Universite de Montreal. Champagne, François et Andre-Pierre Contandriopoulos, Julie Picot-Touche, François Beland et Hung Nguyen. 2005. Un cadre d’Evaluation de la Performance des Systèmes de Sante: Le modèle EGIPSS. Rapport de recherche : Universite de Montreal. Contandriopoulos, Andre-Pierre et François Champagne. 2010. Evaluation globale et integree de la performance des systèmes de sante : presentation generale. Presentation auprès du secretariat a la Sante du Mato Grosso du Sud, Bresil. Contandriopoulos, Andre-Pierre, Linda Rey, Astrid Brousselle et François Champagne. 2012. « Evaluer une intervention complexe : Enjeux conceptuels, methodologiques et operationnels. » The Canadian Journal of Program Evaluation 26 (3) : 1–16. Davies, Huw Talfryn Oakley. 1998. “Performance Management using health outcomes: In search of instrumentality.” Journal of Evaluation in Clinical Practice 4 (4): 359–62. Denis, Jean-Louis, Pascal Lehoux et François Champagne. 2004. “A knowledge utilization perspective on fine-tuning dissemination and contextualizing knowledge”, in Using knowledge and evidence in health care, edited by Louise Lemieux-Charles et François Champagne. Toronto: University of Toronto Press. Dubnick, Melvin. 2005. “Accountability and the promise of performance: In search of the mechanisms.” Public Performance and Management review 28 (3) March: 376–417. Eddy, David. 1998. “Performance Measurement: Problems and Solutions.” Health Affairs 17 (4): 7–25. Freeman, Tim. 2002. “Using performance indicators to improve health care quality in the public sector: A review of the literature.” Health Services Management Research 15: 126–37. Friedberg, Mark W., Cheryl L. Damberg, Elizabeth McGlynn et John L. Adams. 2011. Methodological considerations in generating provider scores for use in public reporting for community quality collaborative. White paper, Agency for Healthcare Research and Quality, N 11– 0093. Habermas, J€ urgen. 1987. L’agir communicationnel Tome 2. Paris : Librairie Arthème Fayard. Halachmi, Arie. 2002. “Performance measurement, accountability, and improved performance.” Public performance & Management review 25 (4), June: 370–74. Hatchuel, Armand et Beno^ıt Weil. 1992. L’expert et le système. Paris : Economica. Henry, Gary T. et Melvin M. Mark. 2003. “Beyond Use: Understanding Evaluation’s Influence and Attitudes and Actions.” American Journal of Evaluation 24 (3): 293–314. Hurst, Jeremy et Melissa Jee-Hughes. 2001. Performance measurement and performance management in OECD health system. OECD Labour market and social policy occasional papers 47. Lied, Terry R. et Vahe A. Kazandjian. 1999. “Performance: A multi-disciplinary and conceptual model.” Journal of Evaluation in Clinical Practice 5 (4): 393–400. Lorino, Philippe. 1997. Methodes et pratiques de la performance. Paris : Editions des organisations. McDaniel Jr., Reuben. 2007. “Management Strategies for Complex Adaptative Systems. Sensemaking, Learning and Improvisation.” Performance improvement quarterly 20 (2): 21–42. Meyer, Allan D., Anne S. Tsui et Christopher R. Hinings. 1993. “Configurational approaches to organizational analysis.” Academy of Management Journal 36 (6): 1175–95. Murray, Christopher et David Evans. 2003. “Health Systems Performance Assessment: Goals, Framework and Overview”, In Health system performance assessment, Debates, 62 GEORGES-CHARLES THIEBAUT, ET COLL. Methods and Empiricism edited by Christopher Murray C. et David Evans. Geneva: World Health Organization. Murray, Christopher et Julio Frenk. 2000. “A framework for assessing the performance of the health systems.” Bulletin of the World Health Organization, 78 (6): 717–31. NHPA. (National Health Performance Authority). 2011. National Health Reform, Performance and Accountability Framework. NHS. Department of Health. 2011. The operating framework for the NHS in England, 2011/2013. ——. 2012. The NHS Outcomes framework 2012/2013. Norton, David et Robert Kaplan. 1996. The Balanced Scorecard, Translating strategy into action. Boston: Harvard Business School Press. Parsons, Talcott. 1951a. The Social System. London: Routledge. ——. 1951b. Toward a general Theory of Action. Cambridge: Harvard University Press. Patton, Michael Quinn. 2011. Developmental Evaluation: Applying Complexity Concepts to Enhance Innovation and Use. New York: The Guilford Press. Pfeffer, Jeffrey et Robert. I. Sutton. 2006. “Evidence-based management.”, Harvard Business Review, January. Rousseau, Denise. M. 2006. “Is there such a thing as “evidence-based management?” Academy of Management Review 31(2): 256–69. Sicotte, Claude, François Champagne, Andre-Pierre Contandriopoulos, Jan Barnsley, François Beland, Jean-Louis Denis, Ann Langley, Marc Bremond et Ross. G. Baker. 1998. “A conceptual framework analysis of health care organizations performance.” Health services management research 11(1). Smith, Peter. C., Elias Mossalios, Irene Papanicolas et Sheila Leatherman. 2009. Performance measurement for health system improvement: Experience, challenges and prospects. Cambridge: Cambridge University Press. Smith, Peter. C. 1995. “The unintended consequences of publishing performance data in the public sector.” International Journal of Public Administration 18 (2): 277–310. Wallace, Jean. E., Jane. B. Lemaire et William. A. Ghali. 2009. “Physician Wellness: A missing quality Indicator.” Lancet 374 (9702), novembre: 1714–21. Walshe, Kieran et Thomas G. Rundall. 2001. “Evidence-based Management: From theory to practice in health care.” The Milbank Quaterly 79 (3): 429–57. Weiss, Carol. H. 1998. “Have we learned anything new about the use of evaluation?” American Journal of Evaluation 19 (1): 21–33. Olivier Sossa Isabelle Ganache ciation de la performance L’appre et des du système de sante bec : services sociaux du Que la L’approche du Commissaire a et au bien-e ^tre sante Sommaire : Le Quebec s’est dote d’un Commissaire a la sante et au bien-^etre pour apporter un eclairage pertinent au debat public et a la prise de decision gouvernementale dans le but de contribuer a l’amelioration de l’etat de sante et de bien-^etre des citoyens. Au fil des annees, le Commissaire a developpe une expertise dans le domaine de l’evaluation de la performance fondee sur le croisement de differentes connaissances et sources d’information, et qui intègre l’ethique et la participation citoyenne. Dans cet article, nous presentons la methode du Commissaire, ses resultats, ainsi que les enjeux et les defis qui y sont associes. Abstract: Quebec established a Health and Welfare Commissioner to shed a relevant light on the public debate and government decision-making process to help improve citizens’ health and welfare. Over the years, the Commissioner has developed an expertise in the performance measurement field based on the crosssection of various knowledge and information sources, and that integrates ethics and citizen engagement. In this article, we present the Commissioner’s approach, its results, and related issues and challenges. Le developpement des technologies et des connaissances sur les determinants de la sante, l’evolution demographique, l’apparition de nouvelles maladies, les contraintes budgetaires, etc., complexifie l’environnement dans lequel evoluent les systèmes de sante, exerçant ainsi des pressions considerables sur ces derniers. Le fait de porter un regard externe et independant sur la performance des systèmes de sante permet de proposer des pistes de solutions pour l’adaptation du système et l’amelioration de sa performance (Quebec 2005). Au Quebec, le Commissaire a la sante et au bien-^etre, le « CSBE », remplit ce mandat. En creant le CSBE, le Olivier Sossa, PhD en Sante publique (Organisation des soins), et Isabelle Ganache sont tous deux professionnels de recherche au bureau du Commissaire a la sante et au bien-^etre. Isabelle Ganache, PhD en Bioethique, est aussi professeure adjointe de clinique, Programmes de bioethique au Departement de medecine sociale et preventive a la Faculte de medecine de l’Universite de Montreal. Les auteurs tiennent a remercier tout le personnel du bureau du Commissaire. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 63–88 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 64 OLIVIER SOSSA, ISABELLE GANACHE gouvernement quebecois desirait accro^ıtre l’imputabilite et la reddition de comptes vis- a-vis de la population en ce qui concerne les resultats atteints par le système de sante et de services sociaux quebecois (SSSS). Cette preoccupation emanait notamment des discussions entourant l’Accord 2003 sur le renouvellement des soins de sante, pendant lequel les premiers ministres du Canada et des provinces ont prevu la creation du Conseil canadien de la sante. Le Quebec a toutefois decide de ne pas adherer au Conseil en tant que membre; il a plut^ ot prevu de creer sa propre entite (CSBE 2013a). La mission du CSBE est d’apporter un eclairage pertinent au debat public et a la prise de decision gouvernementale dans le but de contribuer a l’amelioration de l’etat de sante et de bien-^etre des Quebecois cet effet, le CSBE apprecie les resultats atteints par le (Quebec 2005). A système; consulte les citoyens, les experts et les acteurs du système; informe le ministre, l’Assemblee nationale et la population des resultats obtenus; fait des recommandations en vue d’accro^ıtre sa performance, et evalue les enjeux et les implications de celles-ci. Le CSBE produit annuellement un rapport d’appreciation qui est depose a l’Assemblee nationale, et largement diffuse pour informer et eclairer la prise de decisions relative a l’amelioration du système. Dans cet article, après avoir sommairement presente le contexte dans lequel s’inscrit le CSBE, nous aborderons les elements distinctifs de sa demarche et ses principaux resultats. Nous illustrerons, a partir d’exemples, la contribution de ses travaux a la gouvernance du SSSS et aborderons quelques enjeux et defis lies a l’evaluation de sa performance. Les auteurs œuvrent tous deux au bureau du CSBE, l’une ayant une formation en bioethique et l’autre en organisation des soins. thodes utilise es par le CSBE et ses Me s particularite Hormis le CSBE, quelques organismes gouvernementaux jouent un r^ ole a differents niveaux en ce qui concerne l’evaluation de la performance du SSSS quebecois, m^eme si le mandat premier de ces organismes n’est pas l’appreciation globale de la performance. Il s’agit principalement du ministère de la Sante et des Services sociaux (MSSS), de l’Institut national de sante publique du Quebec (INSPQ) et de l’Institut national d’excellence en sante et en services sociaux (INESSS). D’autres entites telles que le Verificateur general du Quebec (VGQ) et le Protecteur du citoyen, de par la nature de leurs travaux, apportent aussi un eclairage sur la performance du système de sante. En plus de ces acteurs gouvernementaux, mentionnons l’Association quebecoise des etablissements de sante et de services sociaux (AQESSS), qui, depuis 2007, apprecie la performance de ses etablissements membres. Mentionnons aussi le Conseil quebecois d’agrement (CQA) qui examine ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 65 periodiquement la capacite des etablissements de sante a satisfaire les besoins et les attentes des usagers. Il en est de m^eme pour Agrement Canada, qui fournit aux etablissements de sante un processus d’examen externe dans le but d’evaluer et d’ameliorer les soins et services offerts aux patients. Si plusieurs organismes ont des mandats qui leur permettent de s’interesser a la performance du SSSS au Quebec, le CSBE se distingue d’une part par ses origines, soit une volonte politique clairement exprimee par l’Assemblee nationale, reconnue dans le projet de loi 38 portant sur la creation de l’organisme definissant ses obligations; et d’autre part, par sa demarche d’appreciation de la performance dont les principaux elements seront presentes dans cet article. des approches et me thodes La diversite ciation de la performance d’appre La complexite du SSSS requiert de disposer de plusieurs approches methodologiques qui permettent d’en cerner l’etendue et la portee, d’apporter un eclairage sur sa performance et de proposer des actions ciblees pour l’ameliorer. Dans ses travaux, le CSBE met l’accent sur la triangulation (combinaison d’approches theoriques, des donnees et des methodes), l’ethique et la participation citoyenne. La necessite de s’appuyer sur une triangulation des sources de connaissances s’est imposee comme une evidence dès la conceptualisation des travaux ainsi que pendant l’analyse et la structuration des recommandations du CSBE La decision politique ne peut se fonder uniquement sur des donnees probantes, mais aussi sur des donnees contextuelles qui informent sur la pertinence des actions a poser (Lavis 2006; Denis, Lehoux, et Champagne 2004; Lomas et coll. 2005). Dans cette perspective, les travaux du CSBE reposent sur les notions de pertinence et d’utilite (les travaux doivent servir a ameliorer le système), de rigueur (l’analyse doit ^etre construite sur des faits justes et basee sur une observation rigoureuse) et de triangulation (prise en compte de differentes sources d’informations). La triangulation, dans la demarche du CSBE, consiste a combiner differentes approches dans chacune des etapes de sa demarche. Elle vise a augmenter la validite et la qualite des resultats afin d’aboutir a une lecture juste du niveau de la performance atteint par le système de sante, et de proposer des recommandations pertinentes susceptibles de l’ameliorer (Bekker 2007; Levesque et Cleret de Langavant 2010). 66 OLIVIER SOSSA, ISABELLE GANACHE Figure 1. Trois sources de connaissances pour juger de la performance La triangulation des connaissances Les travaux du CSBE s’inspirent du champ theorique concevant la demonstration ou la preuve comme etant le produit de differentes sources de connaissances (Klein 2003). Ainsi, ils reposent sur trois types de connaissances interdependantes : scientifiques, organisationnelles et citoyennes (Figure 1). Une recension des ecrits sur les questions a aborder permet de faire le point sur l’etat des connaissances scientifiques, mais egalement sur la reponse du SSSS aux besoins en matière de soins et services sur cette question. Ces connaissances reposent sur des donnees de sources variees, notamment des indicateurs de performance valides permettant des comparaisons a l’echelle regionale, provinciale et internationale. Ces connaissances sont enrichies par une consultation prenant la forme d’un seminaire de deux jours reunissant une vingtaine d’experts provenant des milieux de recherche, d’enseignement universitaire et de cliniques. Ce seminaire permet de documenter l’etat de situation des soins et des services dans un domaine specifique, de determiner les principaux enjeux et defis concernant ces soins et services, en plus de cibler les pratiques exemplaires qui devraient ^etre mises en application au Quebec pour en ameliorer la prestation. Les connaissances organisationnelles sont issues de consultations auprès d’une vingtaine de professionnels, d’administrateurs et de decideurs provenant de divers horizons – dont des milieux de pratique – reunis pendant deux jours. Cet exercice vise a identifier les facteurs qui influencent le fonctionnement et la performance du système et, a cibler les ameliorations pouvant ^etre apportees a l’organisation et a la gestion des soins et services. Ces connaissances permettent de tenir compte des realites du terrain et de degager les pratiques exemplaires ou innovatrices qui pourraient ^etre mises en œuvre dans le contexte quebecois, ainsi que la faisabilite des actions qui seront proposees. ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 67 Les connaissances citoyennes proviennent du Forum de consultation et d’autres formes de consultation. Unique en son genre au Quebec, ce Forum est une instance deliberative permanente formee de 27 membres, nommes pour trois ans, dont 18 proviennent de chacune des regions du Quebec et 9 autres possèdent une expertise particulière en sante et en services sociaux. Le mandat du Forum consiste a fournir sa perspective sur diverses questions que le CSBE lui soumet; ce dernier cherchant a identifier particulièrement les valeurs, les preoccupations citoyennes et a documenter l’acceptabilite sociale des changements proposes. Selon les sujets abordes et la nature de l’information souhaitee, le CSBE procède egalement a d’autres formes de consultation, telles que : temoignages sur son site web, appels de memoires, groupes de discussion ou consultations individuelles auprès de divers acteurs. Ces consultations permettent au CSBE de recueillir un savoir fonde sur l’experience, conserver une dimension emotionnelle tout en creant une interface entre les individus et les decideurs, dans le but d’eclairer la prise de decision sur les orientations du système de sante. Pour ^etre performant, le SSSS doit assumer quatre grandes fonctions : 1) s’adapter pour se donner les ressources et les structures organisationnelles necessaires, repondre aux besoins et aux attentes des citoyens; 2) produire des services de qualite en quantite adequate et en maintenant une bonne productivite; 3) maintenir et developper des valeurs et la qualite du milieu de travail; 4) atteindre ses buts, qui sont de reduire l’incidence, la duree et les effets negatifs des maladies et des problèmes sociaux La complexite du SSSS, ainsi que l’ampleur et l’articulation des differentes perspectives de sa performance, rendent necessaire l’utilisation de plusieurs types de triangulation pour temoigner des resultats. La necessite de s’appuyer sur une triangulation des sources de connaissances s’est imposee comme une evidence dès la conceptualisation des travaux ainsi que pendant l’analyse et la structuration des recommandations du CSBE, la mise en commun de ces dernières aidant a saisir les differents niveaux de comprehension selon differents groupes d’informateurs. Ainsi, la combinaison de differentes sources de donnees peut permettre une explication plus convaincante et plus solide lorsque les deux types de donnees conduisent a des resultats similaires (Hammond 2005), ou de faire emerger des contradictions ou des 68 OLIVIER SOSSA, ISABELLE GANACHE paradoxes non observables autrement (Teddlie et Tashakkori 2009). Dans le cas des travaux du CSBE, les chercheurs et experts apportent une comprehension de la thematique a l’etude qui provient de leurs travaux de recherche et ils enoncent une vision de ce qui devrait ^etre fait pour ameliorer la performance. Les gestionnaires et les decideurs, en se basant sur les connaissances du terrain et leur capacite en matière de prise de decision, apportent leur savoir pour eclairer sur la faisabilite de la mise en œuvre des visions evoquees par les experts. Quant aux citoyens, ils eclairent le Commissaire sur leurs valeurs et les enjeux ethiques rattaches aux visions et leur acceptabilite sociale. Le but recherche est la convergence ou la corroboration des constats sur un m^eme phenomène afin de renforcer les arguments en faveur d’une recommandation. Toutefois, il importe d’^etre sensibilise au fait que, dans un processus de consultation, peuvent transpara^ıtre des prejuges ou des positionnements influences par des inter^ets particuliers. Ainsi, il est necessaire de comprendre pleinement chaque source de connaissances et ce qu’elle represente. Le Commissaire n’est en aucun cas lie aux propos qu’il recueille dans le cadre de ces projets. L’interpretation et l’utilisation qu’il fait de ces donnees sont modulees par sa mission et surtout par l’analyse des implications des recommandations a emettre. La combinaison des theories gr^ace a un modèle integrateur Il existe plusieurs modèles d’analyse de la performance qui illustrent chacun une perspective de la performance organisationnelle, que ce soit l’atteinte des buts, le maintien de valeurs et normes, la capacite d’acquisition des ressources, la reponse aux besoins de la population, par exemple. (Sicotte 2007; Champagne et coll. 2005). Pour le Commissaire, l’appreciation de la performance du SSSS devrait depasser une vision parcellaire et integrer chacune de ces perspectives, c’est pourquoi il a construit son cadre d’analyse en s’inspirant du modèle EGIPSS (Evaluation globale et integree de la performance des systèmes de sante) (Champagne et coll. 2005). Ce modèle se fonde sur la premisse selon laquelle la performance des organisations de sante est paradoxale (Contandriopoulos et coll. 2012). Ainsi, les differents modèles d’analyse de la performance, m^eme s’ils peuvent appara^ıtre contradictoires, doivent ^etre consideres simultanement. Ce qui implique le besoin d’une approche holistique, qui permette de prendre en consideration les tensions entre des exigences contradictoires de chacune des perspectives d’analyse (Figure 2). Sur cette base, le CSBE adhère a l’idee que, pour ^etre performant, le SSSS doit assumer quatre grandes fonctions : 1) s’adapter pour se donner les ressources et les structures organisationnelles necessaires, repondre ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 69 Figure 2. Cadre d’analyse de la performance (Inspire de Champagne et coll., 2005) aux besoins et aux attentes des citoyens; 2) produire des services de qualite en quantite adequate tout en maintenant une bonne productivite; 3) maintenir et developper des valeurs et la qualite du milieu de travail; 4) atteindre ses buts, qui sont de reduire l’incidence, la duree et les effets negatifs des maladies et des problèmes sociaux. Dans ce cadre d’analyse, on reconna^ıt que les quatre fonctions sont liees, et l’examen de leurs interactions permet une appreciation plus fine, qui reflète mieux le fonctionnement du système. Ces interactions ou alignements sont d’ordre strategique (lien entre l’adaptation et l’atteinte des buts); allocatif (lien entre l’adaptation et la production); tactique (lien entre l’atteinte des buts et la production); operationnel (lien entre le maintien et developpement et la production); legitimatif (lien entre le maintien et developpement et l’atteinte des buts) ou contextuel (lien entre le maintien et developpement et l’adaptation). Le recours aux approches quantitative et qualitative Il importe de souligner, comme nous l’avons vu dans les sections precedentes, que les travaux du CSBE combinent un volet quantitatif et un volet qualitatif. Le volet quantitatif de l’appreciation de la performance repose sur des indicateurs. Les indicateurs proviennent de registres administratifs 70 OLIVIER SOSSA, ISABELLE GANACHE Figure 3. Fonctions, dimensions et sous-dimensions du cadre d’appreciation de la performance (MED-ECHO, RAMQ, rapports financiers, etc.), d’enqu^etes quebecoises (donnees de l’ISQ, INSPQ, etc.) d’enqu^etes canadiennes (Sondage national des medecins, donnees de l’ICIS, etc.) et internationales (Commonwealth Fund, OCDE, OMS, etc.). Les indicateurs sont regroupes en sous-dimensions. Les sousdimensions quant a elles sont regroupees en dimensions qui composent les fonctions du modèle (Figure 3). En dehors des critères generaux (validite, sensibilite, pertinence, stabilite), le choix des indicateurs, leur pertinence et leur categorisation dans une sous-dimension sont faits avec l’appui d’universitaires, notamment l’Institut de recherche en sante publique de l’Universite de Montreal et d’autres partenaires institutionnels qui evaluent aussi la performance du système de sante tant au Quebec qu’ailleurs. Les indicateurs sont selectionnes en fonction de la documentation, des experts mais aussi en fonction de la disponibilite de l’information. Ce dernier critère reste la principale limite en matière d’evaluation de la ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 71 performance (Voyer 2002; Calmette 2008), limite soulevee par les organisations qui evaluent la performance des systèmes de sante, et se reflète notamment par le decalage entre le moment de la publication des rapports d’appreciation de la performance et les informations les plus recentes contenues dans les banques de donnees utilisees. Les alignements sont determines par les interrelations entre les sousdimensions au sein des fonctions. L’operationnalisation de ces interrelations et alignements necessite une reconstruction de la cha^ıne causale des liens entre les differentes sous-dimensions du modèle. Les relations qui lient les differentes sous-dimensions du modèle sont de nature differente. Elles peuvent ^etre des relations compensatoires, c’est-a-dire qu’il existe un lien de causalite directe positif ou negatif entre deux sous-dimensions. Cela signifie que le niveau de performance d’une sous-dimension (par exemple « disponibilite des ressources ») a un impact direct positif ou negatif sur une autre sous-dimension du modèle (par exemple, « adequation aux besoins de la population » ou « capacite a innover »). On observe aussi des relations d’arbitrage quand il n’y a pas de relation causale directe entre deux sous-dimensions, mais que toutes deux possèdent des determinants communs qui jouent dans le sens contraire. Ainsi la performance provient d’un certain equilibre car les exigences de chaque sous-dimension – par exemple « productivite », « securite », « justesse » – peuvent-^etre contradictoires. Une relation parabolique est observee quand la performance d’une sous-dimension est liee a celle d’une autre sous-dimension par une relation en U. Ainsi, la performance depend du niveau de performance d’une sous-dimension par une relation curvilineaire. C’est le cas par exemple de la relation entre le « climat organisationnel » et la « productivite ». La productivite aura un impact positif sur le climat organisationnel dans un certain intervalle seulement. Une productivite trop elevee ou trop faible aurait un impact negatif sur le climat organisationnel. Enfin, des relations contingentes sont observees lorsqu’une troisième sous-dimension influence l’interaction entre deux sous-dimensions expliquant ainsi la performance de la sous-dimension analysee. C’est le cas entre la sous-dimension « disponibilite des ressources » et « la securite », « l’efficacite », « l’humanisation », etc. Le cadre d’analyse de la performance adopte par le CSBE permet d’evaluer la performance de chaque dimension gr^ace aux sousdimensions qui la composent. Chaque sous-dimension est mesuree par une serie d’indicateurs qui, une fois mise en relation et configuree, permet d’apprecier la performance de chacune des fonctions du système de sante. De plus, il offre la possibilite de mesurer la performance en fonction des tensions, equilibres ou alignements qui existent entre les 72 OLIVIER SOSSA, ISABELLE GANACHE dimensions et les sous-dimensions de la performance. Toutes les logiques et tous les liens qu’il est possible d’etablir entre les differentes composantes du SSSS montrent la complexite des analyses visant a mesurer leur equilibre. Et, chaque annee, des ameliorations sont apportees au modèle pour integrer l’analyse des alignements. L’approche qualitative est utilisee dans les differentes consultations du CSBE. Elle permet de structurer la demarche visant a tirer profit des connaissances et de la comprehension qu’ont differents acteurs des questions qui leur sont soumises. L’analyse des comptes rendus des consultations est realisee en equipe, permettant ainsi d’evaluer le degre de concordance ou de divergence entre les differentes perspectives sur la performance. De plus, que ce soit pour le Forum de consultation, le panel des decideurs ou le seminaire d’experts, une validation des synthèses est faite auprès des participants pour evaluer et s’assurer qu’ils y retrouvent l’essentiel de leurs propos. Une mise en commun des conclusions, provenant de differents groupes d’acteurs, oriente le CSBE dans sa demarche pour la formulation de ses recommandations. l’instar de demarches de recherche dites mixtes, il faut admettre A qu’il peut exister des combinaisons differentes selon le poids qu’on decide d’accorder aux volets quantitatif et qualitatif. Le CSBE cherche a appliquer, selon la nature de la thematique abordee, la demarche maximisant la qualite du recueil d’information et de traitement des donnees. La necessite d’integrer une reflexion sur les aspects ethiques lies a la sante et au bien-^etre dans l’appreciation du SSSS quebecois se trouve au cœur de la demarche du Commissaire Ces travaux s’inscrivent ainsi dans un processus simultane d’analyse de donnees quantitatives et qualitatives afin de fournir un portrait de la thematique abordee. En effet, les deux types de donnees sont recueillis simultanement et sont ensuite integres dans l’interpretation des resultats globaux pour plus de precision et de profondeur (Zandvanian et Darya l’image de certains travaux de recherche qui poor 2013; Mobini, 2011). A utilisent a la fois des donnees quantitatives et qualitatives, les travaux d’appreciation de la performance du CSBE font usage de ces donnees de manière parallèle, concurrente ou sequentielle (Ostlund, Kidd, Wengstrom et Rowa-Dewar 2011). En effet, pour certains travaux tels que le rapport d’appreciation thematique sur la première ligne de soins et celui sur la perinatalite et la petite enfance, la collecte et l’analyse des indicateurs et des connaissances issues des consultations ont ete realisees en parallèle pour ^etre ensuite consolidees a l’etape de la production des ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 73 recommandations. Dans d’autres exercices d’appreciation de la performance, les sources de donnees sont integrees concurremment et plus t^ ot dans le processus pour permettre de presenter un paysage plus complet de l’objet d’analyse; ce fut le cas en particulier du rapport d’appreciation thematique sur la sante mentale et celui a venir sur le vieillissement de la population. Il est a noter que les resultats des differentes sources de donnees sont parfois convergents, c’est-a-dire qu’ils vont dans le m^eme sens (Ostlund et coll. 2011; Mobini, 2011; Venkatesh, Brown et Bala 2013). Par exemple, dans le rapport sur la première ligne de soins, a la fois les experts, les decideurs, les citoyens et les donnees tendaient a appuyer la recommandation sur l’intensification de l’interdisciplinarite au sein des equipes de soins, l’implantation des mecanismes d’amelioration continue de la performance clinique, etc. D’autres fois, les donnees etaient plut^ ot complementaires, alors que les unes et les autres se completaient (Ostlund et coll. 2011; Mobini 2011; Venkatesh, Brown et Bala 2013). C’est le cas par exemple dans les travaux sur les maladies chroniques o u, dans l’ensemble des consultations, il ressortait que le système devrait favoriser le developpement de la capacite des personnes et de leurs proches a mieux prendre soins de leur maladie. Cependant, pour les membres du Forum, cette orientation m^eme si elle est juste, doit tenir compte de la capacite de ces derniers. Enfin, il arrive que les differentes donnees semblent contradictoires (Ostlund, et coll. 2011; Mobini 2011; Venkatesh, Brown et Bala 2013). Y a-t-il penurie de medecins, tel que le suggèrent certains groupes consultes, ou les services sont-ils a reorganiser, ce que suggèrent certaines donnees quantitatives sur le nombre de ces professionnels de la sante par habitants? Avec le temps et l’experience, le CSBE perfectionne ses methodes et ameliore l’integration des differentes sources de donnees qu’il utilise. Cependant, des defis a ce niveau demeurent. La documentation qui guide ce type de reflexion et de pratique est pauvre (Ostlund et coll. 2011; Venkatesh, Brown et Bala 2013). En pratique, il demeure parfois malaise de combiner differentes informations qui ne semblent pas se combiner, ou dont la nature est differente (Cronholm et Hjalmarsson 2011). En effet, quelle place accorder a une valeur exprimee par un groupe de citoyens par rapport a des indicateurs? La transparence dans les choix qui sont realises menant aux recommandations est un principe devant guider les travaux, alors m^eme que ces choix sont parfois implicites. Les defis d’organisation – contraintes de temps pour favoriser la decision pertinente et autres contraintes de ressources, expertise a developper et soutenir pour qu’elle perdure dans le temps – sont aussi a ne pas negliger. En resume, les differentes approches de recherche utilisees par le CSBE, visent a depasser les limites de chaque approche theorique et methodologique et conduisent a une observation et une analyse plus 74 OLIVIER SOSSA, ISABELLE GANACHE fines et plus fiables, appuyant ainsi la qualite, la validite et la rigueur des travaux. Ces approches de triangulation permettent alors de mieux apprehender la complexite des sujets et thematiques abordes, par le croisement de differents regards sur ces derniers. grer l’e thique a l’exercice Inte ciation d’appre La necessite d’integrer une reflexion sur les aspects ethiques lies a la sante et au bien-^etre dans l’appreciation du SSSS quebecois se trouve au cœur de la demarche du Commissaire. Cette necessite se concretise d’abord dans la visee m^eme de soutenir les decisions a l’egard de choix sociaux que sous-tend notre SSSS, ses priorites et leurs consequences sur la base des valeurs sociales qui ont cours, tout en integrant la voix de tous. Elle se concretise egalement dans les methodes et les moyens pour ce faire, incluant ceux concernant la mise en œuvre des processus de consultation. Soutenir des decisions qui sous-tendent des choix sociaux De nombreuses decisions a l’egard du SSSS quebecois constituent des choix de societe. « Quels sont les besoins en sante et en services sociaux auxquels le gouvernement devrait repondre en priorite? »; « Comment offrir un accès equitable et raisonnable a certains soins et services? »; « Comment resoudre la tension entre la liberte individuelle et la solidarite lorsqu’elle est rencontree par les acteurs de notre système? » ne sont que quelques-unes des questions globales qui interpellent les valeurs sociales. Les methodes et les moyens L’ethique, dans la tradition de la sagesse pratique, constitue une aide a la decision qui amène a poser des jugements en situation et interpelle la responsabilite des acteurs concernes. Interdisciplinaire par definition, l’ethique au CSBE est donc portee par l’ensemble des membres et des collaborateurs de l’organisation – aux expertises variees (sante publique, economie, administration et gestion de la sante, pharmacie, dentisterie, psychologie, medecine, sciences politiques, droit, communication, anthropologie, ethique, etc.) – sous la responsabilite du Commissaire adjoint a l’ethique. La methode pour la complexite en ethique, fondee sur la theorie de la complexite, inspire la demarche d’integration de l’ethique a l’exercice d’appreciation menee par le CSBE (Cleret de Langavant 2001). Les concepts associes a cette methode permettent de poser un regard sensible ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 75 au contexte sur le SSSS, de saisir les interactions entre les elements qui le composent et leur evolution. Ils permettent notamment d’apprehender le SSSS avec ses multiples acteurs qui interagissent, evoluent et s’adaptent, comme un système complexe. Ils amènent par exemple a voir les problèmes de performance du système soulevant des enjeux, ethiques ou autres, comme des « nœuds systemiques » sur lesquels il est possible d’agir en identifiant des « leviers de changement » systemiques. Cette vision de l’ethique s’eloigne d’une approche deductive fondee uniquement sur l’application de principes, telle que proposee par certains sur l’ethique des politiques de sante (Masse 2001), puisque le CSBE considère que les enjeux ethiques lies a un objet d’evaluation sont intimement lies au contexte de son developpement et de son utilisation, a l’instar d’autres auteurs (Burls et coll. 2011). Ainsi, le CSBE met en œuvre differentes approches allant d’une explicitation des enjeux et des choix sociaux rencontres (et de leur contexte historique et social), a l’identification des acteurs et des niveaux decisionnels concernes incluant les inter^ets en jeu, les valeurs sous-jacentes et les normativites implicites. Par la mise en œuvre de cette vision, ces methodes et ces approches, le CSBE s’inscrit ainsi dans une reflexion sur l’ethique des politiques de sante. Dans le cadre de cette reflexion, des analystes s’interessant a l’integration de l’ethique aux demarches d’evaluation ont identifie differents moyens pour y arriver (Kenny et Giacomini 2005; Churchill 2002; Burls et coll. 2011) : diverses formes de participation citoyenne; la reflexivite sur les normativites implicites, soit l’identification explicite des valeurs, principes et normes qui president aux actions et aux decisions des acteurs concernes; une analyse historique et sociale, afin de prendre en compte le contexte et les raisons qui expliquent une situation actuelle; l’identification formelle et une analyse de l’ethos des acteurs concernes, incluant les motivations de ces derniers et les valeurs qui ont cours dans un milieu (le milieu des affaires, une communaute, etc.), et; l’analyse de l’objet d’evaluation a partir de questions specifiques visant a faire ressortir les enjeux ethiques. Le souci ethique dans l’organisation des processus de consultation La demarche du CSBE reflète l’importance qu’il accorde au fait d’entendre la voix de tous et de fournir une aide a la decision pertinente et ancree dans les valeurs sociales. Les nombreuses consultations 76 OLIVIER SOSSA, ISABELLE GANACHE qu’il met en œuvre dans le cadre de chacun de ses rapports d’appreciation thematique et les approches qu’il experimente, en temoignent. Il importe d’abord de noter la preoccupation constante du CSBE d’inclure la voix des citoyens, a l’instar d’autres organisations qui ont a prendre des decisions d’evaluation refletant des choix ethiques et sociaux importants et impliquant des jugements de valeur, et ce, dans un contexte d’allocation de ressources limitees. Mentionnons a ce titre le Conseil de citoyens cree par le National Institute for Health and Care Excellence (NICE), responsable d’etablir les standards cliniques du système de sante au Royaume-Uni (NICE 2013), ou celui cree dans le cadre de l’Ontario Drug Benefit Act par le gouvernement ontarien dans le but d’atteindre une legitimite pour determiner les valeurs sociales guidant les choix a faire (NICE 2008; Rawlins et Culyer 2004) ou d’augmenter la transparence et la reddition de compte envers le public dans le developpement des politiques de sante (Ontario 2012). Plus près de nous au Quebec, soulignons egalement l’Institut national d’excellence en sante et en services sociaux (INESSS), charge de l’evaluation des technologies, des medicaments et des interventions en sante et en services sociaux, qui reconna^ıt egalement que certaines de ses decisions necessitent de poser des jugements de valeur et se propose d’accro^ıtre la participation des citoyens dans ses differents processus de decisions dans les annees a venir (INESSS 2011, 2012). Au-dela de l’application de methodes, approches et moyens, le Commissaire conçoit l’ethique comme un exercice de mediation sociale qui, en incluant les differents acteurs concernes, y compris les citoyens, dans les decisions portant sur des choix sociaux a la base de notre SSSS, participe a la promotion de la democratie Pour le CSBE, le Forum de consultation constitue l’une des voies privilegiees visant a inclure les preoccupations des citoyens, dans la mesure o u il doit faire etat dans ses rapports des conclusions et recommandations auxquelles parvient le Forum. Le Forum constitue une instance deliberative unique, o u les membres reçoivent de l’information pertinente et objective sur les questions soumises a leur examen et echangent dans un contexte favorable a une confrontation respectueuse des idees. Leurs deliberations permettent d’arriver a des positions argumentees, et ce, sur la base de leurs connaissances, croyances, valeurs et experiences personnelles. La diversite des membres selectionnes favorise la deliberation a partir de multiples perspectives. ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 77 Dans l’interpellation des acteurs et des decideurs du SSSS, de m^eme que des experts pour chaque thème d’evaluation traite, le souci de representer diverses perspectives et de faire preuve d’une ouverture preside encore lors de l’organisation de ces consultations, qu’il s’agisse de tables rondes ou de seminaires organises, ou encore de consultations individuelles ou de groupes. Cette combinaison des processus consultatifs lui permet de valider certaines tendances, d’explorer le sens pouvant ^etre donne a differentes positions et de mieux illustrer des situations vecues. L’ethique dans les publications du CSBE La reflexion du CSBE sur l’integration des aspects ethiques lies a la sante et au bien-^etre s’est traduite jusqu’a maintenant de diverses manières dans ses publications. Tout d’abord, il importe de mentionner que certaines publications du CSBE ont ete produites sur la base m^eme de considerations ethiques. C’est le cas du rapport intitule « Consultation sur les enjeux ethiques du depistage prenatal de la trisomie 21, ou syndrome de Down, au Quebec – des choix individuels qui nous interpellent collectivement » (CSBE 2009a); de l’avis « Informer des droits et sensibiliser aux responsabilites en matière de sante » (CSBE 2010a); du guide « L’importance du debat public et les conditions qui y sont propices » (CSBE 2012b), et de l’avis « sur les services de procreation assistee au Quebec », (CSBE 2014). Plut^ ot que d’aborder l’ethique de manière isolee, les publications du CSBE la traite par le choix de certains thèmes de m^eme que par le compte rendu des processus mis en œuvre pour nourrir la prise de decision. L’integration d’une reflexion ethique a plusieurs etapes de la production d’une analyse ou d’une demarche d’appreciation de la performance influence de differentes manières le produit final. Une telle reflexion est a la base de la conception m^eme de certaines productions. Par exemple, la portee de certaines questions aux fondements de la selection des thèmes a apprecier ont comme sources une preoccupation ethique explicite : « Comment allouer les ressources du SSSS quebecois adequatement alors que des postes budgetaires comme ceux des medicaments augmentent de manière croissante? »; « Comment desservir adequatement des populations presentant des vulnerabilites particulières telles que dans le domaine de la sante mentale, les populations avec des maladies chroniques ou vieillissantes? ». De m^eme, plusieurs demarches de consultation, comme celle menee dans le cadre du dossier sur les medicaments, sont pensees pour integrer la voix de tous les acteurs concernes, leurs differents inter^ets tout en rendant compte de la legitimite de differentes positions. Par ailleurs, les outils developpes pour sonder la population, le Forum, les experts et les decideurs incluent des questions 78 OLIVIER SOSSA, ISABELLE GANACHE qui touchent les choix de societe a faire, par exemple des questions concernant la distribution des ressources ou les priorites a considerer pour notre système de sante. De nombreuses recommandations sont le reflet de ces preoccupations ethiques soulevees tout au long d’une production, sans necessairement que les enjeux ethiques fassent l’objet d’une recommandation a part. Les defis de l’integration de l’ethique L’integration de l’ethique a l’exercice d’evaluation comporte plusieurs defis et soulève plusieurs questions. Comment, methodologiquement, integrer de manière systematique et equitable les differentes informations recueillies? Quel poids attribuer aux differentes sources de donnees et aux differents resultats des consultations de manière a porter la legitimite de chaque perspective, et a tenir veritablement compte des points de vue exprimes? Comment s’assurer qu’un point de vue n’a pas ete ecarte par le processus m^eme de consultation? Comment tenir compte de l’information transmise aux personnes consultees, toujours susceptible d’orienter les propos recueillis, et du contexte – issu du milieu interne de l’organisation et de l’environnement externe – entourant les consultations? Quelle est la meilleure manière de rendre compte des deliberations du Forum? Comment s’assurer que la perspective personnelle des auteurs des publications ne porte pas ind^ ument ombrage a certaines informations obtenues par le processus de collecte des donnees, soit pour le choix des thèmes a aborder ou des recommandations a proposer? Comment rendre compte des differentes visions sous-tendant parfois des inter^ets divergents, et les integrer dans la perspective du bien commun souhaitee pour le SSSS? Pour conclure cette section sur l’integration de l’ethique a l’exercice d’evaluation, il importe de souligner qu’au-dela de l’application de methodes, approches et moyens, le Commissaire conçoit l’ethique comme un exercice de mediation sociale qui, en incluant les differents acteurs concernes, y compris les citoyens, dans les decisions portant sur des choix sociaux a la base de notre SSSS, participe a la promotion de la democratie. sultats des travaux du Commissaire Re L’approche distinctive du CSBE repose sur l’integration des connaissances recueillies tout au long de sa demarche d’appreciation de la performance (Figure 4). Elle lui permet ainsi d’appuyer ses recommandations sur une appreciation integree de la performance visant a determiner les propositions qui sont a la fois pertinentes, realisables, demontrees comme etant efficaces et repondant aux problèmes recenses, ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 79 Figure 4. Processus de recommandations tout en tenant compte des enjeux et des implications emergeant de la perspective citoyenne (CSBE 2013a). Les principales recommandations des rapports du CSBE Le premier rapport d’appreciation de la performance produit par le Commissaire est paru en 2009 et portait sur la première ligne de soins (CSBE 2009b). Dans ce rapport, le Commissaire reconnaissait que la première ligne de soins est un maillon indispensable pour ameliorer la performance globale du système de sante. Pour soutenir une telle orientation, le Commissaire recommande d’agir sur quatre aspects fondamentaux du système, 80 OLIVIER SOSSA, ISABELLE GANACHE Figure 5. Recommandations première ligne de soins comme le montre la Figure 5. Il s’agit de s’accorder une organisation et des ressources modernes, de favoriser une plus grande participation des personnes aux soins, de mieux planifier, organiser et evaluer les soins ainsi que se munir d’un financement approprie de la première ligne de soins. En 2010, le Commissaire a choisi d’apprecier les soins et services offerts aux personnes atteintes de maladies chroniques (CSBE 2010b), dont la prevalence et les problèmes de sante associes representent un defi majeur pour le système de sante. Le Commissaire a articule 10 recommandations autour de 5 avenues d’amelioration (Figure 6). Pour repondre de façon appropriee a l’ensemble des besoins des personnes atteintes de maladies chroniques, les ajustements organisationnels du système, au centre desquels se trouve la personne, sont incontournables. Selon le Commissaire, plusieurs aspects lies a la planification, a l’organisation et a la prestation des soins et services doivent ^etre revus. ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 81 Figure 6. Recommandations maladies chroniques Pour son rapport d’appreciation de 2011 (CSBE 2011b), le Commissaire a choisi d’analyser la gamme de soins et services offerts en perinatalite et en petite enfance (PPE). Fonctionnant en synergie, les 12 recommandations du Commissaire permettent d’agir, dès la grossesse, sur la sante et le bien-^etre des enfants, par une action concertee repondant aux besoins et aux problèmes presentes par la clientèle en PPE. Comme l’illustre la Figure 7, trois niveaux d’actions sont vises : prestation des soins et services; organisation du reseau de la sante et des services sociaux; niveau societal. Les recommandations se regroupent en quatre grands axes : la hierarchisation des services en PPE; le continuum « promotionprevention-soins-readaptation-protection » en PPE; une reponse adaptee aux besoins en matière d’« information-soutien services » en PPE; l’enfant comme priorite sociale. En 2012, le Commissaire s’est interesse au secteur de la sante mentale (CSBE 2012b). Le rapport met en evidence l’importance des enjeux ethiques dans le secteur de la sante mentale. On y souligne les lacunes qui justifient de revoir l’allocation des ressources pour une offre de services optimale en sante mentale, d’accorder une importance particulière a la lutte contre la stigmatisation ainsi qu’a la promotion de la sante et a la 82 OLIVIER SOSSA, ISABELLE GANACHE Figure 7. Recommandations PPE prevention des troubles mentaux et de favoriser l’accès a la psychotherapie. Le rapport decrit comment la reorganisation de services autour des soins de collaboration entre les medecins de famille, les psychiatres repondants et les equipes en sante mentale par des mecanismes de liaison plus efficaces, de m^eme qu’un accès rehausse au niveau des guichets des centres de sante et de services sociaux, peut contribuer a reduire les delais d’attente et a favoriser le retablissement des personnes atteintes de troubles mentaux. En 2013, le Commissaire a rendu public son rapport d’appreciation globale de la performance du système de sante, qui permet de faire ressortir les forces et les faiblesses du système, ce qui donne aux decideurs et aux gestionnaires des pistes de reflexion et d’action pour une amelioration continue de la performance (CSBE 2013b). En comparant le Quebec aux autres provinces canadiennes et certains pays de l’OCDE, de m^eme qu’en comparant les regions du Quebec entre elles, le Commissaire fournit une information juste et rigoureuse afin de contribuer a ameliorer la performance du système. On observe, entre autres, que la qualite des soins au Quebec et l’etat de sante general de la population se demarquent favorablement par rapport a l’ensemble du Canada. L’utilisation des technologies informatiques par les medecins et la coordination des soins et services ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 83 sont par contre des aspects pour lesquels beaucoup d’efforts doivent encore ^etre consentis, en plus de l’accès a un medecin traitant regulier pour une plus grande proportion de la population. es des travaux La diffusion et les retombe du CSBE La question des retombees des travaux est une preoccupation generale, particulièrement dans le domaine de la sante o u les resultats des travaux de recherche sont très peu traduits en pratique (Grimshaw et coll. 2012). Comment mesure-t-on les retombees des recommandations du CSBE lorsqu’on sait que les travaux du CSBE ne constituent qu’une des sources de recommandations visant a soutenir les actions du reseau? L’imputation d’une action a ses recommandations constitue un defi reel et a defaut de pouvoir la mesurer, mentionnons quelques exemples de l’eclairage que ces travaux ont apporte sur la performance du système en informant differents acteurs (decideurs, gestionnaires, praticiens, etc.). Par exemple, le CSBE fournit un portrait detaille de chacune des regions du Quebec, a l’exception des Terres-Cries-de-la-Baie-James, du Nunavik et du Nord-du-Quebec. Il offre ainsi un portrait individualise des resultats specifiques pour chaque region. Les gestionnaires du reseau de la sante et des services sociaux peuvent donc avoir accès a une information utile et detaillee pour determiner les enjeux de performance propres a leur region respective. De plus, pour rendre plus accessibles les resultats de ses travaux, le Commissaire a developpe au cours des annees une application interactive disponible a tous sur son site Internet : l’Atlas CSBE. Cet outil de visualisation permet, au moyen de cartes geographiques qui decoupent le Canada par provinces et le Quebec par regions sociosanitaires, d’acceder a une multitude d’informations sur differents aspects du SSSS a l’aide des indicateurs utilises par le Commissaire, ce qui contribue a sa fonction d’information. Le système de sante est perçu comme une entite com les experimentations, plexe, interactive, organique ou les etudes empiriques et la reflexion sont centrales pour creer une culture d’innovation, d’amelioration et par consequent d’efficacite Les portraits regionaux mis a jour annuellement renseignent sur les forces et les faiblesses des regions et sur l’efficience des services qui y 84 OLIVIER SOSSA, ISABELLE GANACHE sont dispenses. Ils visent notamment a permettre au personnel des agences de la sante et des services sociaux d’etudier en detail l’ensemble des indicateurs utilises par le Commissaire, en plus de constituer un outil d’amelioration de la performance pour les decideurs du reseau. En effet, suite a la transmission aux agences regionales de leurs rapports regionaux de performance, plusieurs ont mentionne que les informations se retrouvant dans ces rapports leur permettent de synthetiser l’information sur leur reseau et de contribuer a la reflexion sur la performance de leurs continuums de services. Certaines agences ont egalement mentionne qu’il s’agissait d’un outil de gestion supplementaire pour faire ressortir les secteurs a potentiel d’amelioration et les forces de leur reseau. Certains travaux du Commissaire ont des retombees a d’autres niveaux. Par exemple, a la suite de la publication de son rapport sur la sante mentale incluant une recommandation concernant l’accès a la psychotherapie, le Ministre de la Sante et des Services sociaux a confie a l’INESSS le mandat de faire une analyse comparee de la psychotherapie et des medicaments psychotropes et de formuler des recommandations sur les modèles de remboursement de la psychotherapie. Cette recommandation du Commissaire a egalement mene a la creation, en mars 2013, du Collectif pour l’accès a psychotherapie visant a jouer un r^ ole actif auprès des instances gouvernementales afin que cette recommandation soit actualisee. De plus, le gouvernement s’appr^ete a rendre publique une politique nationale en sante, un element qui avait fait l’objet d’une recommandation dans le rapport du Commissaire portant sur les maladies chroniques. De la m^eme manière, plusieurs decisions recentes avaient egalement fait l’objet de recommandations de la part du Commissaire telles que : l’augmentation et l’ajout de plus de professionnels non medecins dans les GMF (Groupes de medecine de famille); l’informatisation du reseau; le soutien a la prevention de la sante, le soutien au developpement de la capacite des personnes a participer a leurs soins et services, le soutien et l’accompagnement des aidants, etc. Mentionnons enfin que d’autres travaux du Commissaire, particulièrement dans le cadre de la consultation sur les enjeux ethiques du depistage prenatal de la trisomie 21, ont egalement eu des retombees la suite de ces traauprès de differents acteurs du SSSS du Quebec. A vaux, plusieurs regions ont souhaite et obtenu le soutien du Commissaire dans la mise en œuvre de programmes de depistage. Bien que l’on puisse constater que les travaux du Commissaire participent a certaines de ces transformations, comme en temoignent les retombees ci-haut mentionnees, il importe cependant de considerer avec humilite la contribution du Commissaire aux changements observes. Le système de sante est perçu comme une entite complexe, interactive, ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 85 organique o u les experimentations, les etudes empiriques et la reflexion sont centrales pour creer une culture d’innovation, d’amelioration et par consequent d’efficacite (Kitson 2009). Le Commissaire contribue dans ces travaux a l’implantation d’une telle culture. Conclusion Le Commissaire a la sante et au bien-^etre souhaite, par cet article decrivant ses methodes, les caracteristiques de son exercice et les defis qu’il rencontre, contribuer a la reflexion sur l’appreciation de la performance dans le domaine de la sante. L’appreciation de la performance que realise le Commissaire est un exercice complexe, impliquant l’integration de differentes methodes et sources de connaissances, l’utilisation et le developpement d’un modèle integrateur qu’est EGIPSS, la mise en commun de differentes connaissances, de m^eme que l’integration de l’ethique et de la perspective citoyenne. Cette complexite est garante, aux yeux du Commissaire, de resultats d’inter^et et pertinents, fondes scientifiquement, ancres dans la realite des milieux concernes, de m^eme que dans les valeurs sociales constitutives de la societe quebecoise. Dans un tel exercice d’appreciation de la performance, de nombreux defis a la fois d’ordre methodologique ou pratique sont rencontres. Tout d’abord, l’utilisation et le developpement du modèle EGIPSS, exercice stimulant s’il en est, comportent leur part de defis. Par exemple, l’accès a des indicateurs fiables et stables dans le temps pour permettre un suivi de l’evolution de l’objet d’evaluation, constitue un premier enjeu qu’il ne faut pas minimiser : les resultats qui en seront tires, a la fois en termes de constats comparatifs qu’en interpretation des alignements, en dependent. Le Commissaire fait egalement face au defi d’integrer et de ponderer de manière exacte et pertinente les differentes sources de connaissances et les resultats issus des diverses methodes et types d’evaluation mis en œuvre. Un autre groupe de defis rencontres est lie a la diffusion des travaux d’appreciation realises par le Commissaire et leur communication aux divers publics vises. L’un des premiers defis a ce niveau est de concretiser les resultats obtenus – souvent nuances et tenant compte de nombreuses considerations – de manière accessible et directement pertinente pour les decisions politiques qu’ils visent a informer. Le Commissaire doit ainsi tenir compte des considerations propres a l’univers de la decision politique dont fait partie l’importance de la pertinence en temps opportun, les enjeux politiques et sociaux ou les inter^ets divergents de groupes d’acteurs. En effet, l’impact des recommandations d’un exercice d’appreciation en depend. Cependant, la reelle possibilite de mise en 86 OLIVIER SOSSA, ISABELLE GANACHE œuvre de ces recommandations est egalement tributaire de la complexite de l’objet d’evaluation, soit le SSSS quebecois, dont il est necessaire de tenir compte. Le Commissaire se mesure egalement au difficile exercice de diffusion et de communication qui consiste a rejoindre la population de manière plus generale pour eclairer le debat public. Bien qu’il s’agisse de l’un de ses axes d’intervention prioritaires visant a accro^ıtre le rayonnement de ses travaux et a faciliter leur accès a un large public d’ici 2017, les defis constitues par la vulgarisation et la multiplication des activites de communication et des outils de diffusion n’en demeurent pas moins de taille. Bibliographie Bekker, Marleen. 2007. The Politics of Healthy Policies: Redesigning health impact assessment to integrate health in public policy. Delft: Eburon Academic Publishers. Burls, Amanda, Lorraine Caron, Ghislaine Cleret de Langavant, Wybo Dondorp, Christa Harstall, Ela Pathak-Senand et Bjorn Hofmann. 2011. “Tackling ethical issues in health technology assessment: A proposed Framework.” International Journal of Technology Assessment in Health Care 27 (3): 230–37. Calmette, J-F. 2008. « La LOLF comme nouvelle approche des politiques publiques. » Informations Sociales (150) : 22–31 Champagne François, Andre-Pierre Contandriopoulos, Julie Picot-Touche, François Beland et Hung Nguyen. 2005. Un cadre d’evaluation de la performance des systèmes de services de sante : le modèle EGIPSS, Rapport technique du Groupe de recherche interdisciplinaire en sante, n R05-05. Montreal : Universite de Montreal. Churchill, Larry R. 2002. “What ethics can contribute to health policy” In Ethical dimensions of health policy, edited by M. Danis, C. Clancy, et L.R. Churchill. New York : Oxford University Press. Cleret de Langavant, Ghislaine. 2001. Bioethique : methode et complexite. Sainte-Foy : Presses de l’Universite du Quebec. Commissaire a la sante et au bien-^etre (CSBE). 2009a. Consultation sur les enjeux du depistage prenatal de la trisomie 21, ou syndrome de Down au Quebec. Rapport de consultation, Consultation sur les enjeux ethiques du depistage prenatal. Quebec, Gouvernement du Quebec. ——. 2009b. Rapport d’appreciation de la performance du système de sante et de services sociaux 2009 – Construire sur les bases d’une première ligne de soins renouvelee : recommandations, enjeux et implications. Quebec, Gouvernement du Quebec. ——. 2010a. Informer des droits et sensibiliser aux responsabilites en matière de sante – Synthèse et recommandations. Quebec, Gouvernement du Quebec. ——. 2010b. Rapport d’appreciation de la performance du système de sante et de services sociaux 2010 – Adopter une approche integree de prevention et de gestion des maladies chroniques : recommandations, enjeux et implications. Quebec, Gouvernement du Quebec. ——. 2011a. Document explicatif sur la methode de balisage pour l’analyse globale et integree de la performance. Quebec, Gouvernement du Quebec. ——. 2011b. Rapport d’appreciation de la performance du système de sante et de services sociaux 2011 – Pour une vision a long terme en perinatalite et en petite enfance : enjeux et recommandations. Quebec, Gouvernement du Quebec. ET DES SERVICES SOCIAUX DU QUEBEC LA PERFORMANCE DU SYSTEME DE SANTE 87 ——. 2012a. L’importance du debat public et les conditions qui y sont propices, Un guide du Commissaire a la sante et au bien-^etre. Quebec, Gouvernement du Quebec. ——. 2012b. Rapport d’appreciation de la performance du système de sante et services sociaux – Pour plus d’equite et de resultats en sante mentale au Quebec. Quebec, Gouvernement du Quebec. ——. 2013a. Rapport sur la mise en œuvre de la loi sur le commissaire a la sante et au bien^etre. Quebec, Gouvernement du Quebec. ——. 2013b. La performance du système de sante et de services sociaux quebecois 2013 – Resultats et analyses. Quebec, Gouvernement du Quebec. ——. 2014. Avis detaille sur les activites de procreation assistee au Quebec. Quebec, Gouvernement du Quebec. Contandriopoulos, Andre-Pierre, Lynda Rey, Astrid Brousselle et François Champagne. 2012. « Evaluer une intervention complexe : enjeux conceptuels, methodologiques, et operationnels. » Revue canadienne d’evaluation de programme 26 (3) : 1–16. Cronholm S. et A. Hjalmarsson. 2011. “Experiences from sequential use of mixed methods.” Electronic Journal of Business Research Methods 9 (2): 87–95 Denis, Jean-Louis, Pascale Lehoux et François Champagne. 2004. “A Knowledge Utilization Perspective on Fine-Tuning Dissemination and Contextualizing Knowledge.” In Using Knowledge and Evidence in Health Care: Multidisciplinary perspectives edited by L. LemieuxCharles et F. Champagne. Toronto: University of Toronto Press. Grimshaw J.M., M.P. Eccles, J.N. Lavis, S.J. Hill et J.E. Squires. 2012. “Knowledge translation of research findings.” Implement Sci. 31 (7:50). Hammond, C. 2005. “The wider benefits of adult learning: An illustration of the advantages of multi-method research.” International Journal of Social Research Methodology 8: 239–55. Institut national d’excellence en sante et en services sociaux (INESSS). 2011. Projet pilote sur l’evaluation de quatre medicaments anticancereux. Quebec, Gouvernement du Quebec. ——. 2012. Accessibilite a des medicaments anticancereux a caractère juge prometteur. Etat des lieux et bilan du projet pilote, Quebec. Kenny, Nuela, et Mita Giacomini. 2005. “Wanted: A new ethics field for health policy analysis.” Health Care Analysis 13 (4): 247–60. Kitson, Alison L. 2009. “The need for systems change: reflections on knowledge translation and organizational change.” Journal of Advanced Nursing 65 (1): 217–28. Klein, Rudolf. 2003. “Evidence and policy: interpreting the Delphic oracle.” J R Soc Med (96): 429–31. Lavis, John N. 2006. “Moving forward on both systematic reviews and deliberative processes.” Healthcare Policy 1 (2): 59–63. Levesque, Jean-Frederic et Ghislaine Cleret de Langavant. 2010. “Les grands defis lies a l’expertise et la preuve en politiques de la sante : de la connaissance a la recommandation.” Dans Les grands defis en droit et politiques de la sante, edite par R. P. Kouri et C. Regis. Editions Yvon Blais. Lomas, Jonathan, Tony Culyer, Chris McCutcheon, Laura McAuley, et Susan Law. 2005. Conceptualiser et combiner les donnees probantes pour guider le système de sante. Ottawa : Fondation canadienne de la recherche sur les services de sante. Consulte en ligne a: Available at: http://www.fcrss.ca/migrated/pdf/insightAction/evidence_f.pdf. Masse, R. 2001. “Analyse anthropologique et ethique des conflits de valeurs en promotion de la e par C. Fournier, C. Ferron, S. Tessier, B. sante.” Dans Education pour la sante et ethique, Edit Sandrin Berthon et B. Roussille. Editions du Comite Français pour l’Education a la sante. Ministère de la sante et des services sociaux (MSSS). 2012. Cadre de reference ministeriel d’evaluation de la performance du système public de sante et de services sociaux a des fins de gestion. Quebec : Ministère de la sante et des services sociaux, Direction generale de la planification, de la performance et de la qualite. 88 OLIVIER SOSSA, ISABELLE GANACHE Mobini Dehkordi, A. 2011. “Introducing the models and designs in mixed method research.” Journal of Rahbord 20 (60): 217–34. NICE. National Institute for Health and Care Excellence. 2008. Social value judgments: Principles for the development of NICE guidance, Second edition. http://www.nice.org.uk/ media/C18/30/SVJ2PUBLICATION2008.pdf (page consultee le 28 novembre 2013). ——. 2013. Citizen’s council. http://www.nice.org.uk/aboutnice/howwework/citizenscouncil/citizens_council.jsp (page consultee le 28 novembre 2013). Ontario. Ministère de la sante et des soins de longue duree, Conseil des citoyens. 2012. http://www.health.gov.on.ca/fr/public/programs/drugs/councils/(page consultee le 28 novembre 2013). Ostlund, U., L. Kidd, Y. Wengstrom, et N. Rowa-Dewar. 2011. “Combining qualitative and quantitative research within mixed method research designs: A methodological review.” International Journal of Nursing Studies (48): 369–83. Quebec. 1998. Loi sur l’Institut national de sante publique du Quebec : L.R.Q., chapitre I officiel du Quebec. 13.1.1, a jour au 1er novembre 2013 [Quebec], Editeur ——. 2005. Loi sur le Commissaire a la sante et au bien-^etre : L.R.Q., chapitre C-32.1.1, a jour au 1er novembre 2013 [Quebec], Editeur officiel du Quebec. ——. 2010. Loi sur l’Institut national d’excellence en sante et en services sociaux : L.R.Q., chapitre I-13.03, a jour au 1er novembre 2013 [Quebec], Editeur officiel du Quebec. Rawlins, Michael D. et Anthony J. Culyer. 2004. “National institute for clinical excellence and its value judgments.” BMJ (329): 224–7. Sicotte, C. 2007. “Comment donner du sens a un système de sante complexe? Reddition de comptes et systèmes d’information.” Dans Le système sociosanitaire au Quebec, edite par M.-J. Fleury, M. Tremblay, H. Nguyen et L. Bordeleau. Montreal : Les Editions de la Chenelière Inc. Ga€etan Morin editeur. Teddlie, C. et A. Tashakkori, 2009. Foundations of Mixed Methods Research. Integrating Quantitative and Qualitative Approahes in the Social and Behavioral Sciences. Thousand Oaks, CA: Sage. Venkatesh, V., S.A. Brown et H. Bala. 2013. “Bridging the qualitative-quantitative divide: Guidelines for conducting mixed methods research in information systems.” MIS Quarterly, 37(1): 21–54. Verificateur general du quebec (VGQ). 2013. Personnes ^ agees en perte d’autonomie : Services a domicile. Gouvernement du Quebec. Assemblee Nationale. Voyer, P. 2002. Tableaux de bord de gestion et indicateur de performance. Ste-Foy : Presses de l’Universite du Quebec, 2e edition. Zandvanian A. et E. Daryapoor. 2013. “Mixed methods research: A new paradigm in educational research.” J. Educ. Manage. Stud. 3 (4): 525–31. Etienne Charbonneau Gerard Divay Damien Gardey dans l’utilisation des Volatilite indicateurs de performance municipale : Bilan et nouvelle perspective d’analyse Sommaire : Les indicateurs de performance municipale font l’objet d’un inter^et renouvele par les provinces et villes canadiennes, comme dans plusieurs autres partir d’une revue des etudes publiees, cet essai enrichit la litterature sur pays. A deux points. Il etablit un bilan des connaissances sur leur utilisation qui prend en compte la diversite des types d’indicateurs. Il propose aussi un cadre d’analyse de l’interface politico-administrative locale qui aide a comprendre la variabilite et la volatilite dans l’utilisation de divers types d’indicateurs. Cette nouvelle perspective permet de mieux comprendre pourquoi l’utilisation des indicateurs de performance par les elus et administrateurs est limitee. Abstract: The provinces and cities in Canada, like in several other countries, are studying the indicators of municipal performance with a renewed interest. Based on a review of published studies, this article expands on the existing literature in two areas. It establishes a record of their use, taking into account the diversity of indicators used. It proposes also an analytical framework of the local, political and administrative interface that helps in the understanding of the variability and the volatility in the various uses of indicators. This new perspective allows a better understanding of the reason why the use of performance indicators by elected officials and administrators is limited. L’amelioration de la performance publique et de la reddition de compte conna^ıt un regain d’inter^et au Canada a tous les paliers de gouvernement. L’Ontario, le Quebec (Schatteman et Charbonneau 2010) et la Nouvelle-Ecosse ont impose a toutes les municipalites l’obligation de colliger annuellement les valeurs d’indicateurs predetermines sur les services et les finances. M^eme si les municipalites obtempèrent, l’utilisation de ces indicateurs se revèle fort inegale autant dans les decisions de gestion (Bellavance et coll. 2010) que dans la communication aux citoyens (Schatteman 2010). Les observations canadiennes concordent avec les constats faits dans d’autres pays sur trois tendances : L’utilisation des l’Ecole Etienne Charbonneau est professeur adjoint a nationale d’administration publique de Montreal, Quebec, et membre du CERGO. Gerard Divay est professeur titulaire a l’Ecole nationale d’administration publique de Montreal, Quebec. Damien Gardey est professeur en management a Groupe ESC Pau, Pau, France. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 89–109 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 90 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY indicateurs de performance reste en general limitee; les elus en tiennent moins compte que les gestionnaires; l’utilisation est variable dans le temps. Comment peut-on expliquer ces trois tendances dans les pratiques malgre le discours sur l’importance de la mesure de la performance? Un cadre d’analyse est propose o u les trois tendances sont liees au m^eme phenomène : la dynamique de l’interface politico-administrative. Selon Mazouz et coll. (2010 : 142), la performance publique serait « la resultante d’un pacte institutionnel » entre les elus et les gestionnaires publics. La comprehension de la relation entre ces deux groupes d’acteurs est donc centrale pour comprendre l’utilisation des indicateurs de performance. Le cadre offert s’appuie sur une revue et une combinaison de deux courants de documentation : l’utilisation de la mesure de la performance municipale et les relations entre politique et administration au niveau local. La première partie dresse un bilan des etudes sur l’utilisation de la performance. La deuxième partie rappelle la diversite des types d’indicateurs pour mesurer la performance municipale. La troisième partie expose la nouvelle perspective d’analyse o u l’utilisation differenciee des types d’indicateurs peut se comprendre par la dynamique de l’interface politico-administrative locale avec ses divers registres et jeux de position. L’utilisation des indicateurs de performance La mesure de la performance est un outil informationnel de gestion. Jusqu’ a tout recemment, l’utilisation n’etait pas un objet d’etude saillant (Van de Walle et Van Dooren 2008). L’attention etait centree sur les determinants d’initiatives de mesure de la performance au niveau municipal (Streib et Poister 1999; Holzer et Yang 2004), l’etat d’avancement de la mise en œuvre (de Lancer-Julnes et Holzer 2001; Ho et Chan 2002), les benefices perçus (McGowan et Poister 1985), et les distorsions possibles (Bouckaert et Balk 1991; Smith 1995; Hood 2007). L’utilisation des informations vehiculees par les indicateurs de performance a ete identifiee comme l’une des questions fondamentales pour la recherche en management public (Moynihan et Pandey 2010). Depuis, un flot important d’etudes a ete produit sur l’utilisation faite surtout par les gestionnaires. Ces etudes sont regroupees sous le thème de la gestion de la performance (performance management). Il existe plusieurs manières d’organiser cette litterature. Gerrish et Wu (2013) regroupent les articles sur ce sujet en quatre categories : (a) etudes de cas qualitatives qui identifient en quoi l’utilisation des indicateurs devrait ameliorer la performance dans le secteur public, et quelles en sont les derives possibles; (b) litterature professionnelle sur les meilleures pratiques et les leçons apprises; (c) etudes DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 91 empiriques basees sur des questionnaires auto-administres; et (d) etudes empiriques etoffees integrant des donnees d’archives d’organisations publiques. Des revues systematiques de documentation ont ete publiees ailleurs sur le sujet (entre autres, par Van Dooren et coll. 2010). Nous presentons ici seulement des etudes recentes des categories (a) et (c) de Gerrish et Wu (2013). L’utilisation des indicateurs de performance reste limitee, et moins frequente chez les elus que chez les gestionnaires. Ces derniers se contenteraient d’une utilisation passive plut^ot que d’une utilisation resolue La mesure de la performance consiste a documenter les ressources, les activites et les operations quotidiennes et les resultats attribues a ces activites, avec un nombre limite d’indicateurs predefinis. L’information colligee peut ensuite servir aux decideurs pour diverses fonctions. Van Dooren, Bouckaert et Halligan (2010) enumèrent 44 fonctions. Il est possible de concevoir les differents types d’utilisation en quatre categories : l’utilisation passive, l’utilisation resolue (purposeful), l’utilisation politique, et l’utilisation perverse (Moynihan et Lavertu 2012, 2012b). L’utilisation passive se limite aux demandes minimales d’un regime de performance, defini par les exigences nationales (Kuhlmann 2010) qui se distinguent notamment par le degre de contrainte, l’eventail des activites couvertes, la nature des indicateurs et leur diffusion (Moynihan et Lavertu 2012). Dans le cas des municipalites quebecoises, le transfert de la valeur des indicateurs au ministère des Affaires municipales, des Regions et de l’Occupation du territoire (MAMROT) et le dep^ ot du rapport au conseil municipal sont des exemples d’utilisation passive individuelle. En Ontario, les informations transmises au ministère des Affaires municipales et du logement social (MAH) sont consignees sur un site web depuis 2010. Il s’agit d’utilisation passive collective. Dans l’utilisation resolue, les donnees sont utilisees pour l’amelioration de la gestion des programmes et services, dans les modifications de processus, l’allocation de ressources, les attentes signifiees aux employes, etc. (Moynihan et Lavertu 2012). L’utilisation politique incorpore surtout la reddition de compte au public, y compris la selection, la dissemination et l’interpretation des resultats pour bien faire para^ıtre une administration (Moynihan, Pandey et Wright 2012a). L’utilisation perverse inclut la manipulation de donnees (Moynihan, Pandey et Wright 2012a). Il existe donc des types d’utilisation plut^ ot administrative (passive, perverse), plut^ ot politique, ou mixte (resolue). Cette classification est moins sommaire qu’une autre plus repandue qui separe, voire oppose, les fonctions 92 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY d’amelioration interne de la performance des fonctions de reddition de compte au public (Melkers et Willoughby 2005; Ammons et Rivenbark 2008), soient les fonctions externes et internes (Torres, Pina et Yetano 2011). Les etudes sur l’utilisation des indicateurs portent davantage sur le comportement des gestionnaires que sur celui des elus, et très rarement sur les deux types d’acteurs. Pour les gestionnaires, de nombreuses etudes recentes arrivent au m^eme constat dans plusieurs pays : la frequence et l’importance de l’utilisation des indicateurs ne gagnent guère de terrain. Pour les Etats-Unis, Ammons (2013) decèle cependant quelques progrès dans la mesure de la performance dans des villes leaders, sans qu’ils ne se traduisent automatiquement par des ameliorations effectives des services. Les indicateurs de gestion sont peu ou non utilises dans les fonctions budgetaires et de management, ni aux Etats-Unis (Wang et Berman 2001; Chung 2005; Rogers 2006), ni au Quebec (Charbonneau 2010), ni en Espagne et en Italie (Montesinos et coll. 2013). Un inventaire recent aux Etats-Unis n’a repere que 27 villes ayant un système de performance juge exceptionnel, et qui se demarque par plus d’indicateurs de resultats, d’efficience et de qualite qu’ailleurs. Cependant, m^eme dans ces municipalites remarquables, le système de performance appara^ıt fragile, a la merci notamment des changements de leadership (Sanger 2013). Du c^ ote des elus, une revue de la litterature sur l’utilisation qu’ils font des indicateurs, effectuee en 2006, revelait que le sujet n’etait a peu près pas couvert (Pollitt 2006), et mentionnait m^eme que les citoyens, les elus et les ministres etaient les cha^ınons manquants de ce champ de recherche. Demaj et Summermatter (2012) identifient un plus grand nombre d’etudes sur l’utilisation d’indicateurs de la part des politiciens; mais elles se penchent sur les parlementaires a divers niveaux, plus que sur les elus municipaux. Le constat est que les parlementaires demontrent peu d’inter^et pour l’information de performance; il en est de m^eme pour les elus locaux neerlandais (ter Bogt 2007). Cependant, certaines enqu^etes laissent percevoir un portrait plus nuance. Selon Ho (2006), une forte proportion des maires du Midwest americain trouve utile l’information de performance et selon Askim (2007), les elus norvegiens l’utilisent a tous les stades du pilotage des politiques, de manière plus ou moins accentuee. L’utilisation concourante faite par les gestionnaires et les elus est encore plus rarement etudiee (Liguori et coll. 2012). Certaines etudes donnent des indications sur l’inter^et des elus, mais a partir de l’opinion des gestionnaires. Au Quebec, seulement 17 % des gestionnaires municipaux estiment que leurs elus ont de l’inter^et pour l’information de performance du regime officiel (Bellavance, Charbonneau et Messier 2010). Sanger (2013) mentionne que le support des elus n’est pas constant. Des constats semblables ont ete faits en France sur la difference d’implication Sondage Sondage Sondage Chung (2005) Opinion des repondants Opinion des repondants Sondage Bellavance, Charbonneau, Messier (2010) terBogt (2004) Carassus, Gardey (2009) Sondage Opinion des repondants Sondage Askim (2007) Charbonneau (2010) Opinion des repondants corroboree par documents Etude de 15 cas Ammons, Rivenbark (2008) Opinion des repondants Opinion des repondants Opinion des repondants Presence dans documents Analyse documentaire Indicateur d’utilisation Ammons (2013) Source Extrants, efficience, resultats ou efficacite, qualite des services, satisfaction des utilisateurs ou des citoyens Sources sur information de performance: formelle, informelle, autres Ressources, activites, efficacite, efficience 14 indicateurs du regime quebecois Types d’indicateurs mentionnes Analyse selon finalite de l’utilisation et non le type d’indicateurs Analyse des facteurs associes a l’utilisation des divers types de mesures Importance des diverses sources Statistiques sur leur presence Analyse de l’utilisation differenciee des divers types d’indicateurs « higher-order measures »: Evolution de la qualite des services, effipresence de ce type cience, et efficacite d’indicateurs Extrants, efficience, Degre d’utilisation des efficacite (resultats) divers types, en particulier ceux de « higher-order» aux fins d’amelioration Information de perforRelativisation de mance versus autres types l’importance de l’inford’information mation de performance non non Tableau 1. La mesure de l’utilisation des indicateurs de performance au niveau local DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 93 Sondage Sondage Etude de 5 cas Sondage Sondage Liguori, Sicilia et Steccolini (2012) Maurel, Carassus, Gardey (2012) Melkers, Willoughby (2005) Montesinos et coll. (2013) Source Ho (2006) Resultats, extrants, efficience et productivite Types d’indicateurs mentionnes Information financière et comptable (12 items) Information non financière (efficience, activites, extrants, satisfaction des usagers, activites a venir, service approprie) Presence dans Regroupes sous 4 axes: documents beneficiaire, financier, Opinion de repondants processus interne, apprentissage organisationnel Opinion des Intrants, activites/procesrepondants sus, extrants, resultats, co^ ut/efficience, qualite/ satisfaction des usagers, et etalonnage Opinion des Economie (co^ ut des repondants intrants), efficience, efficacite Opinion des repondants Opinion des repondants Indicateur d’utilisation Tableau 1: Suite Analyse selon la finalite de l’utilisation et non le type d’indicateurs Mention de frequence d’utilisation non Statistiques descriptives sur utilisation par service Type d’indicateurs comme variable independante Statistiques descriptives sur l’importance par types, domaines, selon elus et gestionnaires Analyse de l’utilisation differenciee des divers types d’indicateurs 94 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY Sondage Analyse documentaire 1 entrevues Etude de 7 cas Sondage et entrevues Opinion des repondants Rogers (2006) Sanger (2013) Torres, Pina, Yetano (2011) Wang, Berman, (2001) Opinion des repondants Presence dans documents 1 opinion des repondants Opinion des repondants Opinion des repondants Sondage Moynihan, Pandey, Wright (2012a) Opinion des repondants Sondage Indicateur d’utilisation Moynihan, Pandey (2010) Source Tableau 1: Suite Extrants, efficience, efficacite, qualite des services, satisfaction Extrants, resultats, (efficacite, qualite des services) Intrants, extrants, resultats, efficience, qualite Extrants, resultats non non Types d’indicateurs mentionnes Les deux types d’indicateurs utilises comme variables dependantes Indirecte, par identification de systèmes exemplaires de gestion de la performance Frequence d’utilisation selon les services Analyse des facteurs favorables a l’utilisation Analyse selon la finalite de l’utilisation et non le type d’indicateurs non Analyse de l’utilisation differenciee des divers types d’indicateurs DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 95 96 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY entre elus et gestionnaires (Carassus et Gardey 2009) et sur l’heterogeneite des situations municipales (Maurel, Carassus et Gardey 2012). Les divers types d’indicateurs n’ont pas tous la m^eme pertinence pour les trois grands groupes de parties prenantes a la performance municipale : les gestionnaires, les elus et les citoyens Dans la plupart de ces etudes, l’utilisation est estimee a partir des opinions emises par les repondants, gestionnaires et parfois elus (Tableau 1). Elle l’est dans quelques cas a partir d’une analyse des documents disponibles. L’appreciation de la performance est le plus souvent globale. Quelques etudes la desagrègent cependant selon divers types d’indicateurs, pour certains domaines d’action municipale. Des differences dans l’utilisation y apparaissent selon les types d’indicateurs et selon les domaines, mais elles sont rarement analysees. L’utilisation des indicateurs de performance reste limitee, moins frequente chez les elus que chez les gestionnaires. Ces derniers se contenteraient d’une utilisation passive plut^ ot que d’une utilisation resolue. Le degre d’utilisation fluctue beaucoup entre municipalites, m^eme a caracteristiques semblables, et a l’interieur de chaque municipalite entre les domaines et selon les types d’indicateurs. L’utilisation est volatile dans chaque municipalite, au gre des conjonctures. Cette volatilite est aussi manifeste en longue periode sur l’ensemble des municipalites aux Etats-Unis (William 2004). Plusieurs analyses quantitatives (listees dans le Tableau 1) mettent en relation le degre d’utilisation et diverses variables de conditions locales, de contexte organisationnel et de caracteristiques personnelles des utilisateurs pour expliquer le niveau d’utilisation, sa variabilite et sa volatilite. Si elles mettent en evidence des facteurs favorables a l’utilisation et expliquent, partiellement, la variabilite intermunicipale, elles oblitèrent pour la plupart les differences selon la nature des indicateurs et negligent l’evolution de leur utilisation. Le degre d’utilisation, la variabilite et la volatilite peuvent se comprendre plus facilement si on tient compte de la difficulte de mesurer la performance municipale. La comprehension est aussi bonifiee si l’on resitue l’utilisation des indicateurs de performance dans la dynamique de l’interface politico-administrative locale. la fois Une performance municipale, a fractaire a la mesure propice et re En Amerique du Nord, la mesure de la performance municipale a un siècle d’experience (Williams 2004). Elle represente un travail de Sisyphe DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 97 qui se heurte a trois difficultes majeures : l’etendue et l’heterogeneite des responsabilites municipales, la multidimensionnalite de la performance, et la variete des perspectives. Les responsabilites municipales couvrent de nombreux secteurs d’action publique, avec quelques variantes selon les provinces ou Etats, qui ont chacun des logiques professionnelles differentes. Dans une large recension des mesures, Ammons (2012) identifie 32 secteurs. Le nombre et l’heterogeneite de ces secteurs rendent difficile toute apprehension synthetique de la performance municipale, d’autant plus qu’il faut tenir compte de sa multidimensionnalite (Maurel, Carassus et Gardey 2012). Si on se refère aux grandes dimensions de la performance publique en general, proposees par Van Dooren, Bouckaert et Halligan (2010), on constate que les indicateurs les plus frequemment utilises, notamment les extrants et les resultats (voir Tableau 1), se rapportent a la fonction de production. D’autres dimensions aussi importantes en termes de valeurs publiques ne sont pas abordees, par exemple, la qualite du processus decisionnel democratique. En tenant compte des pratiques et preoccupations actuelles, les indicateurs peuvent ^etre regroupes en quatre grandes categories, selon leur objet : Fonction de production : tout ce qui a trait aux activites et a l’utilisation des ressources (humaines, physiques et financières); interactions de service : les mesures relatives a la qualite de la relation avec les citoyens, telles que developpees et promues par l’Institut des services axes sur les citoyens; etat de situation locale : les mesures relatives a l’etat des infrastructures et de l’espace public, de l’environnement, de l’economie locale et des problematiques sociales (par exemple, criminalite); viabilite organisationnelle : les indicateurs relatifs aux processus democratiques et a la capacite institutionnelle de relever les defis locaux. Par ailleurs, dans chaque categorie, les indicateurs peuvent ^etre elabores dans plusieurs perspectives. Une perspective descriptive peut ^etre privilegiee pour rendre compte de l’ampleur de ce qui est fait; une perspective analytique pour etablir les principaux rapports habituels (efficience, efficacite, pertinence, productivite); une perspective comparative aux fins d’etalonnage; une perspective evolutive pour retracer les changements; et une perspective evaluative pour expliciter des jugements soit a partir de normes professionnelles, soit en fonction de l’appreciation des citoyens. La prise en compte de l’eventail des responsabilites, des nombreuses dimensions et des diverses perspectives aboutit a une multiplication des indicateurs necessaires pour apprehender la performance municipale dans son ensemble. Ce rappel des caracteristiques de base de la mesure de la performance municipale suggère que la diversite des types d’indicateurs devrait 98 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY ^etre davantage prise en compte dans l’analyse de l’utilisation. Les divers types d’indicateurs n’ont pas tous la m^eme pertinence pour les trois grands groupes de parties prenantes a la performance municipale : les gestionnaires, les elus et les citoyens. Nous ne traitons ici que les deux premiers groupes, puisque l’essai est centre sur l’utilisation a des fins de gestion. hension : Une nouvelle piste de compre Les indicateurs de performance dans l’interface politico-administrative Les differences entre elus et gestionnaires dans l’utilisation des divers types d’indicateurs ne viendraient-elles pas de ce que les divers types d’indicateurs correspondent a des enjeux differents pour les deux groupes qui composent la direction bicephale d’une municipalite? Elus et gestionnaires municipaux se differencient sur plusieurs plans selon Nalbandian (1994 : 532) : « activity, players, conversation, pieces, currency, dynamics ». Plusieurs constatations suggèrent qu’ils se differencient aussi dans leur utilisation des types d’indicateurs, aussi bien dans la litterature sur la performance que dans les analyses sur les relations entre elus et gestionnaires. Pour systematiser ces constatations, nous proposons un modèle de comprehension de l’interface articule sur deux axes : les registres d’interaction et les logiques de position. Les registres de l’interface Les registres font reference aux quatre plans qui entrent simultanement en jeu dans la relation entre elus et gestionnaires : les savoirs, les valeurs, les inter^ets et les emotions. rence des savoirs Interfe Dans l’interface politico-administrative, divers types de savoirs sont a l’œuvre. Ce registre n’est pas expose comme tel dans la litterature, m^eme si plusieurs articles y font indirectement allusion. Les savoirs sont entendus ici dans le sens large donne par Alavi et Leidner (2001:109) : « Knowledge is information possessed in the mind of individuals: it is personalized information (which may or may not be new, unique, useful, or accurate) related to facts, procedures, concepts, interpretations, ideas, observations, and judgments ». Les savoirs sont de divers types. Nonaka et coll. (2000) proposent quatre types : les savoirs experientiels, conceptuels, systemiques et routiniers. Une differenciation des savoirs mobilises dans l’interface politico-administrative locale selon la source principale d’information donnerait quatre formes : experientielle, analytique, preconceptionnelle et generique. Le savoir experientiel est DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 99 constitue de l’information acquise couramment par la presence dans le milieu local et la pratique professionnelle. Pour les elus, il constituerait le mode d’apprehension usuel du système d’acteurs, de la situation et de certains elements de l’administration. Demaj et Summermatter (2012) mentionnent en ce sens que les elus ont tendance a evaluer la performance a partir de l’observation des operations et non d’indicateurs. Les gestionnaires comptent aussi sur ce type de savoir en utilisant des sources d’information plus informelles et non routinières (Kroll 2013). Compte tenu de l’importance des facteurs d’influence exogène, les indicateurs minimaux des regimes de performance rendent difficilement compte du travail specifique des gestionnaires, diminuant de ce fait leur inter^et pour ces derniers Le savoir analytique provient du traitement d’informations codifiees: statistiques, indicateurs, rapports, et directives. Les indicateurs de performance qui definissent des rapports analytiques (notamment l’efficience et l’efficacite) incarnent le savoir analytique. Ils presupposent des pratiques professionnelles standardisees et rigoureuses. Ils rendent lisibles pour les elus l’organisation, les interventions et la situation, de manière très partielle ou plus exhaustive, selon le type d’indicateurs et leur degre de couverture des activites. Le savoir preconceptionnel est compose des prejuges, idees reçues, cliches populaires et renommees qui conditionnent le jugement et la perception des autres. Dans cette perspective, Nalbandian (1994) specifie que ce qui importe pour les elus est « ce qu’on entend ». Le savoir preconceptionnel est incompatible avec le langage de la performance mesuree. Le savoir generique est constitue des modes dans les idees communes, des representations collectives professionnelles, des meilleures pratiques vehiculees par les reseaux de pairs, les consultants et les instituts de formation. Il porte surtout sur l’intervention et l’organisation. Les gestionnaires y sont souvent davantage exposes que les elus, par leurs differentes associations professionnelles. Les indicateurs de performance font desormais partie de ce savoir generique; Sanger (2013) mentionne qu’une des motivations des municipalites a produire des indicateurs est de se conformer a leur « vogue generale ». Conciliation des valeurs Surtout au niveau le plus eleve de l’interface (executif politique-direction administrative), l’exercice du jugement est frequent (par exemple, sur l’opportunite de telle action ou de tel contact) et fait appel a des valeurs. 100 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY Ces valeurs refèrent a des balises morales de comportement personnel et organisationnel (Kernaghan 2003; Durat et Bollecker 2012; Molina et McKeown 2012), ou a des orientations ideologiques (Askim 2009; Demaj et Summermatter 2012; Cristofoli et Crugnola 2012; Albalate Del sol 2013). Les deux groupes n’accordent pas forcement la m^eme priorite aux diverses valeurs en toute circonstance. Or, les differentes categories de valeurs-balises morales ne sont pas neutres pour l’utilisation des indicateurs de performance. Les valeurs professionnelles, pour reprendre la categorisation de Kernaghan (2003) et Molina et McKeown (2012), sont congruentes avec les indicateurs de rapports analytiques. Les valeurs humaines peuvent davantage inciter a porter attention aux donnees sur la qualite de l’interaction de service. Mais les deux autres categories de valeurs, ethique et democratique, ne trouvent guère d’echo dans les indicateurs de performance les plus frequents. re ^ ts Agencement des inte Pour les elus comme pour les gestionnaires, les indicateurs de performance peuvent devenir un atout de mise en valeur personnelle. De surcroit, pour les gestionnaires, cela constitue un facteur de remuneration additionnelle dans les systèmes de paie au merite. Cependant, la valorisation repose sur des types d’indicateurs differents selon les deux groupes. La renommee professionnelle d’un gestionnaire tient plus specifiquement aux indicateurs d’efficience et d’efficacite, surtout dans des perspectives evolutives et comparatives. Compte tenu de l’importance des facteurs d’influence exogène, les indicateurs minimaux des regimes de performance rendent difficilement compte du travail specifique des gestionnaires, diminuant de ce fait leur inter^et pour ces derniers. L’elu local n’est pas juge seulement sur son r^ole de superviseur des realisations administratives, mais aussi, entre autres, sur sa sensibilite aux attentes des electeurs De manière analogue, l’elu va pouvoir plus facilement faire valoir sa valeur ajoutee avec des indicateurs de realisation de projets annonces en campagne electorale et inseres dans des tableaux de bord. Par contre, les indicateurs de nature administrative (rapports analytiques ou fonction de production) ne sont pas forcement un atout electoral (James et John 2007), et les elus auront tendance a ne pas les utiliser (Kuhlmann 2010), sauf eventuellement pour mettre en relief les changements positifs dans certains indicateurs de gestion, durant leur gouverne. Par ailleurs, les indicateurs DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 101 modifient les rapports de pouvoir entre elus et gestionnaires, au fur et a mesure de leur utilisation (Hanssen, 2007; Kroll et Proeller 2013). ^ le des e motions Contro Le registre des emotions est le moins etudie, bien que Nalbandian (1994) ait dej a insiste sur l’importance des symboles pour les elus. Dans notre societe ultra mediatisee, le registre emotif tient une place majeure dans la politique locale. Les maires sont aussi les premiers narrateurs de l’histoire de leur ville, evalues entre autres par leurs talents discursifs (McNeill 2001). Le discours et les pratiques sur la performance relèvent aussi du spectacle (Talbot 2000). Les indicateurs de fonction de production, dans leur facture statistique et comptable, ne se pr^etent guère aux vibrations mediatiques, ce qui contribue sans doute a leur profil bas. Cependant, la première generation de CitiStat a Baltimore a montre qu’ils pouvaient faire l’objet d’une mise en scène qui attire l’attention (Behn 2008). Comme the^atralisation de l’interface sur le thème de la performance, l’experience de Baltimore a ses debuts est exceptionnelle; elle rappelle cependant que le registre emotif est partie integrante de la relation entre elus et gestionnaires. Les logiques de position Elus et gestionnaires occupent dans le système politico-administratif local diverses positions qui se situent a trois niveaux, du plus general au plus specifique : statut, poste et r^ ole. Chacun de ces trois niveaux peut avoir une influence sur l’utilisation des divers types d’indicateurs, comme plusieurs indications dans la documentation le suggèrent. dispositions statutaires Pre Plusieurs auteurs estiment que les comportements des elus et des gestionnaires sont guides par des logiques d’action (Nalbandian 1994; Hansen et Ejersbo 2002; Bergstr€ om et coll. 2008). Le terme de logique laisse supposer une forme de determinisme des comportements par le statut occupe. Celui de predispositions qui, tout en associant au statut d’elu ou de dirigeant administratif des ressorts specifiques de comportements, laisse neanmoins une place significative pour les effets propres a d’autres facteurs. Selon les logiques d’action, le fait d’^etre elu ou gestionnaire inciterait ces deux groupes d’acteurs a se positionner differemment sur les divers registres et a developper des attitudes differentes a l’egard des divers types d’indicateurs. Nalbandian (1994) et Bergstr€ om et coll. (2008) soulignent que la resolution des conflits est centrale pour les elus; or, les indicateurs de performance n’y refèrent pas. La performance politique a des critères propres 102 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY qui debordent les realisations administratives. L’elu local n’est pas juge seulement sur son r^ ole de superviseur des realisations administratives, mais aussi, entre autres, sur sa sensibilite aux attentes des electeurs. Un sondage realise au Quebec, en 2012, sur un echantillon de 4 230 personnes montre que les citoyens privilegient davantage chez leur maire les qualites d’ecoute et d’attention plut^ ot que de gestion (Union des Municipalites du Quebec et Ad hoc recherche 2012). Au registre des savoirs, de leur contenu et de leur mode d’acquisition, Hansen et Ejersbo (2002) estiment que les elus suivent une logique inductive, alors que les gestionnaires ont une logique deductive. Les elus sont plus sensibles aux cas particuliers (Crozet 2001), et donc au savoir experientiel. Les biais des postes La position qu’un acteur occupe dans le système politico-administratif local, aux plans formel (poste et secteur d’activite) et informel (influence, place dans la structure du pouvoir) affecte tous les registres. Les responsables administratifs sectoriels auraient tendance a utiliser davantage les indicateurs de performance que les dirigeants corporatifs (Moynihan et Pandey 2010). Les elus utilisent l’information de performance plus souvent dans certains secteurs que dans d’autres (Ter Bogt 2007; Askim 2007). En position de responsabilite, ils recourent davantage a l’information sur la performance que les conseillers ordinaires (Askim 2009). ^ les L’entrecroisement des ro Les recherches sur le partage des r^ oles politique et administratif ont remplace la dichotomie classique par une collaboration multiforme (Mouritzen et Svara 2002; Svara 2006a; Cheong et coll. 2009; Demir et Reddick 2012). Les r^ oles respectifs d’elu et de gestionnaire sont complexes et se chevauchent (Bergstr€ om et coll. 2008). Il est a noter qu’un r^ ole se definit par rapport aux comportements attendus de la part d’un titulaire de poste. Les attentes reciproques conditionnent donc les comportements des deux categories d’acteurs dans leurs interactions (Haidar et coll. 2009; Demir et Reddick 2012). L’exercice de certains r^ oles est susceptible de rendre leur titulaire plus ou moins sensible a divers types d’indicateurs de performance. En raison des affinites dans les schèmes de reference, les indicateurs de fonction de production sont susceptibles d’interesser davantage les elus qui se veulent managers (Kerrouche 2006) ou entrepreneurs (Huron 2001; Belley et coll. 2011) que ceux qui se voient surtout comme des representants des citoyens (Vabo et Aars 2013) ou comme des emissaires d’un parti (Karlsson 2012). Les multiples r^ oles jouables par les dirigeants administratifs (Loveridge 1968; DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 103 Wheeland 2000; Sancino et Turrini 2009) peuvent ^etre regroupes selon les trois grandes directions proposees par Jones (2011) et par Siegel (2010) : l’interne (operational edge ou leading down), le superieur (political edge ou leading up) et l’externe (stakeholders edge ou leading out). Par exemple, les r^ oles de conseil dans les relations avec les superieurs elus, et de mobilisateur dans les reseaux externes ne trouvent aucun echo dans les indicateurs les plus frequemment utilises dans les municipalites quebecoises et ontariennes. ^ le sur l’utilisation des L’influence du ro indicateurs se modifie avec le temps Pour les elus, la duree en poste a tendance a restreindre le recours a l’information de performance (Askim 2009); les savoirs evoluent aussi avec l’experience (Jacobsen 2006). Par ailleurs, le jeu de r^ ole est aussi influence par des variables personnelles; les plus frequemment soulignees sont les attitudes et, en particulier, la motivation de service public (Kroll 2014), le support ou le leadership des dirigeants politiques et administratifs (Yang et Hsieh 2007; Moynihan et Pandey 2010; Hoontis et Kim, 2012; Sanger 2013; Carassus et coll. 2014). La dependance de l’utilisation a l’egard de ces variables personnelles est en elle-m^eme source de volatilite. Le changement de leader – soit administratif, soit politique – entra^ıne souvent une retrogradation de la mesure de la performance dans les priorites (Sanger 2013). L’utilisation de la mesure de la performance est donc sensible a la conjoncture organisationnelle. Conclusion Cet essai croise deux courants de documentation : un sur la mesure de la performance municipale, et l’autre sur les relations entre politique et administration. Le niveau d’utilisation des indicateurs se revèle souvent faible et variable dans les municipalites, en depit de multiples preconisations manageriales et parfois, d’exigences gouvernementales. Par contre, peu d’etudes essaient d’analyser deux autres constats : l’utilisation differenciee des types d’indicateurs et sa volatilite dans le temps. Pour rendre compte de tous ces constats, un cadre de comprehension de l’interface politico-administrative locale est propose avec l’hypothèse que le sort des indicateurs de performance se decide dans la dynamique de cette interface. L’interaction entre elus et gestionnaires intègre constamment quatre registres (savoirs, valeurs, inter^ets et emotions) sur lesquels les indicateurs de performance sont plus ou moins pertinents selon leur type. L’attitude des deux groupes a l’egard des indicateurs depend a la fois de logiques de position et d’affinites personnelles, influencees par les contextes locaux et organisationnels specifiques. Il n’est donc pas surprenant que l’interface 104 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY politico-administrative soit sujette a d’importantes fluctuations conjoncturelles. Cette mouvance conjoncturelle aide a comprendre la variabilite et la volatilite observees dans l’utilisation des indicateurs. Notre approche analytique debouche sur de nouvelles pistes de recherche et presente un triple inter^et pour les praticiens. Pour la recherche, davantage d’analyses longitudinales (quelles que soient leurs difficultes) devraient ^etre effectuees pour mieux comprendre l’impact des fluctuations conjoncturelles de l’interface sur la viabilite de la mesure de la performance. Les analyses devraient porter simultanement sur l’utilisation des divers types d’indicateurs par les elus et par les gestionnaires. Notre cadre d’analyse peut aussi ^etre inspirant pour les praticiens. Premièrement, il permet de donner sens a la volatilite de l’utilisation des indicateurs et de ne pas les decourager de leur faible utilisation par les elus. Les gestionnaires ont un inter^et professionnel a l’utilisation des indicateurs sur la fonction de production et sur l’interaction de services, a la fois pour ameliorer l’offre de services et pour se b^atir une reputation d’efficacite. Cependant ces indicateurs ne repondent que partiellement aux preoccupations des elus et de leurs mandants, les citoyens. Ces preoccupations sont changeantes au gre des resultats electoraux et des conjonctures locales. La volatilite dans l’utilisation des indicateurs de gestion s’explique en partie par des raisons autres que leur degre de pertinence dans la gestion courante. Deuxièmement, le cadre d’analyse permet au gestionnaire de mieux decoder les tensions dans la dynamique de l’interface politico-administrative, en distinguant les divers registres sur lesquels elles peuvent se manifester et en prenant en compte les logiques de statut et de position. La comprehension de cette dynamique est cruciale pour l’etablissement de la « legitimite individuelle des dirigeants » (Durat et Bollecker 2012 : 147), en particulier dans les contextes de changement frequent soit dans la direction politique, soit dans la direction administrative; les tandems maires-directeurs generaux ont une longevite variable. Une meilleure comprehension facilite les ajustements dans les relations formelles et informelles et permet de jauger l’utilite des divers types d’indicateurs dans l’evolution de ces relations. Troisièmement, le cadre d’analyse peut guider le developpement strategique des divers types d’indicateurs. L’instauration d’un système de mesure de la performance depend du soutien des parties prenantes internes, son maintien et son utilisation du support des parties prenantes externes (de Lancer-Julnes et Holzer 2001). Si les gestionnaires souhaitent que les indicateurs les plus proches de leur gestion courante soient aussi utilises par les elus et plus largement, les citoyens, il faut qu’ils se soucient en m^eme temps de developper des indicateurs que ces deux groupes trouvent significatifs en fonction de leurs experiences et de leurs preoccupations. Les elus n’ont pas d’inter^et electoral a valoriser des indicateurs de performance qui ne trouvent pas d’echo auprès de la population. Deux avenues DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 105 sont possibles pour rendre les indicateurs de performance significatifs pour les citoyens. D’une part, dans la foulee des experiences de definition d’indicateurs avec les citoyens (Ho et Coates 2004; Woolum 2011), il faudrait trouver des indicateurs et des modes de presentation de ces indicateurs qui accroissent leur pertinence sur tous les registres de l’interface politicoadministrative. D’autre part, des mecanismes d’animation democratique a partir des indicateurs devraient ^etre experimentes; la viabilite organisationnelle de la mesure de la performance passe sans doute aussi par le developpement d’une capacite locale de lecture et d’interpretation des indicateurs, et par l’instauration de communautes interpretatives (Cornford et coll. 2013). Cette voie ferait retrouver certaines racines du mouvement de la performance locale au debut du siècle dernier, qui etait tiraille entre une vision plus manageriale axee sur l’amelioration de la gestion et une vision plus large cherchant a b^atir une democratie effective, an Efficient Democracy, selon le titre evocateur du livre d’Allen (1912). La première vision l’a emporte et prevaut toujours, avec les resultats que l’on constate. Ne devraiton pas davantage s’inspirer de la seconde? Bibliographie Albalate Del Sol, Daniel. 2013. “The institutional, economic and social determinants of local government transparency.” Journal of Economic Policy Reform 16 (1): 90–107. Alavi, Maryam et Dorothy E. Leidner. 2001. “Review: Knowledge management and knowledge management systems: Conceptual foundations and research issues.” MIS Quarterly 25 (1): 107–136. Allen, William Harvey. 1912. Efficient Democracy. New York: Dodd, Mead and Company. Ammons, David N. 2012. Municipal Benchmarks: Assessing Local Performance and Establishing Community Standards. Armonk, NY, M.E. Sharpe. ——. 2013. “Signs of performance measurement progress among prominent city governments.” Public Performance & Management Review 36 (4): 507–28. Ammons, David N. et William C. Rivenbark. 2008. “Factors influencing the use of performance data to improve municipal services: Evidence from the North Carolina benchmarking project.” Public Administration Review 68 (2): 304–18. Askim, Joestein. 2007. “How do politicians use performance information? An analysis of the Norwegian local government experience. ” International Review of Administrative Sciences 73 (3): 453–72. ——. 2009. “The demand side of performance measurement: Explaining councillors’ utilization of performance information in policymaking.” International Public Management Journal 12 (1): 24–47. Behn, Robert D. 2008. “The Core Drivers of CitiStat” dans D. Ammons, (sous la direction de), Leading Performance Management in Local Government. Washington: ICMA: 155–77. Bellavance, François, Etienne Charbonneau et Stephane Messier. 2010. Resultats du sondage sur l’utilisation des indicateurs de gestion. Quebec : MAMROT. Belley, Serge, Marc-Andre Lavigne, Louise Quesnel et Paul Villeneuve. 2011. « Les maires entrepreneurs : un nouveau style de leadership local », dans INM, Etat du Quebec 2011. Montreal : Boreal. Bergstr€ om, Tomas, Hakan Magnusson et Ulf Ramberg. 2008. “Through a glass darkly: Leadership complexity in Swedish local government.” Local Government Studies 34 (2): 203–20. 106 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY Berman, Evan, et XiaoHu Wang. 2000. “Performance measurement In U.S. Counties: Capacity for reform.” Public Administration Review 60 (5): 409–20. Bouckaert, Geert, et Walter Balk. 1991. “Public productivity measurement: Diseases and cures.” Public Productivity & Management Review 15 (2): 229–35. Carassus, David et Damien Gardey. 2009. « Une analyse de la gestion de la performance par les collectivites locales françaises : un modèle administratif ou politique? » Revue Française de Finances Publiques 107 : 101–30. Carassus, David, Christophe Favoreu et Damien Gardey. 2014. “Factors that determine or influence managerial innovation in public contexts: The case of local performance management.” Public Organization Review 14 (2): 245–66. Charbonneau, Etienne. 2010. “Use and Sensemaking of Performance Measurement Information by Local Government Managers: The Case of Quebec’s Municipal Benchmarking System.” Thèse de doctorat. Newark, NJ: Rutgers University. Charbonneau, Etienne et François Bellavance. 2012. Performance Management in a Canadian Municipal Benchmarking Regime, Working paper. Montreal, QC : ENAP et HEC Montreal. Cheong, Jong One, Kim ChulowooRhee, Dong Young, et Yahong Zhang. 2009. “The policy role of city managers: An empirical analysis of cooperative relationship in policy process between city managers and elected officials.” International Review of Public Administration 14 (2): 25–36. Chung, Yeonsoo. 2005. Factors Affecting Uses and Impacts of Performance Measures in Mid-Sized U.S. Cities. Thèse de doctorat. Knoxville, TN: University of Tennessee. Cornford, James, Rob Wilson, Suzan Baines et Ranal Richardson. 2013. “Local governance in the new information ecology: The challenge of building interpretative communities. ” Public Money & Management 33 (3): 201–08. Cristofoli, Daniela, et Paolo Crugnola. 2012. “To run or not to run (Again) for political office... at the crossroads between public values and self-interested benefits.” Yearbook of Swiss Administrative Sciences 2012: 91–105. Crozet, Paul. 2001. « La mise en jeu de la responsabilite des acteurs territoriaux: une nouvelle donne pour la gestion locale? » Politiques et management public 19 (3) : 55–77. Damanpour, Fariborz, et Marguerite Schneider. 2006. “Phases of the adoption of innovation in organizations: Effects of environment, organization and top managers.” British Journal of Management 17(3): 215–36. de Lancer-Julnes, Patricia, et Mark Holzer, 2001. “Promoting the utilization of performance measures in public organizations: An empirical study of factors affecting adoption and implementation.” Public Administration Review 61(6): 693–708. Demaj, Labinot, et Lucas Summermatter. 2012. “What should we know about politicians’ performance information need and use?” International Public Management Review 13 (2): 85–111. Demir, Tansu, et Christopher Reddick. 2012. “Understanding shared roles in policy and administration: An empirical study of council-manager relations.” Public Administration Review 72 (4): 526–35. Durat, Laurence, et Marc Bollecker. 2012. « La legitimite manageriale : le cas des directeurs generaux de service. » Politiques et management Public 29 (2) : 145–65. Folz, David. H, Reem Abdelrazek et Yeonsoo Chung. 2009. “The adoption, use and impacts of performance measure in medium-size cities: Progress toward performance management.” Public Performance & Management Review 33 (1): 63–87. Gerrish, E, et P-J Wu. 2013. “Performance Management in the Public Sector”, dans E.J. Ringquist (sous la direction de), Meta-Analysis for Public Management and Policy. San Francisco, CA: Jossey-Bass: 352–97. Haidar, Ali, Mark Reid et Keri Spooner. 2009. “Politicization and managerial values: Responses from New Zealand councillors, 2009 IERA Conference-Book of Proceedings.” Available at https://epress.lib.uts.edu.au/research/handle/10453/11483 (consulte le 5-07-2013) DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 107 Hammerschmid, Gerhard, Steven Van deWalle et Vid Stimac. 2013. “Internal and external use of performance information in public organizations: Results from an international survey.” Public Money & Management 33 (4): 261–68. Hanssen, GroSandkjaer. 2007. “ICT in norwegian local government – empowering the politicians?” Local Government Studies 33 (3): 355–82. Hansen, Kasper M, et Niels Ejersbo. 2002. “The relationship between politicians and administrators – A logic of disharmony.” Public Administration 80 (4): 733–50. Ho, Alfred Tat-Kei. 2006. “Accounting for the value of performance measurement from the perspective of midwestern mayors.” Journal of Public Administration Research and Theory 16 (2): 217–37. Ho, Alfred Tat-Kei, et Paul Coates. 2004. “Citizen-initiated performance assessment: The Initial Iowa experience.” Public Performance & Management Review 37 (3): 29–50. Ho, Shih-Jen Kathy et Yee-Ching Lilian Chan. 2002. “Performance measurement and the implementation of balanced scorecards in municipal governments.” Journal of Government Financial Management 51 (4): 8–19. Holzer, Mark, et Kaifeng Yang. 2004. “Performance measurement and improvement: An assessment of the state of the art.” International Review of Administrative Sciences 70 (1): 15– 31. Hood, Christopher. 2007. “Public service management by numbers: Why does it vary? Where has it come from? What are the gaps and the puzzles?” Public Money & Management 27 (2): 95–102. Hoontis, Peter, et Taehee Kim. 2012. “Antecedents to municipal performance measurement implementation.” Public Performance & Management Review 36 (1): 158–73. Huron, David. 2001. « Une typologie de maires entrepreneurs politiques comme aide au conseil dans les mairies. » Politiques et management public 19 (2) : 63–81. Jacobsen, Dag Ingvar. 2006. “The relationship between politics and administration: The importance of contingency factors, formal structure, demography, and time.” Governance 19 (2): 303–23. James, Oliver, et Peter John. 2007. “Public management at the ballot box: Performance information and electoral support for incumbent english local governments.” Journal of Public Administration Research and Theory 17 (4): 567–80. Jones, Stephen. 2011. Superheroes or puppets? “Local government chief executive officers in victoria and queensland.” Journal of Economic and Social Policy 14 (2), article 6. Karlsson, Martin. 2012. “Participatory initiatives and political representation: The case of local councillors in sweden.” Local Government Studies 38 (6): 795–815. Kernaghan, Kenneth. 2003. “Integrating values into public service: The values statement as centerpiece.” Public Administration Review 63 (6): 711–19. Kerrouche, Eric. 2006. « Les maires français, des managers professionnels ? » dans Annuaire des collectivites locales. Tome 26, La gouvernance territoriale : 83–98. Kroll, Alexander. 2013. “The other type of performance information: Non routine feedback, Its relevance and use.” Public Administration Review 73 (2): 265–76. ——. 2014. “Why performance information use varies among public managers: Testing manager-related explanations.” International Public Management Journal 17 (2): 174–201. Kroll, Alexander, et Isabella Proeller. 2013. “Controlling the control system: Performance information in the german childcare administration.” International Journal of Public Sector Management 26 (1): 74–85. Kuhlmann, Sabine. 2010. “Performance measurement in European local governments: a comparative analysis of reform experiences in Great Britain, France, Sweden and Germany. International Review of Administrative Sciences 76 (2): 331–45. Liguori, Mariannunziata, Mariafrancesca Sicilia, and Ileana Steccolini. 2012. “Some like it non-financial - politicians’ and managers’ views on the importance of performance information.” Public Management Review 14 (7): 903–22. 108 ETIENNE CHARBONNEAU, GERARD DIVAY, DAMIEN GARDEY Loveridge, Ronald O. 1968. “The city manager in legislative politics: A collision of role conceptions.” Polity 1 (2): 213–36. Maurel, Christophe, David Carassus, et Damien Gardey. 2012. « Les demarches locales de performance publique face a la LOLF : mimetisme ou innovation? » Politiques et Management Public 28 (4) : 417–42. Mazouz, Bachir, Jean Leclerc, et Marcel Tardif. 2010. « Des prealables a la mesure et a l’evaluation dans la sphère publique : La gestion publique a l’interface politico-administrative », dans B. Mazouz (sous la direction de), La gestion integree par resultats: Concevoir et gerer autrement la performance dans l’Administration publique. Quebec : Presses de l’Universite du Quebec : 131–60. McGowan, Robert P, et Theodore H. Poister. 1985. “Impact of productivity measurement systems on municipal performance.” Policy Studies Review 4 (3): 532–40. McNeill, Donald. 2001. “Embodying a Europe of the cities: Geographies of mayoral leadership.” Area 33 (4): 353–59. Melkers, Julia, et Katherine Willoughby. 2005. “Models of performance-measurement use in local governments: Understanding budgeting, communication, and lasting effects.” Public Administration Review 65 (2): 180–90. Molina DeForest, Anthony, et Cassandra L. McKeown. 2012. “The heart of the profession: Understanding public service values.” Journal of Public Affairs Education 18 (2): 375–96. Montesinos, Vicente, Isabel Brusca, Francesca Manes Rossi, et Natalia Aversano. 2013. “Usefulness of performance reporting in local government: Comparing Italy and Spain.” Public Money & Management 33 (3): 171–76. Mouritzen, Poul Erik, et James Svara. 2002. Leadership at the Apex: Politicians and Administrators in Western Local Governments. Pittsburgh, PA: University of Pittsburgh Press. Moynihan, Donald P. 2008. The Dynamics of Performance Management: Constructing Information and Reform. Washington, DC: Georgetown University Press. Moynihan, Donald P., et Sanjay K. Pandey. 2010. “The big question for performance management: Why do managers use performance information?” Journal of Public Administration Research and Theory 20 (4): 849–66. Moynihan, Donald P., et Stephane Lavertu. 2012. “Do performance reforms change how Federal managers manage?” Issues in Governance Studies (52):1–9. Moynihan, Donald P., Sanjay K. Pandey, et Bradley E. Wright. 2012a. “Prosocial values and performance management theory: Linking perceived social impact and performance information use.” Governance 25 (3): 463–83. ——. 2012b. “Setting the table: How transformational leadership fosters performance information use.” Journal of Public Administration Research and Theory 22 (1): 143–64. Nalbandian, John. 1994. “Reflections of a “Pracademic” on the logic of politics and administration.” Public Administration Review 5 (6): 531–36. ——. 2006. “Politics and administration in local government.” International Journal of Public Administration 29 (12): 1049–63. Nonaka, Ikujiro, Ryoko Toyama, et Noboru Konno. 2000. “SECI, Ba and leadership: A unified model of dynamic knowledge creation.” Long Range Planning 33 (1): 5–34. Poister, Theodore H., et Gregory Streib. 1999. “Performance measurement in municipal government: Assessing the state of the practice.” Public Administration Review 59 (4): 325–35. Pollitt, Christopher. 2006. “Performance information for democracy: The missing link?” Evaluation 12 (1): 38–55. Rogers, Martha Kinney. 2006. Explaining Performance Measurement Utilization and Benefits: An Examination of Performance Measurement Practices in Local Governments, Thèse de doctorat. North Carolina State University, Raleigh, NC. Sancino, Alessandro, et Alex Turrini. 2009. “The managerial work of Italian city managers: An empirical analysis.” Local Government Studies 35 (4): 475–91. DANS L’UTILISATION DES INDICATEURS DE PERFORMANCE MUNICIPALE VOLATILITE 109 Sanger, Mary Bryna. 2013. “Does measuring performance lead to better performance?” Journal of Policy Analysis and Management 3 (1): 185–207. Schatteman, Alicia. 2010. “The state of Ontario’s municipal performance reports: A critical analysis.” Canadian Public Administration 53 (4): 531–50. Schatteman, Alicia, et Etienne Charbonneau. 2010. “A comparative study of municipal performance measurement systems in Ontario and Quebec, Canada.” International Journal of Public Sector Performance Management 1 (4): 360–75. Siegel, David. 2010. “The leadership role of the municipal chief administrative officer.” Canadian Public Administration 53 (2): 139–61. Smith, Peter. 1995. “On the unintended consequences of publishing performance data in the public sector.” International Journal of Public Administration 18 (2–3): 277–310. Streib, Gregory, et Theodore Poister. 1999. “Assessing the validity, legitimacy, and functionality of performance measurement systems in municipal governments.” American Review of Public Administration 29 (2): 107–23. Svara, James H. 2006a. “The search for meaning in political-administrative relations in local government.” International Journal of Public Administration 29 (12): 1065–90. Talbot, Colin. 2000. “Performing ‘Performance’: A comedy in five acts.” Public Money and Management 20 (4): 63–9. Ter Bogt, Henk J. 2007. “Politicians in search of performance information? – Survey research on Dutch Aldermen’s use of performance information.” Financial Accountability & Management 20 (3): 221–52. Torres, Lourdes, Vicente Pina, et Ana Yetano. 2011. “Performance measurement in Spanish local government across cross-case comparison study.” Public Administration 89 (3): 1081– 1109. Union des Municipalites du Quebec et Ad hoc recherche. 2012. Les perceptions et les attentes Stephan Harris, associe des Quebecois et Quebecoises a l’egard de leur municipalite. Livre blanc sur l’avenir des municipalites. Montreal, QC : UMQ. Vabo, Signy Irene, et Jacob Aars. 2013. “New Public Management Reforms and Democratic Legitimacy: Notions of Democratic Legitimacy among West European Local Councillors.” Local Government Studies 39 (5): 703–20. Van De Walle, Steven, et Wouter Van Dooren. 2008. “Introduction: Using Public Performance Information.” Dans W. Van Dooren et S. Van De Walle (sous la direction de), Performance Information in the Public Sector: How It Is Used. Houndmills, UK: Palgrave Macmillan: 1–8. Van Dooren, Wouter, Geert Bouckaert, et John Halligan. 2010. Performance Management in the Public Sector. New York, NY: Routledge. Wang, XiaaoHu, et Evan Berman. 2001. “Hypotheses about Performance Measurement in Counties: Findings from a Survey.” Journal of Public Administration Research and Theory 11 (3): 403–27. Weible, Christopher M. 2011. “Political-Administrative Relations in Collaborative Environmental Management.” International Journal of Public Administration 34 (7): 424–35. Wheeland, Craig. 2000. “City Management in the 1990s. Responsibilities, Roles and Practices.” Administration & Society 32 (3): 255–81. Williams, Daniel W. 2004. “Evolution of Performance Measurement until 1930.” Administration & Society 36 (2): 131–59. Woolum, Janet. 2011. “Citizen Involvement in Performance Measurement and Reporting.” Public Performance & Management Review 3 (2): 79–10. Yang, Kaifeng, et Jun Yi Hsieh. 2007. “Managerial Effectiveness of Government Performance Measurement: Testing a Middle-Range Model.” Public Administration Review 67 (5): 861–79. Etienne Charbonneau François Bellavance Performance management in a benchmarking regime: Quebec’s municipal management indicators Abstact: We study the occurrence of performance management in a provincially mandated, yet flexible, municipal performance regime in Quebec. Statistical analyses of the determinants of general, management, budgeting and reporting uses of performance information are performed for 321 municipalities. We combined perception data of uses and internal characteristics with archival data on socioeconomic and political characteristics and performance of municipal services. Our results reveal that the propensity of using performance information is not affected by operating in a small municipality or evolving in a hotly disputed political environment. The strongest predictors of performance management are performance and managerial attitudes facing performance measurement. Sommaire : Nous etudions la frequence de l’utilisation des indicateurs de performance dans un regime de performance municipal au Quebec, prescrit par la province, mais souple. Des analyses statistiques des determinants de l’usage de l’information sur la performance a des fins generales, de gestion, de budgetisation et de presentation de rapports ont ete realisees pour 321 municipalites. Nous avons combine des donnees de perception des usages et des caracteristiques internes a des donnees d’archives sur les caracteristiques socioeconomiques et politiques et sur la performance des services municipaux. Nos resultats revelent que la propension a utiliser les renseignements sur la performance n’est pas affectee par le fait que l’on fonctionne dans une petite municipalite ou que l’on evolue dans un milieu politique tres litigieux. Les plus forts indicateurs previsionnels de gestion de la performance sont les attitudes a l’egard de la performance et des gestionnaires auxquelles fait face la mesure de la performance. It has long been taken for granted by researchers that from the presence of performance data would come (better) informed decisions (for a recent review, see Aubert and Bourdeau 2012). As a result, “while the production of performance information has received considerable attention in the public sector performance measurement and management literature, actual use of this information has traditionally not been very high on the research agenda” (Van De Walle and Van Dooren 2008: 2). Until recently, most of Etienne Charbonneau is assistant professor at the Ecole nationale d’administration publique, and a member of the CREXE research center. François Bellavance is professor of management sciences at HEC Montreal. They would like to thank Gregg G. Van Ryzin and Donald P. Moynihan for their comments on a previous version of the paper. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 110–137 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 111 the large body of research on performance measurement in the public sector examined the determinants of its existence (Streib and Poister 1999; Holzer and Yang 2004; Chung 2005), its implementation (Palmer 1993; Johnsen 1999; de Lancer-Julnes and Holzer 2001; Ho and Chan 2002; Jordan and Hackbart 2005), its perceived benefits (McGowan and Poister 1985; Berman and Wang 2000; Rogers 2006) and shortcomings (Radin 1998; Hood 2007a). There is now a mounting number of studies on the determinants to the use of performance information at the local level by managers (de Lancer-Julnes and Holzer 2001; Moynihan and Pandey 2010; Kwon and Jangb, 2011; Moynihan and Hawes 2012; Moynihan, Pandey and Wright 2012b) and by elected officials (Ho 2006; Askim 2009). This is the first Canadian study on the determinants of performance management at the local level. This study answers the same research question formulated by this recent stream of research on the statistical determinants of performance management: What are the factors, whether controllable or uncontrollable, that account for the uses of performance measurement by municipal managers? After all, performance measurement is an informationbased managerial tool. It is valuable if it is utilized. This study will not only investigate the general use of performance information, but also the management, budgeting and reporting uses of performance information. Moynihan, Pandey and Wright (2012a: 470) would label management and budgeting use as “purposeful” and reporting uses as “political.” In step with de Lancer-Julnes and Holzer (2001), Rogers (2006), and Moynihan and Pandey (2010), we try to identify the determinants on the use of performance information. The Indicateurs de gestion municipaux [Municipal Management Indicators] initiative, a mandatory municipal benchmarking regime in operation in the province of Quebec, constitutes the setting of this study. Quebec is an interesting case to study performance management at the local level. First, it is one of the three municipal mandatory performance regimes in North America, with Ontario and Nova Scotia. Second, the mandatory performance measurement regime is only mandatory regarding the collection and transmission of standardized indicators. It is not a rigid performance measurement regime like England’s defunct Comprehensive Area Assessment. Rather, it is a compulsory and publically funded version of performance consortiums in operation in the United States, like the North Carolina Benchmarking Network, the Southeastern Results Network, and the Florida Benchmarking Consortium. There are at least three limitations of the existing literature on the use of performance information in the public sector. First, the dependent and independent variables of the regression models come from survey instruments that were previously used in other governmental settings. Second, the information carried by the performance indicators themselves will be 112 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE controlled for in the analyses, as suggested by Boyne (2010: 216, 218). Third, and more importantly, the present study encompasses many small and very small municipalities. Most studies on the use of municipal performance information are targeted at medium to large municipalities: between 10,000 and 100,000 residents (Ho 2003), more than 25,000 residents (Melkers and Willoughby 2005; Rogers 2006), more than 50,000 residents (Moynihan and Pandey 2010; Moynihan, Pandey and Wright 2012b), between 25,000 and 250,000 residents (Chung 2005). To our knowledge, only two studies include smaller municipalities. In Johansson and Siverbo’s (2009) study of Swedish municipalities, the smallest municipality in Sweden, Bjurholm, has 2,500 residents (Statistics Sweden 2009). In Rivenbark and Kelly’s (2003) study in smaller municipalities, their sample is focused on localities between 2,500 to 24,999 residents. In comparison, two-thirds of the 1,113 municipalities in Quebec have less than 2,000 residents. Rural municipalities include a large share of municipalities in Canada. Patterns of use of performance information in small and very small municipalities are close to unknown. The article proceeds as follows. The next section will cover in more depth the previous research on use of performance information followed by a description of the data and methods used to test our hypotheses. Lastly, we present the results from our regression analyses and conclude with a discussion of the importance of the findings. Previous research The general use of performance information by local managers (Poister and Streib 1999; Wang and Berman 2001; Chung 2005; Moynihan and Pandey 2010) and local politicians (Askim 2007 and 2008) has been studied. Nevertheless, studies assessing the extent of the use of performance measurement seldom take into account subtle symbolic uses (de LancerJulnes 2006: 227) or attitudes towards activities (Pollitt 2013: 349). In past research, when the use of performance measurement was addressed, it was often referred to in a general way. The proportion of local or subnational agencies that used certain kinds of measures was analyzed. The typical findings are that output measures were more prevalent than outcomes measures (Usher and Cornia 1981: 233; McGowan and Poister 1985: 534; Palmer 1993: 32; Berman and Wang 2000: 413; de Lancer-Julnes and Holzer 2001: 699; Wang and Berman 2001: 414). Indicatively, the use of performance measurement was operationalized by researchers in simple straightforward ways such as asking managers if “we use performance measures in this municipality” or “we do not use performance measures in this municipality” (emphasis in the original survey used for Poister and Streib 1999, and Streib and Poister 1999). PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 113 The Government Finance Officers Association of the United States and Canada (GFAO) asked jurisdiction if performance measurement was being used in their governments in ‘some way’ (Kinney 2008: 47). The National Administrative Studies Project (NASP-IV) asked local managers the extent to which they “regularly use performance information to make decisions” (Moynihan and Pandey 2010: 857). Swedish local managers were asked if they have adopted the Relative Performance Evaluations system the extent to which “one make use of ratio comparisons” in their municipality (Johansson and Siverbo 2009: 206). On other occasions, the general use of performance measurement was operationalized as the proportion of municipalities reporting consumption of performance information in general terms like decision making (Fountain 1997) or the use “in selected departments or program areas” (Streib and Poister 1999: 111). Berman and Wang’s (2000) sought to bridge the “input/output/outcomes” and “general use” of performance information with the “task relying on performance information” use studies. Their study of U.S. counties, with population over 50,000, establishes if organizations saying they use performance information would have the capacity to include this information in their operations. The authors found that among those using performance measurement, about one-third had what the authors regarded as an “adequate” level of capacity to actually use the information (Berman and Wang 2000: 417). Recent studies look at actual, or purposeful (Moynihan, Pandey and Wright 2012b) uses of performance information instead of a more passive general use. Performance management In the “task relying on performance information” use studies, the use of performance measurement information concentrated less on the proportion of input, output or outcomes measures or sweeping generalities. Instead, these studies looked more on activities and functionalities where performance information was used. Using Berman and Wang’s (2000) data, Wang and Berman (2001) assessed the link between what they called the “deployment” and “purposes” of output and outcome measures in county government. The authors asked county managers if their jurisdiction uses performance measurement to accomplish nine tasks including “communicating between public officials and resident,” “monitor the efficiency/effectiveness of services,” “determine funding levels for individual programs” (Wang and Berman 2001: 415). The proclaimed use of performance information was quite high. It spanned from 53 percent for “determine funding priorities across programs,” to 82 percent for “communicate between managers and commission.” A task like 114 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE “communicating between public officials and resident” in the survey did not encompass “actual use” of performance measurement, but reporting (Ammons and Rivenbark, 2008: 305). The actual use of performance information excludes “(. . .) simply reporting measures or somewhat vaguely considering measures when monitoring operations” and implies the “(. . .) evidence of an impact on decisions at some level of the organization” (Ammons and Rivenbark 2008: 305). According to that definition, only three out of six of the specific uses of de Lancer-Julnes and Holzer (2001: 700) would qualify as “actual uses” of performance. The result of de Lancer-Julnes and Holzer (2001: 708) is that few state, county and municipal managers reported using performance for specific activities “sometimes” or “always.” At the municipal level, slightly different results than county-level government performance information use were obtained using ICMA data from managers in U.S. municipalities with populations between 25,000 and 250,000 (Chung 2005). The results from 173 managers who reported having performance measurement initiative were that performance information was reported to be used for approximately between 25 percent (for strategic planning) and less than 50 percent of municipalities (for managing/evaluating programs), depending on the function (Chung 2005: 116). Rogers’ (2006) study used 277 GASB-generated surveys to assess the use of performance measurement in diverse local initiatives in the United States. The unit of analysis here is different than Chung’s (2005). “Use” in Rogers’ (2006) study looks at the proportion of departments within a municipality that reportedly use performance information, whereas Chung (2005) looks at municipalities as a whole. Once more, the results point out that the reported actual use of performance information at the municipal level in the departments is rarely higher than 50 percent. Use of comparative performance information One of the earliest studies covering the performance management behavior in general, and the use of benchmarking tangentially, focused on local authorities in England (Palmer 1993), before the implementation of the Best Value system and the Comprehensive Performance Assessment regime. Managers identified if they used certain indicators as referents, or benchmarks. Her results on the use of benchmarking do not specify what is meant by uses, what is the frequency of use, what is the percentage of decisions for which they are used, or if they impacted decisions. Still, it is informative to see that 63 percent of managers expressed that they used internal (historical) benchmarking and 56 percent indicated PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 115 that they used external benchmarking (comparisons with other local authorities) (Palmer 1993: 33). In 1999, postal surveys of General Managers and management accountants based in the UK showed that only a third of all respondents, when identifying the reasons for participating in benchmarking activities, “saw benchmarking as a source of new ideas, or route to improvement building on observed best practice” (Holloway, Francis and Hinton 1999: 355). Later, Boyne and colleagues (2002) performed content analysis of “performance plans” in Wales, under the Best Value system. They paid specific attention to the presence of benchmarking information contained in these “performance plans.” The authors found that: The percentage of plans including comparisons of performance is extremely low. This limited use of comparisons is surprising because benchmarking was one of the key elements of the review. Some plans contained comparative data gained through benchmarking, but not all pilots who were members of the same benchmarking club included the data. Some PPs utilized the Citizen’s Charter indicators published by the Audit Commission. Only a few pilots produced extensive comparative data in the PP. In some cases comparative data are provided, but are difficult to interpret as there is little or no information on the comparator organizations (Boyne et al. 2002: 703). One of the few studies that specifically present their results on managers’ use of benchmarking information is Johansson and Siverbo’s (2009) study of 207 (out of 290) Swedish municipalities. They find that 40 percent of Swedish municipal managers were reporting to use relative performance evaluations, that is, comparative benchmarking information, “to a great extent” (Johansson and Siverbo 2009: 207). Establishing from the literature the proportion of local government managers disclaiming that they are using performance information is difficult to do in a precise manner. The expression “use of performance measurement” or “use of benchmarking information” takes different meanings in different studies. The samples and the questionnaires vary between studies. If the reader can tolerate some discrepancies between the few different perception surveys that studied the role of performance information by local managers and politicians and the inherent bias for respondents to overstate socially desirable management behaviors, an overall picture immerges. Performance indicators are not widely used. Data The present research is based on municipalities in Quebec. The use of performance information by managers is analysed. Since 2004, all municipalities in that province are mandated to collect and report a set of standardized performance indicators. This is the third and last 116 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Table 1. Quebec Municipalities by Population Size in 2009, and Survey Participation Size of municipalities 0 to 499 500 to 999 1,000 to 1,999 2,000 to 2,999 3,000 to 4,999 5,000 to 9,999 10,000 to 24,999 25,000 to 49,999 50,000 to 99,999 100,0001 Total Number of municipalities 206 272 261 114 91 73 55 23 9 9 1,113 Number of survey participants (%) 84 98 82 35 30 22 19 12 5 4 391 (40.8%) (36.0%) (31.4%) (30.7%) (33.0%) (30.1%) (34.6%) (52.2%) (55.6%) (44.4%) (35.1%) Number of observations in the statistical models (due to missing value on independent variables) 67 83 67 28 23 19 16 12 4 2 321 (32.5%) (30.5%) (25.7%) (24.6%) (25.3%) (26.0%) (29.1%) (52.2%) (44.4%) (22.2%) (28.8%) province that implemented a provincially-mandated municipal performance regime. The e-mail addresses of General Managers were transmitted to our research team by the Ministère des Affaires municipales, Regions et Occupation du territoire (MAMROT). An electronic survey1 was sent to all General Managers in the 1,113 municipalities in Quebec in the winter of 2009-2010. The survey instrument was pretested with twelve active members in the Partners on Municipal Management Indicators Committee in the fall of 2009. Many of them are current or former municipal General Managers and/or Chief Financial Officer. The data of this study uses the whole population of municipalities in Quebec. The fact that MAMROT had electronic contacts of every municipality dissipates some of the concerns about a digital divide that could potentially be present in the province of Quebec, given the proportion of municipalities of less than 5,000 residents. Table 1 summarizes the distribution of these municipalities according to population size. It also includes the population size of survey respondents. As we observe in Table 1, population-stratified reminders contributed to receiving a response from 391 municipalities out of 1,113, with at least 30% of municipalities in each stratum. This is very similar to the 33% response rate registered by Schatteman’s study of municipal performance reports in Ontario. The unit of analysis for this study is the municipality. The survey was addressed to the General Manager, the highest-ranking PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 117 municipal administrative employee. Most of the returned surveys, 312 out of 391, were filled out by the General Manager. The rest were filled out by someone else in the organization in charge of the indicators, often the Assistant General Manager or Chief Financial Officer. The data used in this research come from two sources. A self-administered survey on the behaviors and perception of managers toward the management indicators and the values of the so called “hard performance information” (that is, of the mandatory indicators) were merged. An identification code on the surveys made possible the joining of the values for the performance information from MAMROT’s dataset, the values obtained for the different dependent variables and the independent variables from the survey. Due to missing values on one or more independent variables for 70 respondents, the data from 321 survey questionnaires were used in the statistical analyses (last column of Table 1). Dependent variable: performance management Use of performance information At least four different kinds of uses can be derived from the survey: general use, management use, budgeting use and reporting use. In this study, similarly to Moynihan and Pandey (2010), General Managers were asked how often the Municipal Management Indicators are used in their municipality: “According to your observations, what is the utilization level in your municipality?” This question differentiates between managers who do not use performance information at all and those who do occasionally, often, and very often. It does not differentiate between symbolic or passive uses and actual or purposeful uses of performance information1. Managers who do not use management indicators passed over the next questions on more specific uses of performance information. The descriptive statistics on top of Table 2 reveal that the performance information carried through the management indicators is not widely used. Of the 321 respondents with complete survey data, 51.1% said that the performance indicators are never used in their municipalities. The rest of the managers say that performance information is used. Of these 44.2% are from municipalities where the information is occasionally used; 4.7% are from municipalities where this information is used often, and none use it very often. The distribution is very similar in the sample of 391 respondents, with one manager saying using the management indicators very often. A number of existent survey instruments measuring use of performance information were presented in the literature review. Rogers (2006) 118 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Table 2. Descriptive Statistics of the Dependent Variables – General Use, Management Use, Budgeting Use, Reporting Use (n 5 321) Statements and dependent variables Data collection and reporting Municipal Management Indicators has been mandatory for all municipalities since 2003. According to your observations, what is the utilization level in your municipality? General use From what you have observed, indicate what are the reasons for which management indicators are used in your municipality: Management use From what you have observed, indicate what are the reasons for which management indicators are used in your municipality: Indicate whether at least one mandatory indicator was explicitly mentioned in Budget use From what you have observed, indicate what are the reasons for which management indicators are used in your municipality: Reporting use n % Very often Often Occasionally Never 0 15 142 164 0 4.7 44.2 51.1 At least occasionally In establishing contracts for services Managing operations or routine decisions Evaluation to establish underlying reasons for results Specific performance improvement initiatives At least one of the above To prepare budgets, including resources allocations or discussion of resources reallocations 157 7 48.9 2.2 17 5.3 49 15.3 30 9.3 75 27 23.4 8.4 Annual budget Annual report on the financial situation 78 77 24.3 24.0 At least one of the above To provide feedback to managers and employees To report to elected officials To report to citizens, citizen groups or to inform the medias At least one of the above 113 32 35.2 10.0 97 48 30.2 14.9 125 38.9 PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 119 differentiates between different uses: use for management, use for budgeting and use for reporting. This differentiation of uses makes this Governmental Accounting Standards Board (GASB) instrument more precise than other survey instruments measuring managerial use of performance information (for example, NASP-IV). For this reason, the GASB instrument was adapted to the reality of Quebec’s municipal environment. In a similar way to Rogers (2006), survey takers were asked: “From what you have observed, indicate what are the reasons for which management indicators are used in your municipality,” and “Indicate if since their implementation, mandatory management indicators for the different functions and activities have been explicitly mentioned [appeared] in the preparation of the budget and the annual report on the financial situation in your municipality.” In Rogers’ (2006) dissertation, there were seven items for management uses, six items for budgeting uses, and three items for reporting uses. For all the different questions items, respondents were asked five options about the proportion of departments using them, from no department at all to all departments. An additive index was set up for each use. Of the 321 respondents with complete survey data, 51.1% said that the performance indicators are never used in their municipalities. The rest of the managers say that performance information is used. Of these 44.2% are from municipalities where the information is occasionally used; 4.7% are from municipalities where this information is used often, and none use it very often. To keep true to the realities of Quebec’s benchmarking regime, and after suggestions from the twelve people who pretested the survey questionnaire, the number of items for each different uses were reduced. In the current survey, there are four items for management uses, three items for budgeting uses, and three items for reporting uses. Also, to reflect the fact that the survey is sent to many small and very small municipalities, the survey takers only had to express if items for specific uses were indeed used. Because the majority of municipalities in Quebec are very small, it did not make sense to ask the proportion of departments that were using indicators for specific functions. For the three specific use indices, a binary variable was computed to indicate if Municipal Management Indicators are used for at least one item or not. What can be noticed in Table 2 is that uses related to reporting, with 38.9% of respondents indicating at least one reporting use, have a slightly higher occurrence than budgeting (35.2%), and especially managing (23.4%) uses. In total, there are four dependent binary variables in this study, one for every type of use. 120 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Independent variables: performance, barriers, and internal, socio-demographic and political characteristics of the municipality Performance of the municipality Fourteen mandatory indicators form the basis on which the performance of municipalities can be assessed. Two precisions should be made about the measurement of performance in Quebec. First, the fourteen mandatory indicators do not cover all municipal services performed by municipalities in Quebec. Public libraries, fire services and police services were not included in the list of mandatory indicators at the time of data collection. Second, performance should be understood as internal and external performance. Internally benchmarking is akin to historical trends within a municipality. The portal where managers are invited to assess themselves consists of comparison of the municipality indicator value to the appropriate quartile values for municipalities of its size. The comparative information contained in this portal constitutes external benchmarking. Beside a passing remark to the effect that performance usually translates as effectiveness and efficiency (MAMSL 2004: 3), no judgment is offered by MAMROT in defining what performance is, and what differentiates good or bad, or even better from worst performance. This ambiguity goes against recommendations for performance management in complex settings (Van Dooren 2010: 430). In the online portal, where the comparative data are presented to municipal managers, the fourth quartile represents municipalities with the highest values for the indicators. For example, higher plowing cost and more frequent water boiling notices are in the fourth quartile. In the context of this research, it is hypothesized that cost should be lower rather than higher; higher cost of snow removal per km and higher average length of health-related leaves of absence should be minimized; the debt should represent a lower percentage of assets rather than a higher percentage. Thus, better performance can be defined as being in the first quartile (lower cost of snow removal per km); worst performance can be defined as being in the fourth quartile (higher percentages of debt compared to assets)2. This evaluation by quartiles for municipalities of similar characteristics is precisely how Zafra-G omez, L opez-Hernandez and Hernandez-Bastida (2009: 157) evaluated the performance in a study on Spanish municipalities. It should be noted that only the 2008 quartile values were available to municipal managers for external benchmarking when they could use and had to report their 2009 performance data to MAMROT. Therefore, the quartile of each management indicator in 2009 for a given municipality was obtained by comparing its 2009 value to the 2008 quartile values of the PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 121 municipalities of the same population size group. The quartiles of the fourteen indicators were further averaged to obtain the externally benchmarked performance of each municipality in the sample3. The average quartile for the sample of survey participants is 2.61 (Table 3). The difference in performance for each available indicator from the year 2009 to year 2008 was also computed and coded 0 if performance was stable or better in 2009 (that is, equal or lower indicator value in 2009 compared to 20084, and coded 1 if the performance was in decline (that is, higher indicator value in 2009). Internally benchmarked performance is defined in this study as the proportion of management indictors with a declining performance between 2009 and 2008. The municipalities had on average 49% of their management indicators in decline (Table 3). Beside a passing remark to the effect that performance usually translates as effectiveness and efficiency (MAMSL 2004: 3), no judgment is offered by MAMROT in defining what performance is, and what differentiates good or bad, or even better from worst performance. Having no guidance from previous research on the influence of performance itself on performance management, it is unclear to us how performance should influence the use of performance indicators. Thus, although externally and internally defined performance are main independent variables, we do not have formal hypotheses on their expected impacts. Barriers to uses of performance information Even in voluntary initiatives in Canada, the importance of understanding barriers to the use of performance information has been recognized (Hildebrand and McDavid 2011: 57, 60, 62). In a context of the mandatory and systematic benchmarking regime in Quebec, municipalities are bounded and constrained, at the very least, to collect and transmit information on standardized performance indicators. Although it is not impossible that managers, who are to use performance information, would do so because of perceived benefits, it is more likely that they would practice performance management because they do not encounter barriers. The level of government mandating the collection and transmission often define the benefits that are expected out of the benchmarking regime. It was the case of Quebec’s benchmarking regime from the beginning (MAMSL 2004). Siverbo and Johansson (2006: 283-284), in their study of the voluntary Relative Performance Evaluations municipal 122 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Table 3. Descriptive Statistics of the Independent Variables (n 5 321) 1. Performance Quartile (external benchmarking) Decline (internal benchmarking) 2. Barriers Unwillingness (Cronbach’s alpha 5 0.81) Inability (Cronbach’s alpha 5 0.83) Prevented (Cronbach’s alpha 5 0.68) 3. Internal characteristics Strategic planning (Cronbach’s alpha 5 0.72) Citizen outreach (Cronbach’s alpha 5 0.78) Political and administrative leadership (Cronbach’s alpha 5 0.73) 4. Socio-demographic characteristics Size of population (logarithm) Population density of municipality (per km2) Size of budget in 2009 (logarithm) Fiscal stress Devitalisation index 5. Political characteristic Presence of political competition for the mayoral seat in 2009 election Mean SD Min Max 2.61 0.41 1.33 3.50 0.49 0.17 0.00 0.90 3.07 0.66 1 4 3.04 0.69 1 4 2.20 0.82 1 4 3.02 0.63 1 4 3.42 0.52 1 4 3.60 0.51 1 4 7.31 1.38 4.62 12.84 0.12 0.36 0.00 3.36 14.45 1.42 12.31 20.25 21.87 0.30 14.10 5.12 0.00 216.76 102.07 25.99 51% system in Sweden, used a survey instrument to measure the perceived barriers to performance measurement implementation and use. Siverbo and Johansson (2006) established three categories of barriers of use: those related with one’s unwillingness to use performance information, those related to one’s inability, and those related to one being prevented from using performance information. Each barrier is constituted of four PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 123 items. After the pretest, at the suggestion of the pretest survey takers, one item measuring “being prevented from using performance information” “the municipality has an explicit or implicit policy against the Municipal Management Indicators,” was dropped from our survey. The rest of the eleven items from Siverbo and Johansson’s (2006: 283284) instrument were barely altered. Two more items were added to the list. In the survey, four items constitute the “unwilling” portion of barriers, six items constitute the “unable” portion, and three items constitute the “prevented” portion. On every item, surveyed managers could identify if the statements described the reality in their municipality by expressing if they “agree,” “somewhat agree,” “somewhat disagree,” or “disagree” with the statement. A full disagreement with the perception that the barrier applied in their municipality was coded “1”; a full agreement was coded “4.” Additive indices akin to the one Rogers’ (2006) used for the dependent variables were developed. Therefore, all three barriers are coded by the average score from one to four, as measured by the items. The descriptive data on the barriers, performance and the control variables discussed in this section and the followings are presented in Table 3. The research hypotheses about the barriers are: H1. It is expected that managers who express their unwillingness to use performance indicators will indeed use performance measurement less than managers that do not perceive this barrier. H2. It is expected that managers who express their inability to use performance indicators will indeed use performance measurement less than managers that do not perceive this barrier. H3. It is expected that managers who express being prevented from using performance indicators will indeed use performance measurement less than managers that do not perceive this barrier. Internal characteristics of the municipality Managerial practices can help explain the use of performance information. Initially developed to understand the characteristics of local authorities in the U.K. under the Best Value System (Enticott et al. 2002), the instrument was later used by the Audit Commission and academics. Academic researchers sampled the lengthy survey to circumscribe the instrument to five dimensions that have been identified by the Audit Commission (2002: 3-4, in Boyne and Enticott 2004: 12). In their study of local authorities’ performance Boyne and Enticott (2004) used twenty-five questions related to the five dimensions identified by the Audit Commission from the instrument to measure internal characteristics of local 124 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE authorities. From the same initial instrument, Andrews, Boyne and Enticott (2006) developed a different instrument to explain poor performance in English local authorities. The current survey instrument of municipalities’ internal characteristics is based on the items used by Boyne and Enticott (2004) and Andrews, Boyne and Enticott (2006). An exploratory factor analysis resulted in eight items constituting three factors: strategic planning, citizen outreach, and leadership. We forced the presence of three factors in the analyses. The factor loading can be consulted in Appendix 1. On every item, surveyed takers could identify if the statements described their municipality, by agreeing with the statement on a four-point Likert scale. A full disagreement with the perception that the management characteristic applied in their municipality was code “1”; a full agreement was coded “4.” Hence, all characteristics are coded by the average score from one to four, as measured by the items it contains. Socio-demographic and political characteristics of the municipality To follow previous researches about performance measurement (de Lancer-Julnes and Holzer 2001: 695; Askim, Johnsen and Christophersen 2008: 303; Moynihan and Pandey 2010: 858), socio-demographic variables are included in the regression models. In this research, four control variables that are expected to favour the use of performance information are included: the size of population (Askim, Johnsen and Christophersen 2008; Andrews et al. 2010), the size of the budget (Askim, Johnsen and Christophersen 2008; Schatteman 2010: 539; Zaltsman 2009: 464), both in logarithmic form, the population density of municipalities (Williams 2005; Andrews et al. 2010). Three control variables are expected to have a diminishing effect on the use of performance information: devitalisation, which is a seven-item socio-economic index calculated by MAMROT that is akin to deprivation (Andrews 2004; Andrews et al. 2010), fiscal stress (Hendrick 2004; Askim, Johnsen and Christophersen 2008), which is the proportion of the long-term net debt relatively with the net value of assets, and the level of political competition (Askim, Johnsen and Christophersen 2008; Kwon and Jangb 2011: 604), measured by if the mayoral seat was disputed in the 2009 general municipal elections, or if the mayor was elected by acclamation. In a recent article, Hildebrand and McDavid (2011: 68) suggested that local governments with polarized political culture would be less likely to use performance information. Results As we have seen previously in Table 2, most managers are either from municipalities where performance information is perceived as not being PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 125 used or occasionally used. Before moving forward with the results of the multiple logistic regression analyses for the four binary dependent variables, the differences of barriers occurrences and performance are offered for users and non-users of performance information. Managers using management indicators systematically perceived fewer and less intense barriers on average than managers not using them. Users of performance information tend to have slightly lower externally benchmarking performance (that is, higher average quartile), and fewer indicators with declining values between 2009 and 2008. The regression analyses will determine if differences are statistically significant, even when other possible influences like internal, socio-demographic, and political characteristics are taken into account. The results of the four multiple logistical regression models are presented in Table 4. The coefficients of the independent variables are raw regression coefficients. Performance of the municipality The internal and external performance of these municipalities, are measured by the Municipal Management Indicators. Externally benchmarked performance, measured by the average quartile of the fourteen management indicators, proved to be statistically related only with general use of performance information. Managers from municipalities where comparative performance is low (that is, higher average quarter value) tend to use the indicators more than others for general use (p < 0.05). Internally benchmarked performance, measured by the proportion of indicators with a declining performance in a municipality, between 2009 and 2008, is significantly associated, at the 10% level, with uses of general and budgeting of performance information. The predicted probabilities to use performance information for a municipality with an internally declining performance on 80% of its indicators, when all other independent variables are set to their mean values, are respectively 39% (95% C.I.: 27% to 52%) and 23% (95% C.I.: 15% to 35%) for general and budgeting uses compared to 59% (95% C.I.: 47% to 70%) and 42% (95% C.I.: 31% to 54%) for a municipality with an internally declining performance on 20% of its indicators. Overall, the value of indicators as a source of information is used more often in municipalities where it can be seen as a vindication of encouraging results than when it suggests suboptimal results. Barriers to uses of performance information The hypotheses are related to the perceived presence of barriers hindering the use of performance information. The expectations were that more 126 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE barriers would be associated with diminished uses of performance information. The results for our four types of use are that only one of the three barriers has a consistent verifiable impact. The perceived barriers of being unable or prevented to use performance information only seem to impact the general use of the management indicators. On the other hand, the barrier reflecting an unwillingness to use the indicators is statistically significant on all four types of use at the 1% significance level. For H1, we can reject the null hypothesis, but the evidence to support hypotheses H2 and H3 is very modest. This is due for the most part to the relatively high correlation between the three perceived barriers (0.38 < r < 0.48), implying non negligible collinearity in the regression models5. We observe that a marginally sterner attitude of unwillingness to use indicators will decrease the logit of the probability that a manager would express using the indicators in general by a factor of 1.18. The same pattern is found for the specific uses of performance information. The marginal effects of the unwillingness barrier on the management, budgeting and reporting uses of performance information are comparable. When all other independent variables are set at their mean value (see Table 3)6, a manager that on average mostly agrees with barriers related with unwillingness (that is, unwillingness 5 3.8), has a predicted probability of 28% (95% C.I.: 20% to 37%) for general use of the indicators and 16% (95% C.I.: 11% to 24%) for budgeting purposes. When the unwillingness barrier score is at the average value of 3, which means that the typical manager “somewhat agree” on average with the four unwillingness barriers in his/ her municipality, the predicted probabilities of use of performance information increase respectively to 50% (95% C.I.: 44% to 56%) and 33% (95% C.I.: 28% to 39%) for general and budgeting uses. For managers with an unwillingness score of 2, who on average “somewhat disagree” that the four unwillingness barriers are present, the predicted probabilities of uses respectively jumps to 76% (95% C.I.: 65% to 85%) and 61% (95% C.I.: 49% to 73%). The effect of this independent variable is substantial. Internal characteristics of the municipality The relationship between internal characteristics of municipalities and the different types of use is seldom conclusive. Contrary to what Moynihan, Pandey and Wright (2012b: 157) found with American local government jurisdictions with populations over 50,000, the positive influence of leadership is not felt for performance management in Quebec. Citizen outreach does not have an effect on performance management when performance, management, socio-demographic and political characteristics are taken into account. We do not find discernible influence of strategic planning on the use of performance information. PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 127 Socio-demographic and political characteristics of the municipality Six additional variables that were included in the regression models were not covered by hypotheses. These variables were related to sociodemographic and political characteristics of municipalities. The size of a municipality, as measured in population, is not statistically associated with the different uses of performance information. Neither is the size of the budget. Higher devitalization is negatively correlated with budgeting use of performance information, but only at the 10% level (p < 0.1). Others uses do not seem to be impacted by devitalisation. More-densely populated municipalities would have a higher probability of using performance information in their budgeting (p < 0.05). Municipalities in more stringent fiscal situations would be slightly less likely to use performance data for general use (p < 0.05). The presence of political competition in elections for the mayoral seat does not seem to impact the use of performance information by managers. Discussion Do local managers in the province of Quebec use performance information a lot in their operations? The question is difficult to answer, because “use” is also measured in different ways in different studies. Findings from previous studies are that “use,” however defined, usually does not go further than 50%. Less than half of the departments within a city, less than half of decisions, less than half of management functions integrate information from performance measures. In Quebec, less than half of General Managers say that the indicators are used at least sometimes. In addition, 41.1% identify at least one management or budgeting function, also known as actual use, for which performance information would be enshrined. The proportions are somewhat comparable to large municipalities in national studies in the United States. The proportions are much lower than for Swedish municipalities in the voluntary Relative Performance Evaluations system. Siverbo and Johansson (2006: 278) classified nine out of ten Swedish municipalities as a “high-intensity” user. The same could be said for about one municipality out of twenty in Quebec. The results of the analysis of uses are that there are two constant influences explaining the variations in performance management. The first of these influences is performance itself. Municipalities which tend to feature better internal performance (that is, lower proportion of indicators declining) report higher uses of the information. This is observed for general, budgeting and actual uses. Confirming recent findings on performance reporting by municipalities in Quebec (Charbonneau and 128 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Bellavance 2012), there seem to be behavior akin to blame avoidance when it comes to municipalities involved in internal benchmarking. Blame avoidance behaviors are observed for declining internal performance. This is to say that managers with a higher number of indicators with declining performance about their municipalities are more prone to ignore the information out of hand. Lastly, despite frequent claims from managers in Quebec (Charbonneau and Nayer 2012), it could not be demonstrated that the size of municipalities impacts the different uses of performance. The second of these influences is the unwillingness of managers to use indicators. The barriers related to the perception of an inability to use data and being prevented to use data could not be identified as statistically significant forces, but it is important to mention that they are correlated with unwillingness. The case to point out the unwillingness barrier as a prime influence of performance information uses is strong. The manifestation of management behaviors stemming from an unwillingness to use the indicators is constant across the board. Once the discrete changes of variations of unwillingness are taken into account, it becomes clear that this barrier is why certain managers choose not to use the indicators in their decision making and operations. For the most part, uses are not impacted by influences outside managers’ control or immutable characteristics, such as being a very small municipality, or evolving in a hotly disputed political environment. A sizable portion of variation of use can be explained by two main factors: the internal benchmarking performance of municipal services over the years and the unwillingness of managers to use the indicators. The analyses of performance management reveal that the factors associated with variation of uses are linked to managers. For the most part, uses are not impacted by influences outside managers’ control or immutable characteristics, such as being a very small municipality, or evolving in a hotly disputed political environment. A sizable portion of variation of use can be explained by two main factors: the internal benchmarking performance of municipal services over the years and the unwillingness of managers to use the indicators. It bears repeating that these two factors are not correlated between themselves. In our sample, taking all factors into account, the lowest predicted probability for using the indicators in a general manner is 7.6%. It is observed for a municipality with the highest score for unwillingness, inability and prevented barriers (that is, score 5 4) and 60% of indicators in decline between 2009 and 2008. The highest predicted PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 129 probability of 98% is observed for a municipality with very low scores of unwillingness, inability and prevented barriers (1.25, 1.40 and 1.33 respectively) and 33% of indicators in decline. Limitations There are four main limitations in the present study: two have to do with survey items, one with the survey itself, and one as to do with sampling. The substantive nuance between occasional users and non-users of performance information is thin. This can elevate the difficulty of finding differences between users and non-users of performance measures, as municipalities with no or low intensity of use can be similar. This constitutes the nuance problem. There is also a novelty problem. The nuance problem is compounded by the fact that only a handful of studies on the use of performance information utilize regression analyses. Many survey instruments that were used in the survey of Quebec’s General Managers come from previous studies about managerial behavior related to performance measurement and management. The survey instrument developed for the realities and specificities of performance measurement regimes of other countries were adapted to fit Quebec’s municipal benchmarking regime reality. However, there were no survey items to measure group-grid culture. Potential influences of egalitarian and fatalistic managerial cultures on non-use were not studied (Van Dooren, Bouckaert and Halligan 2010: 141-142). Additionally, newer measures of some variables surfaced after data collection. For example, we might have found leadership to be statistically correlated with performance management if Moynihan, Pandey and Wright’s (2012b) instrument was used. This being said, the independent variables were not fine-tuned survey items for regression purposes. Previous studies related to use were answering research question that could be answered with descriptive statistics, or offer simple correlation tables. Nevertheless, the logistic regressions performed in this study reveal that there are significant differences between municipal non-users and users of performance information. It has been recognized in past research on use of performance measurement information that surveying a single manager per organization constitutes a limitation (Lægreid, Roness and Rubecksen 2008: 45). Municipal managers working in finances and budgeting do not report their activities according to the same criteria as other managers (Marcuccio and Steccolini 2009: 160). A significant difference exists as well regarding perceived problems of performance measurement for finance and budgeting managers, and other managers (Willoughby 2004: 36). Aside from its limitations, the present study follows the lead of a landmark performance measurement study on Norwegian local politicians’ a p < .10, b p < .05, c p < .01. Intercept 1. Performance Quartile (external benchmarking) Decline (internal benchmarking) 2. Barriers Unwillingness Inability Prevented 3. Internal characteristics Strategic planning Citizen outreach Political and administrative leadership 4. Socio-demo. characteristics Size of population (logarithm) Population density of municipality Size of budget in 2009 (logarithm) Fiscal stress Devitalisation index 5. Political characteristic Political competition in 2009 election Adjusted Pseudo R-squared P-value of the Hosmer and Lemeshow Goodness-of-Fit Test 0.34 0.76 0.23 0.23 0.22 0.24 0.32 0.32 0.38 0.41 0.38 0.01 0.03 0.27 0.77b 21.38a 21.18c 20.47b 0.01 0.24 0.33 20.11 0.25 0.33 20.10 20.02b 20.03 0.05 0.265 0.182 3.56 S.E. 2.16 Coeff. General use 20.12 0.21 0.40 0.05 20.02 0.01 20.22 0.31 20.23 21.19c 20.19 20.50a 0.47 20.98 1.79 Coeff. 0.302 0.531 0.32 0.46 0.42 0.45 0.01 0.03 0.28 0.37 0.38 0.27 0.27 0.27 0.40 0.90 4.06 S.E. Management use 20.12 0.257 0.573 0.27 0.84b 20.28 20.02 20.05a 20.03 0.43 0.40 21.17c 20.30 20.34 0.13 21.43a 4.64 Coeff. 0.28 0.41 0.43 0.40 0.01 0.03 0.25 0.33 0.33 0.24 0.24 0.23 0.34 0.78 3.63 S.E. Budgeting use 0.41 0.54 0.62 20.44 20.01 20.04 0.13 0.33 0.13 21.02c 20.31 20.04 0.26 20.58 3.58 Coeff. 0.202 0.550 0.27 0.39 0.40 0.39 0.01 0.03 0.24 0.32 0.32 0.22 0.23 0.21 0.33 0.74 3.52 S.E. Reporting use Table 4. Results of the Logistic Regression Analysis of Performance Information (n 5 321)(Regression coefficients and their standard errors are reported in the Table) 130 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 131 attitudes towards comparative evaluation of local bureaus’ performances against other jurisdictions, by juxtaposing “hard” performance and survey data (Revelli and Tovmo 2007). It also sheds light to performance management in small rural municipalities of less than 2,000 residents. Municipalities that perform worse than last year on many indicators tend to use performance indicators to a lesser extent. This is an interesting finding. It reiterates the fact that performance measurement is an information-based management tool. Information about how the organization is doing is but one source among many. Even when other factors are taken into account, if the performance information reflects badly on a manager and his/her municipality, he/she is less likely to include it in the municipality’s operations. To deepen our understanding of the determinant of performance management at the local level, future research should find ways to reduce dependence on perception data on the part of managers (Boyne 2010). Observing the direct real use of performance information would enhance our knowledge of performance measurement. Conclusion The goal of this research was to uncover what influences the use of performance information by local managers in the province of Quebec. There are two main factors accounting for the use of performance measurement by municipal managers. The first factor influencing the use of performance information is performance itself, defined historically. Municipalities that perform worse than last year on many indicators tend to use performance indicators to a lesser extent. This is an interesting finding. It reiterates the fact that performance measurement is an information-based management tool. Information about how the organization is doing is but one source among many. Even when other factors are taken into account, if the performance information reflects badly on a manager and his/her municipality, he/she is less likely to include it in the municipality’s operations. This is akin to blame avoidance strategies described by Hood (2002, 2007b and 2011). The second factor is related to the unwillingness of managers to use performance indicators. No correlations were found between the various uses of performance information and internal characteristics such as citizen outreach, strategic planning, and leadership. Contrary to the claims of managers found in a complementary qualitative study of comments and 132 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE focus groups (Charbonneau and Nayer 2012), no link was established between the size of a municipality and the use of performance information. Contrary to the claims of managers found in a complementary qualitative study of comments and focus groups (Charbonneau and Nayer 2012), no link was established between the size of a municipality and the use of performance information. Our findings draw attention to the fact that in an intelligence performance regime like Quebec’s, which shares much with other North American municipal performance consortium, unwillingness and blame avoidance can be relied on to explain the low occurrence of performance management. Notes 1 What motivates this choice is that official correspondence from MAMROT is sent electronically; MAMROT routinely contacts municipalities by electronic means and sets-up performance transition tools that are computer-based. The contact list of General Managers provided by MAMROT consisted of electronic addresses. The sheer number of municipalities in the province of Quebec involves prohibitive costs for mail surveys. It also has other methodological implications: the large number of municipalities is a serious hindrance to a large number of face to face interviews. Additionally, the distances between municipalities are great in Quebec, which is the vastest Canadian province. An e-mail survey seemed like the best compromise between an online survey and a mail survey. 2 The quartile classification of two of the fourteen management indicators was reversed: training effort per employee and percentage of training cost compared to total payroll. A higher value for these two indicators was considered a better performance. The reason motivating this reverse coding is that the intent behind these indicators is to draw attention to training and foster more training, not less. Therefore the classification into the quartiles was reversed for these two indicators so that a classification in the first (fourth) quartile represents a lower (better) performance. 3 The coding was reversed for training effort per employee and percentage of training cost. 4 The coding for declining performance was also reversed for the training effort per employee and percentage of training cost compared to total payroll management indicators. As for the quartiles, the number of indicators differs across municipalities. 5 If we remove unwillingness and prevented (inability) from the regression models, inability (prevented) becomes statistically significant at the 1% with a negative regression coefficient. 6 Because all other independent variables have a very low or no correlation with the three barrier variables, setting them to their respective means is representative of the observed data patterns in our sample for the different values of unwillingness considered to show the predicted probabilities of use of performance information by the multiple logistic regression models. References Ammons, D. N., and W. C. Rivenbark. 2008. “Factors influencing the use of performance data to improve municipal services: Evidence from the North Carolina benchmarking project.” Public Administration Review 68 (2): 304–18. PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 133 Andrews, R. 2004. “Analysing deprivation and local authority performance: The implications for CPA.” Public Money & Management 24 (1): 19–26. Andrews, R., G. Boyne, and G. Enticott. 2006. “Performance failure in the public sector: Misfortune or mismanagement?” Public Management Review 8 (2): 273–96. Andrews, R., G. A. Boyne, M. J. Moon, and R.M Walker. 2010. “Assessing organizational performance: Exploring differences between internal and external measures.” International Public Management Journal 13 (2): 105–29. Askim, J. 2007. “How do politicians use performance information? An analysis of the Norwegian local government experience.” International Review of Administrative Sciences 73 (3): 453–72. ——. 2008. “Determinants of Performance Information Utilization in Political Decision Making.” Performance Information in the Public Sector: How It Is Used. W. Van Dooren, and S. Van De Walle. Houndmills, UK, Palgrave Macmillan: 125–39. ——. 2009. “The demand side of performance measurement: Explaining councillors’ utilization of performance information in policymaking.” International Public Management Journal 12 (1): 24–47. Askim, J., Å. Johnsen, and K.-A. Christophersen. 2008. “Factors behind organizational learning from benchmarking: Experiences from Norwegian municipal benchmarking networks.” Journal of Public Administration Research and Theory 18 (2): 297–320. Aubert, B. A., and S. Bourdeau. 2012. “Public sector performance and decentralization of decision rights.” Canadian Public Administration 55 (4): 575–98. Berman, E., and X. Wang. 2000. “Performance measurement is U.S. counties: Capacity for reform.” Public Administration Review 60 (5): 409–20. Boyne, G. A. 2010. “Performance Management: Does It Work?” Public Management and Performance: Research Directions, R. M. Walker, G. A. Boyne, and Brewer G. A. Cambridge, UK: Cambridge University Press: 207–26. Boyne, G., and G. Enticott. 2004. “Are the ‘Poor’ different? The internal characteristics of local authorities in the five comprehensive performance assessment groups.” Public Money & Management 24 (1): 11–18. Boyne, G. A., Gould–Williams, J., Law, J, and R. Walker. 2002. “Plans, performance information and accountability: The case of best value.” Public Administration 80 (4): 691–710. and F. Bellavance. 2012. “Blame avoidance in public reporting: Evidence Charbonneau, E., from a provincially-mandated municipal performance measurement regime.” Public Performance & Management Review 35 (3): 399–421. and G. Nayer. 2012. “Barriers to the use of benchmarking information: Charbonneau, E., Narratives from local government managers.” Journal of Public Management & Social Policy 18 (2): 25–47. Chung, Y. 2005. Factors Affecting Uses and Impacts of Performance Measures in Mid- Sized U.S. Cities. Department of Political Science. Knoxville, TN: University of Tennessee. de Lancer-Julnes, P. 2006. “Performance measurement: An effective tool for government accountability? The debate goes on.” Evaluation 12 (2): 219–35. de Lancer-Julnes, P., and M. Holzer. 2001. “Promoting the utilization of performance measures in public organizations: An empirical study of factors affecting adoption and implementation.” Public Administration Review 61 (6): 693–708. Enticott, G., Walker, R. M., Boyne, G. A., Martin, S., and R. E. Ashworth. 2002. Best Value in English Local Government: Summary Results from the Census of Local Authorities in 2001. Cardiff, Wales, Local and Regional Government Research Unit. Fountain, J. 1997. “Are state and local governments using performance measures?” PA Times 20 (1). Johansson, T., and S. Siverbo. 2009. “Explaining the utilization of relative performance evaluation in local government: A multi-theoretical study using data from Sweden.” Financial Accountability & Management 25 (2): 197–224. 134 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Jordan, M. M., and M. Hackbart. 2005. “The goals and implementation success of state performancebased budgeting.” Journal of Public Budgeting, Accounting & Financial Management 17 (4): 471–87. Hendrick, R. 2004. “Assessing and measuring the fiscal health of local governments: Focus on Chicago suburban municipalities.” Urban Affairs Review 40 (1): 78–114. Hildebrand, R., and J. C. McDavid. 2011. “Joining public accountability and performance management: A case study of Lethbridge, Alberta.” Canadian Public Administration 54 (1): 41–72. Ho, A. T.-K. 2003. “Perceptions of performance measurement and the practice of performance reporting by small cities.” State and Local Government Review 35 (2): 161–73. —— 2006. “Accounting for the value of performance measurement from the perspective of Midwestern mayors.” Journal of Public Administration Research and Theory 16 (2): 217–37. Ho, S.-J. K., and Y.-C. L. Chan. 2002. “Performance measurement and the implementation of balanced scorecards in municipal governments.” Journal of Government Financial Management 51 (4): 8–19. Holloway, J., G. Francis, and M. Hinton. 1999. “A vehicle for change? A case study of performance improvement in the “New” public sector.” International Journal of Public Sector Management 12 (4): 351–65. Holzer, M., and K. Yang. 2004. “Performance measurement and improvement: An assessment of the state of the art.” International Review of Administrative Sciences 70 (1): 15–31. Hood, C. 2002. “The risk game and the blame game.” Government and Opposition 37 (1): 15–37. ——. 2007a. “Public service management by numbers: Why does it vary? Where has it come from? What are the gaps and the puzzles?” Public Money & Management 27 (2): 95–102. ——. 2007b. “What happens when transparency meets blame-avoidance?” Public Management Review 9 (2): 191–210. ——. 2011. The Blame Game: Spin, Bureaucracy, and Self-Preservation in Government. Princeton, NJ: Princeton University Press. Johnsen, Å. 1999. “Implementation mode and local government performance measurement: A Norwegian experience.” Financial Accountability & Management 15 (1): 41–66. Johansson, T., and S. Siverbo. 2009. “Explaining the utilization of relative performance evaluation in local government: A multi-theoretical study using data from Sweden.” Financial Accountability & Management 25 (2): 197–224. Kinney, A. S. 2008. “Current approaches to citizen involvement in performance measurement and questions they raise.” National Civic Review 97 (1): 46–54. Kwon, M., and H. S. Jangb. 2011. “Motivations behind using performance measurement: city-wide vs. selective users.” Local Government Studies 37 (6): 601–20. Lægreid, P., P. G. Roness, and K. Rubecksen. 2008. “Performance Information and Performance Steering: Integrated System or Loose Coupling?” Performance Information in the Public Sector: How It Is Used. W. Van Dooren, and S. Van De Walle. Houndmills, UK: Palgrave Macmillan: 42–57. Marcuccio, M., and I. Steccolini. 2009. “Patterns of voluntary extended performance reporting in Italian local authorities.” International Journal of Public Sector Management 22 (2): 146–67. McGowan, R. P., and T. H. Poister 1985. “Impact of productivity measurement systems on municipal performance.” Policy Studies Review 4 (3): 532–40. Melkers, J., and K. Willoughby 2005. “Models of performance-measurement use in local governments: Understanding budgeting, communication, and lasting effects.” Public Administration Review 65 (2): 180–90. Ministère des Affaires municipales, du Sport et du Loisir. 2004. Les Indicateurs de Gestion Municipaux: Un regard neuf sur notre municipalite. Gouvernement du Quebec, QC. Moynihan, D. P., and D. P. Hawes. 2012. “Responsiveness to reform values: The influence of the environment on performance information use.” Public Administration Review 72 (s1): s95–s105. Moynihan, D. P., and S. K. Pandey. 2010. “The big question for performance management: Why do managers use performance information?” Journal of Public Administration Research and Theory 20 (4): 849–66. PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME 135 Moynihan, D. P., S. K. Pandey, and B. E. Wright. 2012a. “Prosocial values and performance management theory: Linking perceived social impact and performance information use.” Governance 25 (3): 463–83. ——. 2012b. “Setting the table: How transformational leadership fosters performance information use.” Journal of Public Administration Research and Theory 22(1): 143–64. Palmer, A. J. 1993. “Performance measurement in local government.” Public Money & Management 13 (4): 31–36. Pollitt, C. 2013. “The logics of performance management.” Evaluation 19 (4): 346–363. Poister, T. H., and G. Streib. 1999. “Performance measurement in municipal government: Assessing the state of the practice.” Public Administration Review 59(4): 325–35. Radin, B. A. 1998. “The government performance and results acts (GPRA): Hydra-headed monster or flexible management tool?” Public Administration Review 58 (4): 307–16. Revelli, F., and P. Tovmo. 2007. “Revealed yardstick competition: Local government efficiency patterns in Norway.” Journal of Urban Economics 62 (1): 121–34. Rivenbark, W. C., and J. M. Kelly. 2003. “Management innovation in smaller municipal government.” State and Local Government Review 35 (3): 196–205. Rogers, M. K. 2006. Explaining Performance Measurement Utilization and Benefits: An Examination of Performance Measurement Practices in Local Governments, Doctoral dissertation, Department of Public Administration, Raleigh, NC: North Carolina State University. Schatteman, A. 2010. “The state of Ontario’s municipal performance reports: A critical analysis.” Canadian Public Administration 53 (4): 531–50. Siverbo, S., and T. Johansson. 2006. “Relative performance evaluation in Swedish local government.” Financial Accountability & Management 22 (3): 271–90. Streib, G., and T. H. Poister. 1999. “Assessing the validity, legitimacy, and functionality of performance measurement systems in municipal governments.” American Review of Public Administration 29 (2): 107–23. Statistics Sweden. 2009. “Population in the country, counties and municipalities by sex and age.” Available at: http://www.scb.se/Pages/TableAndChart____159278.aspx. Accessed July 28th, 2010. Usher, C. L., and G. C. Cornia. 1981. “Goal setting and performance assessment in municipal budgeting.” Public Administration Review 41 (2): 229–35. Van De Walle, S., and W. Van Dooren. 2008. “Introduction: Using Public Performance Information.” Performance Information in the Public Sector: How It Is Used. W. Van Dooren, and S. Van De Walle. Houndmills, UK: Palgrave Macmillan. Van Dooren, W. 2010. “Better performance management: Some single- and double-loop strategies.” Public Performance & Management Review 34 (3): 420–433. Van Dooren, W., G. Bouckaert, and J. Halligan. 2010. Performance Management in the Public Sector. New York, NY: Routledge. Wang, X., and E. Berman. 2001. “Hypotheses about performance measurement in counties: Findings from a survey.” Journal of Public Administration Research and Theory 11 (3): 403–27. Williams, M. C. 2005. Can Local Government Comparative Benchmarking Improve Efficiency?: Leveraging Multiple Analytical Techniques to Provide Definitive Answers and Guide Practical Action. Doctoral dissertation. Richmond, VA: Virginia Commonwealth University. Willoughby, K. G. 2004. “Performance measurement and budget balancing: State government perspective.” Public Budgeting & Finance 24 (2): 21–39. Zafra-G omez, J. L., A. M. L opez-Hern andez, and A. Hern andez-Bastida. 2009. “Evaluating financial performance in local government: Maximizing the benchmarking value.” International Review of Administrative Sciences 75 (1): 151–67. Zaltsman, A. 2009. “The effects of performance information on public resource allocations: A study of Chile’s performance-based budgeting system.” International Public Management Journal 12 (4): 450–83. 136 ETIENNE CHARBONNEAU, FRANÇOIS BELLAVANCE Appendix 1. Results of the Factor Analyses Rotated factor loadings (varimax rotation) Barriers Some managers identified barriers which would limit the use of management indicators in decision making. Indicate your level of agreement regarding the following statements on the management indicators. (1 5 disagree to 4 5 agree) Management indicators are not considered useful Management indicators are not trustworthy Management indicators are felt to convey an incomplete picture of the organization We fear that management indicators are misunderstood and misinterpreted We do not know how to integrate management indicators into decision making We are not able to access data that would enable us to compare our results to similar municipalities We lack the time to use management indicators We lack the staff with the expertise to work with management indicators We lack the computerized tools to gather the detailed data on the management indicators We need additional information to use the management indicators Our officials are uninterested in the management indicators Management indicators are seen as a threat Management indicators will expose our weaknesses Mean SD Factor 1 Factor 2 Factor 3 3.11 0.87 0.206 0.734a 20.022 2.61 0.95 0.206 0.737a 0.084 3.08 0.85 0.015 0.758a 0.084 3.31 0.78 0.315 0.600a 0.105 2.92 0.90 0.552a 0.286 0.162 2.99 0.95 0.440a 0.291 0.098 3.33 0.85 0.662a 0.130 0.200 3.15 0.98 0.809a 0.133 0.218 2.85 0.93 0.682a 0.128 0.143 2.98 0.93 0.654a 0.152 0.102 3.24 0.85 0.216 0.483a 0.287b 2.20 0.93 0.193 0.234 0.737a 2.21 0.91 0.285 20.009 0.739a 137 PERFORMANCE MANAGEMENT IN A BENCHMARKING REGIME Appendix 1. Continued Rotated factor loadings (varimax rotation) Internal characteristics Indicate your level of agreement regarding the following statements on the management indicators. on the current situation in your municipality (1 5 disagree to 4 5 agree) There are clear links between the objectives and priorities of our service and those for the municipality as a whole The municipality’s objectives are clearly and widely communicated by managers of different services Co-ordination and joint working among the different municipal services is a major part of our approach to the organization of services The general manager and most managers place the needs of users first and foremost when planning and delivering services Working more closely with our citizens is a major part of our approach to service delivery Citizens’ demands are important in driving service improvement Political leadership is important in driving performance improvement The general manager is important in guiding decision making to drive performance improvement a Mean SD Factor 1 Factor 2 Factor 3 2.96 0.82 0.126 0.671a 0.061 2.95 0.78 0.319 0.465a 0.219 3.13 0.77 0.166 0.814a 0.187 3.53 0.63 0.525a 0.312 0.250 3.29 0.64 0.729a 0.189 0.182 3.44 0.59 0.722a 0.130 0.391 3.59 0.57 0.293 0.111 0.848a 3.61 0.57 0.270 0.246 0.584a loading > 0.4 are marked with an asterisk; the associated item is used in the computation of the factor score b although this item loads on the second factor, we let Johansson and Siverbo’s (2009) instrument unaltered (that is, we used it in the computation of the third factor). Edward Nason Michael A. O’Neill What metrics? On the utility of measuring the performance of policy research: An illustrative case and alternative from Employment and Social Development Canada Abstract: This article examines the state of performance measurement of policy research in government. The article observes that, to date, government policy research activities have seldom been the object of performance measurement, a factor we ascribe to the relative unsuitability of existing models rooted in a focus on outputs and outcomes, often at the expense of relationships and networks. In reference to the literature and the case study, the article proposes that existing performance measurement models are ill-suited to the task of assessing policy research performance. As a result, the article proposes that a purpose-built model may be needed to achieve this objective. Such a model, the Sphere of Influence of Research Policy model, is provided as an illustration. Sommaire : Le present article examine l’etat de la mesure de la performance dans le domaine de la recherche strategique au gouvernement. Les auteurs font remarquer que jusqu’a present les activites en recherche strategique gouvernementale ont rarement fait l’objet de mesure de performance, ce qu’ils attribuent au fait que les modèles actuels axes sur les resultats et les moyens mis en œuvre, souvent au detriment des relations et des reseaux, sont relativement inadaptes. Pour ce qui est de la documentation et de l’etude de cas, l’article laisse entendre que les modèles actuels de mesure de la performance sont inadaptes a la t^ache visant a evaluer les resultats de la recherche strategique. Pour cette raison, l’article suggère qu’il pourrait ^etre necessaire d’avoir un modèle conçu specifiquement pour atteindre cet objectif. Un tel modèle, le modèle Sphère d’influence des politiques et de la recherche, est fourni a titre d’illustration. Edward Nason is Director, Health and Innovation, Institute on Governance, Toronto. Michael A. O’Neill is Lecturer, School of Political Studies, University of Ottawa. This research was initially commissioned by Employment and Social Development Canada, whose permission to disseminate is acknowledged. The financial support of the Institute on Governance and the University of Ottawa’s Academic and Professional Development Fund is gratefully acknowledged. The authors thank the journal’s anonymous reviewers and the editorial assistance of Bryon Rogers. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 138–160 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 139 WHAT METRICS? Introduction “Not everything that can be counted counts, and not everything that counts can be counted,” William Bruce Cameron (1963). In recent years, governments have increasingly expanded the scope of their activities that are subjected to measures of performance. This has resulted in not-inconsequential human and financial resources to be directed to the development and implementation of performance measurement (PM) systems (see Feller 2002; Thomas 2004; Thomas 2008). An outgrowth of the implementation of New Public Management (NPM), the implementation of PM raised hopes for an new approach to public sector management and decision-making (Emery 2005: 11). While the introduction of PM has undoubtedly proved positive in many areas, such as in the relatively quantifiable areas of public services (Hood, Dixon and Wilson 2009: 2), its applicability and utility remain dependent on the area (Feller 2002: 437). One such area, investigated here, is that of policy research. With the goal of improving the efficiency and effectiveness of public policy and administration, PM has been introduced to track public sector performance over time in order to provide feedback and demonstrate accountability for actions and decisions to the public, decision-makers, and host of other stakeholders that gravitate toward public institutions. In this light, we consider below whether the area of policy research is one suitable to be assessed using PM through a case study provided by Employment and Social Development Canada (ESDC)1. Policy-making is inextricably linked to government, and the turn towards evidence-based policy-making has increased the attention paid to research as a means of evidence acquisition (Howlett 2009; Nutley, Davies and Walter 2003). How to assess the performance of this area of government activity has, until recently, not received much attention by governments though, as the audit at the genesis of this research highlights (HSRDC 2010), this is changing. Internationally, one of the more interesting efforts to determine the cost of the policy function was undertaken in New Zealand (see New Zealand 2010). However, this study considered the costs associated with the policy-making process, not its performance. In academia, policy research, has received limited attention in comparison with research performance more generally, though some work has emerged on this topic (Carden 2004; Davies, Hutley and Walker 2005; Immonen and Cooksy 2014; Landry, Amara and Lamari 2001; Landry, Lamari and Amara 2003; Nason et al. 2007; SSHRC 2008; Stephenson and Hennink 2002; Thomson Reuters 2010). We consider this area of research and discussion that we contribute to with these pages.2 140 EDWARD NASON, MICHAEL A. O’NEILL In this article we consider the applicability and utility of PM as it applies to government policy research conducted for the purpose of informing the process of public policy development. Though all research findings have the potential to influence future public policy, for our purposes we are only concerned with the research that is carried out as a distinct activity within the mandate of a government department for the explicit purpose of informing decision making. In the case of ESDC, this consists primarily of social science research, done by its in-house staff or by academics, private research organizations, NGOs, and consultants under contract with the department. Policy research, as it is practiced at ESDC, is an insular activity that seeks not to influence external interests, but chiefly internal ones: policy staff, senior officials and, ultimately, ministers and lawmakers. It is our argument that the prevalent PM models applied in the federal public sector are ill-suited to policy research activities. This is a particularly germane issue because many of the indicators for PM of policy research sit firmly in the conceptual, rather than empirical, realm. For example, as we expand upon later, many of the measures of final outcomes for policy research may be empirical in terms of data collection, but are often highly conceptual in their link to policy research activities.3 In light of this argument, we propose the outlines of an alternative model, the Sphere of Influence Research Performance (SIRP) model. This model, we propose, better captures the essence and reality of the policy research function within a public sector organization such as ESDC. We arrive at this proposal after a critical consideration of PM and its use in policy research within government and academia. It is our argument that the prevalent PM models applied in the federal public sector are ill-suited to policy research activities. We begin our discussion with a discussion of our research methodology, followed by an overview of ESDC’s research functions and the context in which it sought to introduce PM for its research activities. We follow with a brief discussion of the prevailing PM models in the Canadian federal public sector and consider the experiences of two federal institutions in assessing their research activities. Methodology and research limitations This paper originates in a project undertaken under contract with ESDC. The project sought to assess existing models of PM of research, and 141 WHAT METRICS? specifically policy research, and recommend a model that could be implemented and managed centrally by the Department’s Strategic Policy and Research Branch. Our research was conducted largely between September 2010 and May 2011. It was based on a methodology that triangulates among several sources of information: a review of the scholarly and grey literature on the topic of PM, a dozen confidential research interviews with senior officials at the rank of assistant deputy minister and director general within ESDC and other federal organizations, as well as with external experts on PM in Canada and the United Kingdom. All research interviews blended the use of a formal questionnaire with informal discussion of issues volunteered by the interviewee. In addition, two workshops with ESDC staff were held to confirm our research findings and obtain additional information on ESDC’s research practices and previous experience with PM. Our research findings were further discussed at an informal workshop attended by senior officials responsible for policy research from several federal departments. The generalizability of our findings may be limited by our focus on a single organization, ESDC. We mitigated this limitation through the inclusion of other organizations as part of our review. Furthermore, given the scope of the project, our primary focus has been on social science research for decision making as this is principally what is carried out at ESDC. We mitigated this through the inclusion in our research of experiences in science-based organizations. Per Feller (2002), we recognize that our findings and proposed PM model may be influenced by the nature of the commissioning organization and, therefore, risks being, settingdependent. This article and its findings will be of interest to a broad segment of public sector practitioners with a role in policy research, evaluation and audit. In particular, managers of policy research programs may find in this paper inspiration for new approaches and tools to demonstrate the value for money of their activities and a reason to devote some of their resources to undertake such efforts. Scholars with an interest in public administration and PM will find in this paper a call for further understanding of this relatively neglected area of research (Immonen and Cooksy 2014: 97, 99). Why measure the performance of policy research? Though there is variability in the terminology used, most scholars and practitioners agree on the following as the main objectives sought by the application of PM: 142 EDWARD NASON, MICHAEL A. O’NEILL Reducing costs, improving efficiency and effectiveness, and the quality of service delivery; Improving accountability to external authority, such a legislatures; Improving the process of budgetary decision-making by linking funding to performance; Providing benchmarks and targets; and Motivating personnel (see Alberta 2012; Auger 1997; Emery 2005; Feller 2002; Hood, Dixon and Wilson 2009; Immonen and Cooksy 2014; Quebec 2002; Treasury Board of Canada Secretariat [TBS] 2010). From the outset we recognize that there is a live and lively debate in Canada among academics and practitioners on both sides of the discussion concerning the effective use and misuse of PM (see, for example, Graham 2008; Heintzman 2009; Paquet 2009; Savoie 2013a, 2013b; Thomas 2004, 2005, 2008). In the United States the application of PM to policy research has also generated scholarly attention, likely as a result of the Government Performance and Results Act of 1993 (see, for example, Perrin 1998; Bernstein 1999). As summarized by Feller (2002) the issues raised by these scholars focus on the broader issue of whether policy research –in academia or government – is an area suitable for the application of PM. Clearly, the objectives pursued by PM are as relevant to policy research as they would be to any other area of government activity – especially in periods of diminishing or plateauing resources. In his discussion of the technical obstacles to PM, Graham (2008) notes the following challenges: PM’s subjective and value-laden nature; Lack of understanding of “production processes” linking inputs through to outcomes; Outcome measurement difficulties; Consistency and comparability of PM assessments over time and between organizations; PM’s retrospective nature; “Dumb data” problems; Debatable nature of “success” (2008: 179). Clearly, the objectives pursued by PM are as relevant to policy research as they would be to any other area of government activity – 143 WHAT METRICS? especially in periods of diminishing or plateauing resources. However, the obstacles Graham points to are more acute in the area of policy research. As Immonen and Cooksy note: “the challenges of monitoring outcomes and impacts . . . is well-known as research organizations have little or no control over the longer term results of their research” (2014: 107, see also Feller 2002: 435-438). In short, it is not PM per se that is at issue but whether it can confidently and usefully be employed to assess performance in a research setting. About policy research at ESDC and other federal organizations ESDC is a large policy department with a broad policy mandate spanning the economic and social spheres. It is composed of 180 research staff divided across six directorates (and approximately 25 units) divided among its four branches. Its research priority setting and activities cascade from the broader departmental priorities, with varying degrees of autonomy in terms of how these are translated into research projects at the unit level (research interview). In the 2007 and 2008 fiscal years, the department spent approximately $11 million dollars on its policy research activities; roughly half the amount spent in 2006 (HRSDC 2010). At the time of analysis in 2011, ESDC’s policy research activities were spread between three research streams: Program-specific research: typically undertaken to support the administration of a specific program, such as the Canada Student Loan program. Program-specific research leverages the policy capacity across the different branches of the department in support of a specific program; Policy-relevant research: typically undertaken to provide research in support of policy development activities that have been identified as departmental priorities. This work is either undertaken by research divisions within the Strategic Policy and Research (SPR) Branch, which is itself divided into specialised units that support the department’s three business lines (that is, labour market policy, learning and skills development, and social development), or it may be coordinated by SPR but conducted on a horizontal basis from policy and research units across the departments program branches; and Medium to long-term research: the “over the horizon” research that addresses emerging or anticipated priorities, issues that cut across several departmental mandates or which address strategic gaps in knowledge that would not otherwise be filled. These activities are most 144 EDWARD NASON, MICHAEL A. O’NEILL frequently handled by SPR (HRSDC 2009; HRSDC 2010; and research interviews). For the most part, ESDC performs research for internal consumption, though some of the research performed by academic researchers on contract for the department is published and disseminated (research interview). This approach is not out of step with research activities conducted in other government agencies, though it limits the use of peer review and dissemination results as an indicator of research performance. It is useful to note, however, that while internal research is not formally published beyond the department, it does percolate into the external environment, for example as a result of access to information requests, parliamentary hearings and tabling, media reports, leaks and occasional presentations at research conferences. Research projects at ESDC are internally directed and defined by the various research units within the department, although this is commonly done in consultation with program and operational units of the department. More recently, the department has moved towards the development of a coordinated departmental research planning process that more directly links research with departmental policy priorities (research interview). In 2010, the department published the results of an internal audit of its research activities (HRSDC 2010: s. 2.2). The audit made a number of findings and recommendations concerning the areas of planning and coordination of departmental research activities, centralized oversight of research initiatives, and expenditure tracking (HRSDC 2010: s. 2.2). In particular the audit made the following observations: Measuring performance for research activities can be complex as it may be difficult to attribute a particular policy or program outcome directly to a research result. However, indicators or practices that demonstrate the effectiveness of research activities are important to the success of a research function. In order to ensure that research is providing relevant and timely information, it is necessary to establish meaningful expectations against which to compare the results achieved. A formal performance measurement process has not been established to determine if policy research meets the needs of clients within HRSDC. [Policy Research Directorate] developed a logic model with performance indicators but this was in draft form at the time of the audit. Most directorates stated that performance was monitored informally through discussions with clients. Without performance information, the Department cannot determine if resources are allocated to activities that provide the greatest value to policy research clients (HRSDC 2010: s. 2.2). The department responded to the audit finding by instituting an intranet-based “knowledge portal” where research can be posted and shared across researchers and other departmental users and further WHAT METRICS? 145 introduced an intranet-based Research Management System (RMS) to track and manage research projects (research interview). The audit further recommended that the department “implement PM indicators that identify client expectations and track research results” without specifying how this should be approached or implemented. The department’s management response readily agreed with this recommendation (research interview). Though the department had been working on a logic-model and performance indicators, this project was still at the development stage at the time of the audit. Otherwise, the audit found most directorates to approach PM on an informal basis. As the audit observed, “without performance information, the Department cannot determine if resources are allocated to activities that provide the greatest value to policy research clients” (HRSDC 2010: s. 2.2). For insiders at ESDC, the recommended development of PM was deemed primarily for accountability reporting to Treasury Board Secretariat, and only secondarily about improving how ESDC approached research performance (research interview). In Canada, federal organizations’ approaches to PM are framed by policies issued under the authority of the Treasury Board (TB), such as the Policy on Evaluation (TBS 2009), under which PM is used to support accountability, improve decision-making, and supporting results-based management and value-for-money. To support the application of these policies, the Treasury Board of Canada Secretariat’s principal guide to PM, Supporting Effective Evaluations (TBS 2010), outlines an approach that frames and guides departments without being prescriptive in how PM is to be developed (TBS 2010). In short, all programs must have a PM component, but how that component is operationalized is left to departments to determine. Nevertheless, because they are linked to government-wide policies, TBS guidance has become the de facto PM model used across most departmental operational, administrative, and program areas (research interview). As outlined in TBS guidance documents (2010), the logic model is the preferred model for federal government PM. The choice of the logic model is based on its relatively simplicity and applicability to most federal program areas. As described by TBS, the logic model is meant to: Assist program managers to confirm program and outcome soundness; Assist program managers interpret data collected and identify implications for the program; Ensure that PM is linked to program logic in order to yield meaningful information; Provide a reference point for program evaluations; 146 EDWARD NASON, MICHAEL A. O’NEILL Facilitate communication about the program to staff and stakeholders (TBS 2010: s. 5). Though agnostic on the format of the logic model to be developed, TBS guidance provides for a series of standard components (that is, program inputs, activities, outputs and outcomes) that are to be included in the logic model (TBS 2010: s. 5). For our purposes it is also important to note that TBS guidance and Treasury Board policies are program-centric and tend to be silent on whether or how to approach PM for the policy related activities. As Feller (2002) and more recently Immonen and Cooksy (2014) have noted, most existing PM models applicable to publicly funded research have been developed to assess research in an academic context (see also Grant et al. 2010; Davies, Hutley and Walker 2005). It is also noteworthy that when considering policy or policy-relevant social science research (as opposed to scientific research), that there are few clear examples of policy research PM approaches used in a decision-making context. PM approaches are either aligned with scientific research in government, or policy-relevant research in the academic sphere. Thus, ESDC’s task of introducing PM to their policy research is made that much more difficult given that few federal organizations have made significant in-roads in this area (research interviews). Of those that have instituted such practices, Environment Canada (EC) stands out has having among the most developed, although this still aligns most closely with their scientific research. It is also noteworthy that when considering policy or policy-relevant social science research (as opposed to scientific research), that there are few clear examples of policy research PM approaches used in a decisionmaking context. As a science-based department, EC has developed a systematic approach to measuring the performance of its research and development activities that combines elements of a traditional logic-model and impact categorization. These categories include: alignment with departmental priorities and mandate; linkages to existing knowledge; scientific excellence; and the ability to enable decision making (Environment Canada 2009). Because EC’s research and development activities have both an inward (that is, departmental) and outward (that is, broad scientific community) focus, its PM framework is able to WHAT METRICS? 147 use external data sources, such as commercial citation indexes, to determine the impacts of its research (research interview). Because EC’s scientists can use the traditional means of research dissemination (that is, peer-reviewed publications) it is relatively easy to measure its research performance and quality using academic metrics (for example, citations and readership). At ESDC, however, when research in produced for an internal client, these same metrics may be unavailable, which introduces into the discussion the issue of indicator selection. Development and use of the indicators used to measure organizational results is a matter of debate (see Perrin 1998; Bernstein 1999). In part this debate rests on whether it is possible to develop indicators that can be generalized across research settings and the availability of the data enabling this. The federal government’s logic model-based approach to PM relies on formalized indicators that function well at the program or institutional level, but would appear to have limited utility in measuring policy research performance. Thus input (for example, financial and human resources) and output indicators (for example, number of reports or publications) could be used to measure policy research performance, although these say little about results, impacts, and outcomes of policy research. The application in government of indicators used to measure academic research performance may be of limited utility as the outputs of policy research are typically inwardly directed. And, as Feller (2002) notes, even if applicable, these would be of limited value given that they offer little indication of policy research quality. Existing means of measuring policy research performance in government, therefore, tend to revolve around qualitative measures; for example, through the use of peer-review (from internal or external reviewers on contract for this purpose) or surveys of user satisfaction with research quality (research interview). The use of impact indicators to measure the final outcomes from policy research are relatively consistent across research frameworks because they all tend towards the ways in which policy impacts individuals and society. Therefore indicators of research impact such as increased employment, well-being or cost-effectiveness of policies are all potentially relevant to government agencies. Specific indicators will depend partly on the framework or PM approach taken. With all outcome indicators, however, there are concerns over issues of attribution and causality (see CAHS 2009). It is also germane to note that the ability to translate research into policy action is dependent on the successful communication of research findings (outputs) between the policy researcher and the policy analyst (Stephenson and Hennink 2002). Dissemination should be an integral 148 EDWARD NASON, MICHAEL A. O’NEILL Figure 1. Logic model element of a research program and a factor in measurement of its performance (see Landry, Amara and Lamari 2001; Lamari, Landry and Amara 2013). Because ESDC policy research is inwardly directed, it faces two challenges: firstly of demonstrating how its policy research is affecting the external discussion of policy options; and secondly of testing the quality of its policy research in the light of that of other research centers and the public. Performance measurement: models and applicability to policy research Having outlined the context in which ESDC has been tasked with instituting PM for its policy research and related considerations about the applicability of PM for policy research, we turn to a discussion of PM frameworks that are available for ESDC. As a general consideration we propose that any framework for policy research would need to provide for generalizability of use across different types of research, would be replicable, would enable the development of benchmarks, and the performance information it generated could be used to improve results. Two models are discussed below: a logic model-based approach (such as the payback framework) which, as noted above, aligns with the approach supported by TB policy and guidance (2009 and 2010); and an alternative approach in the form of the Sphere of Influence for Research Performance (SIRP) model. Though a departure from the federal government’s preferred approach to PM, we suggest that the SIRP model may be better suited to an inward-facing function such as policy research. Payback framework As a federal institution, ESDC had a ready incentive to seek to introduce a logic model-based PM framework for its policy research activities, and the development of such a framework had been in development at the time of the audit (HRSDC 2010: s. 2.2). One available variant of the logic 149 WHAT METRICS? model approach lies in the payback framework (Buxton and Hanney 1994) which is a common approach to PM of research internationally (see CAHS 2009). As noted above, the logic-model is the preferred model for use in program evaluations across the federal government. Its attractiveness lies in the ability to outline processes from input through to outcomes. The payback framework consists of two elements: a logic model representation of the complete research processes (Figure 1) for the purposes of research impact evaluation and a series of categories to classify the individual paybacks (or impacts) from research (Donovan and Hanney 2011: 182). Simply stated, the payback framework provides the ability to measure return on investment from research through tracking research inputs to outcomes and attributing impacts along the way to different impact categories. It should be noted that within the payback framework, and in research evaluation more generally, the trend is to consider impacts to be any result of the research or its application—be it a process efficiency, output, secondary output or longer-term outcome (see CAHS 2009). The logic model (Figure 1) provides a simplified view of how research moves from inputs to outcomes. It shows each stage in the process and provides a different source of information on research processes, but it does not provide a way to categorize impacts and compare them easily; that is the role of the five payback categories: Overall, using the payback framework has many advantages. The most obvious advantage being that its logic-model approach is similar to that used in governments generally and the Government of Canada specifically. It is also conceptually understandable for researchers and policy makers, and the information needed to identify its outputs can be accessed through existing information systems. As Immonen and Cooksy (2014) caution in light of their own research, however, the difficulty inherent in developing repeatable indicators “when applied to research where results are uncertain and [impacts] . . . are complex and protracted” (p. 109) limits the utility of this approach to policy research PM. Forcing policy research into such a rigid PM framework may, in the end, prove counterproductive. An alternative: the spheres of influence for research performance (SIRP) To escape the rigidity of the logic model-based framework and provide information about how research is used may require a different approach. From its genesis, this project sought to provide ESDC with a tailored means of measuring the impact of its research in response to the recommendations of the audit (HRSDC 2010). The preferred approach 150 EDWARD NASON, MICHAEL A. O’NEILL Table 1. Payback Categories Payback category Sub-categories/description Benefit New knowledge Output: Refers to the end product of the research process Quality: Refers to the subjective measure of research excellence measured collected Research capacity: Refers to the ability to link available research resources (i.e. staff and funding) to the departmental strategic policy plans and priorities. Receptor capacity: Refers to the capacity to utilize research findings. Closely aligned to the traditional definition of the impact of policy research this category considers how research was used in formal policy or program proposals. This category measures the changes that would occur to the internal administration of programs or policies as a result of research findings. The category refers to the final outcomes from the policy research – the changes for Canadians based on policy implementation. This payback category – combining both output and quality measures will provide information about the quantity of new knowledge developed and whether it meets user expectations This payback category – combining both capacity dimensions considered in relative balance, enables the assessment of the utility and relevance of the research activity. Capacity building Informing policy Specific departmental benefits Broader social and economic benefits This payback category enables the assessment of the utility and relevance of the research activity. This payback category allows for lag between the policy research and the end policy or program change and therefore provides an indicator of the impact of research on policy. Perhaps the most difficult to assess, this payback category enables the assessment of broader changes associated or resulting from the policy research activity. would be one built upon existing knowledge but would also look to specifically address the challenges associated with measuring the performance of internal government policy research. This resulted in the development of the SIRP model as an alternative to standard approaches to meet ESDC’s needs and specific circumstances (Figure 2).4 We provide SIRP as an illustration of the type of purpose built model which we believe is necessary to assess PM of the area of policy research. 151 WHAT METRICS? Figure 2. The Spheres of Influence of Research Performance (SIRP) model The conceptual basis for SIRP rests on common approaches to knowledge transfer, the spheres of influence approach, in order to highlight the importance of the transfer of policy research knowledge. While this approach is novel for internal government policy research, it does relate to other research impact models such as the SSHRC analysis of fine arts impacts (Picard-Aitken and Bertrand 2008), which include sections on who is impacted by research and how. It is also noteworthy that previous work on social science research for decision making noted the importance of relationships and networks on achieving greater impacts (see Klautzer et al. 2011). The conceptual basis for SIRP rests on common approaches to knowledge transfer, the spheres of influence approach, in order to highlight the importance of the transfer of policy research knowledge. The SIRP model speaks to the linkages between policy researchers and research users illustrated through successive layers of influence. At the most basic level is the client of the work, with whom there should be direct linkage. Moving out, the next level represents policy analysts, whose role it is to develop policies based on research findings. At the third level are senior policy makers within ESDC (including the ADMs responsible for program areas). Further out are other federal government departments who may benefit from ESDC’s research and finally wider policy making organizations such as provincial governments, NGOs or 152 EDWARD NASON, MICHAEL A. O’NEILL international organizations given that ESDC’s research percolates beyond the department. Theoretically, it is possible that ESDC’s research could influence the wider sphere of the public opinion, though we have not illustrated this sphere. Within this framework, while it is anticipated that impacts from research are likely to work their way outwards from the more closely aligned spheres to those more distally aligned, there is no conceptual reason that research cannot directly affect more distal spheres without affecting the closer spheres (that is, there is not a rigid linear flow of impacts out from the research). Indeed, it is possible that impacts more distally could then ripple back inwards to affect the closer spheres to the research. For example, if a research project is archived and then used by another government department, it would be possible for the impacts of that use to then feedback to ESDC and affect its view and use of the research. For simplicity’s sake, we have depicted the model as one without such complex feedback loops or jumps of influence, but that does not preclude them from occurring. This framework focuses on the need to measure performance in terms of research reach and impacts. As such, the indicators aligned with the framework are more subjective and based on whether research is “fit for purpose” in the eyes of those using its results. Linking the transfer of knowledge to its impact is particularly relevant for policy research, where the research is only one of a number of pieces of evidence and influence on policy making. This leads to indicators requiring input from research users, but can be done in a low-impact way and a way that can help to ensure research findings in the future are more appropriate for the needs of research users. These indicators are presented in Table 2. It is worth noting that, like the payback framework, SIRP aligns multiple spheres with multiple impact categories. This could potentially lead to complications in understanding and reporting impact results. However, it is clear from the increasing and consistent use of the payback framework to assess research internationally that most groups consider the comprehensiveness of impacts across multiple levels as more important than any subsequent complication in reporting of impacts. Observations The following table reviews the strengths and weaknesses of the two models we have discussed based on our understanding of the two framework approaches embedded in our reading of the literature and the feedback from ESDC interviewees and international experts in research performance measurement. As we noted, performance measurement of policy research is relatively new. Proceeding down this avenue will Impact Engagement Quality Indicator category Benefit In SIRP, the ability to influence from the core internal user group to external parties is predicated on producing research deemed of sufficient quality. This can be easily assessed at the research client and policy analyst level by asking a series of scoring questions about the research. For higher levels, quality could be assessed by identifying where the research is referenced beyond the department as a proxy for peer-review (essentially a citation approach). The engagement indicator proIn the previous models, engagement was a largely binary measure vides information on the level (researcher–client). In the SIRP, the engagement is tracked through the of participation in and take-up succeeding layers of users and potential users. At ESDC, tools such as of research. the research and information management systems may provide information on these interactions. The further out from the core research results are accessed, the further the influence of that research can be established. In addition to these measures, the RMS and Knowledge Portal can also be used to track the networks of researchers and who makes up these networks (i.e. their position within the successive layers of influence). This is an important indicator of engagement and is simple to collect and analyze for the department. The impact indicator in SIRP Measuring impact in this fashion would require the development of speaks to the perceived or bespoke utility indices that reflect the type and nature of the research actual utility of the research as conducted. These indices should be constructed in such a way as to it transits from researcher assess the impact of research on decision-making and employ a rating through each succeeding system that would enable comparability over time. Impact assessment layers of influence. could also be gleaned from direct reference to departmental research through official documents and possibly the grey or academic literature. Centralized information management and collection systems would provide the information necessary. The quality indicator exists to provide insight into the excellence of the research in terms of how it influences decisionmaking. Description Table 2. Indicators of Research Performance Under SIRP WHAT METRICS? 153 Efficiency Indicator category Description Benefit The efficiency indicator provides The SIRP model is not ideally suited to measure efficiency as it is princithe means to assess the return pally concerned with the reach of the research across succeeding layers on investment of policy of influencers. The efficiency indicators we propose include intensity research activities, which was measures such as number of hours or financial resources used or one of the key issues raised in devoted to a particular project. This information is accessible to ESDC the 2010 audit. through the use of its RMS and financial reporting systems. In addition to the use of these objective measures, we suggest that this indicator would gain in strength through the inclusion of client perceptions of efficiency. This could include scored perceptions of timeliness, value for money, and efficiency of knowledge translation. This approach is already in use by some think-tanks to internally assess their project performance (research interview) suggesting it is an appropriate tool for policy-relevant research. Finally, using SIRP, the focus of the indicator would be internal and core measures of efficiency. However, with the addition of other metrics, the efficiency measures could be made more powerful by linking a research project’s range of influence and the resources it consumes. Table 2: Continued 154 EDWARD NASON, MICHAEL A. O’NEILL Spheres of Influence for Research Performance (SIRP) Payback framework Model Some of the indicators may be used in other frameworks because they relate well to the needs of policy makers Difficult to align with objective measures Requires input from research users Challenge of multiple impact categories across different spheres in the SIRP model Innovative approach so has not been tested in this area Limited applicability to provide performance information to improve results Aspects that can be integrated Impact categories speak to a way of understanding impacts that plays well with politicians and the public Weaknesses May be difficult to match research process across the department Can create large amounts of data – increasing data burden Traditional application in academic settings not policy ones Difficult to generalize across different areas of research Strengths Describes how research gets to impacts Logic models well understood in government of Canada Can link to evaluations easily Well understood and established PM framework for research Replicable to enable year-overyear benchmarks Relates to the groups policy research inside a government setting is trying to impact Close links between indicators and the “fit” of research to policy purpose Generalizable across different areas and types of research Table 3. Comparison of Options WHAT METRICS? 155 156 EDWARD NASON, MICHAEL A. O’NEILL require either modifying existing approaches or considering a new model that is more suitable to the activity being measured. Furthermore, this approach has to be calibrated with need and the ability of the organization to implement PM with due regard to resources and administrative feasibility. Finally, the options we reviewed for PM at ESDC are not mutually exclusive, and elements of each could be blended to create a more tailored approach. However, combining aspects of different models creates the potential for a model so complex that its utility would be undermined by its very weight without necessarily better performance information. It is possible that by focusing on relationships, that the field of research performance measurement can integrate more closely with that of knowledge translation. This would provide new opportunities both for the academic fields in performance evaluation of research and knowledge translation, but also potentially more appropriate evaluation evidence for the management of policy research in practice While Table 3 identifies some of the specific attributes of the different models for assessing policy research performance, it is important to also consider the potential impacts of measuring policy research performance on both the field of performance measurement and the practical assessment of policy research. For the field of performance measurement of research, it is noteworthy that while the majority of work in the field acknowledges the importance of knowledge translation, its models and frameworks still tend to focus on outputs and outcomes rather than relationships. It is possible that by focusing on relationships, that the field of research performance measurement can integrate more closely with that of knowledge translation. This would provide new opportunities both for the academic fields in performance evaluation of research and knowledge translation, but also potentially more appropriate evaluation evidence for the management of policy research in practice. In reality, better performance measurement of research is becoming increasingly linked to management of research activities and resources (such as the use of impact evaluation in the English Research Excellence Framework for university research funding decisions). Conclusion In this article we set out to examine the state of PM and its application to the policy research activities of government through a project focused on 157 WHAT METRICS? the practices and needs of ESDC. To this end, we compared a prevalent model of research impact assessment and proposed an alternative model for social science research in government in the SIRP. We further noted that government policy research activities have seldom been the object of performance measurement, in contrast to other areas of government activity such as policies and programs, a factor we ascribe to the relative unsuitability of existing models rooted in a focus on outputs and outcomes. Can existing models be used or tailored to assess the performance of government policy research? Certain elements of the prevailing model for research impact assessment, the payback framework, could be adapted. The payback framework would likely have resonance in a government context as its logic-model approach is commonly used in the public sector. Though we propose SIRP as a better alternative, there are undoubtedly advantages for government departments to shorten implementation through the adaptation of an existing PM approach. However, an equally important consideration, and one that we believe militates in favour of SIRP, is the ability to provide performance information that reflects the area of activity. In this sense, we believe that the ability of research to be diffused beyond the narrow researcher-client nexus is a far more important measure of performance than metrics concerned with output and resource intensity. In short, SIPR may sow the seed for less inward-facing policy research in government. In closing, this research highlights the relative paucity of PM frameworks that are available for the specific needs of governmental policy research. In this regard, ESDC’s foray into the development of a PM model for policy research can be considered an important development with the possibility of lesson learning and model diffusion across the public sector. “Without performance information, the Department cannot determine if resources are allocated to activities that provide the greatest value to policy research clients” (HRSDC 2010: s. 2.2). This statement is also true for other government departments with an internal policy research capacity. Notes 1 The Department’s applied title was changed in 2013 from Human Resources and Skills Development Canada. 2 For example, a bibliographic survey of this Journal found no research published within its pages since 2000 on this topic. 3 Examples of these sorts of measures include: long-term social outcomes such as socioeconomic status; economic measures such as GDP; and well-being measures such as life satisfaction measures (see OECD 2013). All of these measures face significant challenges linking the measure to the policy research providing evidence to decision making. 158 EDWARD NASON, MICHAEL A. O’NEILL 4 As the SIRP model was developed conceptually specifically for ESDC. While conceptually of interest, it remains to be developed fully to allow its use more broadly. References Alberta. 2012. “Measuring up: Progress report on the Government of Alberta strategic plan.” Available at: http://www.finance.alberta.ca/publications/measuring/.../measuring-up2012.pdf. Accessed 9 August 2013. Auger, J. 1997. La mesure de la performance dans le secteur public, Coup d’œil 3 (1). Bernstein, D. 1999. Comments on Perrin’s effective use and misuse of performance measurement. American Journal of Evaluation 20 (1): 85–93. Buxton, M., and Hanney, S. 1994. Assessing payback from Department of Health Research and Development: Preliminary report. HERG research report. CAHS. 2009. Making an Impact: A Preferred Framework and Indicators to Measure Returns on Investment in Health Research, Ottawa, ON: Canadian Academy of Health Sciences. Cameron, WB. 1963. Informal Sociology: A Casual Introduction to Sociological Thinking. New York: Random House, p. 13. Carden, F. 2004. “Issues in assessing the policy influence of research.” International Social Science Journal, 56(1): 135–151. Davies, H.Nutley, S., and Walter, I. 2005. Approaches to assessing the non-academic impact of social science research: Report of the ESRC symposium on assessing the non-academic impact of research 12th/13th May 2005, London, UK: Economic and Social Research Council. Donovan, C., and S. Hanney. 2011. “The ‘Payback Framework’ explained.” Research Evaluation 20(3): 181–183. Emery, Y. 2005. « La gestion par resultats dans les organisations publiques: de l’idee aux defis de la realisation. » Telescope 12 (3): 1–11. Environment Canada. 2009. Measuring Environment Canada’s Research and Development Performance. Gatineau, QC: Environment Canada. Feller, I. 2002. “Performance Measurement Redux.” American Journal of Evaluation 23 (4): 435–452. Graham, A. 2008. “Integrating Financial and Other Performance Information: Striking the Right Balance of Usefulness, Relevance and Cost.” In: KPMG, Holy Grail or Achievable Quest?: International Perspectives on Public Sector Performance Management. Geneva: KPMG International, pp. 85–104. Grant, J., Brutscher P.-B., Kirk S.E., Butler L., and Wooding S. 2010. Capturing Research Impacts: A Review of International Practice. Cambridge, UK: RAND Europe Heintzman, R. 2009. “Measurement in Public Management: The case for the defence,” Optimum On-Line, 39 (1): 66–80. Available at: www.optimumonline.ca/article.phtml?id5325. Accessed 4 November 2013. Hood, C., Dixon R., and Wilson D. 2009. “Managing by Numbers; the Way to Make Public Services Better?” Working Paper. London, UK: Economic and Social Research Council Public Service Programme. Howlett, M. 2009. “Policy analytical capacity and evidence-based policy-making: Lessons from Canada.” Canadian Public Administration 52 (2): 153–175. HRSDC. 2009. 2010-2011 Estimates: Report on plans and priorities. Available at http://www. tbs-sct.gc.ca/rpp/2010-2011/inst/csd/csd00-eng.asp. Accessed 10 June 2014. ——. 2010. Audit of the management framework for research activities - May 2010. Available at: http://www.hrsdc.gc.ca/eng/publications/audits/2010/18/index.shtml. Accessed 10 June 2014. Immonen, S, and L.L. Cooksy. 2014. “Using performance measurement to assess research: Lessons learned from the international agricultural research centres.” Evaluation 20(1): 96–114 WHAT METRICS? 159 Klautzer, L., Hanney, S., Nason, E., Rubin, J., Grant J., and Wooding S. 2011. Assessing policy and practice impacts of social science research: the application of the Payback Framework to assess the Future of Work programme, Research Evaluation 20(3): 201–209. Lamari, M., Landry J., and Amara N.. 2013. From “Need to Know” to “Need to Use”: What Determines Knowledge Utilisation by Public Managers in the Governmental Agencies in Quebec, Canada, Proceedings of 20th International Business Research Conference 4 - 5 April 2013, Dubai, UAE. Landry, R., Amara, N., and Lamari, M. 2001. “Utilization of social science research knowledge in Canada.” Research Policy 20: 333–349. Landry, R., M. Lamari, and N. Amara. 2003. “The extent and determinants of the utilization of university research in government agencies.” Public Administration Review 63 (2): 192–205. Nason E., Klautzer L., Rubin J., Hanney S., Wooding, S., and Grant J. 2007. Policy and practice impacts of research funded by the Economic and Social Research Council : A case study of the Future of Work programme, supporting data. Cambridge: RAND Europe. New Zealand. 2010. Review of Expenditure on Policy Advice: Improving the Quality and Value of Policy Advice. Wellington: Treasury. Nutley, S., Davies, H., and Walter I. 2003. Evidence based policy and practice: cross sector lessons from the UK. Keynote paper for the Social Policy Research and Evaluation Conference, Wellington, NZ. OECD. 2013. “OECD guidelines on measuring subjective well-being.” OECD Publishing. Available at: http://dx.doi.org/10.1787/9789264191655-en. Accessed 24 October 2014. Paquet, G. 2009. Quantophrenia, Optimum On-Line, 39(1): 14–27. Available at: http://www. optimumonline.ca/article.phtml?id5329. Accessed 4 November 2013. Perrin, B. 1998. “Effective use and misuse of performance measurement.” American Journal of Evaluation 19 (4): 367–379. Picard-Aitken, M., and Bertrand, F. 2008. Review and Conceptualization of Impacts of Research/ Creation in the Fine Arts. Montreal, QC: Science-Metrix. Quebec. 2002. “Guide de gestion axee sur les resultats.” Available at: http://www.tresor.gouv.qc. ca/fileadmin/PDF/publications/guide_gest-axee-resultat_02.pdf. Accessed 9 August 2013. Savoie, D. 2013a. Whatever Happened to the Music Teacher? How Government Decides and Why. Kingston and Montreal: MQUP. ——. 2013b. “Running government like a business has been a dismal failure.” The Globe and Mail, 7 January 2013. Available at: www.theglobeandmail.com/commentary/running-governmentlike-a-business-has-been-a-dismal-failure/article6968196. Accessed 5 September 2013. SSHRC. 2008. Dialogue: Measuring impact - Weighing the value of social sciences and humanities research, Ottawa, ON: Social Sciences and Humanities Research Council. Stephenson, R., and Hennink M. 2002. Moving Beyond Research to Inform Policy. United Kingdom: University of Southampton. Thomas, P.G. 2004. Performance Measurement, Reporting and Accountability: Recent Trends and Future Directions, Saskatchewan Institute of Public Policy Public Policy Paper Series No. 23. —— 2005. “Performance measurement and management in the public sector.” Optimum On-Line 35 (2): 16–26. Available at: http://www.optimumonline.ca/article.phtml?id5225. Accessed 11 June 2014. —— 2008. “Why Is Performance-Based Accountability So Popular in Theory and Difficult in Practice? Paper presented to the World Summit on Public Governance: Improving the Performance of the Public Sector.” In KPMG, Holy Grail or Achievable Quest?: International Perspectives on Public Sector Performance Management, Geneva: KPMG International, pp. 169–191. Thomson Reuters. 2010. Finding meaningful performance measures for higher education: A Report for Executives. Philadelphia, US: Thomson Reuters, Available at: researchanalytics.thomsonreuters.com/m/pdfs/higher-ed-exec-report.pdf. Accessed 9 August 2013. 160 EDWARD NASON, MICHAEL A. O’NEILL Treasury Board of Canada Secretariat (TBS). 2009. “Policy on evaluation.” Available at: http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id515024§ion5text#cha3. Accessed 10 June 2014. ——. 2010. “Supporting effective evaluations: A guide to developing performance measurement strategies.” Available at: http://www.tbs-sct.gc.ca/cee/dpms-esmr/dpms-esmrpr-eng. asp?format5print. Accessed 10 June 2014. John Grundy Performance measurement in Canadian employment service delivery, 1996-2000 Abstract: This article draws on the theoretical insights of Foucauldian governmentality scholarship to analyze a performance measurement system developed by the federal government in the 1990s to assess employment services for the unemployed. An examination of Human Resource Development Canada’s Results-Based Accountability Framework yields insights into contestation and incoherence in performance measurement, often overlooked in governmentality research. This argument is developed by detailing three obstacles to implementing the performance measurement system: dilemmas of technical coordination, contestation by actors invested in different terms of measurement, and widespread recognition of the ambiguities of the performance data. The paper concludes by calling for more variegated accounts of governance in governmentality scholarship. Sommaire : Le present article s’appuie sur les notions de gouvernementalite pour analyser un système de mesure de la performance que le gouvernement federal a mis en place, dans les annees 1990, pour mesurer les resultats des services d’emploi pour les ch^ omeurs. Il pretend qu’un examen du Cadre de responsabilisation axe sur les resultats de Developpement des ressources humaines Canada offre un eclairage sur les formes de contestation et d’incoherence dans la mesure de la performance, que la recherche sur la gouvernementalite neglige souvent. L’article developpe cet argument en presentant en detail trois obstacles a la mise en œuvre du système de mesure de la performance. Ceux-ci comprennent les dilemmes de coordination technique, la contestation de la part des acteurs investis dans differentes modalites de mesure, et une vaste reconnaissance des ambigu€ıtes des donnees de performance. En conclusion, l’article confirme les appels recents a des comptes de gouvernance plus varies dans les travaux de recherche sur la gouvernementalite. Introduction Performance measurement is at the centre of public sector reform initiatives across a range of jurisdictions. Establishing performance measures and benchmarks, tracking organizational activity and publicly reporting John Grundy is a postdoctoral fellow, School of Occupational Therapy, University of Western Ontario, [email protected]. This research was supported by a Social Sciences and Humanities Research Council of Canada Postdoctoral Fellowship. The author thanks the journal’s anonymous reviewers for helpful comments. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 161–182 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 162 JOHN GRUNDY results are now essential tasks for organizations under mounting pressure to demonstrate value for money. The current emphasis on measuring results has transformed the administration of employment services for the unemployed in many jurisdictions, including Canada. From the early 1990s onward, the OECD stressed rigorous results measurement of employment service delivery as a key component of active labour market policy, and it became a major site of policy transfer and expertise in the area. While the performance measurement of employment services in the U.S., Europe and Australia is well documented (Kerr, Carson and Goddard 2002; Nunn, Bickerstaffe and Mitchell 2009; Soss, Fording and Schram 2011; Weishaupt 2010; Brodkin 2011), the practice in Canada remains under-explored. This paper presents a historical case study of an employment service performance measurement system implemented by Human Resources Development Canada in the mid-1990s, known as the Results-Based Accountability Framework (RBAF). Adopted at the height of the federal government’s embrace of new public management (NPM), and complementing labour market policy reforms associated with the Employment Insurance (EI) Act (1996), the RBAF was a key means for embedding a rapid re-employment orientation in service delivery. It altered the terms for measuring public and third-party employment service providers. In place of process-based measures such as the number of individuals served, the RBAF imposed two primary results indicators: the number of individuals returned to work and the savings in unpaid EI benefits generated as a result of re-employment. Administrators repeatedly asserted that these metrics would engender a culture of accountability for results. The paper approaches the RBAF through the analytical lens of Foucauldian governmentality, which is defined most succinctly as “the conduct of conduct” (Foucault 1991: 93). Researchers across many disciplines have elaborated Foucault’s initial formulation of governmentality into a critical analytical approach that illuminates any deliberate attempt to govern people and things. This approach does not attempt to explain governance in terms of causal variables such as parties, institutions, or economic forces. It remains agnostic around questions of “why” and focuses instead on the “how” questions of governance: how do the problems concerning authorities in different sites emerge and how do they change over time? Through what discourses and techniques do authorities seek to bring about desired outcomes? As a pervasive technique of governance, the practice of performance measurement is a prominent topic within governmentality research. Scholars in this field emphasize how performance measurement governs conduct at a distance by imposing new forms of calculative scrutiny and self-surveillance. Numerous studies stress the disciplinary capacity of performance measurement PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 163 regimes to “re-shape in their own image the organizations they monitor” and fabricate calculating selves in the process (Shore and Wright 1999: 570; see also Miller 1994). The analysis of the RBAF presented below departs from this image of performance measurement, however. Its central claim is that the RBAF did not gain the traction often implied in governmentality scholarship. The efforts of administrators to impose calculability encountered a series of dilemmas. These include the mundane but significant difficulties of technical coordination which led many to doubt the RBAF’s validity from the outset; forms of contestation on the part of organizational actors and external stakeholders invested in different terms of measurement; and finally, growing recognition that the meaning of the performance data was inherently ambiguous. The RBAF’s implementation difficulties warrant reflection on the part of researchers of governmentality. A dominant theme in this literature is the diffusion of neoliberal governmentality, a mode of governance characterized by the entrenchment of logics of enterprise and calculation, including auditing and performance measurement, in more and more areas of social life (Rose 1996; Dean 1999). While undoubtedly enriching our understanding of neoliberalism, studies in this vein are under mounting criticism for too often attributing a false coherence and effectiveness to governmental practices, and for making them “appear settled and sometimes even complete in ways that they are not” (Brady 2011: 266; see also O’Malley 1996; Mckee 2009; Walters 2012). A number of scholars call for governmentality research to focus less on programs of governance as expressed in official policy reports and statements, and to pay greater attention to what Li (2007: 28) describes as “the beyond of programing” – the refractory processes that precede and invariably confound governmental ambitions. This entails more attentiveness to the limits to governance posed by difficulties of technical coordination, realities of political contestation, and the limits and ambiguities of expertise. Analyses that do so are yielding more complex accounts of the work of governing. They show how governance practices that often appear in governmentality scholarship as unproblematically implemented and successful in their intended reach and effects are actually deeply contested, incoherent, and often failure prone (Larner and Walters 2000; Higgins 2004; Howard 2006; Li 2007; Mckee 2009; Best 2014; Brady 2011 and 2014).1 Along these lines, the case study presented here illustrates how performance measurement appears very differently when we elevate rather than elide these realities. It also shows how public administration research on performance measurement has many insights that can inform inquiry into the fragility and incoherence of calculative techniques of governance. The analysis proceeds in the following manner. The first section provides a brief overview of performance measurement and its growing use 164 JOHN GRUNDY in employment service delivery. Turning to the empirical study, section two traces the implementation of the RBAF and its embroilment in technical difficulties, political contestation and the ambiguities of measurement. Drawing out the practical and theoretical implications of the analysis, the conclusion points to the limits of performance measurement as a means of coordinating employment service delivery. It also confirms recent calls for more variegated accounts of governance in governmentality studies. A final note on methods is warranted here. The case study is based on the analysis of department reports and administrative records acquired through a series of requests made under the Access to Information Act2 and research conducted in the now defunct Library and Archives of Human Resources and Skills Development Canada (renamed Employment and Social Development Canada). This yielded extensive departmental documentation including technical and consultancy reports on various aspects of employment service delivery, meeting minutes, department memos, presentation notes and other correspondence on the RBAF and employment services more generally. For another perspective on administrative reforms to employment services during the nineties, I consulted back issues of a newsletter produced by members of the Canada Employment Immigration Union (CEIU), and housed at the union’s Toronto office. Documentary research was supplemented by semistructured, anonymous interviews with two HRDC staff at different levels of the organization.3 Interview participants were asked questions relating to the implementation of the RBAF and other reforms to employment service delivery in the 1990s. Conceptualizing performance measurement Performance measurement has both a short and a long history. On one hand its diffusion reflects the ascendance of NPM over the past three decades in nearly all advanced industrialized countries (Pollitt and Bouckaert 2000). Premised on a dim view of Weberian bureaucratic administration, NPM seeks to remake public sector bureaucracy in the image of the private sector through measures such as performance measurement and performance pay, cost-unit accounting, competitive tendering of government services, and privatization. As Brodkin (2012: 5) notes, few managerial techniques have been as widely replicated as performance measurement. On the other hand, the current enthusiasm for performance measurement reflects a preoccupation with the efficiency and effectiveness of government dating back more than a century. It is the most recent in a long list of managerial innovations including scientific management of the progressive era, management by objectives of the PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 165 fifties, and experiments in cost-benefit analysis during the sixties and seventies, which sought to bring rigorous quantitative visibility to all government functions “in search of the public sector approximation of private enterprise’s ‘bottom line’ and for the operational control and clarified political choices consequent thereon” (French 1984: 33). Studies based on Foucault’s lectures on governmentality open up new ways to interpret the diffusion of performance measurement. This scholarship starts from a broad understanding of governance as “any more or less calculated and rational activity, undertaken by a multiplicity of authorities and agencies, employing a variety of techniques and forms of knowledge, that seeks to shape conduct. . .” (Dean 1999: 11). Governmentality studies investigate the forms of expertise and discourses involved in defining problems to be solved, the often mundane technical practices and procedures that enable governmental ambitions to become practical interventions, and the modes of selfhood and subjectivity fostered through governance projects (Walters 2000). A central theme in this literature is the diffusion of neoliberal governance characterized by logics of enterprise and competition, and the proliferation of techniques such as contractualism and performance measurement that aim to autonomize and responsibilize organizations and actors at a distance (Rose and Miller 1992; Miller 1994). From the perspective of governmentality studies, performance measurement appears as a key technique of neoliberal governance that enables the activities of widely dispersed actors to be “made inscribable and comparable in numerical form, in figures that can be transported to centres of calculation, aggregated, related, plotted over time, represented in league tables, judged against national averages, and utilized for future decisions about the allocation of contracts and budgets” (Rose, 1999: 153; Larner and Le Heron 2004). Governmentality scholarship also emphasizes how performance measurement can induce selfmonitoring on the part of scrutinized individuals and organizations. Those under performance measurement may internalize its norms and values and conduct themselves accordingly (Miller 1994; Lambert and Pezet 2012). Studies of the adoption of performance measurement in sites such as universities, hospitals, cultural organizations and social service agencies emphasize its capacity to delineate what activities constitute performance, and to exclude others from the organizational record (Shore and Wright 1999; Doolin 2004; McDonald 2006; Suspitsyna 2010). Governnmentality-based approaches to performance measurement are highly relevant for understanding changes to the administration of employment services for the unemployed. The rigorous performance measurement of employment services is a key plank of reforms associated with the “activation” paradigm of labour market policy, now dominant in nearly all advanced welfare states (Nunn, Bickerstaffe and 166 JOHN GRUNDY Mitchell 2009; van Berkel 2009; Weishaupt 2010; Soss, Fording and Schram 2011; Brodkin 2012). According to the activation paradigm, socalled passive income security programs such as Employment Insurance produce work disincentives for unemployed individuals and harmful labour market rigidities. It calls on governments to activate the unemployed through “help and hassle” employment service measures oriented primarily toward rapid re-employment. According to the activation paradigm’s logic, traditional bureaucratic administration is inadequate to effect this transformation in the governance of the unemployed. Instead, it calls on employment service providers to embrace a new management culture in which “outcomes are tracked, program impacts estimated and less effective programs are replaced with more effective ones” (OECD 2005: 214). Soss, Fording and Schram (2011: i205) characterize this new regime of activation as entailing the “interplay of paternalist systems for disciplining clients (e.g., sanctions) and neoliberal systems for disciplining service-providers (e.g., performance management)”. The results-based measures used in most jurisdictions relate to the number and speed of job seekers re-employed and the off-flow of individuals from benefits as a result of service interventions. Performance measures may also be established for specific populations such as the long-term unemployed, youth or older workers. Many jurisdictions use performance measurement along with performance pay, competitive tendering, customer satisfaction surveys or the engineering of quasi-markets of employment service providers. Reflecting the disciplinary capacity of performance measurement that governmentality scholarship highlights, policy makers often set performance benchmarks continually higher to induce service providers to increase job placements. In a study of U.S. welfare-to-work programs, Schram et al. (2010) describe how performance measurement operates as a hierarchical chain of disciplinary relationships that runs from the federal government through lower levels of government, to individual offices, case workers, and ultimately to the individual client: “At each point in this cascade, benchmarks for outcomes are established and monitored, and managerial techniques, incentives, and penalties are used to discipline actors below” (Schram et al. 2010: 746). The rigorous performance measurement of employment services is a key plank of reforms associated with the “activation” paradigm of labour market policy Breaking with overly systematized accounts of neoliberal governmentality, recent scholarly interventions call for better recognition of the fragile and uncertain work involved in constituting centers of calculation, the PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 167 difficulties of making disparate spaces and subjects calculable, and the forms of contestation that arise from such efforts (Higgins and Larner 2010a; Best 2014; Prince 2014). The large public administration literature that details the technical challenges and unintended consequences associated with performance measurement can advance this new direction in governmentality scholarship on calculative techniques. Numerous studies undertaken by public administration scholars underscore the difficulties of management information system design and utilization, and the effects such difficulties can have on the interpretability and legitimacy of performance data (Doolin 2004; Rist and Stame 2006). Other studies emphasize how performance measurement can skew organizational activity toward those aspects that are measured, often at the expense of the substantive objectives of an organization, and often in conflict with administrative due process or equity (Perrin 1998; Kerr, Carson and Goddard 2002; Radin 2006; Chan and Rosenbloom 2010; Brodkin 2005 and 2011). Public administration literature also highlights the fundamental ambiguity of measurement in light of the thorny question of causality. In the context of social service delivery including employment services, many factors beyond the particular agency can be the cause of the observed outcomes, and the more obvious those other factors are the less credible performance measurement becomes (Mayne 1999: 7). As the following sections illustrate, foregrounding these dilemmas within governmentality-based analyses of performance measurement can yield more complex and multifaceted accounts of this key technique of neoliberal governance. Human resource development Canada’s results-based accountability framework (RBAF) The development of an employment service performance measurement system reflected government-wide shifts in the administration of federal bureaucracies during the nineties. The Liberal government, which was elected in 1993, was deeply influenced by the tenets of NPM and put questions of performance and efficiency at the centre of its agenda (Ilcan 2009). Under the Liberal government, the Treasury Board assumed new powers as a catalyst of managerial reform and implemented a twiceyearly departmental performance reporting process. In this context, employment service delivery increasingly stood out as an evaluative challenge. Employment service outcomes are not directly observable and are notoriously difficult to specify (Breslau 1998). The primary measures of organizational activity that existed for the employment service were 168 JOHN GRUNDY input or process measures, such as the money spent providing services or the number of clients served, which provided no indication of outcomes. The Auditor General of Canada also criticized the employment service over the lack of results-based assessment (Office of the Auditor General of Canada 1988). In response to such criticism, employment service administrators ramped up net impact evaluation of service delivery in the late eighties. They also established a National Working Group on the Impact of Employment Counseling, which explored options for rendering frontline staff accountable for results (Employment and Immigration Canada 1991 and 1993). In short, results measurement was increasingly at the centre of service delivery reform initiatives. The development of an employment service performance measurement system reflected government-wide shifts in the administration of federal bureaucracies during the nineties. With the adoption of the new Employment Insurance (EI) Act in 1996, officials at HRDC NHQ set out to institute a new performance measurement system for Employment Benefits and Support Measures (EBSMs) which included training, targeted wage subsidies, self-employment assistance, as well as more short-term services including counseling, resume preparation and group sessions. Discussions among officials over possible performance metrics were framed by the priorities of the day. Given the emphasis on rapid re-employment in both the influential OECD Jobs Study of 1994 and the federal government’s social policy reform agenda, officials adopted the measures of 1) the number of clients employed or self-employed as a result of a service intervention, and 2) the amount of savings in unpaid EI benefits resulting from client re-employment. Staff in HRDC’s NHQ developed benchmarks for these indicators using administrative data on service users and benefit claimants in previous years. Regional targets were derived from these benchmarks and distributed to regional headquarters in 1996.4 A computerized performance tracking system went online shortly thereafter along with the EI Act’s new employment services. For active EI claimants who received services and returned to work before the end of their benefit entitlement, the system credited the found-work count and benefits savings to the office where services were provided. Clients who attained employment after benefit exhaustion would count only for the found-work measure, and would be captured through a follow-up telephone survey. Data from service interventions were compiled at NHQ and posted monthly on a departmental website. Office managers were encouraged to use the data PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 169 to monitor the performance of their offices. The intended effect of this procedure was to induce a cultural change throughout the organization, and to communicate the message that staff would be made responsible for their performance in achieving results rather than following administrative processes (HRDC 1996a: 2). The RBAF was quickly entangled in dilemmas of technical coordination, forms of political contestation, and the indeterminacies of quantification. HRDC’s performance measurement system undoubtedly exerted an influence over service delivery. Shortly following its implementation along with the new EI Act, service delivery shifted toward a short-term, rapid re-employment orientation. The proportion of service users participating in short-term services quickly increased, and the cost per participant fell sharply (Canada Employment Insurance Commission [CEIC] 1997 and 1999). One evaluation concluded that, “the emphasis on short-term results has dominated the implementation of EBSMs” (HRDC 1998: 73). These effects are consistent with governmentality-based accounts of the way performance measurement can discipline and reshape organizational practice. Yet the implementation of the RBAF generated other effects that cannot be accounted for within governmentality narratives of discipline and surveillance. The RBAF was quickly entangled in dilemmas of technical coordination, forms of political contestation, and the indeterminacies of quantification. In turning to these now, the case study is intended to help redress what Br€ ockling, Krasman and Lemke (2011: 20) suggest is the failure of governmentality research to adequately accentuate “the dislocations, translations, subversions, and collapses of power with as much meticulousness as [its] programs and strategic operations.” Problems with technical coordination The case of the RBAF is instructive to scholars of governmentality because it illustrates the difficulties involved in making performance measurement workable in everyday practice. The mundane challenges of technical coordination undermined the legitimacy and integrity of the technology. Problems associated with data entry were perhaps the most immediate challenge to the RBAF. The process of tracking results relied on the standardized entry of client information in case management software at the outset of a client’s employability intervention. Perhaps not surprisingly, an evaluation report indicated that a sizable portion of frontline staff was not consistently carrying out this task (HRDC 2001: 170 JOHN GRUNDY 15). There were also a range of difficulties with client follow-up surveys after the end of service provision.5 Despite the promise of adequate funding by NHQ, offices reported having scant human and financial resources to devote to client follow-up (HRDC 1998). Ensuring standardized data collection was made even more challenging given the diverse arrangements with third-party agencies providing employment assistance services. One administrator’s assessment of the management information system was that it was more sensitive to changes in data entry and follow-up practices than it was to picking up changes in client employment (Personal Interview, HRDC Administrator). Such formidable challenges associated with office-level data entry confirm what Higgins and Larner (2010b: 6) describe as the precariousness of technologies of calculation that rely on the standardization of activity across different sites. Problems associated with data entry were perhaps the most immediate challenge to the RBAF. Difficulties of data entry were compounded by growing concerns over the possibility of gaming and creaming at the individual office level (Working Group on Medium-Term Savings [WGMTS] 1999). Departmental documents note two different strategies adopted by offices. Given that performance results were only counted for case-managed clients (that is, clients who had a return-to-work action plan entered in case management software), some offices had begun to expand their case management practices to “everyone who walks through the door” (HRDC 1997: 3). A computer record was thus generated in the hope of later accumulating employment and savings counts. This effect of performance measurement is common in employment service delivery as agencies seek to accumulate performance points through unnecessary service provision (Nunn, Bickerstaffe and Mitchell 2009: 15). On the other hand, there was widespread concern that many offices had found ways to limit services to those who facilitate performance achievement, essentially those without more complex or multiple employment-related needs (WGMTS 1999). This practice was acknowledged in the formative evaluation of EBSMs released by HRDC in 1998. It conveyed service providers’ concern that “some organizations have adapted or changed the clientele they served in order to obtain results. Consequently, community partners believed that some clients were ‘falling through the cracks’” (HRDC 1998: 42-43). In response, HRDC’s senior management took measures to mitigate the risk of creaming. They produced communications and organized field workshops admonishing office managers and staff to maintain what they called a “balanced portfolio” of clients. Their message was that PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 171 reconciling short-term placement and savings targets with equitable service provision was possible, but required skillful decision-making on the part of frontline staff (HRDC 1996b: np). However, records indicate that a contingent of NHQ administrators recognized that exhorting offices simply to avoid creaming would have little effect (WGMTS 1999). The RBAF therefore illustrates the tenuousness of officials’ efforts to impose new forms of calculability, a dynamic that remains under explored in governmentality literature. In this case, difficulties related to data entry and growing suspicions of office-level gaming quickly generated doubts among staff over the integrity of the performance data, and the very possibility of measuring results. Reflecting such difficulties, one administrative report noted “confusion at the local level concerning how results are calculated and, very importantly from management’s point of view, what they mean and how to use them once they have been reported” (HRDC 1998: 43). Attending to difficulties of technical coordination such as these can productively complicate governmentality studies, and counter the risk of reifying the coherence of governmental techniques such as performance measurement. Challenges to performance measurement Examination of the RBAF’s implementation can yield another set of insights for researchers of governmentality into the contested nature of performance measurement. Governmentality scholarship on performance measurement tends to emphasize the formation calculable spaces and disciplined subjects rather than the forms of contestation that take shape in and around calculative practices. As the case of the RBAF shows, however, the power of officials to define what counted as organizational performance and to embed this definition in the way service delivery was measured was deeply contested from its outset. The RBAF was implemented in a politically charged organizational environment made increasingly turbulent by successive managerial reforms. Its implementation followed a downsizing exercise amounting to a twenty percent reduction in the department’s full time staff and involving much greater use of community-based and for-profit service providers (Bakvis 1997; Good 2003). The Canada Employment and Immigration Union (CEIU), an active union among the federal public service (McElligott 2001), was deeply critical of these initiatives. The newsletter produced by members of CEIU’s Ontario branch, Paranoia (a name inspired by the department’s newsletter Panorama), reported numerous times on how funding for government-run employment services was being redirected to third-party providers including for-profit agencies (CEIU Ontario 1994: 3). One vocal HRDC staff member stated that the new contractual and 172 JOHN GRUNDY performance based service delivery model was eliminating the role of the employment counselor: “We’ve been in situations where counselors are being sent out to train community partners how to do our jobs. . .there’s no counseling going to be left under this model. If you’re just going around monitoring contracts, then you’re no longer a counselor” (CEIU Ontario 1997: 3). The National President of the CEIU later characterized this period as one in which employment counselors were “reduced to passive compilers of paperwork” (Meunier McKay 2005). These developments were criticized by many staff whose sense of professionalism remained closely tied to human service work rather than the rituals of verification of NPM (Personal Interview, HRDC Employment Counselor 2008). Perhaps not surprisingly, as senior administrators went out into the field to promote the new performance measurement framework, many staff objected to the idea of being rendered accountable for the outcomes of service users (Office of the Auditor General of Canada 1997; Personal Interview, HRDC Administrator 2010). This reaction is consistent with much public administration research that highlights the negative effects of performance measurement on staff morale (Dias and Maynard-Mooney 2007; Diefenbach 2009). It also challenges simplistic narratives, common in literature on neoliberal governmentality, of the fabrication of calculating selves who self-govern in accordance with calculative techniques. The RBAF was implemented in a politically charged organizational environment made increasingly turbulent by successive managerial reforms. A practice administrators adopted in years following the RBAF’s implementation was to have individual Human Resource Canada Centres (HRCCs) establish their own numerical performance targets in consultation with NHQ. This was intended to mitigate conflict likely to arise from top-down, mechanical imposition of targets, and ideally to facilitate local level ownership over the results measurement process. Such ownership and participation did not extend to the initial determination of the primary short-term oriented results measures, which remained controversial. A common concern among staff was that its short-term measures did not provide a way to account for the intermediate steps many service users with more extensive needs were required to undertake prior to securing employment (WGMTS 1999). Equally problematic for many was the lack of any way to account for the quality of work found in terms of duration or wages. Such concerns illustrate how in complex systems of social welfare provision, there is often no single obvious measure of outcomes, but instead, a range of differently situated actors who are invested PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 173 in different terms of measurement (Paton 2003: 45). Such multivocality within governance requires careful theorization in governmentality scholarship as it has important implications for how governance plays out in everyday practice. As the practice of having offices set their own numerical targets illustrates, it often requires forms of negotiation, compromise, and often some degree of mutual accommodation (O’Malley 1996: 313; see also Brady 2011). Amidst growing controversy over developments in employment service delivery, the Unemployed Workers Council (UWC), established in 1992 by the Toronto Labour and Building Trades Councils, organized a twentyseven city tour of Ontario with a former HRDC employee. The tour sought to raise awareness about the increasingly exclusionary nature of employment service delivery under the new EI Act and HRDC’s performancebased regime. The UWC claimed that the department’s emphasis on generating EI savings amounted to discrimination against the disabled, immigrants, women, and others with special needs. It even sought to initiative a complaint with the Canadian Human Rights Commission. While the UWC failed in its bid to initiate a complaint, it established a hotline and encouraged anyone who felt they were denied a service unfairly to call (CEIU Ontario 1998: 3; Fort Frances Times Online 1998). The RBAF generated forms of contestation poorly captured in much governmentality studies of performance measurement and neoliberalism more generally. It was uneasily grafted onto an organizational context characterized by a range of actors with divergent visions of service delivery goals. Values of due process, equity and quality service provision posed obstacles to the new performance based regime. The RBAF thus underscores how techniques of neoliberal governmentality do not simply extend themselves unproblematically across social and organizational fields, steamrolling over past formations in the process. Instead, as Brodie (2008: 148) argues, “[p]reviously cultivated identities, political consensus, and cultural ideals. . .constitute obstacles to the promotion of a new governing order, and its particular way of representing and intervening.” Documentation of these obstacles and their implications is necessary to avoid overstating the reach and effects of calculative technologies associated with neoliberal governmentality. Limits of organizational knowledge within performance measurement The RBAF facilitates a third insight for scholars of governmentality concerning the role and limits of expertise in governance. One of the central tenets of this literature is that governance is a knowledge-intensive activity. Foucault’s account of the emergence of modern governmentality 174 JOHN GRUNDY highlights the formation of arrangements through which formal state apparatuses incorporated expert knowledge in areas such as public health and statistics in projects of social administration. Scholars of governmentality continue to explore the interactions between political authorities and experts that assist in governance. Many studies document an ongoing shift whereby experts able to wield powerful know-hows of calculation and monitoring are gaining influence over other forms of professional power established over the 20th century by teachers, social workers, counselors, doctors and others (Rose and Miller 1992; Isin 2002). While accounts of this development tend to emphasize the increasing clout of calculative expertise, a growing strand of governmentality research highlights the persistence of ambiguity and incoherence within calculative practices associated with neoliberal governance (Higgins and Larner 2010a; Best 2014; Prince 2014). While promotional material from NHQ stressed the ability of the RBAF to capture organizational results, many recognized that there was no necessary relation between the work of staff and the performance data recorded in the management information system. The RBAF exemplifies the ambiguities that can confound calculative technologies. While promotional material from NHQ stressed the ability of the RBAF to capture organizational results, many recognized that there was no necessary relation between the work of staff and the performance data recorded in the management information system. While the RBAF could provide some information as to how an employment office functioned, it could not indicate whether the outcomes recorded were a result of service interventions and not the result of any number of factors, including chance (HRDC [Strategic Evaluation and Monitoring] 1998). This fueled concern among both staff and management that the RBAF was unfairly penalizing offices where the failure to meet placement and savings targets reflected poor local economic conditions rather than any deficiency in service delivery. Conversely, there was concern that it was crediting offices with performance points that were more likely the result of a buoyant local economy (WGMTS 1999). Given that the majority of EI claimants do not exhaust benefits even when they do not receive services, many questioned the logic of treating savings in unpaid benefits as an attribute of service delivery and a measure of performance (Personal Interview, HRDC Administrator 2010). These dilemmas of attribution undermined the capacity of the RBAF to discipline and reshape organizational practice in the ways typically stressed in governmentality studies. PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 175 The deficiencies of the RBAF had been a serious concern for a number of the department’s program evaluators. They asserted that determining the results of service delivery in a manner that met a bare minimum of scientific legitimacy required a “net” impact evaluation that could isolate program impacts from other influences. Only in this way could officials determine the benefit that would not have occurred in the program’s absence. The concerns of program evaluators reflected divisions between program evaluation, which emphasizes methodological sophistication and is both time and resource intensive, and performance measurement, which privileges managerial utility over scientific rigour. Over the past few decades performance measurement has gained prominence over the practice of program evaluation given officials’ preference for continual streams of easily understood performance data that can facilitate managerial control (Bastoe 2006; McDavid and Huse 2006). A number of HRDC administrators well versed in program evaluation established a working group to develop a measure of office performance over the medium term. The group sought to devise an “analytically meaningful operational measure of medium-term [EI] savings to help mitigate the effects of undue reliance on short-term measures and confusing signals the accountability regime was sending to HRCCs” (WGMTS 1999: i). It settled on an evaluation method known as difference-indifferences to determine medium-term “net” EI savings.6 This involved comparing EI use among claimants three years before and three years after an employability intervention. It then entailed a non-experimental exercise to estimate what claimants’ EI use would have been without receiving an employment service. Medium-term “net” EI savings resulted if, over three years following an employability intervention, claimants’ actual EI use was less than their estimated EI use in the absence of an intervention. According to the working group, this method of calculation would correct for external influences on employment office results such as local economic conditions. By allowing for a three-year time-horizon in which results could be documented, it would alleviate the pressure placed on frontline staff to generate short-term results, and allow agencies to better accommodate the needs of individuals and communities. For these reasons, the development of medium-term measures was a priority among many program evaluators, senior regional managers, office managers and staff (WGMTS 1997a). HRDC staff involved in the working group recognized the need to present their message carefully to secure senior management support (WGMTS 1997b). Performance measurement systems always advance certain organizational interests over others, and changes to them can shift the balance of organizational power. Administrative records indicate that the response of senior management to the working group’s proposal was 176 JOHN GRUNDY mixed. Some members of HRDC’s Audit and Evaluation Committee were unreceptive to the working group’s report, and perceived its method to be in competition with the existing RBAF (HRDC [Strategic Evaluation and Monitoring] 1998: 3). Medium-term operational measures might have allowed for an alteration of service delivery practices in ways not consistent with the broader policy orientation of the Liberal government toward short-term, rapid re-employment interventions. As research on program evaluation frequently demonstrates, evaluation methods at odds with the political preferences of policy makers often remain marginal in policy deliberation (Weiss 1999). Nevertheless, some among senior management did support efforts to develop medium-term measures of performance (see Good 1999), and longer-term outcome tracking was included in periodic employment service evaluations carried out in the different provinces. The point to stress here is that the RBAF did not bear out the coherence usually attributed to performance measurement in governmentality literature. The performance measurement system did not generate a clear picture of organizational results as its measures were widely recognized to be inherently ambiguous. As one report put it: “both HRDC and [the] TB [Treasury Board] are in transition towards a new government-wide accountability and performance measurement regime that neither may be fully comfortable with, nor understands” (WGMTS 1999: 13). One administrator similarly recalled that, “as we went to implement it, there was a lot of struggle. . .if you pulled at these measures, they weren’t that robust” (Personal Interview, HRDC Administrator, 2010). In this way, the RBAF gave rise to new ambiguities around the effects of service delivery and the measurability of service outcomes. The performance measurement system did not generate a clear picture of organizational results as its measures were widely recognized to be inherently ambiguous. Beginning in 1996, Labour Market Development Agreements (LMDAs) were established between the federal and provincial governments. The agreements eventually transferred the administration of EI Act related employment services to the provinces. This development saw the externalization of the RBAF as a means of coordinating intergovernmental accountability. Two of the three primary measures of provincial performance in the LMDAs were drawn directly from HRDC’s internal performance measurement system: the number of EI claimants returned to work and the EI savings generated as a result of service interventions. The LMDAs incorporated the number of active EI claimants served as a third PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 177 measure. While the performance measures of the LMDAs are analyzed elsewhere (Wood and Klassen 2009), it is important to note how they reproduced the dilemmas of HRDC’s internal RBAF. Critiques initially leveled against HRDC’s internal performance measures were soon made of the LMDAs. As an administrator recalled: “[a]fter the agreements were signed. . . there was some legitimate criticisms that the definition of found work wasn’t rigorous enough or the benefits [measure] wasn’t meaningful” (Personal Interview, HRDC Administrator 2010). These concerns underscore scholarly criticism of public performance reporting as a mechanism for ensuring intergovernmental accountability in Canada (Anderson and Findlay 2010; Graefe and Levesque 2013). Conclusion HRDC’s Results-Based Accountability Framework was a key element of neoliberal labour market policy reforms adopted by the federal government in 1990s. Through its implementation, administrators sought to mobilize the disciplinary capacity of performance measurement, amply documented in governmentality scholarship, to reshape the activities of frontline staff and managers in accordance with the objective of rapid reemployment of EI claimants. This entailed the interplay of new pressures aimed at the unemployed who were increasingly subject to rapid reemployment measures, and employment service staff made accountable for results. Yet the case of the RBAF exemplifies complexities too often discounted in governmentality scholarship on calculative techniques. Discipline and self-surveillance on the part of staff were by no means the RBAF’s only organizational effects. It became embroiled in technical and political challenges and numerous unintended consequences. Far from simply generating calculating subjectivities, the RBAF did not sit easily with the values of many staff and was contested. Rather than imposing a grid of calculability, the RBAF generated considerable confusion over the validity and meaning of the performance data. Actors throughout the employment service recognized that the integrity of the data could not be assured given the difficulties involved in standardized data entry and client follow-up. An even more fundamental ambiguity arose over the question of attribution and causality. Ultimately, rather than furnishing a bottomline measure of organizational results, the performance measurement regime gave rise to new ambiguities around the measurability of service delivery. The RBAF provides a vivid illustration of the difference induced in governmentality analysis when more attention is given to the difficulties and unintended effects of governance practices. This study therefore 178 JOHN GRUNDY confirms the need for further elaboration of governmentality studies more closely attuned to the fragility and contestability of practices of governance (O’Malley 1996; Mckee 2009; Brady 2011; Walters 2012). It also underscores the capacity of public administration research to deepen and extend this line of governmentality-based inquiry. Public administration scholarship on the technical and political obstacles that confound performance measurement warrants close consideration in studies of calculative governmentalities. Finally, the foregoing analysis points to several broader implications. It confirms previous research which shows that the imposition of narrow performance metrics on a complex service delivery organization is likely to lack legitimacy and generate contestation (Dias and Maynard-Mooney 2007). This study also highlights how performance measurement should be considered a form of policy making, rather than simply a neutral technical or administrative exercise (Brodkin 2011). The matter of how officials define performance warrants a much more prominent place in public deliberation over policy implementation. This is especially pressing given the well documented tendency of performance measurement to induce organizations to “make the numbers” in ways that may run up against legislation, the entitlements of service users, as well as norms of equity or quality service delivery. Without careful assessment of the full effects of performance measurement, this central pillar of managerialism may exacerbate deficits of organizational transparency and democratic accountability (Brodkin 2011). Notes 1 For an in-depth discussion of this new direction in governmentality studies, see the October 2014 special issue of Foucault Studies, titled “Ethnographies of Neoliberal Governmentalities.” 2 The Access to Information Requests made to HRDC were broad in scope. They sought all records related to the Results-Based Accountability Framework as well as the Service Outcome Measurement System, an outcome measurement initiative administrators began work on in 1994. 3 The interview with the HRDC administrator was conducted on September 10, 2010. The interview with the Employment Counselor took place on January 4, 2008. 4 According to one report, regional headquarters tended to allot performance targets to individual offices based on the proportion of resources they used. For instance, an office that absorbed ten percent of the regional budget would be responsible for achieving ten percent of the region’s targets (HRDC 1998: 43). 5 Departmental documents convey a lack of uniformity in methods for conducting such surveys. Some indicate that surveys would be conducted by national or regional headquarters. Others suggest that individual offices would be provided with resources to conduct the surveys either by using local third-parties, office staff or regional tele-centre facilities. 6 In using the difference in differences method, the working group built on previous efforts undertaken by staff in HRDC’s evaluation branch. PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 179 References Anderson, Lynell, and Tammy Findlay. 2010. “Does public reporting measure up? Federalism, accountability and child-care policy in Canada.” Canadian Public Administration 53 (3): 417–438. Bakvis, Herman. 1997. “Getting the giant to kneel: A new human resources delivery network for Canada.” In Alternative Service Delivery: Transcending Boundaries, edited by Robin Ford, and David Zussman. Toronto: IPAC/KPMG. Bastoe, Per Oyvind. 2006. “Implementing results-based management.” In From Studies to Streams: Managing Evaluative Systems, edited by Ray Rist, and Nicoletta Stame. New Brunswick, NJ: Transaction Publishers. Best, Jacqueline. 2014. Governing Failure: Provisional Expertise and the Transformation of Global Development Finance. Cambridge: Cambridge University Press. Brady, Michelle. 2011. “Researching governmentalities through ethnography: The case of Australian welfare reforms and programs for single parents.” Critical Policy Studies 5 (3): 265–283. ——. 2014. “Ethnographies of neoliberal governmentalities: From the neoliberal apparatus to neoliberalism and governmental assemblages.” Foucault Studies 18: 11–33. Breslau, Daniel. 1998. In Search of the Unequivocal: The Political Economy of Measurement in U.S. Labor Market Policy. Westport: Praeger. Br€ ockling, Ulrich, Susanne Krasman, and Thomas Lemke. 2011. “From Foucault’s lectures at the College de France to studies of governmentality: An introduction.” In Governmentality: Current Issues and Future Challenges, edited by Ulrich Br€ ockling, Susanne Krasmann, and Thomas Lemke. New York: Taylor and Francis. Brodie, Janine. 2008. “We are all equal now: Contemporary gender politics in Canada.” Feminist Theory 9 (2): 145–164. Brodkin, Evelyn. 2005. “Toward a contractual welfare state? The case of work activation in the United States.” In Contractualism in Employment Services: A New Form of Welfare State Governance, edited by Els Sol, and Maria Westerveld. The Hague: Kulwer Law International. ——. 2011. “Policy work: Street-level organizations under new managerialism.” Journal of Public Administration Research and Theory 21: i253–i277. ——. 2012. “Reflections on street-level bureaucracy: Past, present, and future.” Public Administration Review 72 (6): 940–949. Canada Employment Insurance Commission (CEIC). 1997. 1997 Employment Insurance Monitoring and Assessment Report. Ottawa: Her Majesty the Queen in Right of Canada. ——. 1999. 1999 Employment Insurance Monitoring and Assessment Report. Ottawa: Her Majesty the Queen in Right of Canada. Canada Employment and Immigration Union (CEIU), Ontario Region. 1994. “CEC jobs threatened by contracting out.” Paranoia: The Workers Magazine of CEIU 12 (3) May-June: 3. ——. 1997. “CEIU members resist devolution in Metro Toronto.” Paranoia: The Workers Magazine of CEIU 14 (3) January-February: 3. ——. 1998. “Former CEIU member slams HRDC.” Paranoia: The Workers Magazine of CEIU 15 (3) June-July: 3. Chan, Hon, and David Rosenbloom. 2010. “Four challenges to accountability in contemporary public administration: Lessons from the United States and China.” Administration and Society 42: 11S–33S. Dean, Mitchell. 1999. Governmentality: Power and Rule in Modern Society. London: Sage. Dias, Janice Johnson, and Steven Maynard-Moody. 2007. “For-profit welfare: Contracts, conflicts, and the performance paradox.” Journal of Public Administrtion Research and Theory 17 (2): 189–211. Diefenbach, Thomas. 2009. “New public management in public sector organizations: The dark sides of managerialistic ‘enlightenment’.” Public Administration 87 (4): 892–909. 180 JOHN GRUNDY Doolin, Bill. 2004. “Power and resistance in the implementation of a medical management information system.” Information Systems Journal 14: 343–362. Employment and Immigration Canada (Working Group on the Impact of Counseling). 1991. Employment Counseling Into the 1990’s. Gatineau: Employment and Immigration Canada. ——. 1993. Employment Counseling Measurement and Accountability Framework. Gatineau: Employment and Immigration Canada. Fort Frances Times Online. 1998. “Workers’ group fighting government over EI.” July 8 1998. Available at: http://newsite.fftimes.com/node/55041. Foucault, Michel. 1991. “Governmentality.” In The Foucault Effect: Studies in Governmentality, edited by Graham Burchell, Colin Gordon, and Peter Miller. Chicago: University of Chicago Press. French, Richard. 1984. How Ottawa Decides. Planning and Industrial Policy-Making 1968–1984. Toronto: James Lorimer and Company with the Canadian Institute for Economic Policy. Good, David. 1999. “Management response.” In Managing by Results: Employment Benchmarking and Savings Impacts for Employment Insurance, Final Report, by Ging Wong and Lesle Wesa. Strategic Evaluation and Monitoring, Evaluation and Data Development, Human Resources Development Canada. Available at: http://dsp-psd.pwgsc.gc.ca/Collection/ RH63-2-062-02-99E.pdf. ——. 2003. The Politics of Public Management: The Human Resources Development Canada Audit of Grants and Contributions. Toronto: University of Toronto Press. Graefe, Peter, and Mario Levesque. 2013. “Accountability in labour market policies for persons with disabilities.” In Overpromising and Underperforming? Understanding and Evaluating New Intergovernmental Accountability Regimes, edited by Peter Graefe, Julie M. Simmons, and Linda Ann White. Toronto: University of Toronto Press. Higgins, Vaughan. 2004. “Government as a failing operation: Regulating administrative conduct ‘at a distance’ in Australia.” Sociology 38 (3): 457–476. Higgins, Vaughan, and Wendy Larner (eds). 2010a. Calculating the Social: Standards and the Reconfiguration of Governing. Houndsmills: Palgrave Macmillan. Higgins, Vaughan, and Wendy Larner. 2010b. “Standards and standardization as a social science problem.” In Calculating the Social: Standards and the Reconfiguration of Governing, edited by Vaughan Higgins, and Wendy Larner. Houndsmills: Palgrave Macmillan. Howard, Cosmo. 2006. “The new governance of Australian welfare: Street-level contingencies.” In Administering Welfare Reform: International Transformations in Welfare Governance, edited by Paul Henman, and Menno Fenger. Bristol: Policy Press. Human Resources Development Canada (HRDC). 1996a. Results-Based Accountability Framework. Ottawa: Her Majesty the Queen in Right of Canada. ——. 1996b. “Corporate incremental EI savings objective.” Memo from Mel Cappe, Deputy Minister, Human Resources Development Canada, to Regional Director Generals. Gatineau: Human Resources Development Canada. ——. 1997. “National workshop to take stock of employment targets.” Interoffice Memorandum (07/21/97). Gatineau: Human Resources Development Canada. ——. 1998. Formative Evaluation of the Employment Benefits and Support Measures, Final Report. Gatineau: Human Resources Development Canada. ——. 2001. Canada-Newfoundland and Labrador LMDA/EBSM Evaluation of Support Measures, Final Report. Gatineau: Human Resources Development Canada. Human Resources Development Canada (Strategic Evaluation and Monitoring). 1998. ‘The Role of Evaluation in the Human Resources Development Canada Results-Based Accountability Framework, Draft.’ Memo. Gatineau: Human Resources Development Canada. Ilcan, Suzan. 2009. “Privatizing responsibility: Public sector reform under neoliberal government.” Canadian Review of Sociology 46 (3): 207–234. Isin, Engin. 2002. Being Political: Genealogies of Citizenship. Minneapolis, MN: University of Minnesota Press. PERFORMANCE MEASUREMENT IN EMPLOYMENT SERVICES 181 Kerr, Lorraine, Ed Carson, and Jodi Goddard. 2002. “Contractualism, employment services and mature age job-seekers: The tyranny of tangible outcomes.” The Drawing Board: An Australian Review of Public Affairs 3 (2): 83–104. Lambert, Caroline, and Eric Pezet. 2012. Accounting and the making of homo liberalis. Foucault Studies 13: 67–81. Larner, Wendy, and Richard Le Heron. 2004. “Global benchmarking: Participating ‘at a distance’ in the globalizing economy.” In Global Governmentality: Governing International Spaces, edited by Wendy Larner, and William Walters. London: Routledge. Larner, Wendy, and William Walters. 2000. “Privatization, governance and identity: The United Kingdom and New Zealand compared.” Policy and Politics 28 (3): 361–377. Li, Tania. 2007. The Will to Improve: Governmentality, Development and the Practice of Politics. Durham: Duke University Press. Mayne, John. 1999. “Addressing attribution through contribution analysis: Using performance measures sensibly.” Available at: http://www.oag-bvg.gc.ca/internet/docs/99dp1_ e.pdf. McDavid, James C., and Irene Huse. 2006. “Will evaluation prosper in the future?” The Canadian Journal of Program Evaluation 21 (3): 47–72. McDonald, Catherine. 2006. “Institutional transformation: The impact of performance measurement on professional practice in social work.” Social Work and Society 4 (1): 25–37. McElligott, Greg. 2001. Beyond Service: State Workers, Public Policy, and the Prospects for Democratic Administration. Toronto: University of Toronto Press. Mckee, Kim. 2009. “Post-Foucauldian governmentality: What does it offer critical social policy analysis?” Critical Social Policy 29: 465–486. Meunier McKay, Janet. 2005. “The ‘call for proposals’ process: A view from inside HRSDC.” A presentation to the House of Commons Standing Committee on Human Resources, Skills Development, Social Development and the Status of Persons with Disabilities, 38th Parliament, 1st Session, April 12th 2005. Miller, Peter. 1994. “Accounting and objectivity: The invention of calculating selves and calculable spaces.” In Rethinking Objectivity, edited by Allan Megill. Durham: Duke University Press. Nunn, Alex, Tim Bickerstaffe, and Ben Mitchell. 2009. “International review of performance management systems in public employment services.” Department for Work and Pensions Research Report No 616. Available at: http://campaigns.dwp.gov.uk/asd/asd5/rports20092010/rrep616.pdf. OECD. 2005. “Public employment services: Managing performance.” In OECD Employment Outlook 2005. Available at: http://www.oecd.org/dataoecd/2/40/36780883.pdf. Office of the Auditor General of Canada. 1988. Report of the Auditor General of Canada to the House of Commons Fiscal Year Ended 31 March 1988. Ottawa: Minister of Supply and Services Canada. ——. 1997. October Report of the Auditor General of Canada. Chapter 17—Human Resources Development Canada—A Critical Transition Toward Results-Based Management. Ottawa: Supply and Services Canada. O’Malley, Pat. 1996. “Indigenous governance.” Economy and Society 25 (3): 310–326. Paton, Rob. 2003. Managing and Measuring Social Enterprises. London: Sage Publications. Perrin, Burt. 1998. “Effective use and misuse of performance measurement.” American Journal of Evaluation 19 (3): 367–379. Pollitt, Christopher, and Geert Bouckaert. 2000. Public Management Reform: A Comparative Analysis. Oxford: Oxford University Press. Prince, Russell. 2014. “Calculative cultural expertise? Consultants and politics in the UK cultural sector.” Sociology 48 (4): 747–762. Radin, Beryl. 2006. Challenging the Performance Movement: Accountability, Complexity, and Democratic Values. Washington: Georgetown University Press. 182 JOHN GRUNDY Rist, Ray, and Nicoletta Stame (eds). 2006. From Studies to Streams: Managing Evaluative Systems. New Brunswick, NJ: Transaction Publishers. Rose, Nikolas. 1996. “Governing advanced liberal democracies.” In Foucault and Political Reason, edited by Andrew Barry, Thomas Osborne, and Nikolas Rose. Chicago: University of Chicago Press, 37–65. ——. 1999. Powers of Freedom: Reframing Political Thought. Cambridge: Cambridge University Press. Rose, Nikolas, and Peter Miller. 1992. “Political power beyond the state: Problematics of government.” The British Journal of Sociology 43 (2): 173–205. Shore, Chris, and Susan Wright. 1999. “Audit culture and anthropology: Neo-liberalism in British higher education.” Journal of the Royal Anthropological Institute 5 (4): 557–575. Schram, Sanford, Joe Soss, Linda Houser, and Richard Fording. 2010. “The third level of U.S. welfare reform: Governmentality under neoliberal paternalism.” Citizenship Studies 14 (6): 739–754. Soss, Joe, Richard Fording, and Sanford Schram. 2011. “The organization of discipline: From performance management to perversity and punishment.” Journal of Public Administration Research and Theory 21 (supplement 2): i203–i232. Suspitsyna, Tatiana. 2010. “Accountability in American education as a rhetoric and a technology of governmentality.” Journal of Education Policy 25 (5): 567–586. van Berkel, Rik. 2009. “The provision of income protection and activation services for the unemployed in ‘active’ welfare states. An international comparison.” Journal of Social Policy 39 (1): 17–34. Walters, William. 2000. Unemployment and Government: Genealogies of the Social. Cambridge University Press. ——. 2012. Governmentality: Critical Encounters. New York: Routledge. Weishaupt, J. Timo. 2010. “A silent revolution? New management ideas and the reinvention of European public employment services.” Socio-Economic Review 8: 461–486. Weiss, Carol H. 1999. “The interface between evaluation and public policy.” Evaluation 5 (4): 468–486. Wood, Donna, and Thomas R. Klassen. 2009. “Bilateral federalism and workforce development policy in Canada.” Canadian Public Administration 52 (2): 249–270. Working Group on Medium-Term Savings (WGMTS). 1997a. “Minutes of the meeting of the Working Group on Medium-Term Savings, November 17, 1997.” Gatineau: Human Resources and Skills Development Canada. ——. 1997b. “Minutes of the meeting of the Working Group on Medium-Term Savings, December 12, 1997.” Gatineau: Human Resources and Skills Development Canada. ——. 1999. Report of the Working Group on Medium-Term Savings. Gatineau: Human Resources and Skills Development Canada. Burt Perrin Bringing accountability up to date with the realities of public sector management in the 21st century Abstract: This article identifies basic shortcomings of traditional approaches to accountability and considers some of the reasons for the persistence of an approach that is known to: provide an inaccurate and distorted view of actual performance; inhibits rather than facilitates improved performance; and contributes to less rather than more confidence in government. The article then presents a vision of accountability more in keeping with the realities of public sector management in the twenty-first century. Sommaire : Le present article indique les principales deficiences dans les façons traditionnelles d’aborder l’obligation de rendre compte et examine certaines des raisons pour lesquelles on continue de recourir a cette demarche qui, on le sait, entra^ıne les consequences suivantes : cela fournit une fausse et inexacte representation de la performance reelle, emp^eche plut^ ot que ne favorise une meilleure performance, et contribue a avoir moins plut^ ot que davantage confiance dans le gouvernement. L’article presente ensuite une vision de l’imputabilite qui correspond mieux aux realites de la gestion du secteur public au XXIe siècle. Introduction In recent years, there has been a major shift in public sector management from a focus on activities and processes to a focus on results – benefits to citizens arising from government processes. There also is increasing recognition that government interventions, by their very nature, are complex and operate in uncertain environments where adaptability is needed rather than sticking to a plan that can quickly become out of date. As well, significant outcomes rarely follow from just a single action or actor. These shifts have implications for all aspects of public sector management, including accountability. But, as this article documents, traditional approaches to accountability, designed primarily for checking compliance with rules and procedures and expenditures against budgets, have not Burt Perrin is an independent consultant based in France who assists governments and other organizations internationally with advice about planning, evaluation management, design and quality assurance ([email protected]). He would like to thank Harry Jones, John Mayne, Nicoletta Stame, Jim McDavid, and Alison Perrin for many helpful suggestions to an earlier draft of this article. In particular, he would like to thank Jim McDavid for his encouragement, support, and editorial assistance. CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1 (MARCH/MARS 2015), PP. 183–203 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V 184 BURT PERRIN kept pace with these developments. Holding programs accountable for compliance with expectations seems clear and fair enough. But, by itself, this inevitably misleads and distorts. Such practices do not provide a proper account of the appropriateness and value of public sector activities and performance. Worse, they act as a disincentive to good performance and can lead to a wide range of perverse activities and outcomes. They provide the illusion rather than the reality of a focus on results and on accountability. This in turn contributes to the paradox that despite more resources and attention devoted to accountability-related activities, the public often feels that government is less and less accountable. This article, after a brief consideration of what is meant by “accountability” and the nature and key implications of a results or outcome orientation, discusses shortcomings of traditional approaches to accountability. It considers reasons for the persistence of an approach to accountability with such shortcomings that have been so well documented, and then presents a vision of accountability more in keeping with current realities of public sector management. The meaning of accountability in a results context Gray and Jenkins (1993: 55) define accountability as: “the obligation to present an account of and answer for the execution of responsibilities to those who entrusted those responsibilities.” Bemelmans-Videc (2007), the Canadian Comprehensive Audit Foundation (CCAF, in Leclerc et al. 1996) and others present similar definitions that, however, leave it open to whom and for what one is expected to be accountable, and how one appropriately can answer for a responsibility or power that has been conferred. As Lonsdale and Bemelmans-Videc (2007: 3) observe, “Like honesty and clean water, ‘accountability’ is invariably seen as a ‘good thing’.” Nevertheless, Behn (2001) observes, in a book devoted specifically to the meaning of democratic accountability, that no one knows exactly what it means to “hold someone accountable” – except those who are being held accountable, where they understand that accountability in practical terms means punishment. Leclerc et al. (1996) and Thomas (2007) make similar observations, referring specifically to the Canadian context. Lack of clarity about what is meant when referring to accountability limits the meaningfulness of the term. In practice, as Light (1993), Leclerc et al. (1996) and others have observed, accountability generally has been viewed as assessing compliance with tightly drawn rules and regulations. Such a traditional approach to accountability may well be appropriate to control against the abuse or misuse of power, to provide assurance that resources have been used for their intended purposes with due regard for NEW VIEW OF ACCOUNTABILITY 185 fairness and good stewardship, attesting to the accuracy of accounts, and guarding against various forms of fraud and misrepresentation. However, it is less so with respect to performance. But there is another face of accountability that has emerged with New Public Management (NPM) and now overshadows traditional accountability. The Auditor General of Canada (2002) has observed that one of the key purposes of accountability is to lead to improved performance of programs and policies. But as Dubnick (2005: 378) observes, the underlying assumption that “greater accountability will mean improved performance” remains largely unchallenged. As this article indicates, there is extensive evidence that traditional approaches to accountability are ill suited for the purpose of improving performance, leading to excessive focus on processes and on following procedures. Worse, this frequently results in unintended perverse effects, where accountability practices lose sight of the meaning of the concept. Traditional approaches to accountability are ill suited for the purpose of improving performance. The concept of accountability represents an important ethical and moral principle, basic to the concept and exercise of authority within a democracy (for example, Leclerc et al. 1996; Bemelmans-Videc 2007). A major purpose of accountability (some would say its prime function) is the legitimization of the exercise of authority, including the most appropriate use of public resources. In this sense, accountability can be viewed as an end in itself, with the objective of providing for greater confidence or assurance in what government is doing and how. But from an outcome perspective, one must ask if accountability activities help contribute to more relevant, efficient, and effective public services. Looked at in this light, it is appropriate to ask if certain types of accountability approaches are more effective than others (for example, Jarvis and Thomas 2012; Perrin, Bemelmans-Videc and Lonsdale 2007). If accountability does not contribute in at least some way to improved government performance and effectiveness, then one may very well question its value. Worse is when questionable approaches to accountability negatively impact the legitimacy of our governance system. A focus on results and what this means for public sector management and accountability Arguably, the most important change affecting public sector management has been a shift from a primary focus on process or outputs to a focus on 186 BURT PERRIN outcomes and impacts. Outcomes are fundamentally different in nature from processes, inputs, and outputs (or products). For example: they typically have a long-term trajectory, progress is not linear or incremental in nature and is likely to happen unpredictably in fits and starts with the likelihood of tipping points, and by their very nature may not be evident or measurable for some time. Typically there are numerous steps in the results chain, so that outcomes generally arise indirectly, and invariably not just as the result of a single (program) intervention, but in combination with other interventions, actions of other players, and mediated by social, economic and environmental factors. There is no direct cause-and-effect relationship between activities and outcomes. Theories of change that take into account multiple actors, influencing factors, and feedback loops are required, in contrast to linear results chains that are inaccurate and can mislead (Rogers 2008). Rigid plans based upon pre-identified goals and targets (and logic models) are almost certain to become out of date, in response to changing priorities, needs, opportunities, threats, and feedback. “Managing in a context of uncertainty” requires a flexible and adaptive style rather than a rigid approach based upon pre-identified outputs and targets (Freedman 2013; Handy 1995; Mintzberg 1994). Traditional approaches to accountability “customarily become associated with the judgment of whether a program or a policy has achieved its objectives,” (Lehtonen 2005: 175) invariably through checking compliance against predetermined targets. But as the Canadian Comprehensive Audit Foundation has acknowledged, “rendering an account against an original plan without taking into account changed circumstances . . . is not good accountability” (Leclerc et al. 1996: 34). “Managing in a context of uncertainty” requires a flexible and adaptive style rather than a rigid approach based upon pre-identified outputs and targets. Assessing performance and rewarding managers or programs for doing what they were expected to do, on the surface, sounds reasonable. But the reality is that this is inappropriate for an outcome focus when this involves assessing performance against pre-defined targets or indicators, as is the norm with most traditional accountability or results-based management (RBM) approaches. As Mayne and Zapico-Go~ ni (1997) observed: “With an increasingly diverse, interdependent, and uncertain public sector environment, . . . meeting objectives fixed some time ago may not be as important as the capacity to adapt to current and future change. . . In a rapidly changing world, responsive programs should be NEW VIEW OF ACCOUNTABILITY 187 changing and modifying, and should be rewarded rather than penalized for doing so.” Implementing an outcome focus requires a significant change in the ways of thinking and approach to management concerning all aspects of government, including reward mechanisms and accountability approaches. A roundtable discussion convened by the World Bank (Perrin 2006), involving international experts considered what is needed for governments to move from a focus on outputs to outcomes. While affirming that such a strategic focus is central to the raison d’^etre of government, the experts highlighted the challenges of such a change, indicating that one should not expect perfection, and to encourage trying, should reward “failure,” provided that learning comes from this. The Auditor General of Canada (2002: 3) has observed: “Public sector management and governance are changing, becoming more complex and creating new pressures on traditional notions of accountability. Thus it should hardly be surprising that this has not proved easy to implement.” An expert meeting OECD convened specifically to discuss challenges to results-focused management and budgeting provides a good illustration of challenges to an results-oriented approach, and how it can be sabotaged by inappropriate measures more suited for keeping track of activities. Despite the stated objective and title of the event, many of the countries represented said that outcomes were too difficult to measure and in particular to be useful for accountability because of difficulties of attribution. As a result, there was minimal attention to outcomes, some to outputs, but with primarily attention still placed on the development of better means of controlling inputs (Perrin 2002). One may well ask (for example, as Ohemeng and McCall-Thomas 2013 and Radin 2011, among others, have done) if the “results orientation” is more rhetoric than reality. Connecting accountability to performance measurement As many have indicated (for example, Perrin 1998, 1999, 2002, 2012; Hatry 2013; Hildebrand and McDavid 2011; McDavid, Huse and Hawthorn 2013; Nielsen and Hunter 2013) performance measures can be a useful tool for management in many circumstances. In particular, performance measures need to be viewed as indicators, as one element of a more comprehensive monitoring and evaluation strategy, for raising rather than answering questions that could then be explored through means ranging from some telephone calls to a comprehensive evaluation. The main problem occurs when consequences are placed upon achieving pre-established targets, such as for external reporting, rewards and punishments, and accountability, leading to distortion and misrepresentation of actual performance. 188 BURT PERRIN There are numerous reasons why it is inappropriate to use indicators and targets for accountability purposes. Their use is inconsistent with evidence that it is not appropriate to reduce complex undertakings, representing almost all significant public sector initiatives, to one or a small number of (primarily) quantitative indicators (for example, Forss, Marra and Schwartz 2011; Freedman 2013; Handy 1995; Hummelbrunner and Jones 2013a; Mintzberg 1994, 1996; Newcomer 1997; Rogers 2008; Williams and Hummelbrunner 2011). Even more basically, reliance on indicators assumes that they present a valid picture of what they purport to measure. But as Dubnick 2005, Tsoukas 2004 and others have indicated, this is highly questionable on epistemological grounds. Choosing indicators is invariably based upon negotiation, bureaucratic and political pressures, and thus in spite of their apparent objectivity, they are inherently selective, subjective and value laden. At best, indicators present a narrow window on reality. Traditional accountability approaches may provide a misleading account of the appropriateness and effectiveness of performance, and worse, leads to perverse incentives and to the undermining of effectiveness. Inappropriate quantification and a high-stakes approach to accountability often means that attention is paid to what is easy to measure, to inputs, activities and, perhaps, outputs, rather than to outcomes. This can sabotage a meaningful outcome focus. Consequently, traditional accountability approaches may provide a misleading account of the appropriateness and effectiveness of performance, and worse, leads to perverse incentives and to the undermining of effectiveness. Even when intended as a management tool, RBM approaches frequently become de facto accountability and control mechanisms, or at least are frequently viewed and used in this manner (for example, Gill 2011; Mintzberg 1996). Given these shortcomings, it also is hardly surprising that traditional approaches to accountability, based upon performance measures, fail in their objective of providing for greater confidence in what government is doing and benefits arising. Performance measures can fail to provide a meaningful account of actual performance There is extensive evidence documenting the limitations of performance measures to present an accurate and meaningful view of actual performance, thereby misrepresenting what programs are actually doing.1 It is NEW VIEW OF ACCOUNTABILITY 189 only possible to discuss a few of these factors below (but see, for example, Perrin 1998, for discussion of other barriers and considerations that limit the ability of performance to give a valid and accurate accounting of actual performance). Meaningless and inaccurate data Transocean, the deep-water drilling company responsible in part for the explosion of the BP Deepwater Horizon oil rig in the Gulf of Mexico that killed a number of people and resulted in a major ecological disaster nevertheless awarded its executives substantial bonuses, claiming in its annual report that: “Notwithstanding the tragic loss of life in the Gulf of Mexico, we achieved an exemplary statistical safety record as measured by our total recordable incident rate and total potential severity rate . . . As measured by these standards, we recorded the best year in safety performance in our company’s history” (cited by BBC News 2011, emphasis added). This represents just one poignant example, among many that are well documented in the literature, of how performance indicators can be meaningless. As another example, the European Union Cohesion (regional development) Policy uses numbers of people “employed” and “in training” to hold member states to account. But what do these terms mean? Numbers of people “in employment” can include people in “good”’ well-paying steady full-time jobs with a future, as well as others in part-time work of a few days’ duration in an abusive situation; “self employment” (which is included as a form of employment) can include street hawkers with insufficient revenue to cover their expenses as well as others running major hi-tech enterprises with a high level of income. Numbers of people “trained” can include those who slept through a halfday session of dubious value (or count the same person multiple times for registering for many such sessions) as well as those who have completed a comprehensive six-month training program (Perrin 2011). There are many cases where definitions of “obvious” measures vary tremendously (for example, “client,” which can, and is, defined in numerous ways, even among agencies providing similar services, and then inappropriately aggregated). Often clerical staff required to submit data, along with busy professionals who resent the time they need to devote to “paperwork” rather than to delivering services, make it up as they go along. Goal displacement When indicators become the objective, they can result in “goal displacement” which leads to emphasis on “making the numbers” rather 190 BURT PERRIN than on doing what the program was supposed to be doing. At best, this can result in performance measures misrepresenting actual performance. Examples of goal displacement are well documented. For example, the UK Work Programme uses a payment-by-results mechanism whereby private sector contractors are paid only on the successful placement of welfare recipients into employment. An independent evaluation (Rees, Taylor and Damm 2013) not only found little success, but that “creaming” and “parking” were embedded in the approach, where those closest to the labour market received the most help and those with the greatest need received minimal attention. In an empirical study concerning standardized testing in Ontario public schools, Ohemeng and McCall-Thomas (2013) document how centralized imposed targets for “results” have led to undesired behaviours such as “the opening of the test before test day, giving students the questions before the test, erasing test answers after the exams, teaching to the test, and circumventing the instructions for test administration” (466). There are numerous other ways of meeting targets without improving performance, such as: selective definitions, for example, relabeling dropouts from a program as having moved or even as a “success” in becoming independent; “encouraging” those who do not seem to be doing well to go elsewhere in order to increase the overall success rate; providing a minimal level of service when the target is numbers of people served; hospitals addressing waiting time standards by providing perfunctory attention and thereby stopping the clock while keeping patients waiting for hours to receive any real treatment, ignoring those who are not going to meet a target, such as call centres that terminate calls that cannot be answered in time. Ways of gaming the system in order to meet targets without addressing real needs are widespread and well documented (for example, Perrin 1998; Thomas 2006; Wheelan 2013). Incentive to distort results Holding managers accountable to meet targets can produce strong pressure to misrepresent and to distort results, with potentially devastating consequences. It is now apparent that many of the recent corporate scandals are at least partially the result of inappropriate incentives and inordinate pressure on managers to “meet their numbers” — indeed all too often to “meet their number.” For example, Enron indicated in its annual report that it was “laser-focused on earnings per share,” and its former chief executive, Kenneth Lay, reportedly received a bonus of US$123M for achieving his target, at the same time that the company was virtually defunct. This is hardly an isolated incident. For example, an article in the Harvard Business Review indicates that the use of targets to determine compensation “encourages managers to lie and cheat, lowballing targets and NEW VIEW OF ACCOUNTABILITY 191 inflating results, and it penalizes them for telling the truth. It turns business decisions into elaborate exercises in gaming. It sets colleague against colleague, creating distrust and ill will. And it distorts incentives, motivating people to act in ways that run counter to the best interests of their companies.” (Jensen 2001: 94). The same sort of thing also happens in the public sector. Linking pay to target achievement is happening more and more, in spite of evidence that pay-for-performance does not work (for example, Bevan 2013; Chartered Institute of Personnel and Development [CIPD] 2009; Toynbee 2013). Threatening to “punish failure” puts a high degree of pressure on even public sector managers to meet their targets, no matter how, if they want to keep their jobs and to preserve their programs. Bevan and Hamblin, citing a UK Audit Commission report, indicate that: “In a culture where managers’ jobs depend on achieving specific targets, there will be pressure to meet those targets (2009:182). Studies in jurisdictions such as France (L’Inspection Generale de l’Administration et Inspection Generale de la Police Nationales 2014) and the UK (House of Commons 2014) have documented how police have intentionally manipulated statistics on crime in order for “crime” to more closely match political expectations. Critical subgroup differences hidden As Hatry (2013: 25) has observed: “Performance measurement systems typically provide only aggregate outcome data.” However, aggregate (average) scores can misrepresent what is really happening. For example, Winston (1993) indicates how a program with 40 percent women participants can achieve a 60 percent “success” rating, with all – or none – of the women being successful. This finding would be hidden by an indicator that only looked at the overall success rate. Similarly, average scores showing overall improvement in household income may disguise the fact that while a few people (typically those in the top decile) are doing better, most people are no better off, with those at the bottom end actually worse off. As Pawson and Tilley (1997) have observed, with almost any program, invariably some people are better, and others worse off. Performance measurement data that fail to show this are at best misleading. Inhibits rather than contributes to improved performance The above discussion indicates how performance measures may fail to provide a meaningful perspective on actual performance, and thus undermine accountability or decision making-related uses. But more than 192 BURT PERRIN this, there is also extensive evidence about how traditional accountability approaches combined with results-based schemes based upon indicators adversely affect performance.2 Silo thinking and action Few significant objectives (for example, poverty reduction, economic development, employment creation, crime reduction, health, climate change) can be addressed without coordinated action across multiple program areas or without involvement of non-governmental partners. Outcomes also depend upon factors beyond the direct control of a single manager or program. Traditional accountability approaches combined with results-based schemes based upon indicators adversely affect performance. Nevertheless, most reward and accountability structures remain vertical. More appropriate shared accountability mechanisms are rare. This can serve as a powerful disincentive to cooperative action, and may result in cost shifting to other areas and to inappropriate attention to processes and outputs more within the control of a manager. Diversion of limited resources Accountability does not come cheaply (for example, Power 1997; Gray and Jenkins 2007; Martin 2005). Resources that otherwise could go toward program improvement or delivering services must instead be devoted to documentation, a major complaint of program staff. As Gray and Jenkins (2007) ask, what should be the appropriate balance between checking and doing? Programs charged with delivering services are expected (often repeatedly) to demonstrate their value and cost effectiveness. Surely the same should apply to accountability-related functions (for example, RBM specialists, audit, inspection, evaluation, other forms of monitoring) that are of value only if they can assist in improving the relevance, effectiveness, and efficiency of public services. It is incumbent upon all those engaged in accountability to pay attention to whether they are delivering benefits otherwise they are part of the problem rather than part of the solution. Less, rather than more, focus on results and on innovation The sad, supreme irony is that “results-oriented” accountability approaches typically lead to less – rather than to more – focus on outcomes, on innovation and improvement. As Gill (2011) and Mintzberg (1996) have indicated, this NEW VIEW OF ACCOUNTABILITY 193 approach is rooted in a top-down hierarchical “control” model. A narrow focus on measurement is inconsistent with a focus on change and improvement that requires constant questioning about what else can be done or done better (Senge 1990). The outcome of judging programs or staff by their “numbers” is not empowerment, critical self-examination, and experimentation with new approaches. Instead, it leads to impaired performance, an emphasis on justifying and defending what was done, and a reluctance to admit that improvement is needed. A compliance approach may indeed be appropriate with respect to accountability for proper use of funds, conformity with legal or safety requirements, and perhaps with other matters that can be clearly specified. But a different approach is required regarding accountability for results. Meaningless for decision making Perhaps the most frequently mentioned rationale for performance measurement, including (indeed, often particularly) for accountability purposes, is to provide for more informed decision making and budgeting (for example, Hatry 1997). But performance measures, by themselves, are useless for this purpose. As Newcomer (1997: 10) put it: “Performance measurement typically captures quantitative indicators that tell what is occurring with regard to program outputs and perhaps outcomes but, in itself, will not address the how and why questions.” Programs may fail to meet targets due to limitations of program design – but also due to faulty management or implementation, under (or over) funding, unique circumstances, inappropriate targets or measurement at the wrong time, or for other reasons (Perrin 1998). Using performance measures . . . for accountability purposes, including making decisions about the future of programs without other forms of supporting evidence, is likely to lead to inappropriate actions. Thus performance measures by themselves provide no direct implications for action, unless other means are used to explore the reasons for results and the potential for future impact. Indeed, using them for accountability purposes, including making decisions about the future of programs without other forms of supporting evidence, is likely to lead to inappropriate actions. Less rather than more confidence in government Lonsdale and Bemelmans-Videc (2007: 3) have indicated that in spite of the “paradox of there being more accountability-related activities today 194 BURT PERRIN than ever before, at the same time . . . much public debate laments what is seen as a lack of actual accountability.” Tsoukas (2004) identifies paradoxes of the information society, where more information such as performance indicators devoid of context results in less understanding and in particular tends to erode trust that raises legitimate questions about what is being concealed. Radin observes that: “Citizens across the globe are skeptical about the ability of their governments to be accountable or able to perform as expected” (2011: 98). A study concerning accountability for funding for child-care services in Canada found lack of interest and trust in public reporting, with widespread distrust about the veracity of government figures (Anderson and Findlay 2010). While there are many factors at play, current accountability approaches do little to address concerns of the public about the effective use or misuse of government resources. This is hardly surprising given the failure of accountability approaches to reflect actual performance and meaningful results, what is seen as excessive and inappropriate bureaucracy and sophistry with statistics, and failure to support improved performance. Why such persistence with an inappropriate approach to accountability? There has been extensive documentation over the years of the limitations of traditional approaches to accountability and RBM – within Canada as well as in numerous other jurisdictions, including several articles in recent issues of this journal (for example, Anderson and Findlay 2010; Hildebrand and McDavid 2011; Ohemeng and McCall-Thomas 2013; Thomas 2007). These limitations and distortions are, or at least should be, well known. Nevertheless, there is an apparent reluctance to consider anything different and indeed a proliferation of RBM approaches and accountability measures that reflect many of the difficulties with these approaches. Why is this? Do governments, or at least politicians and high-ranking officials, really want to improve? Perhaps politicians and other senior officials are not really concerned about the reality of government performance, but primarily want a means of making themselves look good (and a means of blaming others where there are problems) and a way of symbolically suggesting that they are concerned about accountability and results? Perhaps the “results agenda” is more a matter of rhetoric and ideology than reality and rational analysis? NEW VIEW OF ACCOUNTABILITY 195 This is, arguably, a cynical perspective. But it is one frequently suggested by observers (for example, Radin 2011), and it is one that I hear frequently, in many different jurisdictions, from people both inside and outside of government. One can, perhaps, understand this attitude among politicians and some managers who might feel threatened by a meaningful results orientation. But I am convinced that most public officials are in public service because they are concerned about doing the best for their constituency. And is there not an obligation for auditors, evaluators, and others engaged in the accountability enterprise to speak truth to power, rather than exacerbating the problem? Control trumps good management. One reason for accountability is to control. Government officials want control over what is done with “their” funds and how the direction they have mandated is implemented. The problem – or at least one of the problems with this approach – is that “control” approaches simply do not work, at least with respect to bringing out the best in managers and staff. The control or machine model of human resources management went out of fashion some time ago – because it is ineffective. A top-down command-and-control approach to management is not suitable for creating a meaningful results focus (for example, see Mintzberg 1996). As a report of an expert OECD meeting considering challenges to results-focused management and budgeting observed: “One can order people to undertake specific activities. But it is impossible to order or to direct people how to think or what to believe. Indeed, this is most likely to be counterproductive” (Perrin 2002:14). There are much more effective ways of getting others to do what one wants.3 Leadership from the top is needed to bring about and to support needed organizational renewal and change. So, why the persistence with control mechanisms that impede performance? Is the appearance of exerting power and the illusion of control more important than what actually results from it? As Freedman (2013:557) states: “Power certainly could become an end in itself, a source of status and opportunities to boss others around [that is] detrimental to overall efficiency as well as to morale.” Demand for simple (or simplistic) answers to complex policy problems. There often is a demand for simplicity, in particular by politicians, the media, and the public. Thirty-second sound bites have become the medium for communicating governance issues. Explaining a complex issue through just a few indicators confuses simplicity and simplistic communications. 196 BURT PERRIN In part, this demand follows from an inappropriate obsession with numbers and quantification, a simplistic belief that numbers are never wrong. Unaware of alternatives. Even those well aware of the limitations of indicators sometimes do not seem to be able to conceive of any other approach, and instead seem to call for more of the same, but just done better, in spite of the substantial evidence that indicators, by themselves, are almost certain to be misused, in both the private (for example, The Economist 2002; Handy 1995, Jensen 2001; Wheelan 2013) and public sectors (for example, Bevan and Hood 2006; Le Grand 2010; Perrin 2008). There seems to be lack of openness to consideration of means of assessing performance and of accountability other than through quantitative indicators, despite amble evidence of their shortcomings. Resistance to change. Inertia is a powerful force. Why change unless one is really forced to do so? And there are many with an entrenched interest in staying with the status quo, which is generally easier – at least in the short run. A shift to a true outcome orientation, including realignment of accountability approaches, represents a profound paradigm shift and a major change in mindset. The implications of moving to a focus on outcomes rather than on process are still not understood, or accepted. A vision of accountability appropriate to undertakings intended to achieve outcomes that take place in a complex policy environment and are inevitably influenced by a variety of factors. Leadership from the top is needed to bring about and to support needed organizational renewal and change. There is still limited, but an increasing array of resources available about how to manage for outcomes in a way that embraces complexity.4 Organizations that remain static and fail to evolve and improve quickly become out of date and may struggle to survive, at least in the long term. A better approach to accountability Perrin, Bemelmans-Videc and Lonsdale (2007) present a vision of accountability appropriate to undertakings intended to achieve outcomes that take place in a complex policy environment and are inevitably influenced by a variety of factors. This approach to accountability encompasses three essential characteristics: A primary orientation toward results rather than on process. A focus on continuous and responsive learning. NEW VIEW OF ACCOUNTABILITY 197 A dynamic rather than a static approach that reflects the complexities and uncertainties inherent in most public policy areas. This model of accountability involves holding programs accountable for: asking the tough questions about what works and why, innovating and engaging in risk taking rather than playing it safe, and for seeking – and using – feedback. Holding programs accountable for asking the difficult questions, doing and using evaluations, and demonstrating use of learning – such as through changes in policies and in program approaches, may represent a harder standard than demonstrating compliance with procedures as with traditional accountability. This approach to accountability is based upon the following principles: Acting responsibly – being trustworthy, being true to the mandate, demonstrating responsibility in taking decisions (for example, Gregory 1995). Addressing the overall need or rationale for why the program is in place. Doing the best possible job given the circumstances, resources and constraints, consistent with the overall mandate.(for example, Light 1993). In short, programs should be accountable for demonstrating good management and for keeping in view outcomes, which includes (but definitely is not limited to) a true results orientation. A results-oriented focus represents a new way of thinking and managing, that needs to be reflected in all aspects of management. Such an approach to accountability requires a change in the processes that characterize traditional compliance-oriented accountability approaches. This approach is consistent with views advocated by many respected authorities. For example, it is consistent with the approach advocated by the Auditor General of Canada (2002: 10): “Our enhanced concept of accountability supports managing for results and hence a culture of learning. It asks ministers and managers to demonstrate credibly that they are learning (from mistakes as well as successes), taking corrective action where appropriate, and following up on weaknesses, rather than focussing only on who is at fault when things go wrong.” The Auditor General further says that holding to account for results asks if “everything reasonable has been done with available authorities and resources to influence the achievement of expected results” (2002: 8) and that while people should accept responsibility for their mistakes: 198 BURT PERRIN “If we wish to empower employees and encourage them to innovate . . . we should focus on learning from the experience rather than assigning blame” (2002: 10). Mayne (2007) has further elaborated upon these points. The World Bank roundtable discussion of experts identified a number of principles for moving towards a true results-oriented approach: Recognition that a results-oriented focus represents a new way of thinking and managing, that needs to be reflected in all aspects of management. Need for strong demonstrated, tangible and visible commitment from the top political and administrative levels, in order to provide legitimacy and priority to an outcome orientation and mobilization of resources as required. Need for bottom-up support and engagement, otherwise an outcome focus runs the risk of becoming a mere administrative exercise rather than an actual change in approach. Need to support and reward innovation and risk taking, being careful not to punish those who try, even if initial efforts are not perfect. Need to be strategic, relating all aspects of the results-oriented approach to the strategic direction and goals. Monitoring and evaluation approaches should not be developed in a vacuum, but in response to information requirements that will be needed to inform decisions and future directions. One might take the approach developed initially within the Canadian Office of the Auditor General (Mayne 2001, 2011), where indicators, along with other sources of information, are used to develop the performance story, or as Winston (1999) has put it, one assembles a variety of forms of data in order to provide performance information that is balanced between being formative (intended to improve existing policies and programs) and summative (intended to publicly sum up achievements and shortcomings). Performance information regarding what is really important cannot be reduced to a few numbers in a database. A corollary of the above is that there is a need for program evaluation as well as for monitoring (or RBM or performance management) as part of a more comprehensive monitoring and evaluation strategy to provide a meaningful picture of how effectively programs are moving towards outcomes, and in particular to provide explanation that can better inform improvements as well as future policies and strategies. It makes little sense, as is too frequently the case, to have RBM and evaluation functions NEW VIEW OF ACCOUNTABILITY 199 completely separate (e.g. see Newcomer and Brass 2015 or Nielsen and Hunter, 2013: a special issue of New Directions for Evaluation devoted specifically to this topic). Conclusion and next steps There is no question that implementing the above vision represents a hard sell. In particular, it requires recognition that performance information regarding what is really important cannot be reduced to a few numbers in a database. This may represent a different view of the world for many of those whose primary training and expertise has been with such means and where “accountability”, as Behn (2001) has stated, is often equated with punishment. Given the above, how can one get leadership buy-in for a more meaningful approach to accountability in cultures where some say that the adversarial nature of politics and any moves to decouple performance from a hard-edged view are viewed, at best, with scepticism? It is well beyond the scope of this article to discuss this in any detail, and I can only offer a couple of ideas. If reforms are not undertaken, “accountability” practices will continue to be more and more rhetorical, inhibiting improved performance and contributing to less rather than more confidence in government. First, there is a need for greater public acknowledgement of the widely known shortcomings of current approaches to accountability, and then discussion about possible alternatives. Perhaps this journal and/or others with an interest in the area (for example, the Institute of Public Administration of Canada, Canada’s Treasury Board or the Office of the Auditor General) could provide a forum where leading thinkers, including public sector leaders, can discuss such considerations. Internationally, a Global Parliamentarian Forum on Evaluation is being launched during 2015. This might provide an opportunity for those parliamentarians interested in outcome-oriented and effective public services to discuss with their peers possible roles for parliamentarians in supporting such a change. Moving to a model of accountability appropriate for the current realities of public governance represents a major change effort. Nevertheless, if reforms are not undertaken, “accountability” practices will continue to be more and more rhetorical, inhibiting improved performance and contributing to less rather than more confidence in government. 200 BURT PERRIN Notes 1 For example, Bemelmans-Videc et al. 2007; Behn 2001; CIPD 2009; de Lancer Julnes 2006; Feller 2002; Hummelbrunner and Jones 2013a, 2013b; McDavid and Huse 2012; McDavid, Huse, and Hawthorn 2013; Nielsen and Hunter 2013; Perrin 1998, 2002; Pollitt 2013; Power 1997; Thomas 2006; van der Knapp 2006; Winston 1999. 2 For example, see Note 1. 3 Discussion of this is well beyond the scope of this article, but see, for example: Behn 2004; CIPM 2009; Mintzberg 1996; Hummelbrunner and Jones 2013a, 2013b; Perrin 2002, 2006. Or almost any human resources text. 4 To give but some examples: Bemelmans-Videc et al. 2007; Behn 2001; Forss, Marra and Schwartz 2011; Perrin 2002, 2007; Williams and Hummelbrunner 2011; Williams and Imam 2007. References Anderson, Lynell, and Tammy Findlay. 2010. “Does public reporting measure up? Federalism, accountability and child-care policy in Canada.” Canadian Public Administration, 53 (3): 417–438. Auditor General of Canada. 2002. Modernizing Accountability in the Public Sector. Chapter 9, Report of the Auditor General of Canada to the House of Commons. Ottawa. BBC News. 2011. “Transocean gives bonuses after Gulf of Mexico BP spill.” 3 April. Behn, Robert D. 2001. Rethinking Democratic Accountability. Washington: Brookings. ——. 2004. “Performance leadership: 11 better practices that can ratchet up performance.” IBM Center for the Business of Government. Retreived May. [website] Available at: http://www.businessofgovernment.org/pdfs/Behn_Report.pdf. Bemelmans-Videc, Marie-Louise. 2007. “Accountability, a classic concept in modern contexts: Implications for evaluation and for auditing roles.” In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. Bemelmans-Videc, Marie-Louise, Jeremy Lonsdale, and Burt Perrin, eds. 2007. Making Accountability Work: Dilemmas for Evaluation and for Audit. New Brunswick, NJ: Transaction Publishing. Bevan, Steven. 2013. “Performance pay won’t perform in the classroom.” The Work Foundation Blog. Posted 22 October. Available at: http://www.theworkfoundation.com/blog/ 1438/Performance-pay-wont-perform-in-the-classroom#.UoH7SJBk0hQ.email. Bevan, Gwen, and Christopher Hood. 2006. “What’s measured is what matters: targets and gaming in the English public health care system.” Public Administration 84 (3): 517–538. Bevan, Gwen, and Richard Hamblin. 2009. “Hitting and missing targets by ambulance services for emergency calls: Effects of different systems of performance measurement within the UK.” Journal of the Royal Statistical Society 172, Part 1: 161–190. Chartered Institute of Personnel and Development (CIPD). 2009. Performance Management in Action: Current Trends and Practice. London: author. de Lancer Julnes, Patria. 2006. “Performance measurement: An effective tool for government accountability? The debate goes on.” Evaluation 12 (2): 219–235. Dubnick, Melvin. 2005. Public Performance & Management Review 28 (3) March: 376–417. Economist, The. 2002. “Why honesty is the best policy: Corporate deceit is a slippery slope.” The Economist 362 (8263), March. Feller, Irwin. 2002. “Performance measurement redux.” American Journal of Evaluation 23 (4): 435–452. Forss, Kim, Mita Marra, and Robert Schwartz, eds. 2011. Evaluating the Complex: Attribution, Contribution, and Beyond. New Brunswick, NJ: Transaction Publishers. NEW VIEW OF ACCOUNTABILITY 201 Freedman, Laurence. 2013. Strategy: A History. New York: Oxford University Press. Gill, Derek, ed. 2011. The Iron Cage Recreated: The Performance Management of State Organizations in New Zealand. Wellington: Institute of Policy Studies, Victoria University. Gray, Andrew, and Bill Jenkins. 1993. “Codes of accountability and in the new public sector.” Accounting, Auditing, and Accountability Journal, 6 (3): 52–67. ——. 2007. “Checking out? Accountability and evaluation in the British regulatory state. In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. Gregory, Robert. 1995. “Accountability, responsibility and corruption: Managing the public production process.” In The State Under Control, edited by Jonathan Boston. Wellington, NZ: Bridget Williams Books. Handy, Charles B. 1995. Beyond Certainty: The Changing Worlds of Organizations. Boston: Harvard Business School Press. Hatry, Harry P. 1997. “Where the rubber meets the road: Performance measurement for state and local public agencies.” In Using Performance Measurement to Improve Public and Non-profit Programs. New Directions for Evaluation, No. 75, edited by Kathryn E. Newcomer San Francisco: Jossey-Bass. Hatry, Harry P. 2013. “Sorting the relationship among performance measurement, program evaluation, and performance management.” In Performance Management and Evaluation, edited by Steffen Bohm Nielsen, and David E. K. Hunter. New Directions for Evaluation No. 137. San Francisco: Jossey-Bass. Hildebrand, Richard, and James C. McDavid. 2011. “Joining public accountability and performance management: A case study of Lethbridge, Alberta.” Canadian Public Administration 54 (1): 41–72. House of Commons (UK) Public Administration Select Committee (PASC). 2014. Caught Red-Handed: Why We can’t Count on Police Recorded Crime Statistics. Thirteenth Report of Session 2013–14. London: The Stationery Office Limited. Also available at http://www. publications.parliament.uk/pa/cm201314/cmselect/cmpubadm/760/760.pdf. Hummelbrunner, Richard, and Jones, Harry. 2013a. A Guide to Managing in the Face of Complexity. Background Note. London: Overseas Development Institute. http://www.odi.org. uk/sites/odi.org.uk/files/odi-assets/publications-opinion-files/8662.pdf. ——. 2013b. A Guide for Planning and Strategy Development in The Face of Complexity. Background Note. London: Overseas Development Institute. http://www.odi.org.uk/sites/odi. org.uk/files/odi-assets/publications-opinion-files/8287.pdf. L’Inspection Generale de l’Administration et Inspection Generale de la Police Nationales. 2014. L’Enregistrement des Plaintes par les Forces de Securite Interieure sur le Ressort de la Prefecture de Police. N 14-011/13-093/01 et IGPN-SG-N 13-1000 I. Paris: Ministère de l’Interieur. Available at http://www.ladocumentationfrancaise.fr/var/storage/rapportspublics/144000135/0000.pdf. Jarvis, Mark D., and Paul Thomas. 2012. “The Limits of Accountability: What Can and Cannot Be Accomplished in the Dialectics of Accountability?” In From New Public Management to New Political Governance: Essays in Honour of Peter C. Aucoin, edited by Herman Bavkis and Mark D. Jarvis. Montreal: McGill-Queen’s University Press. Leclerc, Guy, David Moynagh, Jean-Pierre Boisclair, and High R. Hanson. 1996. Accountability, Performance Reporting, Comprehensive Audit – An Integrated Perspective. Ottawa: Canadian Comprehensive Audit Foundation (CCAF). Le Grand, Julian. 2010. “Knights and knaves return: Public service motivation and the delivery of public services.” International Public Management Journal, 13 (1): 56–71. Lehtonen, Markku. 2005. “OECD environmental performance review programme: Accountability (f)or learning?” Evaluation 11 (2): 169–188. 202 BURT PERRIN Light, Paul C. 1993. Monitoring Government: Inspectors General and the Search for Accountability. Washington: Brookings Institution Press. Lonsdale, Jeremy, and Marie-Louise Bemelmans-Videc. 2007. In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. Martin, Steve. 2005. “Evaluation, Inspection and the Improvement Agenda: Contrasting Fortunes in an Era of Evidence Based Policy Making.” Evaluation 11 (4): 496–504. Mayne, John. 2001. “Addressing attribution through contribution analysis: Using performance measures sensibly.” Canadian Journal of Program Evaluation 16 (1): 1–24. ——. 2007. Evaluation for accountability: Myth or reality? In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. ——. 2011. “Contribution analysis: Addressing cause and effect.” In Evaluating the Complex: Attribution, Contribution, and Beyond, edited by Kim Forss, Mita Marra, and Robert Schwartz, eds. New Brunswick, NJ: Transaction Publishers. Mayne, John, and Eduardo Zapico-Go~ ni, eds. 1997. Monitoring Performance in the Public Sector: Future Directions from International Experience. Transaction Publishers. McDavid, James. C., and Irene Huse. 2012. “Legislator uses of public performance reports: Findings from a five-year study.” American Journal of Evaluation 33 (1): 7–25. McDavid, James. C., Irene Huse, and Laura R.L. Hawthorn. 2013. Program Evaluation and Performance Measurement An Introduction to Practice. Sage: Thousand Oaks, CA. Mintzberg, Henry 1994. The Rise and Fall of Strategic Planning. New York: The Free Press. ——. 1996. “Managing government, governing management.” Harvard Business Review 75–83. Newcomer, Kathryn E. 1997. “Using performance measurement to improve public and non-profit programs.” In Using Performance Measurement to Improve Public and Nonprofit Programs edited by Kathryn E. Newcomer. New Directions for Evaluation No. 75. San Francisco: Jossey-Bass. Newcomer, Kathryn E., and C. Brass. 2015. “Forging a strategic and comprehensive approach to evaluation within public and nonprofit organizations: Integrating measurement and analytics within evaluation” American Journal of Evaluation. In press. Nielsen, Steffen Bohm and David E. K. Hunter, eds. 2013. Performance Management and Evaluation. New Directions for Evaluation, No. 137. San Francisco: Jossey-Bass. Ohemeng, Frank, and Elyse McCall-Thomas. 2013. “Performance management and ‘undesirable’ organizational behaviour: Standardized testing in Ontario schools.” Canadian Public Administration 56 (3): 456–477. Pawson, Ray, and Nick Tilley. 1997. Realist Evaluation. London: Sage Publications. Perrin, Burt. 1998. “Effective use and misuse of performance measurement.” American Journal of Evaluation 19 (3): 367–379. ——. 1999. “Performance measurement: Does the reality match the rhetoric? A rejoinder to Bernstein and Winston.” American Journal of Evaluation 20 (1): 101–110. ——. 2002. “Implementing the vision: Addressing challenges to results-focused management and budgeting.” OECD. Available at: http://www.oecd.org/dataoecd/4/10/2497163.pdf. ——. 2006. Moving from Outputs to Outcomes: Practical Advice from Governments Around the World. Washington: World Bank and IBM Center for the Business of Government. Available at: http://www.worldbank.org/oed/outcomesroundtable/outputs_to_outcomes.pdf. ——. 2007. “Towards a new view of accountability.” In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. ——. 2011. “What is a true results-performance-based delivery system.” Invited expert presentation to the European Parliament. Evaluation 17 (4): 17–424. NEW VIEW OF ACCOUNTABILITY 203 ——. 2012. Linking Monitoring and Evaluation to Impact Evaluation. Guidance Note No.2 Washington: InterAction. Available at: http:www.interaction.org/impact-evaluation-notes Perrin, Burt, Marie-Louise Bemelmans-Videc, and Jeremy Lonsdale. 2007. “How evaluation and auditing can help bring accountability into the twenty-first century.” In Making Accountability Work: Dilemmas for Evaluation and for Audit, edited by Marie-Louise Bemelmans-Videc, Jeremy Lonsdale, and Burt Perrin. New Brunswick, NJ: Transaction Publishing. Pollitt. Christopher. 2013. “The logics of performance management.” Evaluation 19: 346–363. Power, Michael. 1997. The Audit Society: Rituals of Verification. Oxford: Oxford University Press. Radin, Beryl A. 2011. “Does performance measurement actually improve accountability?” In Accountable Governance: Problems and Promises, edited by Melvin J. Dubnick and H. George, Frederickson. Armonk, NY: M.E. Sharpe, 98–110. Rees, James, Rebecca Taylor, and Chris Damm. 2013. Does Sector Matter? Understanding the Experiences of Providers in the Work Programme. Third Sector Research Centre Working Paper 92. Available at: http://www.birmingham.ac.uk/generic/tsrc/documents/tsrc/ working-papers/working-paper-92.pdf Rogers, Patricia. 2008. “Using programme theory to evaluate complicated and complex aspects of interventions.” Evaluation 14(1): 29–48. Senge, Peter. 1990. The Fifth Discipline: The Art and Practice of the Learning Organization. New York: Doubleday/Currency. Thomas, Paul G. 2006. Performance Measurement, Reporting, Obstacles and Accountability: Recent Trends and Future Directions. Canberra: The Australian National University. ——. 2007. “Reflections of a 90s-era editor (the 1990s, that is, not the 1890s!).” Canadian Public Administration 50: 513–516. Toynbee, Polly. 2013. “The push for performance-related pay is driven by faith, not facts.” The Guardian, 12 November. Accessed at: http://www.theguardian.com/commentisfree/ 2013/nov/12/performance-related-pay-risky-behaviour. Tsoukas, Haridimos. 2004. “The tyranny of light.” In Complex Knowledge: Studies in Organizational Epistemology. Oxford: Oxford University Press. van der Knaap, Peter. 2006. “Responsive evaluation and performance management: Overcoming the downsides of policy objectives and performance indicators.” Evaluation 12 (3). Wheelan, Charles. 2013. Naked Statistics. New York, Norton. Williams, Bob, and Iraj Imam, eds. 2007. Systems Concepts in Evaluation: An Anthology. Point Reyes, CA: EdgePress of Inverness. Williams, Bob, and Richard Hummelbrunner. 2011. Systems Concepts in Action: A Practitioner’s Toolkit. Stanford, CA: Stanford University Press. Winston, Jerome A. 1993. “Performance indicators: Do they perform?” Evaluation News and Comment 2 (3): 22–29. ——. 1999. “Understanding performance measurement: A response to Perrin.” American Journal of Evaluation 20 (1): 95–100. Reviewers Evaluateurs CANADIAN PUBLIC ADMINISTRATION is Canada’s foremost venue for the analysis of public management issues. The journal’s high quality is heavily shaped by the excellent and often inadequately appreciated contributions of those public servants and scholars who carefully referee our manuscript submissions. These referees probe the logic of arguments, the soundness of evidence and the relevance of submissions to a diverse group of readers. Their thoughtful comments, which often embody hours of work, improve substantially our publication. As a small recognition, the journal hereby acknowledges the contributions and dedication of the reviewers noted below who assessed manuscripts during 2014. C’est dans ADMINISTRATION PUBLIQUE DU CANADA que l’analyse des questions de gestion publique trouve essentiellement sa place au Canada. La qualite superieure de la Revue est grandement determinee par les excellentes contributions, souvent mal reconnues, de fonctionnaires et d’universitaires qui evaluent attentivement nos soumissions. Ces evaluateurs recherchent la logique des arguments, la validite des elements probants et la pertinence des soumissions pour un groupe varie de lecteurs. Leurs commentaires eclaires, qui representent souvent des heures de travail, nous permettent d’ameliorer considerablement notre publication. La Revue temoigne donc sa gratitude aux evaluateurs, nommes ci-dessous, pour leurs contributions a l’evaluation des manuscrits en 2014. Christopher Alcantara Catherine Althaus Herman Bakvis Colin Bennett Laurence Bherer Robert Bish Deborah Blackman Christian Bordeleau Sandford Borins Emilio Bouliane Jacques Bourgault Astrid Brousselle Dale Christenson Ian Clark Amanda Clarke Jonathan Craft Christian Dagenais Raisa Deber Paul Dufour Peter Elson Chantal Faucher Alan Fenna Katherine Fierlbeck Yves Gingras Maurice Gosselin J.I. Gow Mary Hall Brian Head Damian Hodgson Brett Holfeld Michael Howlett Irene Huse Louis Imbeau Steve Jacob Kenneth Kernaghan Tom Klassen John Langford Alexandre Laurin Claude Laurin Jonathan Malloy Daniel Maltais Patrik Marier Justin Massie Tim Mau John Mayne Jim McDavid Kathleen McNutt Jennifer Menzies Frederic Merand Andrea Migone Jason Millar Eric Monpetit Lynne Moore Dwight Newman Joshua Newman Frank Ohemeng Mathieu Ouimet Pamela Palmeter Andrew Parkin Charles Pascal Robert Pellerin Burt Perrin Michael Prince Ken Rasmussen Paul Ross Christian Rouillard Denis Roy Jeffrey Roy Peter Russell Robert Shepherd David Siegel Lynne Siemens Matti Siemiatycki Lorne Sossin John Sullins Nancy Taylor Paul Thomas Heather Treacy Diane-Gabrielle Tremblay Marie-Soleil Tremblay Allan Tupper Jean Turgeon Richard Van Loon Graham White Tom Wileman CANADIAN PUBLIC ADMINISTRATION / ADMINISTRATION PUBLIQUE DU CANADA VOLUME 58, NO. 1, MARCH/MARS 2015, P. 204 C The Institute of Public Administration of Canada/L’Institut d’administration publique du Canada 2015 V Information for subscribers Canadian Public Administration is published in four issues per year. Institutional subscription prices for 2015 are: Print & Online: US$507 (US), US$545 (and Rest of World), e353 (Europe), £280 (UK). Prices are exclusive of tax. AsiaPacific GST, Canadian GST and European VAT will be applied at the appropriate rates. For more information on current tax rates, please go to www.wileyonlinelibrary.com/tax-vat. The price includes online access to the current and all online back files to January 1st 2011, where available. For other pricing options, including access information and terms and conditions, please visit www.wileyonlinelibrary.com/access. s Information aux abonne e quatre fois par an. Les tarifs d’abonnement pour 2015 sont : Administration publique du Canada est publiee et en ligne US$507 (E.-U.), Imprimee US$545 (et reste du monde), e353 (Europe), £280 (R.-U.). Les prix ne enne sera ajoute e comprennent pas les taxes. La TPS canadienne, la TPS de la zone Asie-Pacifique ou la TVA europe la facturation selon la provenance de la demande. Pour de l’information au sujet des taux nationaux s’appliquant aux a biens et services, veuillez consulter le site www.wileyonlinelibrary.com/tax-vat. Le prix inclut l’accès aux es jusqu’au 1er janvier 2011. Pour toute autre information concernant les prix, informations en ligne, courantes et archive incluant les inscriptions ou abonnements, et autres services, consultez le site www.wileyonlinelibrary.com/tax-vat. Journal Customer Services: For ordering information, claims and any inquiry concerning your journal subscription please go to www.wileycustomerhelp.com/ask or contact your nearest office. Americas: Email: [email protected]; Tel: +1 781 388 8598 or +1 800 835 6770 (toll free in the USA & Canada). Europe, Middle East and Africa: Email: [email protected]; Tel: +44 (0) 1865 778315. Asia Pacific: Email: [email protected]; Tel: +65 6511 8000. Japan: For Japanese speaking support, Email: [email protected]; Tel: +65 6511 8010 or Tel (toll-free): 005 316 50 480. Visit our Online Customer Help available in 7 languages at www.wileycustomerhelp.com/ask s de la Revue : Pour toute communication concernant les commandes, les ree clamations et toute Services aux abonne votre abonnement, veuillez consulter wileyonlinelibrary.com/support ou contacter demande de renseignement relative a . notre bureau le plus rapproche la clientele est disponible en 7 langues. Notre service de soutien a l.: +44 (0) 1865 778315; te le c.: +44 (0) 1865 471775; R.U.: Courriel: [email protected]; te l.: +1 781 388 8598 ou 1 800 835 6770 (sans frais aux E.-U.: Courriel: [email protected]; te E.-U. et au Canada); l.: +65 6511 8000. Asie: Courriel: [email protected]; te Delivery Terms and Legal Title: Where the subscription price includes print issues and delivery is to the recipient’s address, delivery terms are Delivered at Place (DAP); the recipient is responsible for paying any import duty or taxes. Title to all issues transfers FOB our shipping point, freight prepaid. We will endeavour to fulfil claims for missing or damaged copies within six months of publication, within our reasonable discretion and subject to availability. es a l’adresse des Conditions de livraison et titre juridique : Le prix comprend la livraison des revues imprime destinataires. Les conditions de livraison sont « Rendu au lieu de destination convenu (DAP) »; les destinataires sont re a l’abonne au moment ou nos responsables du paiement de tout frais d’importation. Le titre juridique est transfe clamations concernant les distributeurs assurent la livraison. Nous nous efforcerons de donner suite aux re s dans les six mois qui suivront la publication, si nous le jugeons raisonnable exemplaires manquants ou endommage serve de disponibilite . et sous re Back issues: Single issues from current and recent volumes are available at the current single issue price from cs [email protected]. Earlier issues may be obtained from Periodicals Service Company, 351 Fairview Avenue – Ste 300, Hudson, NY 12534, USA. Tel: +1 518 822-9300, Fax: +1 518 822-9305, Email: [email protected] ros : Les nume ros individuels des volumes actuels et re cents sont disponibles au Anciens numeros : Anciens nume ros individuels a [email protected]. Pour obtenir des anciens nume ros, prière de tarif en vigueur pour les nume Periodicals Service Company, 351 Fairview Avenue – Ste 300, Hudson, NY 12534, USA. s’adresser a l: +1 518 822-9300, Fax: +1 518 822-9305, Courriel: [email protected] Te Mailing: Canadian Public Administration/Administration publique du Canada is mailed Standard Rate. Mailing to the rest of the world by IMEX (International Mail Express). Canadian mail is sent by Canadian publications mail agreement number 40573520. POSTMASTER: Send all address changes to Canadian Public Administration/Administration publique du Canada, Journal Customer Services, John Wiley & Sons Inc., 350 Main St., Malden, MA 02148-5020. e par la poste au tarif normal. Envois : Administration publique du Canada/ Canadian Public Administrationest envoyee Les envois dans le reste du monde se font par IMEX (International Mail Express). Les envois canadiens se font en vertu ro 40573520. MA^ITRE DE POSTE: Envoyer tous changements d’un accord d’envoi de publications canadiennes numee Administration publique du Canada/ Canadian Public Administration, Journal Customer Services, John d’adresse a Wiley & Sons Inc., 350 Main St., Malden, MA 02148-5020. Access to this journal is available free online within institutions in the developing world through the AGORA initiative with the FAO, the HINARI initiative with the WHO and the OARE initiative with UNEP. For information, visit www.aginternetwork.org, www.healthinternetwork.org, www.oaresciences.org C 2015 The Institute of Public Administration of Canada/L’Institut d’administration Copyright and Copying V publique du Canada. All rights reserved. No part of this publication may be reproduced, stored or transmitted in any form or by any means without the prior permission in writing from the copyright holder. Authorization to copy items for internal and personal use is granted by the copyright holder for libraries and other users registered with their local Reproduction Rights Organisation (RRO), e.g. Copyright Clearance Center (CCC), 222 Rosewood Drive, Danvers, MA 01923, USA (www.copyright.com), provided the appropriate fee is paid directly to the RRO. This consent does not extend to other kinds of copying such as copying for general distribution, for advertising or promotional purposes, for creating new collective works or for resale. Special requests should be addressed to: [email protected] C 2015 L’Institut d’administration publique du Canada/The Institute of Public Droit d’auteur et copie V serve s. Aucune partie de la pre sente publication ne peut e ^tre reproduite, mise Administration of Canada. Tous droits re moire ou transmise de quelque façon que ce soit et par quelque moyen que ce soit sans l’autorisation pre alable en me e par e crit par le de tenteur du droitd’auteur. Le de tenteur du droit d’auteur accordera l’autorisation de copie donne usage interne ou personnel aux bibliothè ques et autres usagers inscrits auprès de leur organisme local de des articles a droits de reproduction (Reproduction Rights Organisation (RRO)), p.ex., Copyright Clearance Center (CCC), 222 condition que les droits approprie s soient paye s Rosewood Drive, Danvers, MA 01923, USA (www.copyright.com), a des fins directement au RRO. Ce consentement ne concerne pas d’autres types de photocopies comme la photocopie a ne rale, de publicite ou de promotion, pour cre er de nouvelles oeuvres collectives ou pour la revente. Les de distribution ge ciales doivent e ^tre envoye es a : [email protected] demandes spe Online Access: This journal is available online at Wiley Online Library. Visit www.onlinelibrary.wiley.com to search the articles and register for table of contents e-mail alerts. ISSN 0008-4840 (Print) ISSN 1754-7121 (Online) For submission instructions, subscription and all other information visit: www.wileyonlinelibrary.com Wiley Online Library. Consulter www.onlinelibrary.wiley.com Accès en ligne : Cette revue est disponible en ligne a recevoir des courriels sur les tables des matières. pour rechercher les articles et demander a ISSN 0008-4840 (impression) ISSN 1754-7121 (en ligne) dures de soumissions, les abonnements ou toute autre information, veuillez consultez le site Pour les proce www.wileyonlinelibrary.com Canadian Public Administration accepts articles for Open Access publication. Please visit http://olabout.wiley.com/WileyCDA/Section/id-406241.html for further information about OnlineOpen Production Editor: Muhammad Haider Md Sahle (email: [email protected]). Advertising: Kristin McCarthy (email: [email protected]). Periodical Postage Paid at Hoboken, NJ and additional offices. Postmaster: Send all address changes to Canadian Public Administration, John Wiley & Sons Inc., C/O The Sheridan Press, PO Box 465, Hanover, PA 17331. Wiley’s Corporate Citizenship initiative seeks to address the environmental, social, economic, and ethical challenges faced in our business and which are important to our diverse stakeholder groups. Since launching the initiative, we have focused on sharing our content with those in need, enhancing community philanthropy, reducing our carbon impact, creating global guidelines and best practices for paper use, establishing a vendor code of ethics, and engaging our colleagues and other stakeholders in our efforts. Follow our progress at www.wiley.com/go/citizenship relever les de fis environnementaux, sociaux, e conomiques d’entreprise de Wiley vise a L’initiative de la citoyennete thiques rencontre s dans notre entreprise et qui sont importants pour nos diversgroupes d’inte re ^ t concerne s. ete Depuis le lancement de l’initiative, nous avons mis l’accent sur le partage de notre contenu avec ceux dans le besoin, liorant la philanthropie communautaire, en re duisant notre empreinte de carbone, en e tablissant des lignes en ame thique des fournisseurs, et directrices et des meilleures pratiques pourl’utilisation du papier, en instituant un code d’e resse s dans nos efforts. en engageant nos collègues et d’autres inte www.wiley.com/go/citizenship Suivez nos progrès a Disclaimer: The Publisher, the Institute of Public Administration of Canada, and the Editor(s) cannot be held responsible for errors or any consequences arising from the use of information contained in this journal; the views and opinions expressed do not necessarily reflect those of the Publisher, the Institute of Public Administration of Canada, and the Editor(s), neither does the publication of advertisements constitute any endorsement by the Publisher, the Institute of Public Administration of Canada, and the Editor(s) of the products advertised. : L’Editeur, dacteurs ne peuvent pas e ^tre Rejet de responsabilite l’Institut d’administration publique du Canada, et les re quences de coulant de l’utilisation d’information contenue dans cette tenus responsables des erreurs et de toutes conse s ne reflètent pas ne cessairement ceux de l’Editeur, revue; les points de vue et les opinions exprime l’Institut dacteurs, et la publication de publicite s ne constitue de la part de d’administration publique du Canada, et ceux des re dacteurs aucun parrainage des produits annonce s. l’Editeur, l’Institut d’administration publique du Canada, et des re Sheridan Group – Printed in the USA by The Sheridan Group Save Time and Let the Research Come to You Sign up for new content alerts for all of your favorite journals: 9 Be the first to read E Early View articles 9 Get notified of Accepted Articles when they appear online 9 Receive table of contents details each time a new issue is published 9 Never miss another issue! Follow these 3 easy steps to register for alerts online: Select “Get New Content Alerts” from Journal Tools on the top left menu on any journal page or visit the Publications page to view all titles. Submit your preferences and you are done. You will now receive an email when a new issue of the journal publishes. wileyonlinelibrary.com 12 - 4 6 0 6 0 1 2 3 Log into Wiley Online Library. If you are not already a registered user, you can create your profile for free. Decision Sciences Journal of Innovative Education “Their support is strong, continuous, enthusiastic, prompt, and supports the long-term growth of the journal and the institute.” Chetan S. Sankar, Editor Institute of Food Technologists “By combining excellent customer service and a strategic view of the publishing market, they have enabled our scientific society to meet our goals...” Jerry Bowman, Vice President of Communication Veterinary Clinical Pathology exchanges.wiley.com/societies “I continue to be impressed with both the day-to-day management, including careful attention to quality issues, and long-term planning. We look forward to continuing our relationship for many years…” Karen M. Young, Editor-in-Chief Governing in the Now. Gouverner dans le Moment. Charting a course in today’s environment of public administration can be challenging. Journey with us for conversations about governing in the now. Tracer la voie dans le contexte actuel de l'administration publique peut être exigeant. Accompagnez-nous dans les conversations sur la façon de gouverner dans le moment. August 2SE to 26th, 2015 Halifax Marriott Harbourfront Hotel www.ipac/2015 Du 2 au 26 août 2015 Hôtel Halifax Marriott Harbourfront www.iapc/2015 INITIATE. INTEGRATE. INNOVATE. INVITER. INTÉGRER. INNOVER. Canadian Public Administration Administration publique du Canada march⁄mars 2015 volume 58 number⁄numéro 1 N. Dubois/1 J. McDavid É. Charbonneau J.-L. Denis Special Issue on Performance Measurement, Performance Management, and Accountability – Editors’ Introduction / Numéro spécial sur la mesure de la performance, gestion de la performance, et imputabilité – Introduction des rédacteurs ori g i nal arti cles ⁄ a r ticl e s o r igin a ux J. Veillard/15 B. Tipper, S. Allin G.-C. Thiebaut/39 F. Champagne A.-P. Contandriopoulos O. Sossa/63 I. Ganache Health system performance reporting in Canada: Bridging theory and practice at pan-Canadian level Les enjeux de l’évaluation de la performance : Dépasser les mythes L’appréciation de la performance du système de santé et des services sociaux du Québec : L’approche du Commissaire à la santé et au bien-être É. Charbonneau/89 G. Divay D. Gardey Volatilité dans l’utilisation des indicateurs de performance municipale : bilan et nouvelle perspective d’analyse É. Charbonneau/110 F. Bellavance Performance management in a benchmarking regime: Quebec’s Municipal Management Indicators E. Nason/138 M.A. O’Neill What metrics? On the utility of measuring the performance of policy research: An illustrative case and alternative from Employment and Social Development Canada J. Grundy/161 Performance measurement in Canadian employment service delivery, 1996-2000 B. Perrin/183 Bringing accountability up to date with the realities of public sector management in the 21st century 204 revi ewers ⁄ éva lua te ur s