3.2 Performance indices should encourage politicians to make good decisions
In a democratic society, the citizen has the right to be
informed about politics, so that..
- she or he can build up an own opinion on the
importance of a given issue, and on the right way to deal with it (i.e.
“Should income taxes be lowered, if that implies less available money
for supporting families with young
children?”);
- she or he can judge if the
government acts in the preferred way (i.e. lowers or raises income taxes), so
that the voter can decide to re-elect or not this
government.
Many nations have laws that guarantee the
citizen’s right to be informed
[16]; to my
knowledge, these laws or directives
[17] do not
deal with the subtle difference between “information” and
“communication”. The right to be informed should imply the right to
be informed in an understandable way, so that the information is effectively
communicated to the citizen.
Indicator systems are a means of communication. Beyond pure
information, they make complex problems digestable by structuring them, by
highlighting what is essential and omitting what is not absolutely necessary to
understand a given issue.
A democratic information system, whether consisting of
indicators, databases, newspaper columns, TV broadcasts or any other form of
reporting and communication, should help the citizen to evaluate the performance
of the elected government.
More specifically, a “Policy Performance Index”
should represent, not determine, the perception of importance of a
given policy issue. As said earlier, we might be scared by the power of a PPI to
drive policy decisions; however, if that power lets politicians take decisions
that are in line with their citizens’ expectations, it would be
beneficial.
As illustrated in Figure
6:
Indicators, Media,
Voters and Politics, two main features of a PPI drive policy decisions: the
share of the respective issue in the index, and the policy
valuation. If we take the example of GDP (an index measuring
production on the basis of a monetary unit), then the car industry would have a
higher
share than the bicycle industry in this index; and for both
industries an annual production increase by 5% would be
valued as a
“good” result, while a decrease of 10% within one year would
probably be called a
“crisis”.
3.2.1 Policy Performance Index: defining the share of the components
While we have clear ideas how to define the respective share
of car and bicycle industries in GDP (through their value added measured in
Euros), there are no market prices for issues like poverty, gender equality,
education, CO2 emissions or destruction of habitats.
There is no easily accessible common unit for these issues;
and yet, politicians, when looking at the PPI example above, would probably
declare “my friend, you have exaggerated the share of Environment a
little bit, but I could live with the 35% you attached to Social
Care”.
We all have a feeling for importance, for the weight
that such issues have in policy-making. We intuitively know that in Europe
unemployment is more important than drugs, while in the U.S. it is
the other way round. Quantifying such intuition is not too difficult; for
example, one could ask the following question to a representative sample of
citizens:
|
Question: For the purpose of judging the
performance of the government, we want to construct a “Policy
Performance Index”, containing economic, social and environmental
indicators. The weight of the indicators should represent the importance of each
area for policy-making. If you had 100 points to distribute on the three issues,
how many would you give to each of them?
|
|
(total: 100%)
|
|
Economy (e.g. GDP, inflation, investments,
...)
|
: ___
|
|
Social Care (e.g. unemployment, pensions, health
system, ...)
|
: ___
|
|
Environment (e.g. climate change, air pollution, waste,
noise...)
|
: ___
|
This very straightforward method to determine the weights of
an index will work fine as long as the respondent has an opinion on the weight
of the issues in real life. An average citizen with a basic knowledge of
mathematics who occasionally reads newspapers or watches the news in TV will be
perfectly able to allocate 100 points on economy, social care and
environment.
However, the same citizen will have more difficulties, for
defining aggregation level 2 of the PPI, to allocate 100 points to Social
Care issues like poverty, health system, children care, pension schemes,
education, gender equality, drugs and crime etc. Although it could and
should be tried to ask citizens for their opinion on the importance of Social
Care issues, one might get more consistent results if the respective question
would be asked to a panel of persons working in this policy area; for example,
senior experts of the health insurance and pension systems, trade unions, the
churches, journalists, doctors, street workers, labour market specialists, and
so on (and it will be interesting to compare how the experts perceptions differ
from those of ordinary citizens, and why...).
Even more difficult would be the allocation of the 100-point
budget on the various components that constitute the policy area
“
Environment”. Again, it could be tried to ask citizens how
many points they would give to “Climate Change”, and how many to
“Ozone Layer Depletion”. Given that most people do not even
understand the difference between the two issues, one should not expect
meaningful results. It makes sense to “delegate” the definition of
the weights of the environmental sub-index of PPI to a panel of experts who are
perfectly familiar with environmental issues. This method was actually tested
(using a simple “budget allocation” question) in a 1991
survey
[18] among a panel of 660 German senior
environment experts, comprising NGO people, journalists, university professors,
administrators, politicians (including the members of an environmental Bundestag
committee), and industry experts. The results, i.e. the
weight attached
to each of the eight items used, are presented below as pie
charts:
Figure 8: Defining the shares of the
PPI’s environmental sub-index
|
|
There is a remarkable consensus on the weight of issues even
between groups that are “ideologically” far apart, like
environmentalists and industry experts. For example, Climate Change was
consistently given about 50% more weight than the depletion of the Ozone
Layer.
(Note that in this figure the colours do not represent a
valuation - they just serve to distinguish the eight “policy fields”
used in this survey)
|
3.2.2 Policy Performance Index: defining the valuation of the components
3.2.2.1 Valuation and science: the role of basic attitudes
The reader will have noted that the size of the pie charts
above differs: the environmentalists’ pie is much bigger than the one of
the industry experts. This reflects the observation that opinions on the overall
importance of environmental problems differ a lot between the main societal
actors in environmental policy.
In the same survey, the panelists had been asked to reveal
their general attitudes towards environmental problems, using four questions
along an “optimism vs. pessimism” axis. Below the results for the
two most controversial statements are
presented:
Figure 9: Societal actors and
differences in basic attitudes
|
|
While over 80% of the industry experts were convinced that
science and technology will save us in the end (37% “fully”
agreed!), very few environmentalists shared this view. Politicians also showed a
lot of confidence in progress and science (more than the researchers
themselves). The most skeptical groups are again NGO experts and
journalists.
|
|
|
Neither politicians nor industry experts
“fully” accepted this radical statement. Not surprisingly, the most
pessimist group were the NGO experts, followed by the journalists. Virtually
none of the industrial and policy experts was fully convinced of the "doomsday
scenario", but 60% of the environmentalists agreed or fully agreed that it was
too late for action.
|
Striking, but not surprising, is the symmetry between the two
figures. Obviously, it will be difficult to convince environmentalists and
industry representatives to agree on a common judgement of environmental policy
performance. For example, it is likely that a stabilisation of CO
2
emissions will be judged “a great success” by industry, but
“another step towards the climate catastrophe” by environmentalists
- while both groups agree, as shown in
Figure 8: Defining the shares of the PPI’s environmental sub-index,
that Climate Change is among the three most important environmental
themes.
One should not expect help from
science when trying to
solve this dilemma. Attempts to value, for example, the monetary damage of one
kg of CO
2 emissions differ by several orders of magnitude, reflecting
again differences in basic attitudes, and the enormous
sensitivity of
such valuation methods to changes in
assumptions:
Figure 10: Monetary valuation of
CO2 emissions and the sensitivity of assumptions
Starting from the neutral assumption that “Climate
Change is a serious problem” (a judgement that is shared even by the
extreme poles of the environmental policy spectrum), any scientist can easily
produce damage estimates that are six orders of magnitude apart - depending on
“simple” assumptions such as whether Climate Change impacts should
be discounted or not, or whether the Canadians will help the Africans or not.
In practice, scientists will not reveal their basic attitudes so openly
(they have a reputation to lose), but published analyses still differ by four
orders of magnitude, a range of
10,000:1. [19]
|
|
3.2.2.2 Objective valuation I: policy targets as “anchors”?
The objective of a Policy Performance Index (PPI) is to inform
the citizen whether the government has done a good or a lousy job. Presenting
“scientific” results that differ by orders of magnitude (depending
on whether the study was financed by Shell or by Greenpeace) obviously will not
have the same political power as the yearly publication of GDP growth and
unemployment rates - non-controversial figures produced by statistical
services.
And the valuation of CO2 damages is only one
example; others may be less controversial, but we cannot wait until a consensus
on “what is a policy success” for 20-30 indicators in the economic,
social and environmental spheres has been reached; especially since the great
differences between societal groups often principally will never lead to a
consensus.
Some indicator experts want to use “anchors” for
defining policy success or failure; for example, if a government promised at
the Kyoto summit to stabilise CO2 emissions at the 1990 levels (to
raise GDP by 3% per year; to push unemployment below 8%; to increase life
expectancy to 99 years; ...), and if the government manages to reach this
target, then this should be considered a policy success.
At first sight, this sounds like a plausible and objective
valuation method. However, the targets approach suffers from two minor
shortcomings:
- Which target should be taken? The Kyoto target
certainly has official legitimation, but only a few years ago the
Intergovernmental Panel on Climate Change (IPCC) asked, equally
legitimated, for a reduction of CO2 emissions by 75% (which would
rightly put the climate policies of all UN Member States into the ugly
category “complete failure”).
The European Environment Agency
(EEA) has collected approx. 5,000 targets related to environmental policy - who
will define which of them are the “right” and “valid”
targets? - Assuming that we would declare only
government targets as “valid” (not such a bad idea because at least
EU governments are democratically elected): would an intelligent prime minister
ever formulate a target that can only be reached with great sacrifices to the
voters? Or would she/he rather declare targets that will be reached anyway
with a business-as-usual policy, making thus certain that the PPI segment for
the respective indicator (e.g. CO2 emissions) appears in a dark green
(= “very good”) shortly before the
elections??[20]
3.2.2.3 Objective valuation II: relative performance as
“anchors”
In the 1960ies, many European countries had unemployment rates
around 1%; inflation was low, and GDP growth was in general higher than
nowadays.
In the 1990ies, unemployment reached historical peaks of well
over 10% for some countries; inflation was high, and GDP growth was judged
“insufficient” by political parties, media and even
governments.
If voters had used an “absolute” yardstick for
“unemployment performance”, we would have seen victories of the
opposition in all elections, given that unemployment rates were ten times higher
than in the 1960ies, and given that the newspapers and TV news were dominated by
self-appointed economy experts unanimously declaring that GDP growth was too
slow.
Citizens have a feeling what they can reasonably expect from
their governments; the loud propaganda from both sides does not really impress
them. What they want to know is whether the current government performs well
relative to what it could achieve under the given
constraints; and their yardsticks will usually be:
- how previous governments or opposition parties
have dealt with important issues; and
- how the
governments of neighbouring countries cope in
comparison.
Generally, what the voter expects as objective
information is a
differentiated
picture:
Figure 11: Index messages: the
Importance of Differentiation
|
|
The relevancy of an index for politics depends strongly on the
credibility of its message: neither a "deep red" nor a "deep green" will be
taken seriously outside those small fractions of the population that believe
either in doomsday scenarios, or are convinced that scientific progress will
solve all problems. The greatest political impact has a message that gives a
differentiated picture of policy sucess and failure. For example, the index in
the middle might say “waste problems have been successfully addressed, but
climate policy was a complete failure”. Such a balanced message may help
to define political priorities, and to spend the available "budget" (both in
terms of money, and of the willingness of the population to make other
sacrifices for the environment), in an efficient way.
|
The overall valuation (i.e. the small circles in the middle)
should rarely differ much from “yellow” - voters know that the
opposition parties aren’t any better; but they will check carefully how
the government performs on issues that voters consider to be important for
themselves.
How can such a differentiated, credible and objective message
be produced? Again, one should not expect help from science: valuations produced
by academics will only by accident be neutral enough to be accepted both by
Greenpeace
and by Shell...
[21]
However, since the index user expects anyway a relative
valuation, one could formulate simple “benchmarking rules”, to be
uniformly applied to all component indicators, such as:
- The scale for describing the performance of the
current government is delimited by the best (“dark green”) and the
worst performance (“dark red”) of the last five
governments:
Figure
12: Relative valuation against past policy performance: the unemployment
example
- The scale for determining the performance of the
current government is delimited by the worst and the best of a group of
countries, e.g. the fifteen European Union Member
States:
Figure
13: Comparison to countries of the same class: the CO2 emission
reductions example
Although such “benchmarking” procedure can be much
less controversial than e.g. monetary valuation, there will still be enough room
for debates. The figure above, for example, shows countries’ performance
with regard to reducing CO
2 emissions in 1996 relative to 1990; the
resulting valuations are thus highly policy-relevant for the Kyoto process.
However, if we had chosen the
per capita emissions as the yardstick, then
the picture would look
different:
Figure 14: Comparison to countries of
the same class: per capita CO2 emissions
Suddenly, Germany (D) loses her green label and becomes a
“serious” case, while Portugal (P) and Spain (E) improve their
performance and appear as “green” countries.
Like in the example of CO2 damage monetisation, the
“performance evaluation” is determined by sensitive
assumptions:
- the first graph portrays the efforts of EU
Member States to stabilise or reduce their emissions compared to the 1990
levels, and here Portugal and Spain are among the least successful
countries;
- the second graph introduces an element
of justice: countries with low emissions appear in a better light, even
if their emission trends point dangerously
upwards.[22]
However, in
contrast to many valuation methods that are accessible only to the expert
community (and sometimes only to the experts who calculated the estimates...),
the two figures above are not “black boxes” - everybody can
understand why there are differences between them, and why some countries are
“greener” than others. Furthermore,
- the numbers themselves are non-controversial
(CO2 emission statistics are relatively solid, compared to other
environment statistics); and
- non-experts can
intuitively grasp the logic of the valuation system (but still will have to
decide for themselves whether they prefer the “reduction” or the
“justice” version for judging their governments’ policy
performance in comparison to other EU states.
Another
feature of the “benchmarking” approach is its responsiveness:
modest efforts of a country to solve a problem (i.e. a red spot in the PPI) can
lead to quick improvements in the ranking that determines the valuation.
However, since all “members” of the same class of countries could do
the same, the benchmarking leads to a permanent competition - a country that
neglects a certain policy field can equally quickly become the owner of the
“red light” at the bottom of the classification. If the indicators
are defined according to real policy needs, then this is a healthy competition -
much healthier than the competition we observe for economic growth measured as
GDP.
A “benchmarking” system is also the basis for the
well-known Human Development Index (HDI,
see HDR99: The
Report, at
http://www.undp.org/hdro/HDI.html),
and several other indices such as the popular “Ecosistema Urbano”,
an index (composed of 20 indicators, see
http://members.tripod.com/legambiente/document/class98.htm)
comparing the environmental performance of 103 Italian cities. Ecosistema Urbano
has been produced already the fifth time for the NGO Legambiente
(“environment league”, the Italian equivalent of
Friends of the Earth), and is becoming more and more a standard
management tool for the cities that are being so merciless ranked every
year.
There are two main disadvantages of the
“benchmarking” approach to valuation:
- It does not reveal policy failure if all members
of a class (e.g. all EU Member States) commit the same errors - for example, not
reducing their CO2 emissions to the levels recommended by the IPCC;
one should balance this disadvantage, however, against the political weakness of
such far-away targets.
- It requires steady (with
regard to comparability over time) and/or internationally compatible indicator
sets; such sets exist for OECD and EU Member States (Eurostat and EEA
publications), but for Developing Countries progress is slow, and depends
strongly on the successful testing and implementation of the UN CSD indicator set.
[16] For the U.S.
“Freedom of Information Act” see
http://foia.state.gov/about.htm
[17] European Union: see for
example the Council Directive 90/313/EEC of 7 June 1990 on the freedom of access
to information on the environment, Official Journal L 158 , 23/06/1990 p. 0056
–
0058,
http://europa.eu.int/eur-lex/en/lif/dat/1990/en_390L0313.html
[18] conducted by the
University of Mannheim (Forschungsstelle für Gesellschaftliche
Entwicklungen, FGE)
[19] "
Reasonable people
find environmental externalities from the production of electricity to be
anywhere from 0.01 mils per kilowatt hour to over 100 mils per kilowatt hour, a
range of four orders of magnitude." Stephen Wiel (Lawrence Berkeley
Laboratories): The Science and Art of Valuing Externalities: A Recent History of
Electricity Sector Experiences. DG XII/IEA ExternE Workshop, 26.1.1995
[20] A closer look at the
Kyoto targets will convince the reader that governments choose the second
solution.
[21] A common belief is that
using lots of “red warning lights” would force politicians to act
more sustainable; but even the IPCC target, “minus 75% CO
2
emissions or there will be a catastrophe”, was completely ignored by
politicians, and subsequently made ridiculous by the Kyoto targets. However,
politicians
never ignore GDP changes. We should accept that figures are
not a substitute for scientists’ warnings; and that such warnings must
find better channels to conquer the political agenda, e.g. through scenarios
translated into TV serials...
[22] It is noteworthy that
the logic of the Kyoto process seems to be closer to the second version,
allowing Portugal significant emission
increases.