For those that don't want to click through to the PDF, but want to be able to pretend they read the paper to comment, here's the abstract and conclusion sections:<p>Abstract:
Metrics are useful for measuring systems and motivating behaviors. Unfortunately, naive application of metrics to a system can distort the system in ways that undermine the original goal. The problem was noted independently by first Campbell, then Goodhart, and in some forms it is not only common, but unavoidable due to the nature of metrics. There are two distinct but interrelated problems that must be overcome in building better metrics; first, specifying metrics more closely related to the true goals, and second, preventing the recipients from gaming the difference between the reward system and the true goal. This paper describes several approaches to designing metrics, beginning with design considerations and processes, then discussing specific strategies including secrecy, randomization, diversification, and post-hoc specification. The discussion will then address important desiderata and the trade-offs involved in each approach, and examples of how they differ, and how the issues can be addressed. Finally, the paper outlines a process for metric design for practitioners who need to design metrics, and as a basis for further elaboration in specific domains.<p>Conclusion:<p>Despite the intrinsic limitations of metrics, the frequent use of poorly thought-out and badly constructed metrics do not imply that metrics are doomed to eventually fail, or that they should not be used because they will be exploited. Instead, forethought and consideration of the problems with metrics is often worthwhile. This process starts by identifying and agreeing on coherent goals, then considering both what leads to the goals, and what parts of the system can be measured. After identifying measurable parts of the system, and considering how participant behavior might exploit the measurement methods or the measured outcomes, measures can be constructed. The construction of these metrics to avoid exploitation may involve multiple diverse measures, secret metrics, intentional reliance on post-hoc specification of details, and randomization. This may also include decisions about where subjective measurements are important, and consideration whether measurement will be beneficial. In building the metrics and deciding whether to implement them, attention should be paid to various important factors in the system, including immediacy of feedback, simplicity and understandability of the measurement system, fairness, and the potential for both actual and appearance of corruption in the metric and reward system.<p>Metric design is an engineering problem, and good solutions involve both science and art. Following these guidelines will not make metrics unexploitable, nor will it keep everyone happy with the results of a process. This is true of metrics used for employees, metrics used for monitoring systems, and even metrics used within machine learning algorithms - in each case, poorly designed metrics will be exploited. Occasionally, the suggested process will lead to investigation of potential improvements or strategies that are ultimately decided against. Despite this, it is a vast improvement on the too-common strategy of using whatever metric seems at first glance to be useful, or deploying metrics without considering what they in fact promote. Putting in the effort to build elegant and efficient solutions won’t fix every problem, but it will lead to less flawed metrics and better results overall.<p>Direct PDF link: <a href="https://mpra.ub.uni-muenchen.de/98288/5/MPRA_paper_98288.pdf" rel="nofollow">https://mpra.ub.uni-muenchen.de/98288/5/MPRA_paper_98288.pdf</a>