A popular idea in government circles at the moment is that the quantity and quality of research produced in a country can be improve by introducing what could be called a research assessment system.
The pioneering such system was the research assessment exercise, or RAE, introduced in the UK by Margaret Thatcher as long ago as 1986. The RAE continued in the UK until 2008, when it was replaced by a new research assessment system, known as the REF. Now there are plans to introduce research assessment systems in other countries such as France and Italy. Naturally all these research assessment systems differ from each other in detail, but they do have a common feature which can be used to define what is meant by such a system. I will give this definition in the next paragraph, and then argue that research assessment systems, as thus characterised, have a fundamental flaw. As a result of this flaw, the effect of introducing such a system is to lower the quality of research produced, instead of raising it.
I will define a research assessment system as a system in which groups of researchers are assessed at intervals. If the assessment is good, the group retains its funding or gets more, while, if the assessment is bad, the group’s funds are reduced or perhaps removed altogether.
Now a research assessment system might seem, at first sight, to be an obvious and common sense procedure. We want to produce good research. So let us first find out who the good researchers are by an assessment, and then give funding to good researchers while removing it from bad researchers. In this way we will obviously improve the quality of the research produced. What could be wrong with such a system?
In this short article, I will examine the problem in the context of research in the natural sciences. The situation is somewhat different in other areas. For example, in my 2011, I argue that the damaging effect of research assessment systems is greater in economics than the natural sciences, but there is not space to go into this problem here.
The fundamental flaw in a research assessment system for the natural sciences is shown by a study of the history of science. Such a study shows that it is not in fact possible for researchers to give accurate assessments of contemporary research. After twenty or thirty years, the assessments of a piece of research have normally reached a consensus which does not change much thereafter. However, this consensus after twenty or thirty years may be wildly different from the judgements which were made at the time the research was first produced. Research which was then thought to be really important may, after twenty or thirty years, be seen as the exploration of a blind alley, while research which was thought then to be of no value may after twenty or thirty years be seen to be a crucial breakthrough.
The phenomenon to which I wish to draw attention could be described as delayed recognition. Let us suppose that a scientist, Mr S say, publishes a paper in which he proposes a new theory based on his research, which, after thirty years, is recognised as a major advance in the field. It may well be that his fellow scientists working in that field may not immediately recognise that Mr S’s new theory is a good one. They may initially think that Mr S’s theory is completely wrong, and largely ignore his work. Mr S. may then have to continue developing his theory through his research, and perhaps that of a few supporters, for many years before its value is recognised by the scientific community.
Delayed recognition is a very common phenomenon in the history of science, and, interestingly, it most often occurs for advances which, with hindsight, are seen to be among the most important breakthroughs. It is moreover fairly easy to explain why this happens. According to Kuhn, and I think he is correct here, scientists always work within a framework of assumptions or paradigm, which they accept for the time being as correct. Now a major advance in research is likely to go against some of the assumptions in the dominant paradigm. Working scientists are likely to reject, at least initially, a theory which contradicts any of the basic assumptions of their paradigm. Hence we would expect there to be initially a negative reaction to what later turns out to be a major advance.
The phenomenon of delayed recognition is extremely common in the history of science, and I give many examples of it in my book How Should Research be Organised? (published in 2008 to coincide with the results of the last RAE). For this short article, however, I will confine myself to one recent example.
In 2008, Harald zur Hausen was awarded the Nobel prize for the discovery that a form of cervical cancer is caused by a preceding infection by the papilloma virus. In the research which led to the discovery, however, the majority of researchers favoured the view that the causal agent for cervical cancer was a herpes virus and not a papilloma virus. This was the dominant paradigm at the time, and zur Hausen and his group were the only ones who favoured the papilloma virus.
One of the reasons why the research community favoured the idea that a herpes virus was the cause of cervical cancer was that it had been shown that a herpes virus, the Epstein-Barr virus, was the cause of another cancer: Burkitt’s Lymphoma. The dominance of the herpes virus approach is shown by the fact that, in December 1972, there was an international conference of researchers in the field at Key Biscayne in Florida, which had the title: Herpesvirus and Cervical Cancer. Zur Hausen attended this conference and made some criticisms of the herpes virus approach. He said that he believed that the results indicate at least a basic difference in the reaction of herpes simplex virus type 2 with cervical cancer cells, as compared to another herpes virus, Epstein-Barr virus. In Burkitt’s lymphomas and nasopharyngeal carcinomas, the tumor cells seem to be loaded with viral genomes, and obviously the complete viral genomes are present in those cells. Thus a basic difference seems to exist between these 2 systems. (cf. Goodheart, 1973, p. 1417). It is reported that the audience listened to zur Hausen in stony silence (Mcintyre, 2005, p.35). The summary of the conference written by George Klein (Klein, 1973) does not mention zur Hausen. Clearly at that time, contemporary assessments of zur Hausen’s research by the scientific community would have given him a low rating. He was regarded as a fringe crank, and his work was not referred to or taken seriously by the mainstream. In the long run, however, zur Hausen proved to be correct.
At the time when zur Hausen was working, there was, fortunately for European science, no research assessment system operating in Germany. Let us now consider what would have happened to him and his group had such a system been in place. From the account I have just given, it is obvious that if a research assessment had been conducted in 1973, then zur Hausen and his group would have got a very low rating. Their research funding would have been cut off, and the discovery of the cause of cervical cancer would have been long delayed. Millions of dollars would still have been spent on searching for a herpes virus causing cervical cancer, but no result would have been produced. Moreover, it would have been very difficult for zur Hausen or anyone else to challenge the dominant paradigm (that a herpes virus caused cervical cancer), because anyone who initially advocated such a view would have received a low rating in research assessment and consequently been denied funding. As a result the development of a vaccine which protects against this unpleasant and often fatal disease would have been delayed for several decades, while huge sums of money would have continued to be spent on research. It is worth noting that sales of the vaccine have generated large profits for pharmaceutical companies. So these profits would not have occurred either.
Let us now turn to the general effect of research assessment systems. I remarked earlier that the phenomenon of delayed recognition occurs most frequently in the case of big innovations, significant advances and major breakthroughs. This is explained by the fact that advances of this kind usually contradict some features of the dominant paradigm accepted by most scientists working in the field. Hence we can conclude that the effect of the use of research assessment systems will be to stifle big innovations, significant advances and major breakthroughs in research.
Research assessment systems are very expensive to implement. A great deal of administration is needed to mount such a system, and the administrators need to be paid. In addition, researchers have to devote much time to preparing their submissions for the research assessment system, and to helping in the assessment of the work of their fellow researchers. The time spent on such activities has to be deducted from the time they can spend on the more productive work of getting on with their own research. This causes an indirect increase in the costs of research. Thus research assessment systems are an expensive way of reducing the quality of research output.
It is also worth noting that it is precisely the big innovations which generate the largest profits for the private sector. So a subsidiary consequence would be to reduce profits in the private sector. As most governments are dedicated above all to generating large profits in the private sector, their advocacy of research assessment is a clear instance of shooting themselves in the foot!
It could still be asked whether there are realistic ways of organising research which do not use research assessment systems. The answer is that there are indeed such ways. One suggestion is made in my 2008, Part 3, pp. 63-129.
Clarke, B. (2011) Causality in Medicine with particular reference to the viral causation of cancers. PhD thesis. University College London.
Gillies, D. (2008) How Should Research be Organised? College Publications.
Gillies, D. (2011) Economics and Research Assessment Systems, submitted to Economic Thought
Goodheart, C.R. (1973) Summary of informal discussion on general aspects of herpesviruses, Cancer Research, 33(6), p. 1417.
Klein, G. (1973) Summary of Papers Delivered at the Conference on Herpesvirus and Cervical Cancer (Key Biscayne, Florida), Cancer Research, 33(June 1973), pp. 1557-1563.
McIntyre, P. (2005) Finding the viral link: the story of Harald zur Hausen, Cancer World, July-August, pp. 32-37.
 This account of Zur Hausen’s work is based on discussions with Brendan Clarke and on his (2011).