Discovery of scientific fraud to artificially boost the impact of research

Discovery of scientific fraud to artificially boost the impact of research
Discovery of scientific fraud to artificially boost the impact of research

This article is published in collaboration with Binaire, the blog for understanding digital issues.


The image of the researcher who works alone while ignoring the scientific community is just a myth. Research is based on a permanent exchange, first and foremost to understand the work of others and secondly, to make one’s own results known. Reading and writing articles published in scientific journals or conferences are therefore at the heart of researchers’ activity. When writing an article, it is fundamental to cite the work of your peers, whether to describe a context, detail your own sources of inspiration or even explain the differences in approaches and results. Being cited by other researchers, when it is for “good reasons”, is therefore one of the measures of the importance of one’s own results. But what happens when this citation system is manipulated? Our recent study reveals an insidious method of artificially inflating citation counts: “stealth references.”

The underside of manipulation

The world of scientific publication and its functioning as well as its potential shortcomings and their causes are recurring subjects of scientific popularization. However, let us focus particularly on a new type of drift affecting citations between scientific articles, supposed to reflect the intellectual contributions and influences of a cited article on the citing article.

Citations of scientific works are based on a standardized referencing system: the authors explicitly mention in the text of their article, at least the title of the cited article, the name of its authors, the year of publication, the name of the journal or conference, page numbers, etc. This information appears in the bibliograph of the article (a list of references) and is recorded in the form of additional data (not visible in the text of the article) qualified as metadata, particularly when assigning the DOI (Digital Object Identifier), a unique identifier for each scientific publication.

The references of a scientific publication allow, in a simplified way, authors to justify methodological choices or to recall the results of past studies. The references listed in each scientific article are in fact the obvious manifestation of the iterative and collaborative aspect of science. However, some unscrupulous actors have obviously added additional references, invisible in the text, but present in the metadata of the article during its registration by the publishing houses. Result ? The citation counts of some researchers or journals explode for no good reason because these references are not present in the articles that are supposed to cite them.

A new type of fraud and an opportunistic discovery

It all started thanks to Guillaume Cabanac who published a post-publication evaluation report on PubPeer, a site where scientists discuss and analyze publications. He notices an inconsistency: an article, probably fraudulent because it presents “tortured expressions”, from a scientific journal published by the scientific journal publisher Hindawi obtained many more citations than downloads, which is very unusual. This post attracted the attention of several “scientific detectives”; a reactive team is formed with Lonni Besançon, Guillaume Cabanac, Cyril Labbé and Alexander Magazinov.

We are trying to find, via a scientific search engine, the articles citing the initial article, but the Google Scholar search engine does not provide any results while others (Crossref, Dimensions) find some. It turns out, in reality, that Google Scholar and Crossref or Dimensions do not use the same process to retrieve citations: Google Scholar uses the actual text of the scientific article while Crossref or Dimensions use the metadata of the article that publishing houses provide.

To understand the extent of the manipulation, we then examined three scientific journals that appeared to heavily cite Hindawi’s article. Here is our three-step approach.

  • We first list the references explicitly present in the HTML or PDF versions of the articles;

  • Then we compare these lists with the metadata recorded by Crossref, an agency that assigns DOIs and their metadata. We discover that some additional references were added here, but did not appear in the articles;

  • Finally, we check a third source, Dimensions, a bibliometric platform that uses Crossref metadata to calculate citations. Here again we see inconsistencies.

The result ? In these three journals, at least 9% of the references recorded were “furtive references”. These additional references do not appear in the articles, but only in the metadata, thus skewing citation counts and giving an unfair advantage to some authors. Certain references actually present in the articles are also “lost” in the metadata.

Implications and potential solutions

Why is this discovery important? Citation counts significantly influence research funding, academic promotions, and institutional rankings. They are used differently depending on institutions and countries, but always play a role in these kinds of decisions.

Manipulation of citations can therefore lead to injustices and decisions based on false data. Even more worrying, this discovery raises questions about the integrity of systems for measuring scientific impact that have been raised for several years now. Indeed, many researchers have already, in the past, stressed that these measures could be manipulated, but above all that they generated unhealthy competition between researchers who would, consequently, be tempted to take shortcuts to publish more quickly or have better results which would therefore be more cited. A potentially more dramatic consequence of these researcher productivity measures lies above all in the waste of scientific efforts and resources due to the competition created by these measures.

To combat this practice, the “Invisible College”, an informal collective of scientific detectives to which our team contributes, recommends several measures:

  • Rigorous metadata verification by publishers and agencies like Crossref.

  • Independent audits to ensure data reliability.

  • Increased transparency in reference and citation management.

This study highlights the importance of the accuracy and integrity of metadata, as they too are subject to manipulation. It is also important to note that Crossref and Dimensions have confirmed the results of the study and that it seems that certain corrections were made by the publishing house which manipulated the metadata entrusted to Crossref and, by side effect, to bibliometric platforms like Dimensions. While waiting for corrective measures, which are sometimes very long, or even non-existent, this discovery reminds us of the need for constant vigilance in the academic world.

-

-

PREV In Bollène, the Levade racecourse ends its season this Sunday
NEXT Good deal – The Netatmo connected object Connected thermostatic heads for radiators Additional “5-star” valve at €59.99 (-22%)