The Press compared the content of his articles with those of the French daily The World. Although most of the words are identical, each newspaper still has its particularities, which reveal the differences between the French used in Quebec and in France.
Published at 5:00 a.m.
“You say? » French cousins who don’t understand our language often amuse Quebecers. The lexicographer Marie-Éva de Villers devoted an entire doctorate to this amusement.
For his thesis, Mme de Villers, who is best known for having published the Multidictionary of the French languagecompared the text of the articles in Duty a you Monde published in 1997 in order to identify words specific to each newspaper.
The main objective of its analysis, completed in 2005, was in particular to feed the Multidictionarywhich was already in its fourth edition. The description of French in Quebec has often been qualitative, she explains. “I wanted to try a quantitative, quantified analysis. »
An updated exercise
Twenty years later, The Press try the exercise again by comparing The Worldon the French side, at The Pressthis time, on the Quebec side for the year 2024 (see methodology, below). Marie-Éva de Villers says she is “delighted” with the initiative and indicates that “the results are very stable compared to those [qu’elle a] obtained at the time.
In 2005, the linguist confided that she had a surprise: “I thought it would be the archaisms that characterized Quebec French,” she said. Among these old words that Quebecers have preserved, from different regions of France, we include for example and you et sweater. The French now prefer bonnet et pull.
“But it’s the words we created to name our realities” that distinguish the language spoken here, she says. “These are the words that are the most common by far. » The best examples are snowmobile or motomarine et potato chips. Among the more recent terms, identified by the exercise of The Presswe count baladowhich has appeared 264 times this year.
Let us also note pagerused 34 times by The Press in 2024 due to the simultaneous explosions of dozens of these communications devices, killing a dozen Hezbollah officials in Lebanon in September. The World rather wrote beeper (23 occurrences).
Innovative Quebecers
All these new words were the subject of recommendations from the Office québécois de la langue française (or its predecessor, the OLF). “The receptivity of Quebecers, journalists and editors, too, is much greater in Quebec than in France,” explains Marie-Éva de Villers.
However, a language enrichment commission exists in France. “But for most French people, it is not up to the State to say what words to use. So, they boycott the recommendations,” laments Mme de Villers, while in Quebec, those of the OQLF are authoritative.
Visit the FranceTerme website, which includes the Commission for the Enrichment of the French Language
Which is surprising, when we analyze the words found only in The Worldis to note the large quantity of borrowings from English, such as smartphone (used 161 times), manager (94 times) or sponsor (80 times, including sponsoring).
The one who irritates M the mostme de Villers is newsletter (468 occurrences, probably to invite readers to subscribe): “It still annoys me, newsletter, even though a newsletter is so practical! »
Nearly 90% common lexicon
Our data reveals that approximately 90,000 different words were used by each newspaper, indicating roughly equal vocabulary richness.
If we only retain the words that have been used five times or more each and remove the proper nouns, we obtain approximately 52,600 in the two publications.
Of this number, 45,600 are common (87%); 3800 are specific to The Press (7%) and 3200 at Monde (6 %).
Regionalisms
Most of the differences observed between the two corpora can be explained by regional differences. Geography means that there is only The Press who talks about Laval residents (28 times) or Gaspésiens (27 times) and that The World who writes about Ile-de-France residents (104 times) or Brestois (50 times).
The fauna and flora give us 116 caribou et 59 dandelions in the screens of The Press. The philatelic and numismatic pages of the Monde produce on their side 94 vermillons (for the vermilion franc) 82 jagged et 43 bistres.
The policy explains that The Press mentioned the word 416 times caquist and that The World mentioned 104 times macronie or the Macronians.
The differences are also linked to administration, which gives cegep (455 cases, including on a company computer) et deputy minister (91 cases), unique words The Pressor even courseup (102 cases) and vice minister (26 cases) specific to Monde.
It is also surprising that francization was only found in The Presswho published this word 221 times in 2024.
A useful analysis
Mme de Villers titled The strong desire to last the work she drew from her thesis, a title inspired by the poet Paul Éluard. “It seemed to me to characterize the history of French in America well,” she said, “the story of French speakers whose loyalty to their language, unfailing tenacity, and determination are remarkable. »
The author (a word that is found in both newspapers) is working on the eighth edition of her Multidictionary. She confides that the analysis of The Press allowed him to spot a few words that weren’t there yet, such as affordability, principal, wrongly, minimally et webcastamong others.
Methodology
Pierre Meslin, from the business intelligence team at The Pressfirst provided all of the articles that we published in 2024 (from 1is January to November 30). We have also harvested all the articles from the Monde for the same period with permission from the newspaper.
We have only kept articles written by artisans from each publication or external collaborators, excluding agency articles, opinion texts and commissioned texts. More than 15,000 items remained The Press and nearly 21,000 Monde. As several articles in the French daily are reserved for subscribers and we could only capture the first paragraphs, our corpus ultimately gave us 7.1 million words for The World and 10.1 million for The Press.
The division of the two corpora into lexical units was carried out using an open source language model included in the spaCy software library.