International audience
November 24, 2017 (v1)PublicationUploaded on: December 4, 2022
December 10, 2021 (v1)Publication
This thesis presents a new methodology for text analysis which is situated at the intersection of textual statistics, automatic language analysis and deep learning. It draws on the architecture of neural networks and its potential to extract information from texts. The accuracy of convolutional models for text classification depends on the...
Uploaded on: December 3, 2022 -
June 7, 2021 (v1)Conference paper
La question de la réutilisation des données est au cœur du projet de la Base Louis Meigret. Au-delà des gestes techniques que suppose la mise à disposition de données réutilisables, c'est un principe adéquat à la singularité de l'œuvre de Louis Meigret.Le projet est né à l'occasion du colloque consacré à cet auteur en 2018 à Nice. Il s'agissait...
Uploaded on: December 3, 2022 -
2022 (v1)Journal article
Cette contribution présente la plateforme DeepFLE, un outil conçu pour tous les acteurs du français langue étrangère (FLE), qui est le résultat d'une recherche en cours, dont l'approche interdisciplinaire engage un dialogue entre la didactique du français langue étrangère (FLE), le deep learning et l'analyse des données textuelles (ADT)....
Uploaded on: December 3, 2022 -
2019 (v1)Journal article
Using Deep Learning to attribute authorship of French literary texts While problems of attributing authorship or dating a text can be tackled using the usual methods of literary historians, it is equally possible to turn to statistical and computing tools. A range of intertextual measures have been proposed to describe variation within and...
Uploaded on: December 4, 2022 -
July 1, 2023 (v1)Journal article
At the limits of referential semantics, we propose a proferential semantics to decipherEmmanuel Macron's discourse. At the limits of the political referent, we propose with ThierryMelchior (1998) or Clément Viktorovitch (2021) the proferent. After having affirmed the noveltyin 2017, E. Macron pleads the renewal in 2022. After having brought...
Uploaded on: October 11, 2023 -
June 7, 2016 (v1)Conference paper
Finding word cooccurrences and calculating the specificity scores is one of the most popular statistical methods in the analysis of textual data. Within Hyperbase, there is a " theme " feature for this purpose, which is capable of locating words that are used more commonly near a given word form, grammatical structure or lemma. The graphical...
Uploaded on: February 28, 2023 -
April 1, 2021 (v1)Book section
Les arts et les sciences du texte peuvent-ils tirer parti de la puissance nouvelle des machines ? Que peuvent nous apprendre les algorithmes de deep learning sur une œuvre, un auteur, un genre, une époque ? L'Intelligence artificielle peut-elle offrir à l'analyste des parcours de lecture inédits et faire émerger de nouveaux observables textuels...
Uploaded on: December 4, 2022 -
2022 (v1)Journal article
L'étude du participe passé a fait l'objet de nombreuses réflexions dans le domaine de l'analyse des données textuelles : en raison des ambigüités de son profil grammatical, il a été depuis toujours considéré comme difficile à définir pour la recherche textométrique (Brunet, 1988 ; Engwall, 1966). Dans cet article, nous aborderons cette question...
Uploaded on: February 22, 2023 -
June 16, 2020 (v1)Conference paper
The present paper suggests that intertextuality can be brought out objectively by resorting to specific methodological tools. The case in point is political intertextuality in the speeches of the French president Emmanuel Macron. Deep learning (convolutional model) is first used to "learn" (satisfactory accuracy rate of 92.3%) the French...
Uploaded on: December 4, 2022 -
2022 (v1)Journal article
This contribution returns to the definition of a text and its units by assuming that artificial intelligence is likely to modify our representations and our reading paths. It proposes a popularization of deep learning from the perspective of textual linguistics. In doing so, it returns, in the light of digital technology, to some fundamental...
Uploaded on: December 3, 2022 -
2021 (v1)Book section
International audience
Uploaded on: December 4, 2022 -
December 4, 2017 (v1)Publication
International audience
Uploaded on: December 4, 2022 -
June 3, 2014 (v1)Conference paper
The amount of data contained within Google Books has doubled over the last two years and now exceeds 500 billion words. A new treatment of the data has included a re-examination of scanned images, offering a more accurate recognition of the text. In addition, for the first time, included texts have been subjected to deambigation and...
Uploaded on: March 26, 2023 -
April 2021 (v1)Book
Les arts et les sciences du texte peuvent-ils tirer parti de la puissance nouvelle des machines ? Que peuvent nous apprendre les algorithmes de deep learning sur une œuvre, un auteur, un genre, une époque ? L'Intelligence artificielle peut-elle offrir à l'analyste des parcours de lecture inédits et faire émerger de nouveaux observables textuels...
Uploaded on: February 22, 2023 -
December 11, 2023 (v1)Journal article
Artificial Intelligence raises questions about corpus semantics. By taking into account the syntagmatic axis (CNN, convolution) and the paradigmatic axis or "associative relationship" (RNN, transformer), the architecture we present provides an interpretation aid for enhanced corpus semantics. The algorithm implemented in Hyperbase software is...
Uploaded on: December 25, 2023 -
December 29, 2023 (v1)Journal article
The concept of proference exploits the praxematic approach theorized by the Cahiers for 40 years ago. Emmanuel Macron's speeches become autonomous from the reality that they might want to describe in order to constitute, in themselves, for themselves, a linguistic and political praxis. Utterances or proferential phrases like "project", "great...
Uploaded on: February 28, 2024 -
June 25, 2024 (v1)Conference paper
This contribution shows the interest of supplementary variables with Correspondence Analysis (CA). From a CA vector space crossing the main morpho-syntactic categories and the French Presidents of the fifth Republic, we project the lemma "indiquer" (to indicate) as a supplementary variable. Will it be located in the "verb subspace" of the...
Uploaded on: July 16, 2024 -
December 31, 2021 (v1)Journal article
Time constrains language, just as it constrains people, the economy or culture. Continuous evolution of discourse? Lexical permanence? Wear and tear on words and competition of vocabulary over time? Message breakage and ideological flip-flop? This article proposes a methodological protocol for dealing diachronically with the centenary corpus of...
Uploaded on: December 3, 2022 -
June 12, 2018 (v1)Conference paper
This contribution confronts ADT and Machine learning. The extraction of key-statistical passages is first proposed according to several calculations implemented in the Hyperbase software. An evaluation of these calculations according to the filters applied (taking into account of the positive specificities only and substantives only, etc.) is...
Uploaded on: December 4, 2022 -
June 3, 2014 (v1)Conference paper
A partir de 9 matrices mots x mots ou matrices co-occurrentielles (une par man-dat présidentiel depuis 1958), nous produisons une matrice de dissimilarité consignant les distances entre les présidents de la Vème République. On donne une représentation ar-borée de cette matrice et on améliore ici les performances de la représentation grâce à une...
Uploaded on: March 26, 2023 -
June 7, 2016 (v1)Conference paper
With the exponential development of the Internet, new discourse genres and situations have expanded. These new web genres, which are still little described, are complex objects challenging our methodologies and our analysis tools: the encyclopedic project Wikipedia is one of these new objects which are part of Computer-mediated communication...
Uploaded on: February 28, 2023 -
April 26, 2021 (v1)Book section
De Homère à Shakespeare les questions de paternité littéraire ou de datation passionnent la critique. Or le décryptage de l'ADN résout sans discussion les problèmes de criminalité ou de paternité. L'Intelligence artificielle peut-elle jouer le même rôle dans le déchiffrement des textes? C'est l'objet de la présente étude, menée conjointement...
Uploaded on: December 4, 2022 -
July 6, 2022 (v1)Conference paper
Convolutional neural networks allow new representations of texts that extend the standard statistical approaches. By combining frequency and context of words as well as allowing multidimensional treatments (graphical form, lemma and part of speech), convolution leads to the extraction of motifs, i.e. complex linguistic patterns that are likely...
Uploaded on: December 3, 2022 -
July 6, 2022 (v1)Conference paper
Is there an ADT method that can deal with non-aligned bilingual corpora? Does the textual genre exert a sufficiently strong constraint on the discourse that would make texts written in different languages comparable, provided they are of identical genre? To answer these two questions, one methodological, the other linguistic, this contribution...
Uploaded on: December 3, 2022