VERCOUSTRE , Anne-Marie

Last name: VERCOUSTRE

First name: Anne-Marie

December 10, 2007 (v1)

Conference paper

Metadata-only

Use of Wikipedia Categories in Entity Ranking

Thom, James Pehcevski, Jovan Vercoustre, Anne-Marie

Wikipedia is a useful source of knowledge that has many applications in language processing and knowledge representation. The Wikipedia category graph can be compared with the class hierarchy in an ontology; it has some characteristics in common as well as some differences. In this paper, we present our approach for answering entity ranking...

Uploaded on: April 5, 2025
April 2008 (v1)

Conference paper

Metadata-only

Exploiting Locality of Wikipedia Links in Entity Ranking

Pehcevski, Jovan Vercoustre, Anne-Marie Thom, James

Information retrieval from web and XML document collections is ever more focused on returning entities instead of web pages or XML elements. There are many research fields involving named entities; one such field is known as entity ranking, where one goal is to rank entities in response to a query supported with a short list of entity examples....

Uploaded on: April 5, 2025
2007 (v1)

Report

Metadata-only

Entity Ranking in Wikipedia

Vercoustre, Anne-Marie Thom, James Pehcevski, Jovan

The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named...

Uploaded on: April 5, 2025
December 2006 (v1)

Conference paper

Metadata-only

Report on the XML Mining Track at INEX 2005 and INEX 2006, Categorization and Clustering of XML Documents

Denoyer, Ludovic Gallinari, Patrick Vercoustre, Anne-Marie

This article is a report concerning the two years of the XML Mining track at INEX (2005 and 2006). We focus here on the classification and clustering XML documents. We detail these two tasks and the corpus used for this challenge and then present a summary of the different methods proposed by the participants. We last compare the results...

Uploaded on: April 5, 2025
March 2008 (v1)

Conference paper

Metadata-only

Entity Ranking in Wikipedia

Vercoustre, Anne-Marie Thom, James Pehcevski, Jovan

The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named...

Uploaded on: April 5, 2025
January 23, 2007 (v1)

Conference paper

Metadata-only

Extraction d'entités dans des collections évolutives

Despeyroux, Thierry Fraschini, Eduardo Vercoustre, Anne-Marie

The goal of our work is to use a set of reports and extract named entities, in our case the names of Industrial or Academic partners. Starting with an initial list of entities, we use a first set of documents to identify syntactic patterns that are then validated in a supervised learning phase on a set of annotated documents. The complete...

Uploaded on: April 5, 2025
November 2005 (v1)

Conference paper

Metadata-only

A Flexible Structured-based Representation for XML Document Mining

Vercoustre, Anne-Marie Fegas, Mounir Gul, Saba

This paper reports on the INRIA group's approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allows taking into account the structure only or both the structure and content. Our approach consists of representing XML documents by a set of their sub-paths, defined...

Uploaded on: April 5, 2025
January 2005 (v1)

Conference paper

Metadata-only

Expériences de classification d'une collection de documents XML de structure homogène

Despeyroux, Thierry Lechevallier, Yves Trousse, Brigitte

This paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with unsupervised classification (clustering) of documents. We focus on the feature selection used for...

Uploaded on: April 5, 2025
January 16, 2006 (v1)

Conference paper

Metadata-only

Classification de documents XML à partir d'une représentation linéaire des arbres de ces documents

Vercoustre, Anne-Marie Fegas, Mounir Lechevallier, Yves

In this work, we propose a new clustering document representation for semi-structured documents collections. Our approach consists on a representation of XML documents based on their sub-paths, defined according to some criteria (length, root beginning, leaf ending) using the structure only or both the structure and the content. By considering...

Uploaded on: April 5, 2025
June 29, 2005 (v1)

Conference paper

Metadata-only

Experiments in Clustering Homogeneous XML Documents to Validate an Existing Typology

Despeyroux, Thierry Lechevallier, Yves Trousse, Brigitte

This paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with unsupervised classification (clustering) of documents. We focus on the feature selection used for...

Uploaded on: April 5, 2025
2007 (v1)

Book section

Metadata-only

Mining XML Documents

Candillier, Laurent Denoyer, Ludovic Gallinari, Patrick

XML documents are becoming ubiquitous because of their rich and flexible format that can be used for a variety of applications. Giving the increasing size of XML collections as information sources, mining techniques that traditionally exist for text collections or databases need to be adapted and new methods to be invented to exploit the...

Uploaded on: April 5, 2025

VERCOUSTRE , Anne-Marie

Recent uploads