Grids are interesting platforms for supporting the development of medical image analysis applications: they enable data and algorithms sharing and provide huge amounts of computing power and data storage. In this thesis, we investigate a medical image analysis problem that turns out to be a typical dimensioning application for grids, thus...
-
November 20, 2007 (v1)PublicationUploaded on: December 3, 2022
-
April 8, 2013 (v1)Conference paper
Report on the operations of the biomed VO
Uploaded on: February 28, 2023 -
May 2008 (v1)Conference paper
In this paper, the expressiveness of the simple Scufl data-flow language is studied by showing how it can be used to implement Turing machines. To do that, several non trivial Scufl patterns such as self-looping or sub-workflows are required and we precisely explicit them. The main result of this work is to show how a complex workflow can be...
Uploaded on: December 3, 2022 -
November 2009 (v1)Publication
Production-grid users experience many system faults as well as high and variable latencies due to the scale, complexity and sharing of such infrastructures. To improve performance, they adopt different submission strategies, that are potentially aggressive for the infrastructure. This work studies the impact of three different strategies. It is...
Uploaded on: December 3, 2022 -
November 2006 (v1)Report
This paper presents a method to optimize the timeout value of grid computing jobs. It relies on a model of the job execution time that considers the job management system latency through a random variable. It also takes into account a proportion of outliers to model either reliable clusters or production grids characterized by faults causing...
Uploaded on: February 22, 2023 -
June 19, 2006 (v1)Conference paper
The problem we address in this paper is to build complex applications by reusing and assembling scientific codes on a production grid infrastructure. We first review the two main paradigms for executing application code on a grid: (a) the task based approach, associated to global computing, characterized by its efficiency, and (b) the service...
Uploaded on: December 3, 2022 -
November 2006 (v1)Report
This paper presents a method to optimize the timeout value of grid computing jobs. It relies on a model of the job execution time that considers the job management system latency through a random variable. It also takes into account a proportion of outliers to model either reliable clusters or production grids characterized by faults causing...
Uploaded on: December 3, 2022 -
February 2006 (v1)Conference paper
Production grids have a potential for parallel execution of a very large number of tasks but also introduce a high overhead that significantly impacts the execution of short tasks. In this work, we present a strategy to optimize the partitioning of jobs on a grid infrastructure. This method takes into account the variability and the difficulty...
Uploaded on: December 3, 2022 -
June 2006 (v1)Conference paper
Workflow engines are powerful tools to implement data- intensive scientific applications exploiting parallel grid resources transparently. We discuss the advantages of impelmenting applications as workflows of services when dealing with large data sets. We show how the graph of services associated with data composition operators enable the...
Uploaded on: December 3, 2022 -
May 2007 (v1)Conference paper
This paper presents a method to optimize the timeout value of computing jobs. It relies on a model of the job execution time that considers the job management system latency through a random variable. It also takes into ac- count a proportion of outliers to model either reliable clus- ters or production grids characterized by faults causing...
Uploaded on: December 3, 2022 -
June 7, 2006 (v1)Conference paper
Medical image registration is pre-processing needed for many medical image analysis procedures. A very large number of registration algorithms are available today, but their performance is often not known and very difficult to assess due to the lack of gold standard. The Bronze Standard algorithm is a very data and compute intensive statistical...
Uploaded on: December 4, 2022 -
May 18, 2008 (v1)Conference paper
Production grids are complex and highly variable systems whose behavior is not well understood and difficult to anticipate. The goal of this study is to estimate the impact of the variability of those infrastructures on the performance of workflow-based applications. A probabilistic model of workflows execution time is proposed and evaluated....
Uploaded on: December 3, 2022 -
May 2012 (v1)Conference paper
Production operation of large distributed computing infrastructures (DCI) still requires a lot of human intervention to reach acceptable quality of service. This may be achievable for scientific communities with solid IT support, but it remains a show-stopper for others. Some application execution environments are used to hide runtime technical...
Uploaded on: December 4, 2022 -
May 19, 2008 (v1)Conference paper
In this paper, we study grid job submission latencies. The latency highly impacts performances on production grids, due to its high values and variations as well as the presence of outliers. It is particularly prejudicial for determining the status and expected duration of jobs. In a previous work, a probabilistic model of the latency is...
Uploaded on: December 3, 2022 -
February 28, 2008 (v1)Conference paper
Previous works have presented a probabilistic model of the latency of the grid depending on parameters characterizing the workload. In this paper, we study both the validity of parameters along several weeks and the influence of the day of the week. We show that performance can be improved by the actualization of model parameters.
Uploaded on: December 4, 2022 -
June 2005 (v1)Conference paper
Data intensive medical image processing applications can easily benefit from grid capabilities. However, the setting up of complex medical experiments is not straight forward on current grid infrastructures. To ease such experiments we are developing a generic and grid-enabled workflow framework, relying on current standards. We show results on...
Uploaded on: December 4, 2022 -
November 2007 (v1)Publication
In this paper, we study grid job submission latencies. The latency highly impacts performances on production grids, due to its high values and variations as well as the presence of outliers. It is particularly prejudicial for determining the status and expected duration of jobs. In a previous work, a probabilistic model of the latency is...
Uploaded on: December 3, 2022 -
October 2005 (v1)Publication
Data-intensive applications benefit from an intrinsic data parallelism that should be exploited on parallel systems to lower execution time. In the last years, data grids have been developed to handle, process, and analyze the tremendous amount of data produced in many scientific areas. Although very large, these grid infrastructures are under...
Uploaded on: December 3, 2022 -
November 2007 (v1)Publication
In this paper, we study grid job submission latencies. The latency highly impacts performances on production grids, due to its high values and variations as well as the presence of outliers. It is particularly prejudicial for determining the status and expected duration of jobs. In a previous work, a probabilistic model of the latency is...
Uploaded on: February 22, 2023 -
June 2006 (v1)Conference paper
In this paper, we present a set of experiments comparing the EGEE production infrastructure and the Grid5000 experimental one. Our goal is to better understand and quantify how these systems behave under load. We first identify specific characteristics of the workload and data management systems of these two infrastructures, underlining some of...
Uploaded on: December 3, 2022 -
October 2008 (v1)Conference paper
The impact of lossy compression has often been discussed in the medical area. In this study, an evaluation of the impact of lossy compression on the performance of rigid registration algorithms for medical images is proposed. The robustness, repeatability and accuracy of these algorithms is estimated through a statistical procedure for each...
Uploaded on: December 3, 2022 -
October 2009 (v1)Journal article
In this paper, we study grid jobs latency. Together with outliers, latency highly impacts applications performance on production grids, due to its order of magnitude and important variations. It is particularly prejudicial for determining the expected duration of applications handling a high number of jobs and it makes outliers detection...
Uploaded on: December 3, 2022 -
August 27, 2007 (v1)Conference paper
Grid are promising tools to tackle massively parallel medical applications such as medical image databases indexation but their complexity make the optimization of computation tasks and the design of computing models difficult. We tackle this complexity by a probabilistic approach. We have shown that a probabilistic model of the computing tasks...
Uploaded on: December 4, 2022 -
June 2006 (v1)Conference paper
In this paper, we present a generic wrapper that enables the optimization of legacy codes assembled in application workflows on grid infrastructures. We first describe advantages of a service-based approach for job management. We then introduce our wrapper, that works at execution time, thus allowing service grouping strategies to optimize the...
Uploaded on: December 4, 2022 -
June 11, 2009 (v1)Conference paper
Production-grid users experience many system faults as well as high and variable latencies due to the scale, complexity and sharing of such infrastructures. To improve performance, they adopt different submission strategies, that are potentially aggressive for the infrastructure. This work studies the impact of three different strategies. It is...
Uploaded on: December 3, 2022