Summary: |
The main objective of the project is the development of a Web site where researchers may upload experimental data and undergo a series of data analysis experiments using an ILP system without any knowledge of the system's workings.
Our intended users are Life Science researchers, which are familiar with other web services provided by Universities world wide to Perform Gene, Proteomic and Metabolomic data analysis, like for instance Fasta. Our proposal is however generic and can be used in domains other than the Life Sciences.
We believe our project will enable researchers to expand and extend their analysis by making available to non-expert users a technology suitable for integrative biology due to their ability to learn relations between facts. Hence, this project provides undoubtedly a framework for the study, at a system level, biological problems in a integrated way. Paving the way to important applications like the study of high blood pressure, secondary structures of proteins, and the problem of disjoint exons.
To accomplish our objective we propose 3 lines of research:
i) speedup the ILP system execution, improve the ILP numerical capabilities and generally improve the quality of the models found;
ii) simplify the usage of ILP systems by non-expert users.
iii) develop a distributed computing platform and a scheduler to allow several instances of an ILP system to be executed simultaneously, where each instance runs sequentially or in parallel.
To serve a large number of simultaneous users and to make the ILP system to run faster, the service will be capable of running simultaneously several instances of the ILP system. Each instance runs in sequential execution mode or distributed execution mode. Furthermore, the two modes of execution can be deployed in a group of "normal" desktop machines, in a High Performance cluster, or in a GRID Computing setting. To be able to accomplished that a scheduler will be developed. Apart from being able to control th |
Summary
The main objective of the project is the development of a Web site where researchers may upload experimental data and undergo a series of data analysis experiments using an ILP system without any knowledge of the system's workings.
Our intended users are Life Science researchers, which are familiar with other web services provided by Universities world wide to Perform Gene, Proteomic and Metabolomic data analysis, like for instance Fasta. Our proposal is however generic and can be used in domains other than the Life Sciences.
We believe our project will enable researchers to expand and extend their analysis by making available to non-expert users a technology suitable for integrative biology due to their ability to learn relations between facts. Hence, this project provides undoubtedly a framework for the study, at a system level, biological problems in a integrated way. Paving the way to important applications like the study of high blood pressure, secondary structures of proteins, and the problem of disjoint exons.
To accomplish our objective we propose 3 lines of research:
i) speedup the ILP system execution, improve the ILP numerical capabilities and generally improve the quality of the models found;
ii) simplify the usage of ILP systems by non-expert users.
iii) develop a distributed computing platform and a scheduler to allow several instances of an ILP system to be executed simultaneously, where each instance runs sequentially or in parallel.
To serve a large number of simultaneous users and to make the ILP system to run faster, the service will be capable of running simultaneously several instances of the ILP system. Each instance runs in sequential execution mode or distributed execution mode. Furthermore, the two modes of execution can be deployed in a group of "normal" desktop machines, in a High Performance cluster, or in a GRID Computing setting. To be able to accomplished that a scheduler will be developed. Apart from being able to control the different execution modes and environments where the ILP system runs, it will also have the following extra characteristics. Like the Condor system, it will be able to use idle machines in an organization. It will be constantly aware of the workload of the computational resources available and network connections. It will use historical data and statistics to estimate execution times of tasks on the different machines available.
An useful characteristic of the proposed web service is that data and results will be stored in a Data Base directly accessible by the ILP system. This enables each user to decide which of his data/results are public or private.
The project will be evaluated using applications from the Life Sciences. It will be tested by non ILP experts in the following three domains/problems:
1. A Structure-Activity Relationship problem where ILP will help in explaining the anti-high blood pressure effect of some small peptides.
2. In Genomics where ILP will help in explaining the workings of disjoint exons.
3. In Proteomics where we will apply ILP in the predictions of the secondary structure of proteins.
4. In two problems of Medicine where ILP will be used to explain:
i) the effect/non-effect of some vaccines in tropical diseases; and
ii) ILP will be used to help in explaining the workings of a special kind of virus that is suspected to have the same infection mechanism in plants and animals. |