Abstract (EN):
The monitoring of machines in a network, and the communication of traffic between them, is extremely important to the launching and execution of parallel network applications. These distributed applications must take advantage of as many machines as possible, and deal with the fact that communication difficulties might arise at any moment.
As existent software capable of monitoring machines and network status is not readily adaptable to the automatic support of distributed applications, a system for monitoring and managing, in a decentralized way, a network of computers running Unix is being developed as a small project at FEUP. The system in development is initially aimed at supporting classical programs (i.e. serial type) and, afterwords, its framework will be used for parallel applications. In this paper, the software module that is responsible for the monitoring of each machine in the network, is presented, discussed and, whenever possible, compared to similar modules of public domain or commercial systems.
The monitoring software module keeps records of the processor usage, memory availability and the free disk space of the machine where it runs. It also records the same type of information on other machines in the network, because peer monitoring modules are in constant communication. The packet flow and the traffic in the communication routes are monitored, as well. In this way, each machine builds an image of the whole network and its degree of utilization that should be fairly accurate and up-to-date.
The first iteration of the whole software system is being tested with the main goal of certifying its correct overall functioning and the adequacy of the algorithms used. Following the phase, the routines of the modules will have their efficiency improved. Afterwords, their utilization in the support of parallel applications will be considered.
Language:
Portuguese
Type (Professor's evaluation):
Scientific