Institute for high performance computing and networking (ICAR)

Research activities

The research activities of the Institute cover the main issues of the methodological and applicative research in Informatics and are organized in the following research lines.

(A) Warehousing and mining of large data sets and knowledge representation and discovery
The aim of the research line is the development of advanced tools for combining algorithms, languages, methodologies and new techniques for Data Bases, Data Mining, and knowledge representation to (i) manage large data sets both traditional and coming from WEB, also as services, (ii) discover useful information buried in documents, services and utilization traces, and (iii) to produce new knowledge, new services.
The research activities are (1) investigation and development of data mining methods for complex data, such as graphs, trees, sequences, high dimensional (genomics, textual data), coming from new applications domains, (2) invention of sophisticated techniques for Advanced Data Base Systems, such as aggregation and summarization techniques, analysis and management of streaming data, management and querying of XML data, data integration. These activities are realized in collaboration with national and international research groups, and farms interested in knowledge management.
Specific research activities include the definition of algorithms for classification and outlier detection in streaming data and RFID; non linear optimization methods to search for the minima points applied to the logistic field, and oriented to the mathematical classification; clustering and classification of HTML documents and trajectories of moving objects; information-theoretic co-clustering of multi-relational and of high dimensional data based on greedy techniques to discover functional groups in protein-protein interaction graphs; open-source technique integration to analyze process logs, based on structural clustering of the traces contained in the logs, extensions of the process clustering techniques that exploits the information on the system performance; XML data compression, querying in the compressed domain, eterogeneous data integration; segmentation techniques and record matching.

(B) Cognitive agent systems for robotics and for the intelligent delivery of sensory data and advanced services
The focus of research activity aims to provide methodologies and tools for defining autonomous cognitive agents in order to of innovative and intelligent services such as: software systems able to perceive, to understand, to learn, and to autonomously act, systems that learn by human interaction and by understanding human actions, systems that support decisions and analysis.
The main research activities are: (1) design and experimentation of software architecture based on cognitive paradigm; hierarchical models based on neural networks and Bayesian networks for representing knowledge at different abstraction levels; (2) anthropomorphic robotic applications, automatic language understanding, facial expression analysis, agent based systems and design methodologies; (3) processing sensorial data arising from autonomous platforms or wireless sensor networks to monitor complex scene, extraction of relevant features from images and videos.
The activities are located in the field of knowledge engineering, developing various research themes (such as autonomous intelligent systems, data mining, ontology, semantic web, and so on) through cognitive agent point of view. Studied methodologies are experimented in a number of applicative fields: cultural heritage, security, geographical information systems, environment monitoring, agricultural applications, support to disabled persons, e-learning, biomedical applications.
Specific research activities deal with the study and development of theoretic and methodological tools supporting intelligent autonomous systems, by using machine learning techniques, innovative neural networks, self-organizing maps, Bayesian networks , dynamical Bayesian networks, latent semantic analysis, artificial intelligence, semantic networks, geometric techniques to represent knowledge, multi-agent system, image and video processing, wireless sensor networks, distribute processing.

(C) Intelligent services for computational grids and peer-to-peer systems
The aim of the research line is to investigate the methodologies, techniques and algorithms necessary to make the current Computational Grids evolve towards Intelligent Grids where the inclusion of knowledge services and P2P systems will allow service-oriented infrastructures which will make the interoperability among users, applications and resources easier and will provide scalable on-demand services to support innovation, cooperative work, problem solving and decision support.
The research activities are: (1) investigation and realization of a robust and effective system for the management and the enactment of dynamic workflows based on P2P decentralized and self-organizing algorithms for the search and discovery of the resources to compose. (2) study of models and algorithms to build Grid information systems based on multi-agent systems cooperating through social networks ( small-worlds, scale-free, etc.) in P2P networks; (3) design of intelligent Grid problem solving environments(PSE) based on ontologies and metadata for the semantic modelling of Grid services specialized in the solution of Geoscience problems. (4) development of distributed and high performance algorithms for the Grid/P2P data mining based on innovative paradigms such as ensemble and swarm intelligence techniques in order to support the discovery of patterns in repository of existing data and/or generated from the Grid operation; (5) Development of algorithms for the querying and mining of data streams in Grid/P2P systems.
Specific activities include: studying and creating a system for the management of autonomic workflows, able to provide an open platform for the dynamic service composition and the enactment of evolving services continuously adapting to environmental changes; use of swarm intelligence and evolutionary computation techniques for the design of self-organizing Grids that automatically guarantee the negotiation of contracts regarding the Service Level Agreement (SLA) using Quality of Services (QoS) parameters specified by users; development of a Grid/Web services and workflow architecture to support new data mining distributed algorithms on data streaming stream based on the cooperative co-evolution paradigm and the Fractal theory; studying and creating a distributed data-driven system for the diagnosis of anomalous behaviours which influence the productivity of a transhipment terminal.

(D) Pervasive computational grids for high performance computing
The main aim of the research line is the development of methods, algorithms and software for Computational Grids, providing high-performance computing services in the solution of engineering and scientific applications. Specific goals are the design and development of high-performance algorithms and software, to be included in software architectures for Grid Computing, for applications, such as simulation of clean engines, simulation of aerodynamic and environmental flows, image processing. These applications are related to interdisciplinary collaborations with national and international partners.
Main research activities include the design and development of high-performance algorithms and software for computational simulation and image processing; design and development of Grid Middleware with emphasis on security and concurrent process management; the design and development of a Grid Platform for Modeling Internal Combustion Engines, within the activities of the Commission of the Department of Energy & Transport of CNR.

(E) Highly immersive virtual reality systems and advanced algorithms for image analysis
The research line aims at realizing highly immersive virtual environments that also expose multimodal interaction facilities and pervasive characteristics. Specific goals are the definition of architectural models for virtual reality environments and the modeling and realization of middleware infrastructure for pervasive computing environments. It is also of interest the definition and implementation of algorithms for image and video processing. Currently, the main field of application is the medical one. In particular, many activities are aimed to the realization of enhanced virtual environments for the medical diagnosis and to the realization of applications of pervasive healthcare.
Main activities are: (1) definition of architectural models and services suitable for rapid prototyping pervasive computing environments; (2) invention of techniques and algorithms for acquiring and reasoning on data coming from uncertain context in pervasive healthcare and e-health applications; (3) definition of sub-attributes of usability and related metrics in applications for the manipulation of 3D objects; (4) investigation of techniques for semantic handling of collisions in RFID based location and tracking applications; (4) design and implementation of parallel algorithms for direct volume rendering on MultiCore architectures; (5) construction of ontologies for medical applications; (6) development of algorithms for treating real-time sequences of images.

(F) Service-oriented multi-multimedia content management and integration
The main goals of the research line are developing modules and tools for content representation in video and images through sharable conceptual schemes and investigating tools, operators and metrics for the retrieval, driven by image and video sequence contents.
The research activities are: (1) the definition of a methodology to describe processes for manipulating and monitoring multi-dimensional documents using semantic annotation formalisms - this methodology is iterative and incremental in order to integrate algorithms, descriptions and comparison methods in a CAE platform for hypermedia authoring. (2) semi-automatic annotation of video objects using MPEG-7 standard; (3) implementation of an API for accessing and manipulating of MPEG-7 contents; (4) development of graphic interfaces for video file navigation by means of structured meta-data describing visual and semantic contents.

(G) Evolutionary Computing methodologies and tools and their application to modelling and optimization in Complex Systems
The goals of this research are two. The first, and the most immediate, is to design and implement, in both sequential and parallel versions, a set of tools which, after a suitable tuning phase, either automatically or manually carried out to set a reduced number of parameters, can be easily used to provide sub-optimal solutions for data processing in applications related to optimization, data mining, forecasting. The second goal, less immediate since it requires a deep study of the main features of adaptive pervasive systems, aims at the definition of evolutionary heuristics and of the related operators and their implementation as algorithms. Not only must these latter provide sub-optimal solutions for systems with high complexity as the pervasive ones are, they also must be able to modify the provided solutions in presence of possible and unforeseeable modifications in the system itself.
The activities involve: (1) the development of machine learning techniques based on Evolutionary Algorithms for data processing in optimization, forecasting and rule discovery; (2) the development, experimentation and evaluation of parallel and distributed versions of the above techniques on test problems known in literature; (3) retrieval of real data, taken from SME area or from other areas, to process with the above tools; (4) the study, within Evolutionary Computation field, of new models and new operators to effectively and efficiently face pervasive adaptation issues; implementation, experimentation and evaluation of such new algorithms through test problems; retrieval of data from real problems in pervasive systems area to face with the above methodologies.

(H) Intelligent data analysis for comprehensive security
The research line is focused on studying and developing methods, techniques, models and technologies for comprehensive information security. The idea is to reconsider the traditional view of managing secure information in a broader significance including several issues, such as risks coming from networking activities, risks coming from data management, and risks coming from terrorist threats.
Research activities are: (1) analysis of complex/textual data, with particular emphasis on homeland security/computer security issues; (2) outlier detection; (3) privacy-preserving OLAP; (4) a number of meaningful applications of data mining techniques in the field of security, e.g., fiscal fraud detection (in collaboration with the Italian Revenue Agency) and secure management of the Gioia Tauro seaport (within the framework of the Logistics Technological District of the area).

(I) Machine learning models and techniques for bioinformatics
The research line deals with artificial intelligence and machine learning techniques for the interpretation of biological data and the development of intelligent applications on high performances distributed architectures for analysing, to managing and simulating biological and molecular data.
Research activities are: (1) application of machine learning techniques are proposed; these approaches are applied for pattern recognition and motif finding in genomic sequences and molecular structures, for the support of the drug discovery process, for the visualization and exploration of large collections of molecular data; (2) investigation of new methodologies for the evaluation of software analysis for DNA microArray images and content based indexing methodologies of biomedical images; (3) development of methodologies for data driven conceptual spaces induction in order to analyse large amounts of heterogeneous bioinformatics sources; (4) modelling and simulation of specific bioinformatics domains using artificial intelligence techniques (ontology, multi-agents systems)

(J) Advanced algorithms and architectures for bioinformatics
The research line aims at planning and developing secure and innovative computer architectures, biological databases, mathematical models and software modules for the analysis of genomic, proteomic and transcriptomic data.
The research activities are: (1) planning and implementation of algorithms for: gene expression data analysis produced by high throughput microarray technology, subset selection of regulatory genes discriminating between different states (heath/ill, response to drug,...), and efficient algorithms and software for the solution of differential problems, deriving from kinetic biochemical models of energetic metabolism in human cells; (2) definition of an integrated architectural model to satisfy the needs of life scientists on the different interest scales, from the local processor with parallel components, to the global level of grids involving different organizations; (3) specification of the architecture components levels with the definition of their computational characteristics, associated inter-level interfaces, with respect to program instructions and from microgrids to global grids, possibly using P2P and SOAP approaches; (4) evaluation and comparison of the selected integrated multilevel functional programming model CHIARA (developed by ICAR) with both Haskell and Microsoft F#.