Focus

Rulex®: a software for knowledge extraction from data

How many times in daily research activity do we face with datasets where is contained some relevant knowledge, which a direct visual inspection is not able to extract?!? And in how many cases we would desire an integrated system capable of performing advances statistical analysis or employing complex data mining algorithms without having to manage the choice of obscure options and unknown formats?!?

The problem of extracting knowledge from real world data producing effective models that allow to forecast the evolution of a physical system of interest has been the main target of the activity of the Machine Learning group, working for more than 25 years at the Institute of Electronics, Computer and Telecommunication Engineering of CNR. After having gained experience with most advanced statistical and machine learning methods (neural networks, decision trees, support vector machines, ...) the research activity of the group has been focused on a new paradigm based on the reconstruction of Boolean functions from examples.

Through this novel approach it has been possible to develop new models, named Switching Neural Networks (SNN), whose behavior can be entirely described by simple intelligible if-then rules. The application of these models in different scientific fields (in particular for the treatment of biological data) has led to relevant results published in important international journals.

To allow a more friendly use of these models and of the related innovative algorithms in 2007 has been founded a CNR spin-off, called Impara Srl, having the aim of starting a prototyping activity that makes more efficient the learning algorithms for SNN while proceeding to the construction of an integrated platform for their application in the analysis of data deriving from any real world source.

This two targets have been reached by creating the suite Rulex (acronym of RULe EXtraction) for the management, the visualization and the analysis of data: an integrated visual platform allows to perform any operation in a simple and direct way, freeing the user from the necessity of knowing implementation details about memorization and execution. The construction of an analysis process takes place by simply connecting elementary blocks to the data flow, according to a visual programming procedure.
The computational kernel of Rulex, entirely written in C code with a high degree of optimization, allows to reach a great efficiency: datasets containing a billion data have been analyzed in reasonable time on a conventional personal computer. In particular, it has been implemented by Impara an optimized version of the SNN model, named Logic Learning Machine (LLM), which turns out to be a valuable tool in the solution of many application problems.

Currently Rulex is employed, as a standalone program or in its OEM version, by several academic researchers as well as by important private enterprises, such as Danone, Granarolo, De Cecco (for the forecast of sold volumes in promotional events), Lennox (for the prevision of seasonality effects on product demand), Ansaldo (for the preventive diagnosis of plants), Novacoop (for the analysis on inventory differences), Poste Mobile (for customer segmentation), RS Components (for estimating the success probability of new products).