gCube: a Service-Oriented Application Framework on Grid

The DLib group, led by Donatella Castelli, of the ISTI-CNR Networked Multimedia Information Systems Laboratory has developed an highly innovative application framework, gCube, for building and operating e-Infrastructures promoting an effective and efficient sharing of computing, content, and application services resources. These e-Infrastructures simplify the creation of Virtual Research Environments (VRE), i.e. collaborative digital environments through which scientists, addressing common research challenges, exchange information and produce new knowledge.
To implement this scenario, gCube provides a threefold support for building e-Infrastructures capable to enable VREs development and operation by relying on grid technologies. It provides (i) runtime and design frameworks for the development of services that can be outsourced to a grid-enabled infrastructure; (ii) a service-oriented new generation grid middleware for exploiting the computing and storage facilities of first generation grids and dynamically hosting and managing Web Services on it; (iii) a set of application services for distributed information management and retrieval of structured and unstructured data.
Runtime frameworks are distinguished workflows which are partially pre-defined within the system; they include infrastructure operation services and application services, where the former coordinate in a pure distributed way the action of the latter relying on a high-level characterisation of their semantics. Design frameworks are patterned blueprints, software libraries, and partial implementations of state of the art application functionality which can be configured, extended, and instantiated into bespoke application grid-enabled services.
The service-oriented new generation grid middleware provides all the required capabilities needed to manage grid infrastructures supporting VREs. It implements facilities for eliminating manual service deployment and management overheads by guaranteeing optimal placement of services within the infrastructure and opening unique opportunities for outsourcing state-of-the-art implementation operations. Rather than interfacing the infrastructure, the software which implements the application services is literally handed over to it, so as to be transparently deployed across its constituent nodes according to functional constraints and Quality of Service requirements. By integrating the gLite system released by the Enabling Grid for E-sciencE (EGEE) project for batch processing and management of unstructured data, gCube allows also to exploit the large amount of computing and storage facilities provided by the EGEE infrastructure. With over 60,000 CPUs and 20 Petabytes of storage provided through 250 sites hosted in 48 countries, EGEE is today the largest operational grid infrastructure ever built.
gCube application services offer a full platform for distributed hosting, management and retrieval of data and information, and a framework for extending state-of-the-art indexing, selection, fusion, extraction, description, annotation, transformation, and presentation of content. In particular, gCube is equipped with services for manipulating information objects, importing external objects, managing their metadata in multiple formats, securing the information objects to prevent unauthorised access, and transparently managing replication and partition on grid.
In order to promote interoperability, gCube services are implemented in accordance with second-generation Web Service standards, most noticeably SOAP, BPEL, WSRF, WS-Notification, WS-Security, WS-Addressing, and JSR168 Portal and Portlets specifications.
gCube has been successfully exploited to develop and operate the DILIGENT e-Infrastructure, built in the framework of the homonymous EU IST project. Such an e-Infrastructure supports different VREs serving research communities affiliated with the Environmental Monitoring, Cultural Heritage and other domains. The former, led by ESA - the European Space Agency, is a community of leading actors in the environmental sector involved in preparing periodic conventions and reports by integrating and processing huge amount of Earth related data. The cultural heritage domain is a worldwide spread scholar community working to establish a novel discipline merging heterogeneous experiences to investigate on the relationships between images and texts. Thanks to its openness and customisability, the gCube software has been used to satisfy the needs of two completely different scenarios arising in the context of the Health-e-Child e SAPIR EU projects. In particular, thanks to the gCube support the SAPIR project has been able to process 37 million images in 16 weeks only and produce approximately 112 million text and image objects (about 5TB of data) containing more than 150 million extracted features.
gCube will also be at the core of the e-Infrastructure developed and operated by D4Science EU IST project. This project will develop a more powerful e-Infrastructure serving the needs of Environmental Monitoring (led by ESA, the European Space Agency) and the Fishery and Aquaculture Resources Management (led by FAO, the Food and Agriculture Organisation of the United Nations) communities. By providing such communities with seamless access to numerous data sources such as satellite (ocean colour and reef maps), climate, hydrographic and trade data and tools to manage them in the organised environment resulting from gCube based VREs, D4Science will provide the basis for innovative socio-ecosystem modelling approaches.