Graph-Massivizer researches and develops a high-performance, scalable, and sustainable platform for information processing and reasoning based on the massive graph representation of extreme data. It delivers a toolkit of five open-source software tools and FAIR graph datasets covering the sustainable lifecycle of processing extreme data as massive graphs. The tools focus on holistic usability (from extreme data ingestion and massive graph creation), automated intelligence (through analytics and reasoning), performance modelling, and environmental sustainability tradeoffs, supported by credible data-driven evidence across the computing continuum. The automated operation based on the emerging serverless computing paradigm supports experienced and novice stakeholders from a broad group of large and small organisations to capitalise on extreme data through massive graph programming and processing.
The repositories in this GitHub organization are currently private while development is ongoing. As the toolkit integration proceeds tools will appear publicly in this page.
GraphMa, a component of the Graph-Inceptor tool, integrates principles of pipeline computation using modular, composable functions to provide structured graph data analysis and processing using computational abstractions such as computation as type, higher-order traversal abstraction, and directed data-transfer protocol.
The Graph-Inceptor ETL Pipeline creates KGs and stores them in batches from large data sources using semantic mappings deployed on a scalable IT cloud infrastructure consisting of servers and storage systems.
Graph-Scrutinizer provides various BGO analytics, such as sampling, summarisation, traversal, or ML (e.g., GNN) algorithms, translated into optimised implementations for heterogeneous hardware (HPC, edge, cloud).
Graph-Optimizer combines analytical models, micro-benchmarking, graph sampling, simulation, and automated validation, to predict the performance and energy footprint of a given graph processing workload.
Graph-Greenifier is a simulation tool for data centre operators and application developers to create scenarios that quantify the carbon impact of workloads on different locations and hardware, making informed decisions.
Graph-Choreographer is a serverless orchestration tool for executing single, ensemble and batch graph applications on the computing continuum, scheduled using performance and energy tradeoffs.
Synthetic financial data generation of extreme volumes of stocks and future commodities, adaptable to additional financial securities such as options, bonds, exchange-traded funds, mutual funds, and currencies
Analysis of company-related events from past data, patterns identification in a common sequence, and prediction of the most likely following events by matching them
Integration of traditional expert knowledge with sensor data for quality monitoring in manufacturing, combining KGs with time-series sensor data models to enhance explainability, accuracy, and flexibility in quality predictions, and provisioning of expert insights and real-time measurements for superior quality control
Continuous prediction of compute node failures in a high-performance computing system based on an anomaly prediction model that leverages the nodes’ physical layout integrated into the monitoring system with a continuous graph neural network deployment pipeline