BigDataStack - High-performance data-centric stack for big data applications and operations

Brief description

BigDataStack will deliver a complete high-performance data-centric stack of technologies as a unique combined and cross-optimized offering that addresses the emerging needs of data operations and applications.

BigDataStack introduces the paradigm of a new frontrunner data-driven architecture and system ensuring that infrastructure management will be fully efficient and optimized for data operations and

data-intensive applications. The management system will be scalable and runtime adaptable to facilitate the deployment and management of computing, storage and networking resources, while also

considering their interdependencies. As a holistic solution, BigDataStack incorporates approaches that range from data-focused application analysis and dimensioning, process modelling, cluster resources / nodes characterisation, management and runtime optimization, to information-driven networking.

BigDataStack provides Data as a Service by addressing and interlinking the complete data path operations: cleaning, modelling and interoperability, distributed storage and analytics. Addressing the need for analytics on both data in flight and at rest, BigDataStack provides a seamless data analytics framework to analyse data in a holistic fashion across multiple data stores and locations,

along with advanced modelling techniques defining flexible data schemas that can be exploited across multiple processing frameworks to eliminate the need to adopt to new schemas.

BigDataStack is fully exploitable and open, putting emphasis on usability, through the envisioned Data Toolkit that allows the specification of any analytics task in a declarative way and its efficient

integration in the aforementioned data path operations.

Main Features

BigDataStack provides a complete infrastructure management system, which will base the management and deployment decisions on data from current and past application and infrastructure deployments. This complete infrastructure management system is delivered as a full “stack” that facilitates the needs of operation data and application as well as facilitate it in an optimized way.

Areas of Application

BigDataStack project has been validated and challenged by three commercial use cases: real-time shipment management, connected consumer, and smart insurance. Nevertheless, BigDataStack solutions can be applied in many vertical sectors with needs of real-time data analytics.

Customer Benefits

BigDataStack delivers benefits for customers from supply and demand side:

Supply side (covering entities providing services and products):

Infrastructure providers: BigDataStack data-driven infrastructure / cluster management will provide the means for wider offerings facilitating big data needs through efficient and performant management of all resources.
Data providers: Data as a Service on top of data-optimized infrastructure resources allowing them to offer cleaned, modelled, stored and analysed data of value.
Application providers: Application dimensioning workbench and exploitation of meaningful data through advanced infrastructure services enabling them to provide data-intensive applications with specific performance and quality guarantees.
Data practitioners: Data toolkit and visualization environment allowing them to develop and ingest their own algorithms and offer them through BigDataStack deployments.
Infrastructure brokers: Acting as second-step entities (following infrastructure providers) that take advantage of the BigDataStack data-driven infrastructure management solution.
Data aggregators & Data (re-)sellers: Acting as second-step entities (following data providers) that take advantage of the Data as a Service according to their business models and goals.
Marketplace owners: Acting as second-step entities (following application providers) that take advantage of data-intensive application provisioning.

User / demand side (covering actual end-users):

Citizens: Using applications, services and products with guaranteed levels of quality.
SMEs and big industries: Utilizing BigDataStack for their deployments facilitating their internal data needs, using the Process Modelling Framework to optimize their processes, and / or utilizing BigDataStack services offered by different providers.
Public organisations: Exploiting BigDataStack offerings with optimized operations across the complete data path as required for data being handled by public organisations.
Entrepreneurs: Deploying and offering data-intensive applications through the efficient BigDataStack infrastructure.
Decision makers such as chief data / information / marketing officers: Using the adaptive visualization environment and the process modelling framework to drive business decisions based on accurate, timely, meaningful data and analytics outcomes.

Technological novelty

Address the need for analytics on both data in flight and at rest and optimize the deployment and execution of big data applications and operations through data-driven management of the underlying resources.
Provide a seamless data analytics framework to analyse data across multiple data stores and locations by eliminating the need to perform application-level adaptations to address (and query) the various heterogeneous underlying stores.
Address the complete data path operations, providing Data-as-a-Service to address the complete data path (including data quality assessment, efficient distributed storage, optimized performance through data skipping, and data analytics)
Facilitate the usability and extensibility by delivering a fully exploitable and open source solution through a Data Toolkit that enables new analytics (and analytics pipelines) to be ingested and executed in the underlying infrastructure.
Deliver an innovative Process Modelling framework, accompanied with process analytics / mining and process mapping techniques to address the needs of business analysts.