QualiMaster - Eine konfigurierbare Echtzeit-Datenverarbeitungsinfrastruktur zur autonomen Qualitätsanpassung
Die jüngste globale Finanzkrise hat gezeigt, dass selbst im heutigen Technologiezeitalter, in dem große Teile des privaten wie auch des Arbeitslebens durch Informationstechnologie (IT) unterstützt und vereinfacht werden, eine frühzeitige Erkennung von marktübergreifenden Risikosituation schwierig ist. Besonders in Gebieten wie dem Finanzsektor, in dem an der Börse täglich enorme Datenmengen unmittelbar verarbeiten werden, stößt die IT bei derartigen Analysen derzeit noch an ihre Grenzen. Beispielsweise werden täglich in Europa und Amerika bis zu 250 Gigabyte an Daten (in etwa 54 DVDs) mit aktuellen Börsenhandelsdaten und Devisenkursen erzeugt. Länder- und marktübergreifende Risikoanalysen wie sie für ein frühzeitiges Eingreifen durch Zentralbanken notwendig sind, erfordern die Echtzeitanalyse der Daten sowie die Fähigkeit auf einzelne Phänomene zu fokussieren, d.h., in Risikosituationen mehr und detailliertere Daten zu verarbeiten. Dabei spielen auch Daten aus sozialen Netzwerken eine immer größere Rolle: so begann der Untergang von Lehman-Brothers mit dem Gerücht, dass diese Bank ihr tägliches Kapital nicht beschaffen könne.
Aktuell werden die zugehörigen IT-Systeme allerdings auf den Maximalfall ausgelegt, in dem sowohl die maximale Menge an Daten fließt als auch gleichzeitig die maximale Verarbeitungsleistung benötigt wird. Allerdings ist dies weder effektiv noch kostengünstig, da so Kapazitäten zu Zeiten geringerer Datenströme beispielsweise für zusätzliche Detailanalysen ungenutzt bleiben. Die Vision ist die automatische und dynamische Anpassung solcher Systeme an die jeweilige Situation, so dass bestehende Kapazitäten (auch durch weitere Analysen) optimal ausgenutzt werden.
Die Arbeitsgruppe „Software Systems Engineering“ um Prof. Dr. Klaus Schmid arbeitet an Methoden und Techniken, wie Software effizient angepasst und diese Anpassung von der Software eigenständig durchgeführt werden kann. Mit dieser und anderen Arbeiten hat sich die Gruppe bundesweit einen Namen gemacht und ist daher (neben dem L3S Research Center Hannover, dem Telecommunication Systems Institute der Technischen Universität Kreta und den Unternehmen Maxeler Technologies Ltd in London und Spring Techno in Bremen) Partner in dem dreijährigen Forschungsprojekt „QualiMaster“, das durch die EU mit ca. 2,9 Millionen Euro gefördert wird. In diesem EU-Projekt konzentrieren sich die Forscher der Arbeitsgruppe insbesondere auf die automatische Konfiguration und Anpassung von Verarbeitungsmechanismen für große Datenmengen im Hinblick auf unterschiedliche Qualitätsmerkmale.
Bei fast 100 Millionen Nachrichten pro Sekunde an der Börse (Europa und Amerika) ist besonders der Faktor Zeit ein wichtiges Qualitätsmerkmal. So darf die Verarbeitung und Analyse dieser Nachrichten nicht den Eingang von Ergebnissen unter vereinbarten Qualitätskriterien beim jeweiligen Empfänger verzögern, denn Zeit ist Geld. Die Herausforderung, die relevanten Qualitätsmerkmale zu berücksichtigen und auf Basis deren Gewichtung eine automatische Anpassung der Verarbeitungsmechanismen unter Berücksichtigung der aktuell zu verarbeitenden Datenmenge durchzuführen, nimmt die Arbeitsgruppe SSE in dem Projekt „QualiMaster“ an.
Damit unterstützen die Forscher die Entwicklung eines konfigurierbaren Echtzeit-Datenverarbeitungssystems zur autonomen Qualitätsanpassung. Langfristig sollen mit einem solchen System Vorhersagen über Entwicklungen im Finanzmarkt (im Projekt die Analyse und Voraussage systemischer Risiken), aber auch in anderen Gebieten mit hohem Datenaufkommen, wie zum Beispiel im Bereich der Makroökonomischen Analysen, der Wetteranalysen, der Analyse sozialer Netzwerke oder großer wissenschaftlicher Experimente, effektiv und effizient möglich werden.
Weitere Informationen:
Laufzeit: 3 Jahre
Kontakt: Dr. Holger Eichelberger, eichelberger(at)sse.uni-hildesheim.de
Das QualiMaster-Projekt wird finanziert durch Grant 619525 der Europäischen Kommission, Bereich Scalable Data Analytics im 7. Rahmenprogramm. Die EU fördert das Projekt mit ca. 2,9 Mio. Euro.
Deliverables
Nummer | Name |
---|---|
D1.1 | Initial Use Cases and Requirements |
D1.2 | Full Use Cases and Requirements |
D2.1 | Approach for Scalable, Quality-aware Data Processing |
D2.2 | Scalable, Quality-aware Data Processing Algorithms V1 |
D2.3 | Scalable, Quality-aware Data Processing Algorithms V2 |
D2.4 | Final report on Scalable, Quality-aware Data Processing Methods |
D3.1 | Translation of Data Processing Algorithms to Hardware |
D3.2 | Hardware-based Data Processing Algorithms V1 |
D3.3 | Hardware-based Data Processing Algorithms V2 |
D3.4 | Optimized Translation of Data Processing Algorithms to Hardware |
D4.1 | Quality-aware Processing Pipeline Modelling |
D4.2 | Quality-aware Processing Pipeline Adaptation V1 |
D4.3 | Quality-aware Processing Pipeline Adaptation V2 |
D4.4 | Quality-aware Processing Pipeline Modelling and Adaptation |
D5.1 | QualiMaster Infrastructure Set-up |
D5.2 | Basic QualiMaster Infrastructure |
D5.3 | QualiMaster Infrastructure V1 |
D5.4 | QualiMaster Infrastructure V2 |
D6.1 | QualiMaster Applications V1 (internal) |
D6.2 | Intermediary Evaluation Report |
D6.3 | QualiMaster Applications V2 (internal) |
D6.4 | Final Evaluation Report |
D7.1 | Initial Project Fact Sheet |
D7.2 | Project Presentation and Project Web Site |
D7.3 | Dissemination Plan |
The research leading to these results has received funding from the European Union Seventh Framework Programme [FP7/2007-2013] under grant agreement n° 619525.
Publikationen
Lfd. Nr. | Publikation |
---|---|
2019 | |
16. |
Cui Qin, Holger Eichelberger und Klaus Schmid
(2019):
Enactment of Adaptation in Data Stream Processing with Latency Implications - A Systematic Literature Review
In: Information and Software Technology, 111: 1-21.
Free Download: https://authors.elsevier.com/a/1Yvdh3O8rCSPcx
Zusammenfassung [Context] Stream processing is a popular paradigm to continuously process huge amounts of data. Runtime adaptation plays a significant role in supporting the optimization of data processing tasks. In recent years runtime adaptation has received significant interest in scientific literature. However, so far no categorization of the enactment approaches for runtime adaptation in stream processing has been established. [Objective] This paper identifies and characterizes different approaches towards the enactment of runtime adaptation in stream processing with a main focus on latency as quality dimension. [Method] We performed a systematic literature review (SLR) targeting five main research questions. An automated search, resulting in 244 papers, was conducted. 75 papers published between 2006 and 2018 were finally included. From the selected papers, we extracted data like processing problems, adaptation goals, enactment approaches of adaptation, enactment techniques, evaluation metrics as well as evaluation parameters used to trigger the enactment of adaptation in their evaluation. [Results] We identified 17 different enactment approaches and categorized them into a taxonomy. For each, we extracted the underlying technique used to implement this enactment approach. Further, we identified 9 categories of processing problems, 6 adaptation goals, 9 evaluation metrics and 12 evaluation parameters according to the extracted data properties. [Conclusion] We observed that the research interest on enactment approaches to the adaptation of stream processing has significantly increased in recent years. The most commonly applied enactment approaches are parameter adaptation to tune parameters or settings of the processing, load balancing used to re-distribute workloads, and processing scaling to dynamically scale up and down the processing. In addition to latency, most adaptations also address resource fluctuation / bottleneck problems. For presenting a dynamic environment to evaluate enactment approaches, researchers often change input rates or processing workloads. |
2017 | |
15. |
Holger Eichelberger, Cui Qin und Klaus Schmid
(2017):
Experiences with the Model-based Generation of Big Data Applications
In:
Lecture Notes in Informatics (LNI) - Datenbanksysteme für Business, Technologie und Web (BTW '17) - Workshopband
S. 49-56.
Zusammenfassung Developing Big Data applications implies a lot of schematic or complex structural tasks, which can easily lead to implementation errors and incorrect analysis results. In this paper, we present a model-based approach that supports the automatic generation of code to handle these repetitive tasks, enabling data engineers to focus on the functional aspects without being distracted by technical issues. In order to identify a solution, we analyzed different Big Data stream-processing frameworks, extracted a common graph-based model for Big Data streaming applications and developed a tool to graphically design and generate such applications in a model-based fashion (in this work for Apache Storm). Here, we discuss the concepts of the approach, the tooling and, in particular, experiences with the approach based on feedback of our partners. |
14. |
Holger Eichelberger, Cui Qin und Klaus Schmid
(2017):
From Resource Monitoring to Requirements-based Adaptation: An Integrated Approach
In:
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering Companion (ICPE '17)
S. 91-96.
ACM.
Zusammenfassung In large and complex systems there is a need to monitor resources as it is critical for system operation to ensure sufficient availability of resources and to adapt the system as needed. While there are various (resource)-monitoring solutions, these typically do not include an analysis part that takes care of analyzing violations and responding to them. In this paper we report on experiences, challenges and lessons learned in creating a solution for performing requirements-monitoring for resource constraints and using this as a basis for adaptation to optimize the resource behavior. Our approach rests on reusing two previous solutions (one for resource monitoring and one for requirements-based adaptation) that were built in our group. |
13. |
Klaus Schmid und Holger Eichelberger
(2017):
Variability Modeling with EASy-Producer
In:
Proceedings of the 21st International Systems and Software Product Line Conference
Bd. A.
S. 251-251.
ACM.
Zusammenfassung EASy-Producer is an open-source research toolset for engineering product lines, variability-rich software ecosystems, and dynamic software product lines. In this tutorial, we will focus on its (textual) variability modeling capabilities as well as its configuration and validation functionality. Further, we will provide an outlook on how EASy-Producer can be applied to variability instantiation. |
2016 | |
12. |
Holger Eichelberger, Claudia Niederée, Apostolos Dollas, Ekaterini Ioannou, Cui Qin, Grigorios Chrysos, Christoph Hube, Tuan Tran, Apostolos Nydriotis, Pavlos Malakonakis, Stefan Burkhard, Tobias Becker und Minos Garofalakis
(2016):
Configure, Generate, Run - Model-based Development for Big Data Processing
In:
European Project Space on Intelligent Technologies, Software engineering, Computer Vision, Graphics, Optics and Photonics - Volume 1: EPS Rome 2016
S. 124-148.
SciTePress.
Zusammenfassung The development of efficient and robust algorithms for Big Data processing is a demanding task, which has to cope with the characteristics of this type of data (3Vs). Putting such algorithms as processing elements into larger pipelines adds an extra level of complexity, which can be alleviated by relying on a model-based approach including code generation. This allows data analysts to compose such pipelines on a higher level of abstraction, reducing the development effort as well as the risk of errors. In this chapter, we outline a model-based and adaptive approach to the development of data processing pipelines in heterogeneous processing contexts. It relies on a flexible, tool-supported approach to configuration, which embraces three levels: (a) a heterogeneous processing infrastructure - including reconfigurable hardware, (b) the pipelines as well as (c) the stakeholder applications built upon the pipelines. Furthermore, selected aspects of implementing the approach, which is validated in the context of the financial domain, are presented. |
11. |
Cui Qin und Holger Eichelberger
(2016):
Impact-minimizing Runtime Switching of Distributed Stream Processing Algorithms
In:
Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference
CEUR-WS.org.
Zusammenfassung Stream processing is a popular paradigm to process huge amounts of data. During processing, the actual characteristics of the analyzed data streams may vary, e.g., in terms of volume or velocity. To provide a steady quality of the analysis results, runtime adaptation of the data processing is desirable. While several techniques for changing data stream processing at runtime do exist, one specific challenge is to minimize the impact of runtime adaptation on the data processing, in particular for real-time data analytics. In this paper, we focus on the runtime switching among alternative distributed algorithms as a means for adapting complex data stream processing tasks. We present an approach, which combines stream re-routing with buffering and stream synchronization to reduce the impact on the data streams. Finally, we analyze and discuss our approach in terms of a quantitative evaluation. |
10. |
Holger Eichelberger, Cui Qin, Roman Sizonenko und Klaus Schmid
(2016):
Using IVML to Model the Topology of Big Data Processing Pipelines
In:
Proceedings of the 20th International Systems and Software Product Line Conference
S. 204 - 208.
ACM.
Zusammenfassung Creating product lines of Big Data stream processing applications introduces a number of novel challenges to variability modeling. In this paper, we discuss these challenges and demonstrate how advanced variability modeling capabilities can be used to directly model the topology of processing pipelines as well as their variability. We also show how such processing pipelines can be modeled, configured and validated using the Integrated Variability Modeling Language (IVML). |
9. |
Holger Eichelberger
(2016):
A Matter of the Mix: Integration of Compile and Runtime Variability
In:
2016 IEEE 1st International Workshops on Foundations and Applications of Self-* Systems, Proceedings of the 9th International Workshop on Dynamic Software Product Lines (DSPL '16)
S. 12-15.
IEEE.
Zusammenfassung While dynamic software product lines focus on runtime variability, traditional software product lines typically aim at development-time variability. In this paper, we argue that integrating both kinds of binding times into a single variability model can be beneficial for modeling the adaptation space as well as for controlling runtime decision making. We achieve this by a mix of modeling and constraint capabilities. We illustrate the integration of compile time and runtime variability using a general-purpose variability modeling language and an example from the field of adaptive data stream processing. We also discuss advantages and disadvantages of our approach. |
8. |
Robert Heinrich, Holger Eichelberger und Klaus Schmid
(2016):
Performance Modeling in the Age of Big Data - Some Reflections on Current Limitations
In:
Proceedings of the 3rd International Workshop on Interplay of Model-Driven and Component-Based Software Engineering (ModComp '16)
S. 37-38.
Zusammenfassung Big Data aims at the efficient processing of massive amounts of data. Performance modeling is often used to optimize performance of systems under development. Based on experiences from modeling Big Data solutions, we describe some problems in applying performance modeling and discuss potential solution approaches. |
2015 | |
7. |
Holger Eichelberger und Klaus Schmid
(2015):
IVML: A DSL for Configuration in Variability-rich Software Ecosystems
In:
Proceedings of the 19th International Conference on Software Product Line
S. 365-369.
ACM.
Zusammenfassung Variability-rich Software Ecosystems need configuration capabilities just as in any product line. However, various additional capabilities are required, taking into account the software ecosystem characteristics. In order to address these specific needs, we developed the Integrated Variability Modeling Language (IVML) for describing configurations of variability-rich software ecosystems. IVML is a variability modeling and configuration language along with accompanying reasoning facilities. |
6. |
Klaus Schmid und Holger Eichelberger
(2015):
EASy-Producer: From Product Lines to Variability-rich Software Ecosystems
In:
Proceedings of the 19th International Conference on Software Product Line
S. 390-391.
ACM.
Zusammenfassung The EASy-Producer product line environment is a novel open-source tool that supports the lightweight engineering of software product lines and variability-rich software ecosystems. It has been applied in several industrial case studies, showing its practical applicability both from a stability and a capability point of view. The tool set integrates both, interactive configuration capabilities and a DSL-based approach to variability modeling, configuration definition and product derivation. The goal of the tutorial is to provide the participants with an overview of the tool. However, the main focus will be on a brief introduction of the DSLs. After participating in the tutorial, the participants will understand the capabilities of the toolset and will have a basic practical understanding of how to use it to define software ecosystems and derive products from them. |
5. |
Holger Eichelberger, Cui Qin, Klaus Schmid und Claudia Niederée
(2015):
Adaptive Application Performance Management for Big Data Stream Processing
In: Softwaretechnik-Trends, 35 (3): 35-37.
Zusammenfassung Big data applications with their high-volume and dynamically changing data streams impose new challenges to application performance management. Efficient and effective solutions must balance performance versus result precision and cope with dramatic changes in real-time load and needs without over-provisioning resources. Moreover, a developer should not be burdened too much with addressing performance management issues, so he can focus on the functional perspective of the system For addressing these challenges, we present a novel comprehensive approach, which combines software configuration, model-based development, application performance management and runtime adaptation. |
4. |
Holger Eichelberger und Klaus Schmid
(2015):
Mapping the Design-Space of Textual Variability Modeling Languages: A Refined Analysis
In: International Journal of Software Tools for Technology Transfer, 17 (5): 559-584.
Zusammenfassung Variability modeling is a major part of modern product line engineering. Graphical or table-based approaches to variability modeling are focused around abstract models and specialized tools to interact with these models. However, more recently textual variability modeling languages, comparable to some extent to programming languages, were introduced. We consider the recent trend in product line engineering towards textual variability modeling languages as a phenomenon, which deserves deeper analysis. In this article, we report on the results and approach of a literature survey combined with an expert study. In the literature survey, we identified 11 languages, which enable the textual specification of product line variability and which are sufficiently described for an in-depth analysis. We provide a classification scheme, useful to describe the range of capabilities of such languages. Initially, we identified the relevant capabilities of these languages from a literature survey. The result of this has been refined, validated and partially improved by the expert survey. A second recent phenomenon in product line variability modeling is the increasing scale of variability models. Some authors of textual variability modeling languages argue that these languages are more appropriate for large-scale models. As a consequence, we would expect specific capabilities addressing scalability in the languages. Thus, we compare the capabilities of textual variability modeling techniques, if compared to graphical variability modeling approaches and in particular to analyze their specialized capabilities for large-scale models. |
2014 | |
3. | Holger Eichelberger und Klaus Schmid (2014): Flexible Resource Monitoring of Java Programs In: Journal of Systems and Software, 93: 163-186. Elsevier. |
2. |
Holger Eichelberger, Sascha El-Sharkawy, Christian Kröher und Klaus Schmid
(2014):
EASy-Producer: Product Line Development for Variant-rich Ecosystems
In:
Proceedings of the 18th International Software Product Line Conference: Companion Volume for Workshops, Demonstrations and Tools
Bd. 2.
S. 133-137.
ACM.
Zusammenfassung Development of software product lines requires tool support, e.g., to define variability models, to check variability models for consistency and to derive the artifacts for a specific product. Further capabilities are required when product lines are combined to software ecosystems, i.e., management and development of distributed product lines across multiple different organizations. In this paper, we describe EASy-Producer, a prototypical tool set for the development of software product lines in general and variant-rich ecosystems in particular. To support the product line engineer, EASy-Producer differentiates between simplified views limiting the capabilities and expert views unleashing its full power. We will discuss how these two views support the definition of variability models, the derivation of product configurations and the instantiation of artifacts. |
1. |
Holger Eichelberger und Klaus Schmid
(2014):
Resource-optimizing Adaptation for Big Data Applications
In:
Proceedings of the 18th International Software Product Line Conference: Companion Volume for Workshops, Demonstrations and Tools
Bd. 2.
S. 10-11.
ACM.
Zusammenfassung The resource requirements of Big Data applications may vary dramatically over time, depending on changes in the context. If resources should not be defined for the maximum case, but available resources are mostly static, there is a need to adapt resource usage by modifying the processing behavior. The QualiMaster project researches such an approach for the analysis of systemic risks in the financial markets. |