Département Informatique

Computer Science Department of Telecom SudParis

New paper “Process mining approach for multi-cloud SLA reporting” at IEEE Big Data 2023

Authors: Jeremy Mechouche, Mohamed Sellami, Zakaria Maamar, Roua Touihri, and Walid Gaaloul

Abstract

Cloud consumers’ requirements possess an inherent dynamic nature, characterized by fluctuating needs in reliability and high-availability relative to their workload. To satisfy these requirements, service reconfiguration strategies are put in place ensuring first, adaptable service provisioning and second, compliance with the agreed-upon Service Level Agreements (SLAs) between consumers and providers. However, deviations between SLAs and “real” observed behaviours could occur even after triggering reconfiguration strategies. Additionally, as organizations increasingly embrace multi-cloud environments, careful consideration must be given to the inherent challenges that arise in this requirements satisfaction. In this paper, we represent these strategies as state machines used to report their conformance to collected logs which track what really happened at run-time. The collected logs are processed to construct state machines suitable for conformance checking. Experiments demonstrating the technical doability of using conformance checking to detect deviations between SLAs and logs, along with verifying the suitability of reconfiguration strategies, are also discussed in the paper.

New paper “Uncovering Implicit Bundling Constraints: Empowering Cloud Network Service Discovery” at ICSOC 2023

Authors: Hayet Brabra, Imen Jerbi, Mohamed Sellami, Walid Gaaloul, and Djamal Zeghlache

Abstract

Cloud service providers (CSPs) offer their networking services (NSs) in the form of service bundles containing underlying services, not necessarily requested by the users. While service bundling is a common practice in the cloud providing multiple components as a single service, unawareness of this hidden structure of services at design time may limit their portability, compatibility, and interoperability across multiple providers. This calls for service discovery solutions that can identify and reveal such hidden bundling to cloud users so they become aware of the consequences of existing bundling before any deployment stage. This paper presents a new NSs discovery approach that takes into account and makes transparent network services bundling for cloud users.

New paper “Discovering Guard Stage Milestone Models Through Hierarchical Clustering” at CoopIS 2023

Authors: Leyla Moctar M’Baba, Mohamed Sellami, Nour Assy, Walid Gaaloul, and Mohamedade Farouk Nanne

Abstract

Processes executed on enterprise Information Systems (IS), such as ERP and CMS, are artifact-centric. The execution of these processes is driven by the creation and evolution of business entities called artifacts. Several artifact-centric modeling languages were proposed to capture the specificity of these processes. One of the most used artifact-centric modeling languages is the Guard Stage Milestone (GSM) language. It represents an artifact-centric process as an information model and a lifecycle. The lifecycle groups activities in stages with data conditions as guards. The hierarchy between the stages is based on common conditions. However, existing works do not discover this hierarchy nor the data conditions, as they considered them to be already available. They also do not discover GSM models directly from event logs. They discover Petri nets and translate them into GSM models. To fill this gap, we propose in this paper a discovery approach based on hierarchical clustering. We use invariants detection to discover data conditions and information gain of common conditions to cluster stages. The approach does not rely on domain knowledge nor translation mechanisms. It was implemented and evaluated using a blockchain case study.

New paper “Bringing privacy, security and performance to the Internet of Things using IOTA and usage control”

Bringing privacy, security and performance to the Internet of Things using IOTA and usage control
by Nathanaël Denis, Sophie Chabridon and Maryline Laurent

Annals of Telecoms, jan. 2024

hal.science link

Abstract
The Internet of Things (IoT) is bringing new ways to collect and analyze data to develop applications answering or anticipating users’ needs. These data may be privacy-sensitive, requiring ecient privacypreserving mechanisms. The IoT is a distributed system of unprecedented scale, creating challenges for performance and security. Classic blockchains could be a solution by providing decentralization and strong security guarantees. However, they are not ecient and scalable enough for large scale IoT systems, and available tools designed for preserving privacy in blockchains, e.g. coin mixing, have a limited eect due to high transaction costs and insucient transaction rates. This article provides a framework based on several technologies to address the requirements of privacy, security and performance of the Internet of Things. The basis of the framework is the IOTA technology, a derivative of blockchains relying on a directed acyclic graph to create transactions instead of a linear chain. IOTA improves distributed ledger performance by increasing transaction throughput as more users join the network, making the network scalable. As IOTA is not designed for privacy protection, we complement it with privacy-preserving mechanisms: merge avoidance and decentralized mixing. Finally, privacy is reinforced by introducing usage control mechanisms for users to monitor the use and dissemination of their data. A Proof of Concept is proposed to demonstrate the feasibility of the proposed framework. Performance tests are conducted on this Proof of Concept, showing the framework can work on resource-constrained devices and within a reasonable time. The originality of this contribution is also to integrate an IOTA node within the usage control system, to support privacy as close as possible to the objects that need it.

New paper “Integrating Usage Control Into Distributed Ledger Technology for Internet of Things Privacy”

Integrating Usage Control Into Distributed Ledger Technology for Internet of Things Privacy
by Nathanaël Denis, Maryline Laurent and Sophie Chabridon

IEEE Internet of Things Journal, Volume: 10, Issue: 22, jun. 2023

arXiv link

Abstract
The Internet of Things (IoT) brings new ways to collect privacy-sensitive data from billions of devices. Well-tailored distributed ledger technologies (DLTs) can provide high transaction processing capacities to IoT devices in a decentralized fashion. However, privacy aspects are often neglected or unsatisfying, with a focus mainly on performance and security. In this article, we introduce decentralized usage control mechanisms to empower IoT devices to control the data they generate. Usage control defines obligations, i.e., actions to be fulfilled to be granted access, and conditions on the system in addition to data dissemination control. The originality of this article is to consider the usage control system as a component of distributed ledger networks, instead of an external tool. With this integration, both technologies work in synergy, benefiting their privacy, security, and performance. We evaluated the performance improvements of integration using the IOTA technology, particularly suitable due to the participation of small devices in the consensus. The results of the tests on a private network show an approximate 90% decrease of the time needed for the usage control system to push a transaction and make its access decision in the integrated setting, regardless of the number of nodes in the network.

PhD Defense of Nathanaël DENIS – For a Private and Secure Internet of Things with Usage Control and Distributed Ledger Technology

October 3 2023

Abstract: IoT devices represent one of the major targets for malicious activities. The grounds for this are manifold: first, to reduce the cost of security, manufacturers may sell vulnerable products, leaving users with security concerns. Second, many IoT devices have performance constraints and lack the processing power to execute security software. Third, the heterogeneity of applications, hardware, and software widens the attack surface. As a result, IoT networks are subject to a variety of cyber threats. To counter such a variety of attacks, the IoT calls for security and privacy-preserving technologies. For privacy concerns, usage control grants the users the power to specify how their data can be used and by whom. Usage control extends classic access control by introducing obligations, i.e., actions to be performed to be granted access, and conditions that are related to the system state, such as the network load or the time. This thesis aims at providing answers to the challenges in the Internet of Things in terms of performance, security and privacy. To this end, distributed ledger technologies (DLTs) are a promising solution to Internet of Things constraints, in particular for micro-transactions, due to the decentralization they provide. This leads to three related contributions: 1. a framework for zero-fee privacy-preserving transactions in the Internet of Things designed to be scalable; 2. an integration methodology of usage control and distributed ledgers to enable efficient protection of users’ data; 3. an extended model for data usage control in distributed systems, to incorporate decentralized information flow control and IoT aspects. A proof of concept of the integration (2) has been designed to demonstrate feasibility and conduct performance tests. It is based on IOTA, a distributed ledger using a directed acyclic graph for its transaction graph instead of a blockchain. The results of the tests on a private network show an approximate 90% decrease of the time needed to push transactions and make access decisions in the integrated setting.

New paper “KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training”, to be presented at NeurIPS’23

Available online: https://hal.archives-ouvertes.fr/hal-03750441/document

Code available at https://github.com/TruongThaoNguyen/kakurenbo

Authors: Thao Truong Nguyen, Balazs Gerofi, Edgar Josafat Martinez-Noriega, François Trahay, Mohamed Wahib.

Abstract: This paper proposes a method for hiding the least-important samples during the training of deep neural networks to increase efficiency, i.e., to reduce the cost of training. Using information about the loss and prediction confidence during training, we adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process, without significantly degrading accuracy. We explore the converge properties when accounting for the reduction in the number of SGD updates. Empirical results on various large-scale datasets and models used directly in image classification and segmentation show that while the withreplacement importance sampling algorithm performs poorly on large datasets, our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.

New paper “Request relaxation based-on provider constraints for a capability-based NaaS services discovery” at CAISE 2023

Authors: Imen Jerbi, Hayet Brabra, Mohamed Sellami, Walid Gaaloul, Sami Bhiri, Boualem Benatallah, Djamal Zeghlache, and Olivier Tirat

Abstract

Network as a Service (NaaS) enables cloud customers to connect their distributed services across multiple clouds without relying exclusively on their infrastructures. The discovery of NaaS services remains challenging not only because of their scale and diversity but also because of the hidden constraints that cloud providers impose on these services at the networking layer. NaaS services are usually offered in the form of service bundles containing underlying services and constraints not requested by the customers. This creates undesirable dependencies and constraints that hamper portability, compatibility and interoperability across providers. The problem of service discovery becomes more challenging when these constraints are the main and first cause that prevents a customer’s request from being fulfilled. Without a mechanism that enables customers to identify these constraints and to adjust their requests accordingly, existing service discovery solutions are likely to fall short. We propose to complement existing service discovery solutions by not only identifying unmatched constraints but also recommending relaxing discovery requests to retrieve optimal and compliant services.

New article “Process mining for Artifact-Centric Blockchain Applications” in the SIMPAT journal

Authors: Leyla Moctar M’Baba, Nour Assy, Mohamed Sellami, Walid Gaaloul and Mohamedade Farouk NANNE

Abstract

Process mining can provide valuable insights into user behavior, performance, and security for blockchain applications. In return, process mining benefits from the trustworthiness of blockchain data. One obstacle to realizing these benefits is that blockchain data is inadequate for process mining. This issue has been previously explored in literature, but mainly with regards to workflow-centric processes, leaving out the more common artifact-centric applications. This article introduces ACEL (Artifact-Centric Event Log), an extension to the OCEL (Object-Centric Event Log) standard, specifically designed for artifact-centric processes. Additionally, we present a method for extracting ACEL logs from the Ethereum blockchain platform and demonstrate its effectiveness and the perspectives of process discovery through two case studies of public Ethereum applications.

PhD defense: Alexis Colin – November, 28th – From trace collection to the prediction of the behaviour of parallel applications

Bonjour,J’ai le plaisir de vous inviter à ma soutenance de thèse intitulée “De la collecte de trace à la prédiction du comportement d’applications parallèles” [pdf]. Le résumé est disponible ci-dessous.

La soutenance aura lieu en français le lundi 28 novembre à 14h, dans l’amphithéâtre 3 des locaux de Télécom SudParis au 19 place Marguerite Perey, 91120 Palaiseau. Un accès en visioconférence sera disponible au lien suivant : https://webconf.imt.fr/frontend/fra-v2m-fsg-cuu.

Le jury sera composé de :
– Mme Amel Bouzeghoub, Professeure – Télécom SudParis (Examinatrice)
– M. Patrick Carribault, Chercheur – CEA/DAM (Examinateur)
– M. Denis Conan, Maître de conférences HDR – Télécom SudParis (Directeur de thèse)
– Mme Camille Coti, Professeure – Université du Québec à Montréal (Rapporteuse)
– M. Arnaud Legrand, Directeur de recherche – INRIA Grenoble (Examinateur)
– M. Samuel Thibault, Professeur – Université de Bordeaux (Rapporteur)
– M. François Trahay, Maître de conférences HDR – Télécom SudParis (Encadrant)

La soutenance sera suivie d’un pot.

Résumé :

Afin d’exploiter les ressources des serveurs et des supercalculateurs, les développeurs ont recours à des modèles de programmations spécifiques qui sont mis en œuvre par des runtimes dont le rôle est de permettre à chaque programme d’exploiter pleinement les capacités de la machine qui l’exécute. Pour cela, les runtimes doivent prendre des décisions qui ont un impact direct sur les performances. Pour prendre de bonnes décisions, les runtimes essaient d’anticiper le comportement futur des programmes, mais les moyens à leur disposition sont limités.

Nous présentons Pythia, un oracle générique permettant aux runtimes de prédire le comportement futur d’un programme. Nous décrivons comment enregistrer une trace d’exécution d’un programme pour en capturer la structure sous la forme d’une grammaire. Nous développons un algorithme performant capable de construire une telle grammaire à la volée pendant l’exécution d’un programme sans dégrader ses performances. Nous montrons ensuite comment utiliser une grammaire représentant la structure d’une exécution d’un programme pour prédire son comportement futur lors de ses exécutions ultérieures. Pythia permet en particulier d’explorer un arbre probabilisé des prochaines actions potentielles d’un programme.

L’évaluation de notre travail montre que les prédictions de Pythia peuvent être utilisées pour implémenter des optimisations au sein d’un runtime. Nous faisons aussi la démonstration de l’utilisabilité de Pythia en l’utilisant pour mettre en œuvre une stratégie de parallélisme adaptatif au sein d’un runtime OpenMP existant.

——————————————————-

[English]

Dear colleagues,

I have the pleasure to invite you to the defense of my PhD entitled “From trace collection to the prediction of the behaviour of parallel applications”. The abstract is below. The defense will take place in French on Monday, November 28 at 2:00 pm, in amphitheater 3 of the Télécom SudParis building at 19 place Marguerite Perey, 91120 Palaiseau. A videoconference access will be available at the following url: https://webconf.imt.fr/frontend/fra-v2m-fsg-cuu.

The jury will be composed of:
– Mrs. Amel Bouzeghoub, Professor – Télécom SudParis (Examiner)
– Mr. Patrick Carribault, Researcher – CEA/DAM (Examiner)
– Mr. Denis Conan, Associate Professor HDR – Télécom SudParis (Director)
– Mrs. Camille Coti, Professor – Université du Québec à Montréal (Reviewer)
– Mr. Arnaud Legrand, Research director – INRIA Grenoble (Examiner)
– Mr. Samuel Thibault, Professor – Université de Bordeaux (Reviewer)
– Mr. François Trahay, Associate Professor HDR – Télécom SudParis (Co-director)

The defense will be followed by a buffet.

Abstract:

In order to exploit the resources of servers and supercomputers, developers use specific programming models that are implemented by runtimes. Runtimes allow each program to fully exploit the capacities of the machine that executes it. To do this, runtimes take decisions that have a direct impact on the performance of the programs. In order to take good decisions, runtimes try to anticipate the future behavior of the programs, but the means at their disposal are limited.

We present Pythia, a generic oracle allowing runtimes to predict the future behavior of a program. We describe how to record an execution trace and to capture its structure in the form of a grammar. We develop an algorithm capable of building such a grammar on the fly during the execution of a program without degrading its performance. We then show how to use a grammar representing the structure of a program execution to predict its future behavior during its subsequent executions. In particular, Pythia allows to explore a probabilized tree of potential next actions of a program.

The evaluation of our work shows that the predictions of Pythia can be used to implement optimizations within a runtime. We have also demonstrated the usability of Pythia by using it to implement an adaptive parallelism strategy within an existing OpenMP runtime.