Horizon Europe project
Autonomous, scalablE, tRustworthy, intelligent European meta Operating System for the IoT edge-cloud continuum
aerOS is a European research project (Horizon Europe CL4-2021-DATA-01-05) running for 3 years which aims at transparently utilising the resources on the edge-to-cloud computing continuum for enabling applications in an effective manner while incorporating multiple services. The overarching goal of aerOS is to design and build a virtualized, platform-agnostic meta operating system for the IoT edge-cloud continuum.
What we are aiming for
The IoT ecosystem is a dynamic aggregation of resources, e.g., sensors, actuators, processing/storage, populating edges of current infrastructures, e.g., edge computing with local/ad-hoc clouds, fog computing, far edge and federated approaches. AI (with explainability) and real-time processing may require high computing power close to events and, sometimes, distributed across Infrastructure Elements. EU funded projects, like ACCORDION and DECENTER, already address continuum challenges, by associating edge computing with 5G, and by realising Fog Computing platform. In such a distributed data and compute scenario, the so-called network compute fabric, the network should host computing intertwined with communication for the highest level of efficiency, to support heterogeneous systems, ranging from simple terminals to performance-sensitive robots and augmented reality nodes. However, edge meta operating systems require flexibility to serve any dynamic combination of Infrastructure Elements, providing globally orchestrated services, for example, policy services specifying behaviour; data governance; or even cognitive services. Examples of current and extended state of the art edge meta operating systems are: Thin Edge, ROS for robotic environments, EOS for virtualised telco networks, or VirtuOS for the cloud.
Breakthrough: aerOS will continue developments leading toward achieving IoT edge-cloud continuum, by integrating relevant technologies, elements of connectivity, IoT, AI, data autonomy and cybersecurity. The proposed meta operating system will support distribution and data sharing across the IoT edge-cloud continuum and will enable orchestration of resources and services, by providing mechanisms for data processing and application of intelligence, in particular, close to where the data is produced.
Service orchestration follows recent advances in SDN/NFV, e.g., Cloud-Native functions (i.e., CNFs). Orchestration provides seamless, elastic service deployment for verticals, while efficiently reusing the available resources, reducing incurred costs and consumed energy. One of major challenges is to efficiently orchestrate services in a heterogeneous continuum of resource federation, as opposed to single-domain orchestration (where the orchestrator has full control over resources; while multi-domain orchestration requires coordination across domains). Alternatives exist for centralised, distributed and hierarchical orchestrators, where the growing complexity, calls for automated orchestration and management of services. Different initiatives exist, like ETSI ZSM ISG; ETSI ENI ISG; TMF’s ZOOM, Open RAN, NWDAF (Network Data Analytics Function), ETSI OSM, ETSI MEC ISG. Network and service providers build their business logic around microservices and AI. Orchestrators map high-level QoS requirements into appropriate set of tasks characterised by resource requirements, their locations, and level of isolation. Currently, resource allocations to network components are handcrafted by the operators, leading to resources over(under)provisioning. Therefore, data/event-driven service orchestration is needed to allocate the right amount of resources to each slice.
Breakthrough: aerOS will deliver automated service orchestration, developing a robust high-performance algorithmic framework supporting full automation of service orchestration with adoption and fine-tuning of innovative AI/ML techniques (i.e. training time and accuracy), addressing different topologies, from hierarchical to fully distributed. aerOS will provide zero-touch orchestration leveraging ongoing standards and open source initiatives, progressing shared learning between domains beyond the state of the art to speed up the learning process of AI/ML models.
Service deployment and reconfiguration across IoT edge-cloud continuum is challenging mostly because of the heterogeneity of the network. Standalone services can have network requirements concerning data sources, which can be fulfilled leveraging technologies related to NFV and SDN, but also 5G Network Programmability via the native service APIs (3GPP NEF/SEAL/CAPIF) and the 3GPP vertical application enablers, such as the EDGE_APP. Composition of services with heterogeneous requirements (e.g., latency) can also be enacted vertically, where reconfiguration of services (and network, if necessary) is even more complex. Besides, devices are becoming ever smarter in collecting, processing and transmitting data. In addition, exponential growth of connected devices and sensors promote new, computationally intensive, IoT applications that can cause network bottlenecks, impacting overall performance. Hence, it is mandatory to apply novel techniques to provide better support for IoT operations across IoT edge-cloud continuum and prevent any needless communication that will affect the performance of the network, while reducing costs of data storage and computation. Networks are key to achieve increasingly demanding levels of reconfigurability, self-* and automation, in order to scale efficiently, manage resources, and optimise operation while handling multi-vertical traffic with distinct demands.
Breakthrough: aerOS will leverage smart networking capabilities (5G Native Exposed APIs NEF, SEL, CAPIF, programmable network fabric, etc.) to improve scalability, and real-time processing, within the network, by supporting data/knowledge distribution mechanisms, including automatic monitoring and dynamic (self-) configuration of the network, by means of SDN/NFV components to bring about smart network paradigm. Besides, aerOS will integrate Time Sensitive Networking (TSN) for timely delivery and reliability of critical control data in complex IoT edge-cloud continuum distributed infrastructures derived from the use cases.
IoT ecosystems are comprised of heterogeneous multi-vendor nodes. Consequently, there is a large discrepancy in their capabilities and resources (e.g., processing power or storage capacity), and their underlying hardware. Virtualisation allows services and applications to run in a homogeneous environment, no matter the hardware or operating system. Standardised APIs allow services to access specific hardware e.g., GPUs, memory or storage. In addition, clustering multiple virtualised nodes delivers large federated pool of resources. To allow adequate resource continuity, the compute continuum architecture needs a common infrastructure virtualisation framework. Although VMs are common for the cloud, they are not suitable for constrained devices and edge nodes, because of the large overhead they add. For optimal resource allocation and high QoS, virtualisation frameworks should be tailor-made for each specific domain with its specific requirements, while being entirely hardware-independent. Different frameworks strive at achieving this goal: Docker Swarm, Kubernetes, FITOR, EPOS Fog, Apache Mesos, etc. They are already well established in the cloud, still, they have to be adapted to the heterogeneous nature of the IoT edge-cloud continuum distributed and federated deployments, to provide scalable continuum of resources.
Breakthrough: aerOS will address the dynamic nature of IoT edge-cloud continuum constrained resources including re-configuration of smart networking elements, and re-evaluation of orchestrators. aerOS will develop effective mechanisms to distribute data across IoT edge-cloud continuum, so that the integrity and the performance of latency-sensitive applications are not compromised.
Data sovereignty is the ability to keep data within a particular realm, and the explicit knowledge and control on how data is processed, stored, and forwarded. Data autonomy is related to the ability of homogenising data models at the edge, i.e., to query, interoperate or prepare data to be used by AI modules. Current practices in data processing are focused on access control and enforcement of secure forwarding and storage, with different identity schemas (centralised, distributed, federated…), authorisation models and access policies. Intensive use of data evidence for control and management processes needs: (i) usability – data is provided according to the structure required by consumers; (ii) sufficiency – data is generated by required sources and processors, according to a planned topology; (iii) safety – data provenance related properties (e.g., origin, timeframe) can be verified; (iv) steadiness – availability and continuity of data flows are assured. Most, if not all, of these properties are associated with availability of well-structured and sufficient metadata to manage data access, forwarding and processing.
Breakthrough: aerOS will comprehensively address data autonomy through an integral data infrastructure, relying on current, well-established yet innovative solutions in use for IoT (CIM) and network telemetry (YANG), that would require extension and support for scaling, supporting: (i) user-defined policies integrated in data models; (ii) compositional models to define data processing topologies, for verification and validation; (iii) syntactic and semantic interoperability; (iv) runtime operation and management of data pipelines; and (v) automated policy enforcement in heavily virtualised environments.
According to RAMI4.0, AI can be beneficial not just at functional but also at business level (e.g. IEEE Ethically Aligned Design for Business), when concerns about its reliability and safety are addressed. AI may support an efficient decision-making, e.g., optimise sequencing of activities that run at different IoT/edge nodes, and/or the cloud (referring to critical operations, such as those found in aerOS use cases: forecasts/planning in logistics, production, downtimes, resource availability, etc.). Edge resource constraints bring challenges, but frugal AI methods may provide solutions. While frugal AI approaches are a hot research topic, they are studied using “cloud resources”. Besides, AI explainability may be needed in the real-world, requiring additional resources and overcoming problems caused by streaming data, so it is also pursued (mostly) in the cloud. Separately, IoT/edge ecosystems naturally match federated/distributed AI/ML scenarios. However, existing frameworks still have constraints to address. Finally, aerOS needs distributed AI/ML components “internally”, e.g. to deliver self-* behaviours or smart orchestration.
Breakthrough: aerOS will deliver comprehensive support for distributed/federated frugal AI with explainability (as needed) in data pipelines within IoT edge-cloud continuum. Research will be focused on efficient implementation (and orchestration) of selected (needed for the use cases and by the orchestrator itself) distributed frugal and/or explainable AI methods on resource constrained devices. Implemented AI modules will be validated in the laboratory and in actual use cases (to support application intelligence and in the aerOS orchestrator that will manage them) and complemented with lessons learned, assuring ease of use.
Meta operating systems support for cybersecurity is a multi-dimensional problem of protection of data stored, in transit, and during processing. Increase in security needs, raised by processing data locally, causes novel challenges to be addressed: (i) requirements for lightweight data encryption and fine-grained data sharing, (ii) heterogeneous data dissemination control and secure data management, (iii) balancing security between large-scale edge services and resource-constrained edge devices, (iv) efficient privacy preserving mechanisms. Also, privacy and trust mechanisms should be addressed in every Infrastructure Element in the IoT edge-cloud continuum, even the smart network. Data governance is also a challenge, since data is scattered across the “levels” and needs to be stored, deleted, processed, searched, transmitted and accessed while keeping security, integrity, trust and privacy. The inherent distributed nature of IoT edge-cloud continuum, poses security and privacy challenges due to the heterogeneity of edge infrastructural elements and migration of services among them. A potential solution could be based on DLT, providing reliable access and control of the network, enhancing data integrity and computation validity. There are research challenges to be addressed in relation with security, privacy and trust with focus on scalability, and the extension of DevSecOps methodology to include privacy by design.
Breakthrough: aerOS will deliver highest levels of security, privacy and trust, while keeping high performance, using lightweight SotA techniques, such as concise binary object representation signing and encryption, lightweight attestation, and lightweight consensus. Information, knowledge and decisions, shared amongst peers, will be trusted thanks to traceability and accounting mechanisms, while leveraging a newly defined DevPrivSecOps methodology (including Security and Privacy in the DevOps processes). By design, aerOS will support any existing and future cybersecurity mechanism, thanks to its modular architecture.