Under the Hood
Highly advanced, yet easy to use
The Stream Analyze Platform provides easy and interactive development, deployment, and evolution of applications that process and analyze real-time streams of data in distributed and edge environments. The platform allows non-expert programmers such as analysts and engineers to interactively change, query, and define mathematical models on the fly, without the need for deep programming skills. User friendliness and functionality are provided without any performance loss. This is achieved by combining advanced database and computer algebra optimization with state-of-the-art dynamic compilation into native binary code. Altogether, this makes the platform uniquely powerful and efficient for developing, deploying, and evolving intelligent analytics models (including machine learning) on edge devices of any kind and size.
This page provides an extended overview of the Stream Analyze Engine, the kernel system powering the platform. The SA Engine has a highly optimized architecture, making it independent of operating systems and other software, resulting in an extremely small and efficient footprint.
Explore the following sections to understand the architecture of the SA Engine and its role in enabling the robust performance of the Stream Analyze Platform.
1. Introduction
The Stream Analyze Platform provides interactive search and analysis of large data streams in real time, directly on devices without relying on the cloud. Streams that are produced by sensors and other data sources on mobile or edge devices can be analyzed on-line interactively. An edge device can be, e.g., an Android unit, a desktop computer, a Raspberry Pi, or MCUs such as ARM Cortex M0/3/4/7 or RiscV. This is possible since the kernel of the platform, the Stream Analyze Engine, has a very small footprint (from 17kB to 8MB depending on configuration) and is hardware and OS agnostic and fully independent of any third-party software.
The combination in SA Engine of a main-memory database, a computational engine, a data stream processor, and an inference engine running on edge devices allows edge analytics directly on the edge devices rather than the contemporary approach of first uploading all data from devices to a server and then do the analytics centrally on a server or in the cloud. The approach to process data directly when and where it happens allows for drastic data reduction [1]. Reducing the amount of data streamed to a central location (e.g. the cloud) substantially improves scalability across large fleets of edge devices. Then, only the data required to perform device management and population analysis needs to be streamed from the edge devices.
Stream Analyze Engine systems can be configured as stream servers for collecting and combining data from edges and other systems. For example, an analysis model in SA Engine on some edge device can do computations that detect strong vibrations based on real-time readings from an accelerometer. When strong vibrations are detected, a data stream containing the frequency spectrum of the vibrations along with the geographical position of the device is transmitted to a stream server. If the stream server receives many such streams at the same time from devices in geographical positions close to each other, it may indicate an earthquake. The stream server may furthermore forward the received data to other systems, e.g. for permanent central storage or batch analysis.
To analyze streaming data interactively on a very high and user-oriented level, SA Engine allows analysts and engineers to develop computations, filters and transformations over data using a computational query language called OSQL (Object Stream Query Language) [18], which seamlessly extends queries and updates in the standard query language SQL with functions, numerical objects and data streams for defining numerical models over streaming data. Using computational queries, computations and filters over real-time streaming data are defined as mathematical formulas and expressions, called stream models. A stream model is a set of definitions of mathematical functions, filters, and other computations over streams of measurements. Different kinds of models, algorithms, and engines can be combined through such computational queries. The models are specified non-procedurally on a very high level without deep programming knowledge; the user specifies what to do rather than writing detailed programs expressing how to execute the models. The user need not worry about details on how to efficiently implement analysis programs. Instead, an advanced query optimizer together with a dynamic compiler generates optimized native binary code for the devices on-the-fly [13]. The non-procedural specifications of queries and models over dense multi-dimensional arrays [14] are shown in [20][21] to be at least as efficient as corresponding manually coded computations in C/C++, while being specified as domain-oriented formulas with at least 1/60th of the code volume compared to C/C++.
SA Engine includes a library of over 1000 predefined OSQL-functions for math/stat computations, object stream filtering and transformation, signal processing, model and data management, and much more. The function library is continuously extended for new customer needs, and it is easy to define and deploy new user functions on-the-fly.
Common machine learning algorithms such as DBSCAN, DenStream, k-NN, k-means, and random forests, are available as predefined OSQL models and the user can easily extend them with other algorithms defined as queries or as foreign functions accessing external libraries. Learning and inference are supported on both edges and servers. Inference and computations require efficient representation of multi-dimensional arrays used in models and queries. To support this efficiently, the internal array representation in SA Engine is compatible with many other systems working with dense arrays including NumPy and other BLAS based systems. This compatibility makes SA Engine fully interoperable with TensorFlow Lite and OpenVINO, including hardware-accelerated delegates, as shown in our vision-pipeline-demo1.
Machine learning requires pre-processing of sensor data before applying the learned inference algorithm, followed by post-processing of the inferred knowledge. With SA Engine, both pre- and post-processing are easily expressed using its powerful computational queries.
SA Engine allows users to efficiently process dense sensor readings (including audio and video), numerical computations, and machine learning including neural networks. This is enabled by including in SA Engine built-in data types and operators for arrays. They provide very efficient processing of vectors, matrices, and tensors. Using SA Engine, a neural network and its inference and training are specified using queries, which are optimized and compiled into binary code [21]. For example, neural network models defined by Tensorflow/Tensorboard are represented as dense arrays and functions over them. When SA Engine executes a neural network, the weights are stored in its main memory database as dense arrays. For inference the network is executed as binary code, e.g. over local sensor readings.
The approach allows training of machine learning models including neural networks on edge devices2. Thus, centrally trained models can be further retrained and modified on edge devices to adapt their behavior to their environment3.
SA Engine is independent of other systems, while at the same time providing powerful interoperability mechanisms to enable tight integration with other software on a variety of hardware platforms. This agnosticism has enabled us to port the system to many different hardware and software environments. It has enabled us to scale down the system to run on small edge devices. The architecture provides a mechanism to keep track of all running SA Engine systems. Since the number of edge devices can be massive, the system can scale out to run in many SA Engine copies on large multi-cores, clusters, and clouds. It is always the same SA Engine kernel software running in all these system configurations.
For interoperability of the Stream Analyze Platform with other systems and infrastructures, SA Engine is tightly embedded in several common programming languages, such as Python [17], C [1][12], C++ [15], Lisp [19], and Java [16]. This enables existing algorithms and libraries implemented in those languages to be plugged into the system as foreign functions. The foreign functions implement OSQL functions in an external programming language using language specific application program interfaces (APIs). They can then be transparently used in queries and expressions. For example, a large library of basic mathematical, statistical, and machine learning algorithms are implemented as foreign functions in C and Lisp. New foreign functions can easily be developed.
A query that continuously computes or filters measurements in a data stream is called a continuous query, CQ. The Stream Analyze Platform allows analysts to interactively specify CQs for continuously analyzing measurements flowing through edge devices and stream servers in real-time. The result of an OSQL CQ is a real-time object stream of processed and filtered measurements, for example a CQ returning a stream of the position vectors of a device measured every second when it is close to a given geo-position.
Each SA Engine instance includes an object-oriented main memory database, which is used to store both stream models and user data. Tables stored in these databases can be queried and updated using standard SQL. The local databases are important for data stream processing, which usually involves matching in real-time fast flowing stream objects against data in a database. For example, to locally determine that the frequency spectrum of a measured vibration may later destroy an edge device, the frequencies measured by vibration sensors on a device are matched against a local database of known resonance frequencies of the device [8].
To combine object streams from several edges, the Stream Analyze Platform supports fusion queries that combine object streams [25][26]. An example of a fusion query is a CQ observing when several edge devices in an area detect strong vibrations at the same time to detect anomalies. The user is alerted when the fusion query produces results, perhaps together with a real-time visualization stream of the magnitude of the observed vibrations. A user can then interactively send new queries on-the-fly to the affected devices to find out details of their causes. Fusion queries require the integrated data streams to be comparable even though the involved object streams from different devices may represent the same or similar data in different ways. For example, one device may represent temperature in Fahrenheit while another one uses Celsius.
To be able to combine such heterogeneous data streams from different devices, the Stream Analyze Platform allows mediator models [5][10] to be defined as queries and functions that harmonize arriving heterogeneous object streams by transforming them to a universal model (ontology) in stream servers that integrate data streams from different edges. Mediation can be, e.g., mapping local names of sensors to a universally known nomenclature, measurement unit conversions, or calibrations of local measurements.
To access external data streams, SA Engine provides wrappers, which are APIs that enable processing incoming data stream objects as they arrive to inject them into the SA Engine kernel, so that the accessed streams can be used in CQs. The wrappers are defined as functions that return streams of objects from wrapped data sources. There is a library of predefined wrappers to interoperate with common data infrastructures such as relational databases through JDBC and data processing systems through Kafka, Azure IoT Hub, CANBUS or MQTT. New wrappers can easily be developed4.
SA Engine comes in two different versions:
• The SA Engine Prime system provides all capabilities needed for distributed real-time stream analytics. It requires 8M of RAM in its smallest configuration. Large volumes of edge devices and handled by scaling out large federations of distributed SA Engine Prime stream servers.
• The SA Engine Nano system is a scaled down version of SA Engine that in its smallest configuration can run on very small devices, as it requires only 17kB of RAM and 350kB of flash. SA Engine Nano needs occasional assistance from an SA Engine Prime system, called an SA Engine Twin. An SA Engine Twin stores meta-data about the devices it represents and compensates for their missing functionality.
[1] https://youtu.be/Nbuj76ZQd6Q
[2] https://youtu.be/lWGi6ixIrKs
[3] https://youtu.be/RBOX2P7-3L4
[4] https://studio.streamanalyze.com/docs/reference/osql-ref/external-data/
2. Stream Analyze Engine Prime
The SA Engine Prime system provides general data analytics and inference capabilities to the device or computer where it is running. It is designed to be easily integrated with other systems and extended with models and plug-ins. The figure below illustrates the SA Engine Prime system and how it is extensible by interfaces to external systems and data.
3. Stream Analyze Engine Nano
The SA Engine kernel system can be scaled down to run directly on small devices with limited or no OS support. The scaled-down version of SA Engine is called SA Engine Nano. The figure below illustrates the architecture of SA Engine Nano.
4. Architecture of Cloud Components
• a large number of connected edge devices,
• a large number of users,
• a large number of models managed by the users across the edge devices.
5. Query optimization
The distributed query processing and dynamic native code generation of SA Engine is documented in [13]. The optimization of queries and models involving dense numerical arrays is documented in [14] and benchmarked in [20]. See [21] for a detailed walk-through of the query processing steps when processing an optimized neural network with tensors fully defined in OSQL. This section provides a summarized description of the query optimizer. The steps of the query processor in SA Engine are illustrated by the figure below:
b. Overlapping predicates are identified and bound to variables by unification [1].
c. Expressions that never change over time are evaluated at query optimization time once and for all by the query rewriter and replaced by its value, i.e. partial evaluation. For example, the expression
sqrt(2 * 3.1416) is replaced with 2.5066.d. Special rewrite rules are applied on certain expressions in the query, e.g. array [14] and numerical expressions [6] are translated into predicates that are further reduced and combined with other predicates and expressions accessing the local database are rewritten [23]. Furthermore, specific computer algebra rules are applied based on knowledge about mathematical functions. For example, for multiplication and addition these rules are applicable:
x+0->x and x*1->x.e. Redundant predicates whose values are not used anywhere are removed, a form of dead code elimination for queries [1].