How StreamX implements observability

First published on LinkedIn

This article was originally published on Michał's LinkedIn. We are republishing it here with subheadings added for clarity.

What worked in monolith won't work in modern systems

In monolithic applications, we used to rely on:

Stack traces
Memory dumps
JMX beans
Log files

While we still can use it for one service, it's almost impossible to track every single instance. It's also Cattle vs Pets approaches - in modern systems, you are most focused on the system as a whole, rather than on a separate service.

What we use for monitoring

In StreamX, we use several standards to monitor the state of our meshes:

OpenTelemetry for tracing
Micrometer for gathering metrics
Loki as a logs database

Platform architecture

StreamX Observability plane

Read the docs

For the UI, we rely on Grafana Dashboards and Jaeger tracing. I especially like the tracing, where each of the operation in microservice adds a span to every trace produced by a single publication. It visualizes the pipelines executed by the system.

What can we read in the graph below?

[Update: In April 2025, we’ve replaced Jaeger tracing with Tempo.]

Fig. 1: Jaeger tracing in StreamX

Single data publication used to generate pages fired execution of 9 services: data collector, data aggregation, rendering engine, relay, sitemap generation, indexable item extraction, search and web server ingestion.

Some of the services were fired multiple times (i.e. multiple pages were rendered based on a single data publication event, there are multiple instances of web-servers and search engines)

Invocation of 409 spans generated by 11 services, in some cases processed by multiple replicas took around 200ms.

There were no errors

By reading cumulative (aggregated) metrics based on spans, we may find out the things like:

processing rate (number of service invocation per second)
p95 latency
error rates

The tricky part

Designing observability requires significant time, but once completed, it enables the team to operate with much more information. Therefore, I believe it is essential for modern systems.

From the comments

Albin Paul
AyataCommerce

"Observability is the weakest aspect for composable commerce architecture. For now this isn't such a big issue only because even the world's largest composable commerce ecosystem probably have less than 30 microservice saas platforms integrated."

Michał Cukierman
StreamX.dev

"@Albin Paul This is because most MACH/Composable solutions are built in-house. When planning, things like observability are often out of scope. Let’s be clear: no one wins a client because of distributed tracing or well-tailored metrics.

This is where the build vs. buy decision comes into play. By building composable solutions on a product, you get most of these features already integrated. It’s not only cheaper and faster than in-house development, but also of higher quality and well-tested, and we can demonstrate this with measurable outcomes."

by Michał Cukierman

StreamX co-founder and CTO