by Michał Cukierman
6 min read
by Michał Cukierman
6 min read
Observability is key in modern software systems, especially those with microservices architecture, because it helps teams to diagnose, debug, and improve system performance and reliability.
In monolithic applications, we used to rely on:
Stack traces
Memory dumps
JMX beans
Log files
While we still can use it for one service, it's almost impossible to track every single instance. It's also Cattle vs Pets approaches - in modern systems, you are most focused on the system as a whole, rather than on a separate service.
In StreamX, we use several standards to monitor the state of our meshes:
OpenTelemetry for tracing
Micrometer for gathering metrics
Loki as a logs database
For the UI, we rely on Grafana Dashboards and Jaeger tracing. I especially like the tracing, where each of the operation in microservice adds a span to every trace produced by a single publication. It visualizes the pipelines executed by the system.
What can we read in the graph below?
[Update: In April 2025, we’ve replaced Jaeger tracing with Tempo.]
Fig. 1: Jaeger tracing in StreamX
Single data publication used to generate pages fired execution of 9 services: data collector, data aggregation, rendering engine, relay, sitemap generation, indexable item extraction, search and web server ingestion.
Some of the services were fired multiple times (i.e. multiple pages were rendered based on a single data publication event, there are multiple instances of web-servers and search engines)
Invocation of 409 spans generated by 11 services, in some cases processed by multiple replicas took around 200ms.
There were no errors
By reading cumulative (aggregated) metrics based on spans, we may find out the things like:
processing rate (number of service invocation per second)
p95 latency
error rates
Designing observability requires significant time, but once completed, it enables the team to operate with much more information. Therefore, I believe it is essential for modern systems.
"Observability is the weakest aspect for composable commerce architecture. For now this isn't such a big issue only because even the world's largest composable commerce ecosystem probably have less than 30 microservice saas platforms integrated."
"@Albin Paul This is because most MACH/Composable solutions are built in-house. When planning, things like observability are often out of scope. Let’s be clear: no one wins a client because of distributed tracing or well-tailored metrics.
This is where the build vs. buy decision comes into play. By building composable solutions on a product, you get most of these features already integrated. It’s not only cheaper and faster than in-house development, but also of higher quality and well-tested, and we can demonstrate this with measurable outcomes."