Observability with Hyperf and OpenTelemetry
At PicPay, to improve the scalability of our PHP applications, we use _Swoole_ as a high-performance _runtime_, making applications asynchronous and non...


At PicPay, to improve the scalability of our PHP applications, we use Swoole as a high-performance runtime, making applications asynchronous and non-blocking.
Starting to use Swoole brought some challenges, and one of them is observability. We will understand more about it later.

Photo by Davyn Ben on Unsplash
Observability
Observability is the methodology that the microservices and cloud-native market uses to monitor applications, understand how resources are being used, determine whether the application is delivering what it is supposed to do, and, if there are errors, identify them.
To do that, three pillars are defined (and maybe a fourth):
Metrics
Metrics are quantitative evaluation measurements commonly used to compare and track performance or output. How many times it happened.
Traces
A set of Spans; they are the collection of places through which the life cycle of a transaction passed. Where it happened.
Logs
Information about the current context at the moment something happened during the life cycle of a transaction. What happened.
Events?
It is increasingly common to separate logs from events, and many people see them as the fourth pillar of observability. Events are information about business rules. Custom data about the experience of a given data flow. The biggest difference from logs is that infrastructure data does not live here.
Hyperf
Hyperf is the web framework we adopted for PHP applications on Swoole. The points that led us to use it were that it has coroutines as first-class citizens. Hyperf was made for Swoole; it was not adapted for Swoole as would be the case with other market frameworks such as Laravel, Symfony, and Laminas.
All of its components are prepared to work with coroutines using connection pools, avoiding global state by using coroutine contexts, and avoiding memory leaks as much as possible, since process memory will no longer be cleared on every request. With that also comes the care needed to prevent data from one request from affecting another.
In short, Hyperf is prepared to work in a stateful way, which is the complete opposite of the stateless model of traditional PHP-FPM.
Made for microservices
Another very nice point that supported our choice is the way Hyperf focuses on microservices. It does not worry about things like views and sessions, which are more commonly used when building websites.
Instead, Hyperf delivers components designed for microservices and the cloud world, things like circuit-break, rate-limit, service-discovery, remote-config, gRPC, etc.
And, of course, alongside these cloud-native components focused on microservices, there are also components for Observability.
Problems with APMs
Application observability is usually done trivially through APMs. The monitoring service provider, for example New Relic, provides an APM that can be added to the server to run alongside the application, and it instruments what is happening.
APMs do this using a technique called Monkey Patch, a way of adding behavior at runtime. When you call cURL through the curl_ functions, for example, you are actually calling the New Relic library, which in turn calls PHP’s standard cURL, but in the middle of that it adds the behavior that performs the instrumentation.
Swoole has a feature called Runtime Hooks, which makes current PHP resources such as cURL, PDO, Redis, etc. work asynchronously inside its event loop. Swoole does this using the same Monkey Patch technique, overriding PHP’s native functions to add this new behavior.
Then comes the problem: two extensions, each wanting to apply a monkey patch, overriding the same native PHP functions. They conflict, and in the end neither works.
OpenTelemetry
One of the ways we found to solve this problem with APMs was to do the instrumentation using PHP itself instead of APMs. It worked really well. Fortunately, New Relic has a REST API, actually REST APIs that map exactly 1:1 to the pillars: one API for metrics, one for traces, and one for events. We instrumented things manually and sent the data to those APIs.
The problem with that is that there was too much mixing between business rule code and instrumentation code; classes and methods gained more lines of things that were not directly related.
In any case, it was through this idea of instrumenting with PHP that we discovered OpenTelemetry.
The OTel project is the merger of the OpenTracing and OpenCencus projects; in other words, initiatives for Observability using open formats already existed. OpenTelemetry brought these people together into a foundation and organization.
There were already OpenTracing projects for PHP that generated open formats such as Jaeger and StatsD, and these projects were inherited by OpenTelemetry. This was the missing piece needed to connect the observability ecosystem, which already worked very well, to OpenTelemetry through its Collector component; it was the piece that connected these open formats with New Relic.
Remember that Hyperf is entirely focused on microservices, including Observability? So it already provides two components called hyperf/trace and hyperf/metric that serve precisely to instrument Hyperf applications and export them to open formats.
Instrumentation with AOP
A very nice point about instrumentation, to avoid the problem of having it mixed with business rules and code that does other things, is that Hyperf uses a technique called AOP, from Aspect Oriented Programming. It is a way to implement Monkey Patch (remember it?), meaning we can add behavior (in this case, instrumentation) to classes, methods, and functions without actually changing their code.
But this time, this monkey patch made with AOP provided by Hyperf is done with PHP itself, so we do not have the conflict problem with the monkey patch done by Swoole to make PHP components non-blocking.
I highly recommend reading about the technique: https://en.wikipedia.org/wiki/Aspect-oriented_programming
Conclusion

Application built with Hyperf
It is instrumented and exports data in open formats, such as Jaeger and StatsD.
OpenTelemetry Collector
Because it is the combination of OpenTracing and OpenCensus, it inherits the ability to receive open formats (such as Jaeger and Statsd) and has support from New Relic itself to export the data to its platform.
New Relic
A strong partner of the project, supporting the initiative and increasingly adding support to receive the data generated by OpenTelemetry components.


