More and more organisations break up monoliths (formerly known as “app”) in smaller services and adopt microservices architecture pattern and they communicate in an asynchronous way via Event Sourcing. The main benefits are:
Make sure everyone is happy with the current architecture, and if you need to deliver fast, plan when those fixes will be made, make sure everyone is on the same page, and document those decisions.
Additionally, you want your service to be highly available:
And secure (security is a huge topic and below list is by no means exhaustive):
Ensure your documentation (README/Wiki entries), provide enough information about your app:
--helpoutput for binaries with available flags).
Your microservice should be fairly simple and do very specific tasks. If you find it does too many things, consider breaking it down to more microservices.
There are many benefits in containerizing your application, described here.
To be used by the container orchestration framework to determine if the app is healthy and ready to accept connections.
Possible healthcheck values:
Degraded performance means some operations will fail.
You can create meaningful dashboards (for example in Grafana) by leveraging the power of metrics. This is useful to see how your app is behaving, for example memory consumption, GC pauses, error rate, HTTP status codes, and whatever else is useful when troubleshooting your application.
A popular tool for metrics collection is Prometheus.
Create alerts based on business and operations logic, here are some examples:
Note: Finding the right value for a threshold sometimes takes time, and should be adjusted as you don’t want to be alerted too often as you’ll end up ignoring, or even worse missing alerts due to excessive noise.
Think about what threshold is acceptable for your service and investigate why an alert was triggered, and if necessary adjust the threshold.
Have a backup strategy as well as a way to restore from previous backups. It is very important that your teammates are familiar with these processes and you have tested they work.
You don’t want restoring from a backup to fail in a critical moment.
You also want an alert if backups are not being taken, this can be achieved by your backup service exposing relevant metrics and you setting up alerts for all apps that have their data backed up.
For more on distributed systems, reliable microservices, and event sourcing I recommend the following resources:
I’d love to hear your thoughts and about your experience in microservices world. Feel free to reach out to me on LinkedIn!tags: microservices - go-live