In this piece, which originally appeared here, we’ll look at the challenges of refactoring SOAs to MSAs, in light of different communication types between microservices, and see how pub-sub message transmission — as a managed Apache Kafka Service — can mitigate or even eliminate these challenges.
If you’ve developed or updated any kind of cloud-based application in the last few years, chances are you’ve done so using a Microservices Architecture (MSA), rather than the slightly more dated Service-Oriented Architecture (SOA). So, what’s the difference?
As Jeff Myerson wrote:
“If Uber were built with an SOA, their services might be:
“Whereas, if Uber were built instead with microservices, their APIs might be more like:
“More APIs, smaller sets of responsibilities.”
With smaller sets of responsibilities for each service, it’s easier to isolate functionality. But what are microservices and how do MSAs compare to SOAs and monolithic applications?
What Are Microservices?
Simply put, microservices are a software development method where applications are structured as loosely coupled services. The services themselves are minimal atomic units which together, comprise the entire functionality of the entire app. Whereas in an SOA, a single component service may combine one or several functions, a microservice within an MSA does one thing — only one thing — and does it well.
Microservices can be thought of as minimal units of functionality, can be deployed independently, are reusable, and communicate with each other via various network protocols like
HTTP (more on that in a moment).
Today, most cloud-based applications that expose a REST API are built on microservices (or may actually be one themselves). These architectures are called Microservice Architectures, or MSAs.
On the continuum from single-unit, monolithic applications to coarse-grained service-oriented architectures, MSAs offer the finest granularity, with a number of specialized, atomic services supporting the application.
Source: Samarpit Tuli’s Quora Answer: “What’s the difference between SOAs and Microservices?”
From this, one starts to get a sense of how asynchronous communication at scale could serve as a benefit in the context of apps that pull and combine data from several APIs. Still, while most organizations are considering implementing their applications as MSAs — or already have — the task, especially when refactoring from MSAs or monoliths, is not exactly straightforward.\
For example, 37% of respondents building web apps reported that monitoring was a significant issue in one sample. Why?
Some clues can be seen in some of the challenges cited by those refactoring legacy apps to MSAs — overcoming tight coupling was cited by 28% of respondents, whereas finding where to break up monolithic components was cited by almost as many.
These types of responses suggest a few different, but actually related, conclusions:
- Monitoring services built on MSAs is more complicated (as opposed to SOAs or Monolithic apps) because of multiple points of failure (which exist potentially everywhere a service integrates with another).
- Tight coupling suggests components and protocols are inflexibly integrated point-to-point in monolithic apps or SOAs (making them difficult to maintain and build functionality around).
- Breaking up monolithic apps or large SOA components into atomic, independent, reusable microservices is challenging for exactly those first two reasons.
Also, what sort of problems can one expect when your application scales? We’ll look at these and suggest a solution below. But there’s one question that underlies all of the above concerns: Once we do manage to break up our apps into atomic services, what’s the best way for these services to communicate with each other?
Some Microservices Communication Patterns
In her article “Introduction to Microservices Messaging Protocols,” Sarah Roman provides an excellent breakdown of the taxonomy of communication patterns used by and between microservices:
Synchronous communication is when the sender of the event waits for processing and some kind of reply, and only then proceeds to other tasks. This is typically implemented as REST calls, where the sender submits an HTTP request, and then the service processes this and returns an HTTP response. Synchronous communication suggests tight coupling between services.
Asynchronous communication means that a service doesn’t need to wait on another to conclude its current task. A sender doesn’t necessarily wait for a response, but either polls for results later or records a callback function or action. This typically is done over message buses like Apache Kafka and/or RabbitMQ. Asynchronous communication actually invokes loose coupling between component services, because there can be no time dependencies between sending events and a receiver acting on them.
In this case, each request has one sender and one receiver. If there are multiple requests, they should be staggered, because a single receiver cannot receive and process them all at once. Again, this suggests tight coupling between sender and receiver.
As the category indicates, there are multiple receivers processing multiple requests.
We believe that, while each of these methods (in combination) have their purpose within an MSA, the most loosely coupled arrangement of all is when microservices within a distributed application communicate with each other asynchronously, and via multiple receivers. This option implies that there are no strict dependencies between the sender, time of send, protocol, and receiver.
The pub-sub communication method is an elaboration on this latter method. The sender merely sends events — whenever there are events to be sent— and each receiver chooses, asynchronously, which events to receive.
Apache Kafka may be one of the more recent evolutions of pub-sub. Apache Kafka works by passing messages via a publish-subscribe model, where software components called producers publish (append) events in time-order to distributed logs called topics (conceptually a category-named data feed to which records are appended).
Consumers are configured to separately subscribe from these topics by offset (the record number in the topic). This latter idea — the notion that consumers simply decide what they will consume — removes the complexity of having to configure complicated routing rules into the producer or other components of the system at the beginning of the pipe.
We argue that, when asynchronous communication to multiple receivers is required, Apache Kafka is a promising way to go, as it solves the problem of tight-coupling between components and communication, is monitorable, and facilitates breaking up larger components into atomic, granular, independent, reusable services.
Why Apache Kafka?
Routing Rules Configured by Consumer
When the routing rules are configured by the consumer (a feature of pub-sub and Apache Kafka generally), then, as mentioned, there is no need to build additional complexity into the data pipe itself. This makes it possible to decouple components from the message bus (and each other) and develop and test them independently, without worrying about dependencies.
Built-in Support for Asynchronous Messaging
All of the above makes it reasonably simple to decouple components and focus on a specific part of the application. Asynchronous messaging, when used correctly, removes yet another point of complexity by letting your services be ready for events without being synced to them.
High Throughput/Low Latency
It’s easier to have peace of mind about breaking up larger, SOA-type services into smaller, more atomic ones when you don’t have to worry about communication latency issues. Aiven managed Kafka services have been benchmarked and feature the highest throughput and lowest latency of any hosted service in the industry.
Why Managed Apache Kafka?
Apache Kafka was built to leverage the best of what came before while improving on it even more.
However, Apache Kafka can be challenging to set up. There are many options to choose from, and these vary widely depending on whether you are using an open-source, proprietary, free, or paid version (depending on the vendor). What are your future requirements?
If you were choosing a bundled solution, then your choice of version and installation type, for example, may come back to haunt you in the future, depending on the functionality and performance you later decide you need.
These challenges alone may serve as a compelling argument for a managed version. With the deployment, hardware outlay costs and effort and configuration out of your hands, you can focus entirely on the development for which you originally intended your Kafka deployment.
What’s more, managed is monitorable. Are you tracking throughput? You need not worry about where the integration points are in your app to instrument custom logging and monitoring; simply monitor each of your atomic services’ throughput via your provider’s Kafka backend and metrics infrastructure.
What sort of problems can you expect when your application scales? Bottlenecks? Race conditions? A refactoring mess to accommodate for them?
A managed Kafka solution can scale automatically for you when the size of your data stream grows. As such, you needn’t worry when it’s time to refactor your services atomically, and you needn’t force your teams to maintain blob-style, clustered services with complicated dependencies just for the sake of avoiding latency between them.
Centralized, No-Fuss Management
If you’re managing your own cluster, you can expect to be tied down with installs, updates, managing version dependencies, and related issues. A managed solution handles all of that for you, so you can focus on your core business.
Apache Kafka is already known for its high availability, so you never have to worry about your services being unable to communicate because a single node supporting your middleware is down.
Kafka’s ability to handle massive amounts of data and scale automatically lets you scale your data processing capabilities as your data load grows. And a managed solution has redundancy built right in.
We looked at some common challenges of refactoring from SOAs to MSAs: monitorability, tight coupling of components, and the challenges of breaking up larger monolithic or SOA components into microservices.
We considered different types of communication and looked at how a pub-sub mechanism serves asynchronous message transmission between multiple services. Finally, we examined how a managed Apache Kafka solution may be an excellent candidate for such a use case.q