Invoking REST APIs From Java Microservices

Freviously, I blogged about how to implement and document REST APIs in JavaEE applications with Eclipse MicroProfile. In this article, I describe the inverse scenario — how services can invoke other services via REST APIs over HTTP.

MicroProfile comes with a REST Client which defines a type-safe client programming model. The REST Client makes it easier to convert between the JSON data and Java objects in both directions.

There is pretty good documentation about the REST Client available (see below). In this article, I describe how I’ve used the client in my sample application. The application has a Web API service which implements the BFF (backend for front-end pattern). The Web API service uses the REST Client to invoke another ‘Authors’ service.

Microservice clients

Get the code of the cloud-native starter application.

First, you need to define the interface of the service you want to invoke.

import javax.ws.rs.GET;
import javax.ws.rs.Produces;
import javax.ws.rs.core.MediaType;
import com.ibm.webapi.business.Author;
import com.ibm.webapi.business.NonexistentAuthor;

@RegisterProvider(ExceptionMapperArticles.class)
public interface AuthorsService {
  @GET
  @Produces(MediaType.APPLICATION_JSON)
  public Author getAuthor(String name) throws NonexistentAuthor; 
}

The getAuthor method returns an object of the Authorclass.

public class Author {
   public String name;
   public String twitter;
   public String blog;
}

The actual invocation of the authors service happens in AuthorsServiceDataAccess.java. The RestClientBuilder is used to get an implementation of theAuthorsService interface. The deserialization of the data into a Java object is done automatically.

import org.eclipse.microprofile.rest.client.RestClientBuilder;
import com.ibm.webapi.business.Author;
import com.ibm.webapi.business.NonexistentAuthor;

public class AuthorsServiceDataAccess {
   static final String BASE_URL = "http://authors/api/v1/";
   public AuthorsServiceDataAccess() {} 
   public Author getAuthor(String name) throws NoConnectivity, NonexistentAuthor {
      try {
         URL apiUrl = new URL(BASE_URL + "getauthor?name=" + name);
         AuthorsService customRestClient = RestClientBuilder.newBuilder().baseUrl(apiUrl).
                   customRestClient.getAuthor(name);
      } catch (NonexistentAuthor e) {
         throw new NonexistentAuthor(e); 
      } catch (Exception e) {
         throw new NoConnectivity(e);
      }
   }
}

In order to use the RESTClientBuilder, you need to understand the concept of the ResponseExceptionMapper. This mapper is used to translate certain HTTP response error codes back into Java exceptions.

import org.eclipse.microprofile.rest.client.ext.ResponseExceptionMapper;
import com.ibm.webapi.business.NonexistentAuthor;
@Provider
public class ExceptionMapperAuthors implements ResponseExceptionMapper<NonexistentAuthor> 
   @Override
   public boolean handles(int status, MultivaluedMap<String, Object> headers) {
      return status == 204;
   }
   @Override
   public NonexistentAuthor toThrowable(Response response) {
      switch (response.getStatus()) {
         case 204:
            return new NonexistentAuthor();
        }
        return null;
   }
}

Read the following resources to learn more about the MicroProfile REST Client.

Low-Latency Java: Introduction

This is the first article of a multi-part series on low latency programming in Java. At the end of this introductory article, you will have grasped the following concepts:

  • What is latency, and why should I worry about it as a developer?
  • How is latency characterized, and what do percentile numbers mean?
  • What factors contribute to latency?

So, without further ado, let’s begin.

What Is Latency and Why Is it Important?

Latency is simply defined as the time taken for one operation to happen.

Although operation is a rather broad term, what I am referring to here is any behavior of a software system that is worth measuring and that a single run of that type of operation is observed at some point in time.

For example, in a typical web application, an operation could be submitting a search query from your browser and viewing the results of that query. In a trading application, it could be the automatic dispatching of a Buy or Sell order for a financial instrument to an exchange upon receiving a price tick for it. The lesser the time these operations take, the more they usually tend to benefit the user. Users prefer web applications that do not keep them waiting. You can recall that blazing fast searches were what initially gave Google a winning edge over other search engines prevalent at the time. The faster a trading system reacts to market changes, the higher the probability of a successful trade. Hundreds of trading firms are routinely obsessed with their trading engines having the lowest latency on the street because of the competitive edge they gain from it.

Where the stakes are high enough, lowering latencies can make all the difference between a winning business and a losing one!

How Is Latency Characterized?

Every operation has its own latency. With hundreds of operations, there are hundreds of latency measurements. Hence, we cannot use a single measure like number_of_operations/second, or num_of_seconds/operation to describe latency in a system because that is what can be used to describe a single run of that operation only.

At first impulse, you may think it then makes sense to quantify latency as an average of all the operations of the same kind that were measured. Bad idea!

The problem with averages? Consider the latency graph shown below.

Diagram 1

Several measurements (seven, in fact) exceed the SLA target of 60 ms, but the average response time happily shows us well within SLA bounds. All the performance outliers lying within the red zone get smoothed over as soon as you start dealing in terms of average response times. Ignoring these outliers is like throwing away that part of the data that matters the most to you as a software engineer — this is where you need to focus to tease out performance-related issues in the system. Worse still, chances are that the underlying problems that these outliers mask will rear their head more often than not in a production system under real conditions.

Another thing to note, a lot of latency measurements in practice would probably end up looking like the graph above, with a few random heavy outliers seen every now and then. Latency never follows a normal, Gaussian, or Poisson distribution; you see more of a multi-modal shape to latency numbers. Actually, that is why it is useless to talk about latency in terms of averages or standard deviations.

Latency is best characterized in percentile terms.

What are percentiles? Consider a population of numbers. The nth percentile (where 0< n< 100) divides this population into two portions so that the lower portion contains n percent of the data and the upper portion contains (100 – n)% of the data. As such, the two portions will always add up to 100 percent of the data.

For example, the 50th percentile is the point where half of the population falls below and the other half falls above it. This percentile is better known as the median.

Let’s take a few examples in the context of measuring latency.

When I state the 90th percentile latency is 75 ms, it means that 90 out of 100 operations suffer a delay of at most 75 ms, and the remainder, i.e 100 – 90 = 10, operations suffer a delay of at least 75 ms. Note that my statement doesn’t place any upper bound on the maximum latency observed in the system.

Now, if I further add that the 98th percentile latency is 170 ms, it means that 2 out of 100 operations suffer a delay of 170 ms or more.

If I further add that the 99th percentile latency is 313 ms, it implies that 1 out of every 100 operations suffers a substantial delay as compared to the rest.

In fact, many systems exhibit such characteristics where you find that latency numbers sometimes grow significantly, even exponentially as you go higher up the percentile ladder.

Image title

But why should we worry about long tail latencies? I mean, if only 1 in 100 operations suffers a higher delay, isn’t the system performance good enough?

Well, to gain some perspective, imagine a popular website with a 90th, 95th, and 99th percentile latency of 1, 2, and 25 seconds respectively. If any page on this website crosses a million page views per day, it means that 10000 of those page views take more than 25 seconds to load — at which point 10000 users have possibly stifled a yawn, closed their browser, and moved onto other things, or even worse, are now actively relating their dismal user experience on this website to friends and family. An online business can ill-afford tail latencies of such a high order.

What Contributes to Latency?

The short answer is: everything!

Latency “hiccups” that lead to its characteristic shape, with random outliers and everything, can be attributed to a number of things like:

  • Hardware Interrupts
  • Network/IO delays
  • Hypervisor Pauses
  • OS activities like rebuilding internal structures, flushing buffers etc
  • Context Switches
  • Garbage Collection Pauses

These are events that are generally random and don’t resemble normal distributions themselves

Also, consider how your running Java program looks like in terms of a high-level schematic:

Image title

(The hypervisor/containers are optional layers on bare-metal hardware but very relevant in a virtualised/cloud environment)

Latency reduction is intimately tied to considerations, like:

  • The CPU/Cache/Memory architecture
  • JVM architecture and design
  • Application design — concurrency, data structures and algorithms, and caching
  • Networking protocols, etc.

Every layer in the diagram above is as complex as it gets and adds considerably to the surface area of the knowledge and expertise required to optimize and extract the maximum performance possible — that, too, within the ubiquitous constraints of reasonable cost and time.

But then, that is what makes performance engineering so interesting!

Top Three Strategies for Moving From a Monolith to Microservices

One of the primary problems facing enterprises is the problem of moving from monolith to microservices. The larger the enterprise, the bigger their monolithic applications become, and it gets harder to refactor them into a microservices architecture.

Everyone seems to agree on the benefits of microservices. We covered this topic at some length in this post. However, not many seem to agree on how to undertake the migration journey. Too often, the decision-making process turns into a chicken-and-egg problem. The bigger the monolith, the bigger the stakeholders’ and management’s footprint becomes. Too much management often leads to decision paralysis and, ultimately, the journey ends up as a mess.

However, many organizations have successfully managed to make this transition. Often, it is a mix of good leadership and a well-defined strategy that determines success or failure.

Good leadership is often not in the hands of an architect or developer undertaking this journey. However, a strategy is. So, let’s look at some strategies that can help in this journey:

Implement New Functionalities as Services

I know it is hard. But if you’ve decided to transition from a monolith to microservices, you have to follow this strategy. Think of the monolithic system as a hole in the ground. To get out of a hole, you don’t dig more. In other words, you don’t add to your problems.

Often, organizations miss this part completely. They think about a grand migration strategy that will take years. However, business requirements come fast. Due to the lack of budget or time, teams end up implementing those requirements into the monolithic application. The grand migration strategy never starts for whatever reason. And, each addition to the monolith makes the goal-post move further ahead.

In order to get around this, stop increasing the size of the monolith. Don’t implement new features in the monolithic code base. Every new feature or functionality should be implemented as services. This, in turn, reduces the growth rate of the monolithic application. In other words, new features implemented as services create inertia towards the migration. It also helps demonstrate the value of the approach and ensures continuous investment.

Separate Presentation Layer From the Backend

This is an extremely powerful strategy to migrate from a monolith to microservices. A typical enterprise application usually has three layers:

  • Presentation logic that consists of modules implementing the web UI. This tier of the system is responsible for handling HTTP requests and generating HTML pages. In any respectable application, this tier has a substantial amount of code.
  • Business logic that consists of modules handling the business rules. Often, this can be quite complex in an enterprise application.
  • Data access logic that consists of modules handling the persistence of data. In other words, it deals with databases.

Usually, there is a clean separation between presentation logic and business logic. The business tier exposes a set of coarse-grained APIs. Basically, these APIs form an API Layer. In other words, this layer acts as a natural border based on which you can split the monolith. This approach is also known as horizontal slicing.

If done successfully, you’ll end up with smaller applications. One application will handle the presentation. The other application will handle the business and data access logic. There are various data management patterns for microservices that can be explored.

There are advantages to this approach:

  • You can develop, deploy, and scale both applications independently. In other words, UI developers can rapidly introduce changes to the interface. They don’t have to worry about the impact on the backend.
  • You will also have a set of well-defined APIs. Also, these APIs will be remotely accessible for use in other microservices.

Extract Business Functionalities Into Services

This is the pinnacle of moving to a microservices architecture. The previous two strategies will take you only so far. Even if you successfully implement them, you’ll still have a very monolithic code base. Basically, they can act only as a spring-board to the real deal.

If you want to make a significant move towards microservices, you need to break apart the monolithic code base. The best way to do this is by breaking up the monolith based on business functionality. Each business functionality is handled by one microservice. Each of these microservices should be independently deployable and scalable. Communication between these services should be implemented using remote API calls or through message brokers.

By using this strategy, over time, the number of business functions implemented as services will grow and your monolith will gradually shrink. This approach is also known as vertical slicing. Basically, you are dividing your domain into vertical slices or functionalities.

Conclusion

Moving from a monolith to microservices is not easy. It requires significant investment and management buy-in. On top of all that, it requires incredible discipline and skill from the team actually developing it. However, the advantages are many.

Often, it is a good idea to start small on this journey. Rather than waiting for a big one-shot move, try to take incremental steps, learn from mistakes, iterate, and try again. Also, don’t try to go for a perfect design from the get-go. Instead, be willing to iterate and improve.

Lastly, remember that microservices is not a destination but a journey. A journey of continuous improvement.

Let me know your thoughts and experiences about this in the comments section below.

Reducing Boilerplate Code in Java POJOs — Eliminating Getters/Setters and Minimizing POJO Mappings

“If you want to not write or even manually auto-generate the Getters and Setters in POJOs (Plain Old Java Objects) and want to get two POJOs mapped with less code, then read on for step by step instructions. Skip to Figure 1 for visual summary of this”.

Java has faced the criticism of being verbose in its syntax from the fans of other languages, and it’s quite fair, to be honest. With time, Java has evolved particularly to reduce its syntax verbosity with milestones from advanced for-each loop in Java 5 to try-with-resources in Java 7 to lambda expressions introduced in Java 8. All of these improvements address this issue elegantly, and make Java coding a lot easier and concise. Combine this with Spring Boot and you have a coding environment that’s miles ahead of Java coding environment from just a few years back in terms of speed of coding!

Still, there are things that a Java programmer has to do over and over again just to get the code complete from language/syntax point-of-view rather than spending time on business logic, the actual task at hand. Having to write or even auto-generate Getters and Setters in POJOs is one such issue. Having to map POJOs for data transfer from one layer to another is another cumbersome task that mostly just feels like a mere overhead. For example, presentation layer sees an entity differently than the service layer which in turn has a different shape from the same entity in persistence/DB layer. A lot of times developers find themselves writing mapping of these POJOs. What is even more painful is that maybe only a few fields differ and most of the rest of the fields have same name in source and target objects but we end up having to get and set them all from one object to another.

The above two problems can be addressed quite elegantly using a behind-the-scenes code generator like Project Lombok (introduced to me by my colleague, Carter) and a POJO mapper like MapStruct. This post will go through the example of eliminating the need to write or auto-generate Getters and Setters and of more concise mapping of one POJO to another (rather than manually doing something like target.setField(source.getField())).

Fig 1 displays the summary of this post and compares traditional POJOs and mappings with tools-facilitated light weight code to achieve same functionality.

Fig 1. TL;DR — Reducing Code: Summary of What This Post Shows Step by Step Instructions For.

Lombok provides annotations that can be declared on POJOs instead of writing Getters and Setters, Constructors, Loggers, etc. MapStruct is providing an automatic way of mapping POJOs, especially the ones that have attributes with same name and type, also using annotations. This can reduce a lot of development time and developers can focus on implementing business logic.


Code Walk Through

Following is a step-by-step walk-through to achieve the results in Fig 1. The environment is Eclipse (Spring Tool Suite IDE), Gradle (for building project and running jUnit test-cases), and Java 8 (on Mac OS X, although this shouldn’t matter other than menu buttons walk-through etc.).

(The code in this post will be shared here through GitHub soon. You can then download it from there instead of going through all the steps and use these steps as reference to read through where ever you need more details.)

  1. Let’s start by creating a new project in Eclipse:

 

Fig 2. Create New Gradle Project

2. Final Settings:

3. The project should like this with a default build.gradle file:

4. Next, we’ll change the default package to our own and start adding POJOs to it as follows:

We are assuming a scenario where we have a DTO (Data Transfer Object) POJO on one hand and an Entity (can be JPA or Hibernate or some other ORM or even manual SQL entry POJO), and we need to convert and copy data from one object to the other, bidirectional.

5. Let’s add FoodSampleDTO and use Eclipse to generate Getters and Setters (we will eventually eliminate this step):

6. And Select All fields and Finish:

This is what FoodSampleDTO looks like:

7. Follow same steps to create ProductDTO, which is a sub-object of FoodSampleDTO:

8. Let’s do the same for the POJO for our (superficial) persistence layer, SampleFoodEntity, to get:

You’ll notice that most of the fields are same in these POJOs with the exception of productId, which is a direct String field in FoodSampleEntity while it’s a sub-object in FoodSampleDTO. This is just an example and in real-world the POJOs will be much larger and differences can be much greater.

So now we want some mapper class to map data between DTO and Entity POJOs. This mapper will basically get data from one object and set it into another, with any necessary transformations.

9. Add FoodSampleMapper class in …mapper package and add two methods, one to convert DTO to Entity and other to convert Entity to DTO:

This is the second place where we want to reduce the manual work. As you can see, even though most fields have same name and type, we still have to get from one object and set into the other one. This can be cumbersome (and boring!) when the POJOs are large and in large number, which is typically the case.

Before looking at how to reduce these, let’s first write and run jUnit test-cases.

10. Under the source folder ‘src/test/java’, add a package …mapper and add a class FoodSampleMapperTests. This will have two methods to test our mapper’s two methods:

As you can see, we create our source object, populate some data in it, call our mapper to convert it to the other object and assert to ensure the data got copied and matches our original test data variables.

11. Run jUnit test-cases:

This is all good and business as usual. But cumbersome and boring to write and manually auto-generate this code.

Let’s reduce some coding!

First thing we are going to reduce is Getters and Setters using Lombok. Lombok provides annotations at compile-time to identify that you need Getters and Setters and hooks into the build process to generate the code. The POJOs stay lean (for check-in, check-out and editing purposes) but get the functionality of Getters and Setters auto-generated for run-time. Let’s setup Lombok.

12. Go to https://projectlombok.org/download and get the latest version of Lombok jar. Instructions for different IDEs and Build Tools can be found under “Install” drop down menu on the top bar at Lombok’s site.

13. Execute this jar to get the installer screen:

14. Select your IDE, hit Install/Update, then Quit Installer and restart your IDE:

15. Following the instructions on Lombok’s page, we’ll add a plugin and dependencies in our build.gradle file (the top section and last two lines, remember to save the file):

16. In Eclipse, Right-click the project, select Gradle > Refresh Gradle Project. This should resolve all new dependencies added to build.gradle file and bring them to Eclipse’s dependencies list. (This is where you might need to have a Gradle plugin for Eclipse, I’ve this one: http://marketplace.eclipse.org/content/buildship-gradle-integrationinstalled through Help > Eclipse Marketplace). Alternatively, you can use command line to build Gradle projects (you don’t even need to install Gradle just use gradlew files created in the project for commands like “./gradlew clean build”).

If all goes well, you should have Lombok provided annotations available in your project.

17. Remove all Getter and Setter methods and constructors from your POJOs and just annotate the POJO with Getter and Setter annotations:

Repeat this for all POJOs. Our POJOs now look like this:

Quite slick, eh? We’ve eliminated the need to auto-generate Getters and Setters and even writing constructors. We added a @NoArgsConstructor to FoodSampleEntity as it will be needed for MapStruct to work later on. We also added a @RequiredArgsConstructor to ProductDTO and made id a required field by using @NonNull, this way we can get the one-arg constructor auto-generated for us that’s needed by our existing test-cases: ProductDTO testProduct = new ProductDTO(testProductId);

Now we need to Clean/Build the project and re-run the jUnit test-cases to ensure they pass after these changes. We’ll setup Eclipse to invoke Gradle build from inside Eclipse instead of going to terminal for command line excecution.

18. Right click the build.gradle file, select Run As > Run Configurations, Choose Gradle Project in the left pane, click ‘New’ icon on top left of this pane and configure the Gradle build as follows and click Run:

This should result in a successful Gradle build, if all goes well:

The Console View should show something like:

The test task is successfully executed, passing our test-cases. Good!

Now, it’s time to get rid of that boring getFromSomeObject and setToAnotherObject mappings and replace it with simple annotations (only if field name and type is different! otherwise no mapping needs to be written as is the case for four of our five fields in POJOs).

So let’s setup MapStruct!

19. All we have to do is add MapStruct dependencies and Refresh Gradle Project:

20. Let’s create an interface ‘FoodSampleAutoMapper’ next to our (manual) mapper ‘FoodSampleMapper’, and annotate it with @Mapper (should be available if above dependencies have been successfully resolved):

We added two (abstract) methods to this interface, one for conversion of DTO to Entity and other for Entity to DTO. This small interface definition is all that’s needed to copy all fields with same name and type from one POJO to another! Incredible!

We’ll get to the fields that don’t have same name and type (like productId in our case). We also added an INSTANCE variable that will refer to an actual implementation class that implements this interface. But we don’t have to write that class, MapStruct will auto-generate this impl class for us behind the scenes and make it available at run-time much like Lombok does its magic. We don’t have to use INSTANCE variable, we can get the impl class inline with how our project searches for classes (e.g. you can use dependency injection using Spring, the @Mapper will need to be changed to @Mapper(componentModel = “spring”) and this way the generated class will have @Component annotation for Spring’s initialization to detect and make it available for DI).

21. We can simply switch our mapper to this interface. We’ll add another test class instead of modifying existing one, just copy/paste FoodSampleMapperTest class to FoodSampleAutoMapperTest class and change one line of code, to fetch the new mapper:

//Convert the DTO to entity object

FoodSampleEntity foodSampleEntity = FoodSampleAutoMapper.INSTANCE.mapFoodSampleDTOToEntity(foodSampleDTO);

We are using our interface’s INSTANCE method which will have the implementation class searched by Mappers.getMappers… call and this mapper class will be made available for our use. We can invoke mapFoodSampleDTOToEntity and mapFoodSampleEntityToDTO methods on the INSTANCE variable.

There’s no change in our test-cases except for above line:

22. Let’s try to run unit-tests which should fail. (Just go to Gradle Executions View (use Window > View > Other > (Type) Gradle if it doesn’t show up) and just press re-run icon to execute the build again. Alternatively, right click the project, Run As > Run Configurations and select Gradle Project config created in step # previously and Click Run.

As you can see, our previous test-cases are passing but new ones failed. The console on the right is showing two warnings that there’s an unmapped target property ‘productId’ in first mapper method and ‘product’ in the other one. This is because it is the only field in our POJOs that does not have same name and type in both POJOs so mapstruct doesn’t know how to convert this and it ignores it and just prints a warning. If we didn’t need this we can ignore the warning but in our test-cases we are expecting productId to be converted as well, therefore the actual failure of our test-cases throwing NullPointerException and AssertionError in line 71 and 40 respectively of our FoodSampleAutoMapperTest class. This is expected since we didn’t tell MapStruct how to map this field.

23. Now, let’s tell MapStruct how to map fields that are not that straight forward. Let’s use the powerful @Mapping annotation with attributes source and target to tell MapStruct how to convert productId to product.id:

We are telling MapStruct that for mapFoodSampleDTOToEntity method, when you are looking for productId (which is String in FoodSampleEntity target object), simply get the value from (the source) foodSampleDTO’s product field’s id field. And vice versa in mapFoodSampleEntityToDTO. Please note that we can add as many @Mapping lines as we need to specify the differences for each method, here we only have one difference per method, therefore only one @Mapping line.

In our example, we are just showing a difference in structure but in typical scenarios you will also find difference in types etc which will need more than just source/target mapping (see MapStruct’s excellent documentation for those cases, e.g. you can use expression and provide a Java line of code which will be placed as is in the generated class, providing a concise but powerful touch to these auto-mappings). You can also use dateFormat and MapStruct will automatically write DateParsing code with handling of DateParsing Exceptions. It can also do null checks and that strategy is also customizable, again refer to MapStruct documentation.

24. Run test-cases again with above change and these should pass with flying colors:


Conclusion

So there you have it, we have successfully reduced the code from Getters and Setters and manual mapping to some mere annotations:

Please let me know in comments if you find this post helpful or not, have any questions, issues or feedback.

Happy programming!

One Method to Rule Them All: Map.merge()

 I don’t often explain a single method in JDK, but when I do, it’s about Map.merge(). This is probably the most versatile operation in the key-value universe. But it’s also rather obscure and rarely used.

merge() can be explained as follows: it either puts new value under the given key (if absent) or updates existing key with a given value (UPSERT). Let’s start with the most basic example: counting unique word occurrences. Pre-Java 8 (read: pre-2014!) code was quite messy and the essence was lost in implementation details:

var map = new HashMap<String, Integer>();
words.forEach(word -> {
    var prev = map.get(word);
    if (prev == null) {
        map.put(word, 1);
    } else {
        map.put(word, prev + 1);
    }
});

However, it works, and for the given input, it produces the desired output:

var words = List.of("Foo", "Bar", "Foo", "Buzz", "Foo", "Buzz", "Fizz", "Fizz");
//...
{Bar=1, Fizz=2, Foo=3, Buzz=2}

OK, but let’s try to refactor it to avoid conditional logic:

words.forEach(word -> {
    map.putIfAbsent(word, 0);
    map.put(word, map.get(word) + 1);
});

That’s nice!

putIfAbsent()  is a necessary evil; otherwise, the code breaks on the first occurrence of a previously unknown word. Also, I find map.get(word) insidemap.put() to be a bit awkward. Let’s get rid of it as well!

words.forEach(word -> {
    map.putIfAbsent(word, 0);
    map.computeIfPresent(word, (w, prev) -> prev + 1);
});

computeIfPresent() invokes the given transformation only if the key in question (word) exists. Otherwise, it does nothing. We made sure the key exists by initializing it to zero, so incrementation always works. Can we do better? Well, we can cut the extra initialization, but I wouldn’t recommend it:

words.forEach(word ->
        map.compute(word, (w, prev) -> prev != null ? prev + 1 : 1)
);

compute ()is likecomputeIfPresent(), but it is invoked irrespective to the existence of the given key. If the value for the key does not exist, the prevargument is null. Moving a simpleif to a ternary expression hidden in lambda is far from optimal. This is where themerge()operator shines. Before I show you the final version, let’s see a slightly simplified default implementation ofMap.merge():

default V merge(K key, V value, BiFunction<V, V, V> remappingFunction) {
    V oldValue = get(key);
    V newValue = (oldValue == null) ? value :
               remappingFunction.apply(oldValue, value);
    if (newValue == null) {
        remove(key);
    } else {
        put(key, newValue);
    }
    return newValue;
}

The code snippet is worth a thousand words. merge() works in two scenarios. If the given key is not present, it simply becomes put(key, value). However, if the said key already holds some value, our remappingFunction may merge (duh!) the old and the one. This function is free to:

  • overwrite old value by simply returning the new one: (old, new) -> new
  • keep the old value by simply returning the old one: (old, new) -> old
  • somehow merge the two, e.g.: (old, new) -> old + new
  • or even remove old value: (old, new) -> null

As you can see, merge() is quite versatile. So, how does our academic problem look like with merge()? It’s quite pleasing:

words.forEach(word ->
        map.merge(word, 1, (prev, one) -> prev + one)
);

You can read it as follows: put 1 under the word key if absent; otherwise, add 1 to the existing value. I named one of the parameters “ one” because in our example it’s always…  1.

Sadly, remappingFunctiontakes two parameters, where the second one is the value we are about to upsert (insert or update). Technically, we know this value already, so (word, 1, prev -> prev + 1)  would be much easier to digest. But there’s no such API.

All right, but is merge() really useful? Imagine you have an account operation (constructor, getters, and other useful properties omitted):

class Operation {
    private final String accNo;
    private final BigDecimal amount;
}

And a bunch of operations for different accounts:

var operations = List.of(
    new Operation("123", new BigDecimal("10")),
    new Operation("456", new BigDecimal("1200")),
    new Operation("123", new BigDecimal("-4")),
    new Operation("123", new BigDecimal("8")),
    new Operation("456", new BigDecimal("800")),
    new Operation("456", new BigDecimal("-1500")),
    new Operation("123", new BigDecimal("2")),
    new Operation("123", new BigDecimal("-6.5")),
    new Operation("456", new BigDecimal("-600"))
);

We would like to compute balance (total over operations’ amounts) for each account. Without merge(), this is quite cumbersome:

var balances = new HashMap<String, BigDecimal>();
operations.forEach(op -> {
    var key = op.getAccNo();
    balances.putIfAbsent(key, BigDecimal.ZERO);
    balances.computeIfPresent(key, (accNo, prev) -> prev.add(op.getAmount()));
});

But with a little help of merge():

operations.forEach(op ->
        balances.merge(op.getAccNo(), op.getAmount(), 
                (soFar, amount) -> soFar.add(amount))
);

Do you see a method reference opportunity here?

operations.forEach(op ->
        balances.merge(op.getAccNo(), op.getAmount(), BigDecimal::add)
);

I find this astoundingly readable. For each operation, add the given amount to the given accNo. The results are as expected:

{123=9.5, 456=-100}

ConcurrentHashMap

Map.merge() shines even brighter when you realize it’s properly implemented inConcurrentHashMap. This means we can atomically perform an insert-or-update operation — single line and thread-safe.

ConcurrentHashMap is obviously thread-safe, but not across many operations, e.g. get() and thenput(). However, merge() makes sure no updates are lost.

Life Beyond Java 8

New versions of Java are coming out every six months. What has changed, should we upgrade, and if so, how?

Abstract

Wasn’t Java 8 a fantastic update to the language? Lambdas and streams were a huge change and have helped to improve Java developers’ productivity and introduce some functional ideas to the language. Then came Java 9… and although the module system is really interesting for certain types of applications, the lack of exciting language features and uncertainty around how painful it might be to migrate to Java 9 left many applications taking a wait-and-see approach, happy with Java 8.

But now, Java has a new version every six months, and suddenly, Java 12 is here. But we’re all still on Java 8, wondering whether we should move to a later version, which one to choose, and how painful it might be to upgrade.

In this session, we’ll look at:

  • Why upgrade from Java 8, including language features from Java 9, 10, 11, and 12
  • What sorts of issues might we run into if we do choose to upgrade
  • How the support and license changes that came in with Java 11 might impact us.

Resources

Updates, Licenses, and Support

Where to Get Your JDK From

Migrating From Java 8

Features

Java 11

Java 10

Java 9

Java 12

Java Future

Performance

Garbage Collectors

String Performance

Other

How Are Your Microservices Talking

In this piece, which originally appeared here, we’ll look at the challenges of refactoring SOAs to MSAs, in light of different communication types between microservices, and see how pub-sub message transmission — as a managed Apache Kafka Service — can mitigate or even eliminate these challenges.

If you’ve developed or updated any kind of cloud-based application in the last few years, chances are you’ve done so using a Microservices Architecture (MSA), rather than the slightly more dated Service-Oriented Architecture (SOA). So, what’s the difference?

As Jeff Myerson wrote:

“If Uber were built with an SOA, their services might be:GetPaymentsAndDriverInformationAndMappingDataAPIAuthenticateUsersAndDriversAPI

“Whereas, if Uber were built instead with microservices, their APIs might be more like:

SubmitPaymentsService
GetDriverInfoService
GetMappingDataService
AuthenticateUserService
AuthenticateDriverService

“More APIs, smaller sets of responsibilities.”

With smaller sets of responsibilities for each service, it’s easier to isolate functionality. But what are microservices and how do MSAs compare to SOAs and monolithic applications?

What Are Microservices?

Simply put, microservices are a software development method where applications are structured as loosely coupled services. The services themselves are minimal atomic units which together, comprise the entire functionality of the entire app. Whereas in an SOA, a single component service may combine one or several functions, a microservice within an MSA does one thing — only one thing — and does it well.

Microservices can be thought of as minimal units of functionality, can be deployed independently, are reusable, and communicate with each other via various network protocols like HTTP (more on that in a moment).

Today, most cloud-based applications that expose a REST API are built on microservices (or may actually be one themselves). These architectures are called Microservice Architectures, or MSAs.

On the continuum from single-unit, monolithic applications to coarse-grained service-oriented architectures, MSAs offer the finest granularity, with a number of specialized, atomic services supporting the application.

Microservices vs. SOA vs. MonolithicSource: Samarpit Tuli’s Quora Answer: “What’s the difference between SOAs and Microservices?”

Some Challenges

From this, one starts to get a sense of how asynchronous communication at scale could serve as a benefit in the context of apps that pull and combine data from several APIs. Still, while most organizations are considering implementing their applications as MSAs — or already have — the task, especially when refactoring from MSAs or monoliths, is not exactly straightforward.\

For example, 37% of respondents building web apps reported that monitoring was a significant issue in one sample. Why?

Some clues can be seen in some of the challenges cited by those refactoring legacy apps to MSAs — overcoming tight coupling was cited by 28% of respondents, whereas finding where to break up monolithic components was cited by almost as many.

These types of responses suggest a few different, but actually related, conclusions:

  1. Monitoring services built on MSAs is more complicated (as opposed to SOAs or Monolithic apps) because of multiple points of failure (which exist potentially everywhere a service integrates with another).
  2. Tight coupling suggests components and protocols are inflexibly integrated point-to-point in monolithic apps or SOAs (making them difficult to maintain and build functionality around).
  3. Breaking up monolithic apps or large SOA components into atomic, independent, reusable microservices is challenging for exactly those first two reasons.

Also, what sort of problems can one expect when your application scales? We’ll look at these and suggest a solution below. But there’s one question that underlies all of the above concerns: Once we do manage to break up our apps into atomic services, what’s the best way for these services to communicate with each other?

Some Microservices Communication Patterns

In her article “Introduction to Microservices Messaging Protocols,” Sarah Roman provides an excellent breakdown of the taxonomy of communication patterns used by and between microservices:

Synchronous

Synchronous communication is when the sender of the event waits for processing and some kind of reply, and only then proceeds to other tasks. This is typically implemented as REST calls, where the sender submits an HTTP request, and then the service processes this and returns an HTTP response. Synchronous communication suggests tight coupling between services.

Asynchronous

Asynchronous communication means that a service doesn’t need to wait on another to conclude its current task. A sender doesn’t necessarily wait for a response, but either polls for results later or records a callback function or action. This typically is done over message buses like Apache Kafka and/or RabbitMQ. Asynchronous communication actually invokes loose coupling between component services, because there can be no time dependencies between sending events and a receiver acting on them.

Single Receiver

In this case, each request has one sender and one receiver. If there are multiple requests, they should be staggered, because a single receiver cannot receive and process them all at once. Again, this suggests tight coupling between sender and receiver.

Multiple Receivers

As the category indicates, there are multiple receivers processing multiple requests.

We believe that, while each of these methods (in combination) have their purpose within an MSA, the most loosely coupled arrangement of all is when microservices within a distributed application communicate with each other asynchronously, and via multiple receivers. This option implies that there are no strict dependencies between the sender, time of send, protocol, and receiver.

Pub-Sub

The pub-sub communication method is an elaboration on this latter method. The sender merely sends events — whenever there are events to be sent— and each receiver chooses, asynchronously, which events to receive.

Apache Kafka may be one of the more recent evolutions of pub-sub. Apache Kafka works by passing messages via a publish-subscribe model, where software components called producers publish (append) events in time-order to distributed logs called topics (conceptually a category-named data feed to which records are appended).

Consumers are configured to separately subscribe from these topics by offset (the record number in the topic). This latter idea — the notion that consumers simply decide what they will consume — removes the complexity of having to configure complicated routing rules into the producer or other components of the system at the beginning of the pipe.

diagram of apache kafka producers writing to and consumers reading from a topic

We argue that, when asynchronous communication to multiple receivers is required, Apache Kafka is a promising way to go, as it solves the problem of tight-coupling between components and communication, is monitorable, and facilitates breaking up larger components into atomic, granular, independent, reusable services.

Why Apache Kafka?

Routing Rules Configured by Consumer

When the routing rules are configured by the consumer (a feature of pub-sub and Apache Kafka generally), then, as mentioned, there is no need to build additional complexity into the data pipe itself. This makes it possible to decouple components from the message bus (and each other) and develop and test them independently, without worrying about dependencies.

Built-in Support for Asynchronous Messaging

All of the above makes it reasonably simple to decouple components and focus on a specific part of the application. Asynchronous messaging, when used correctly, removes yet another point of complexity by letting your services be ready for events without being synced to them.

High Throughput/Low Latency

It’s easier to have peace of mind about breaking up larger, SOA-type services into smaller, more atomic ones when you don’t have to worry about communication latency issues. Aiven managed Kafka services have been benchmarked and feature the highest throughput and lowest latency of any hosted service in the industry.

Why Managed Apache Kafka?

Apache Kafka was built to leverage the best of what came before while improving on it even more.

However, Apache Kafka can be challenging to set up. There are many options to choose from, and these vary widely depending on whether you are using an open-source, proprietary, free, or paid version (depending on the vendor). What are your future requirements?

If you were choosing a bundled solution, then your choice of version and installation type, for example, may come back to haunt you in the future, depending on the functionality and performance you later decide you need.

These challenges alone may serve as a compelling argument for a managed version. With the deployment, hardware outlay costs and effort and configuration out of your hands, you can focus entirely on the development for which you originally intended your Kafka deployment.

What’s more, managed is monitorable. Are you tracking throughput? You need not worry about where the integration points are in your app to instrument custom logging and monitoring; simply monitor each of your atomic services’ throughput via your provider’s Kafka backend and metrics infrastructure.

Auto-Scaling

What sort of problems can you expect when your application scales? Bottlenecks? Race conditions? A refactoring mess to accommodate for them?

A managed Kafka solution can scale automatically for you when the size of your data stream grows. As such, you needn’t worry when it’s time to refactor your services atomically, and you needn’t force your teams to maintain blob-style, clustered services with complicated dependencies just for the sake of avoiding latency between them.

Centralized, No-Fuss Management

If you’re managing your own cluster, you can expect to be tied down with installs, updates, managing version dependencies, and related issues. A managed solution handles all of that for you, so you can focus on your core business.

High Availability

Apache Kafka is already known for its high availability, so you never have to worry about your services being unable to communicate because a single node supporting your middleware is down.

Kafka’s ability to handle massive amounts of data and scale automatically lets you scale your data processing capabilities as your data load grows. And a managed solution has redundancy built right in.

Wrapping Up

We looked at some common challenges of refactoring from SOAs to MSAs: monitorability, tight coupling of components, and the challenges of breaking up larger monolithic or SOA components into microservices.

We considered different types of communication and looked at how a pub-sub mechanism serves asynchronous message transmission between multiple services. Finally, we examined how a managed Apache Kafka solution may be an excellent candidate for such a use case.q