50 Common Java Errors and How to Avoid Them

There are many types of errors that could be encountered while developing Java software, but most are avoidable. We’ve rounded up 50 of the most common Java software errors, complete with code examples and tutorials to help you work around common coding problems.

For more tips and tricks for coding better Java programs, download our Comprehensive Java Developer’s Guide, which is jam-packed with everything you need to up your Java game – from tools to the best websites and blogs, YouTube channels, Twitter influencers, LinkedIn groups, podcasts, must-attend events, and more.

If you’re working with .NET, you should also check out our guide to the 50 most common .NET software errors and how to avoid them. But if your current challenges are Java-related, read on to learn about the most common issues and their workarounds.

Compiler Errors

Compiler error messages are created when the Java software code is run through the compiler. It is important to remember that a compiler may throw many error messages for one error. So fix the first error and recompile. That could solve many problems.

1. “… Expected”

This error occurs when something is missing from the code. Often this is created by a missing semicolon or closing parenthesis.

private static double volume(String solidom, double alturam, double areaBasem, double raiom) {
double vol;
 if (solidom.equalsIgnoreCase("esfera"){
 vol=(4.0/3)*Math.pi*Math.pow(raiom,3);
 }
 else {
 if (solidom.equalsIgnoreCase("cilindro") {
 vol=Math.pi*Math.pow(raiom,2)*alturam;
 }
 else {
 vol=(1.0/3)*Math.pi*Math.pow(raiom,2)*alturam;
 }
 }
 return vol;
}

Often this error message does not pinpoint the exact location of the issue. To find it:

  • Make sure all opening parenthesis have a corresponding closing parenthesis.
  • Look in the line previous to the Java code line indicated. This Java software error doesn’t get noticed by the compiler until further in the code.
  • Sometimes a character such as an opening parenthesis shouldn’t be in the Java code in the first place. So the developer didn’t place a closing parenthesis to balance the parentheses.

Check out an example of how a missed parenthesis can create an error (@StackOverflow).

2. “Unclosed String Literal”

The “unclosed string literal” error message is created when the string literal ends without quotation marks, and the message will appear on the same line as the error. (@DreamInCode) A literal is a source code of a value.

 public abstract class NFLPlayersReference {
 private static Runningback[] nflplayersreference;
 private static Quarterback[] players;
 private static WideReceiver[] nflplayers;
 public static void main(String args[]){
 Runningback r = new Runningback("Thomlinsion");
 Quarterback q = new Quarterback("Tom Brady");
 WideReceiver w = new WideReceiver("Steve Smith");
 NFLPlayersReference[] NFLPlayersReference;
 Run();// {
 NFLPlayersReference = new NFLPlayersReference [3];
 nflplayersreference[0] = r;
 players[1] = q;
 nflplayers[2] = w;
 for ( int i = 0; i < nflplayersreference.length; i++ ) {
 System.out.println("My name is " + " nflplayersreference[i].getName());
 nflplayersreference[i].run();
 nflplayersreference[i].run();
 nflplayersreference[i].run();
 System.out.println("NFL offensive threats have great running abilities!");
  }
  }
  private static void Run() {
  System.out.println("Not yet implemented");
  } 
}

Commonly, this happens when:

  • The string literal does not end with quote marks. This is easy to correct by closing the string literal with the needed quote mark.
  • The string literal extends beyond a line. Long string literals can be broken into multiple literals and concatenated with a plus sign (“+”).
  • Quote marks that are part of the string literal are not escaped with a backslash (“\”).

Read a discussion of the unclosed string literal Java software error message. (@Quora)

3. “Illegal Start of an Expression”

There are numerous reasons why an “illegal start of an expression” error occurs. It ends up being one of the less-helpful error messages. Some developers say it’s caused by bad code.

Usually, expressions are created to produce a new value or assign a value to a variable. The compiler expects to find an expression and cannot find it because the syntax does not match expectations. (@StackOverflow) It is in these statements that the error can be found.

} // ADD IT HERE
  public void newShape(String shape) {
  switch (shape) {
  case "Line":
  Shape line = new Line(startX, startY, endX, endY);
  shapes.add(line);
  break;
  case "Oval":
  Shape oval = new Oval(startX, startY, endX, endY);
  shapes.add(oval);
  break;
  case "Rectangle":
  Shape rectangle = new Rectangle(startX, startY, endX, endY);
  shapes.add(rectangle);
  break;
  default:
  System.out.println("ERROR. Check logic.");
  }
  }
  } // REMOVE IT FROM HERE
  }

Browse discussions of how to troubleshoot the “illegal start of an expression” error. (@StackOverflow)

4. “Cannot Find Symbol”

This is a very common issue because all identifiers in Java need to be declared before they are used. When the code is being compiled, the compiler does not understand what the identifier means.

&quot;cannot find symbol&quot; Java software error

There are many reasons you might receive the “cannot find symbol” message:

  • The spelling of the identifier when declared may not be the same as when it is used in the code.
  • The variable was never declared.
  • The variable is not being used in the same scope it was declared.
  • The class was not imported.

Read a thorough discussion of the “cannot find symbol” error and examples of code that create this issue. (@StackOverflow)

5. “Public Class XXX Should Be in File”

The “public class XXX should be in file” message occurs when the class XXX and the Java program filename do not match. The code will only be compiled when the class and Java file are the same. (@coderanch):

package javaapplication3; 
  public class Robot { 
  int xlocation; 
  int ylocation; 
  String name; 
  static int ccount = 0; 
   public Robot(int xxlocation, int yylocation, String nname) { 
  xlocation = xxlocation; 
  ylocation = yylocation; 
  name = nname; 
  ccount++; 
  } 
  }
  public class JavaApplication1 { 
  public static void main(String[] args) { 
  robot firstRobot = new Robot(34,51,"yossi"); 
  System.out.println("numebr of robots is now " + Robot.ccount); 
  }
  }

To fix this issue:

  • Name the class and file the same.
  • Make sure the case of both names is consistent.

See an example of the “Public class XXX should be in file” error. (@StackOverflow)

6. “Incompatible Types”

“Incompatible types” is an error in logic that occurs when an assignment statement tries to pair a variable with an expression of types. It often comes when the code tries to place a text string into an integer — or vice versa. This is not a Java syntax error. (@StackOverflow)

test.java:78: error: incompatible types
return stringBuilder.toString();

^

required: int
found: String

1 error

There really isn’t an easy fix when the compiler gives an “incompatible types” message:

  • There are functions that can convert types.
  • The developer may need change what the code is expected to do.

Check out an example of how trying to assign a string to an integer created the “incompatible types.”(@StackOverflow)

7. “Invalid Method Declaration; Return Type Required”

This Java software error message means the return type of a method was not explicitly stated in the method signature.

public class Circle
 {
  private double radius;
  public CircleR(double r)
  {
  radius = r;
  }
  public diameter()
  {
  double d = radius * 2;
  return d;
  }
 }

There are a few ways to trigger the “invalid method declaration; return type required” error:

  • Forgetting to state the type
  • If the method does not return a value then “void” needs to be stated as the type in the method signature.
  • Constructor names do not need to state type. But if there is an error in the constructor name, then the compiler will treat the constructor as a method without a stated type.

Follow an example of how constructor naming triggered the “invalid method declaration; return type required” issue. (@StackOverflow)

8. “Method <X> in Class <Y> Cannot Be Applied to Given Types”

This Java software error message is one of the more helpful error messages. It explains how the method signature is calling the wrong parameters.

RandomNumbers.java:9: error: method generateNumbers in class RandomNumbers cannot be applied to given types;
 generateNumbers();
 required: int[]
 found:generateNumbers();
 reason: actual and formal argument lists differ in length

The method called is expecting certain arguments defined in the method’s declaration. Check the method declaration and call carefully to make sure they are compatible.

This discussion illustrates how a Java software error message identifies the incompatibility created by arguments in the method declaration and method call. (@StackOverflow)

9. “Missing Return Statement”

The “missing return statement” message occurs when a method does not have a return statement. Each method that returns a value (a non-void type) must have a statement that literally returns that value so it can be called outside the method.

public String[] OpenFile() throws IOException { 
Map<String, Double> map = new HashMap(); 
FileReader fr = new FileReader("money.txt"); 
BufferedReader br = new BufferedReader(fr); 
try{ 
  while (br.ready()){ 
  String str = br.readLine(); 
  String[] list = str.split(" "); 
  System.out.println(list);                
  } 
}catch(IOException e){ 
System.err.println("Error - IOException!"); 
}
}

There are a couple reasons why a compiler throws the “missing return statement” message:

  • A return statement was simply omitted by mistake.
  • The method did not return any value but type void was not declared in the method signature.

Check out an example of how to fix the “missing return statement” Java software error. (@StackOverflow)

10. “Possible Loss of Precision”

“Possible loss of precision” occurs when more information is assigned to a variable than it can hold. If this happens, pieces will be thrown out. If this is fine, then the code needs to explicitly declare the variable as a new type.

&quot;possible loss of precision&quot; error in Java

A “possible loss of precision” error commonly occurs when:

  • Trying to assign a real number to a variable with an integer data type.
  • Trying to assign a double to a variable with an integer data type.

This explanation of Primitive Data Types in Java shows how the data is characterized. (@Oracle)

11. “Reached End of File While Parsing”

This error message usually occurs in Java when the program is missing the closing curly brace (“}”). Sometimes it can be quickly fixed by placing it at the end of the code.

public class mod_MyMod extends BaseMod
 public String Version()
 {
  return "1.2_02";
 }
 public void AddRecipes(CraftingManager recipes)
 {
  recipes.addRecipe(new ItemStack(Item.diamond), new Object[] {
  "#", Character.valueOf('#'), Block.dirt
  });
 }

The above code results in the following error:

java:11: reached end of file while parsing }

Coding utilities and proper code indenting can make it easier to find these unbalanced braces.

This example shows how missing braces can create the “reached end of file while parsing” error message. (@StackOverflow)

12. “Unreachable Statement”

“Unreachable statement” occurs when a statement is written in a place that prevents it from being executed. Usually, this is after a break or return statement.

for(;;){
  break;
  ... // unreachable statement
 }
 
 int i=1;
 if(i==1)
  ...
 else
  ... // dead code

Often simply moving the return statement will fix the error. Read the discussion of how to fix unreachable statement Java software error. (@StackOverflow)

13. “Variable <X> Might Not Have Been Initialized”

This occurs when a local variable declared within a method has not been initialized. It can occur when a variable without an initial value is part of an if statement.

int x;
 if (condition) {
  x = 5;
 }
 System.out.println(x); // x may not have been initialized

Read this discussion of how to avoid triggering the “variable <X> might not have been initialized”error. (@reddit)

14. “Operator … Cannot be Applied to <X>”

This issue occurs when operators are used for types not in their definition.

operator < cannot be applied to java.lang.Object,java.lang.Object

This often happens when the Java code tries to use a type string in a calculation. To fix it, the string needs to be converted to an integer or float.

Read this example of how non-numeric types were causing a Java software error warning that an operator cannot be applied to a type. (@StackOverflow)

15. “Inconvertible Types”

The “inconvertible types” error occurs when the Java code tries to perform an illegal conversion.

TypeInvocationConversionTest.java:12: inconvertible types
 found : java.util.ArrayList<java.lang.Class<? extends TypeInvocationConversionTest.Interface1>>
 required: java.util.ArrayList<java.lang.Class<?>>
  lessRestrictiveClassList = (ArrayList<Class<?>>) classList;
 
 ^

For example, booleans cannot be converted to an integer.

Read this discussion about finding ways to convert inconvertible types in Java software. (@StackOverflow)

16. “Missing Return Value”

You’ll get the “missing return value” message when the return statement includes an incorrect type. For example, the following code:

public class SavingsAcc2 {
  private double balance;
  private double interest;
  public SavingsAcc2() {
  balance = 0.0;
  interest = 6.17;
  }
  public SavingsAcc2(double initBalance, double interested) {
  balance = initBalance;
  interest = interested;
  }
  public SavingsAcc2 deposit(double amount) {
  balance = balance + amount;
  return;
  }
  public SavingsAcc2 withdraw(double amount) {
  balance = balance - amount;
  return;
  }
  public SavingsAcc2 addInterest(double interest) {
  balance = balance * (interest / 100) + balance;
  return;
  }
  public double getBalance() {
  return balance;
  }
 }

Returns the following error:

SavingsAcc2.java:29: missing return value 
return; 
^ 
SavingsAcc2.java:35: missing return value 
return; 
^ 
SavingsAcc2.java:41: missing return value 
return; 
^ 
3 errors

Usually, there is a return statement that doesn’t return anything.

Read this discussion about how to avoid the “missing return value” Java software error message. (@coderanch)

17. “Cannot Return a Value From Method Whose Result Type Is Void”

This Java error occurs when a void method tries to return any value, such as in the following example:

public static void move()
 {
  System.out.println("What do you want to do?");
  Scanner scan = new Scanner(System.in);
  int userMove = scan.nextInt();
  return userMove;
 }
 public static void usersMove(String playerName, int gesture)
 {
  int userMove = move();
  if (userMove == -1)
  {
  break;
  }

Often this is fixed by changing to method signature to match the type in the return statement. In this case, instances of void can be changed to int:

public static int move()
 {
  System.out.println("What do you want to do?");
  Scanner scan = new Scanner(System.in);
  int userMove = scan.nextInt();
  return userMove;
 }

Read this discussion about how to fix the “cannot return a value from method whose result type is void” error. (@StackOverflow)

18. “Non-Static Variable … Cannot Be Referenced From a Static Context”

This error occurs when the compiler tries to access non-static variables from a static method(@javinpaul):

public class StaticTest {
  private int count=0;
  public static void main(String args[]) throws IOException {
  count++; //compiler error: non-static variable count cannot be referenced from a static context
  }
 }

To fix the “non-static variable … cannot be referenced from a static context” error, two things can be done:

  • The variable can be declared static in the signature.
  • The code can create an instance of a non-static object in the static method.

Read this tutorial that explains what is the difference between static and non-static variables. (@sitesbay)

19. “Non-Static Method … Cannot Be Referenced From a Static Context”

This issue occurs when the Java code tries to call a non-static method in a non-static class. For example, the following code:

class Sample
 {
  private int age;
  public void setAge(int a)
  {
  age=a;
  }
  public int getAge()
  {
  return age;
  }
  public static void main(String args[])
  {
  System.out.println("Age is:"+ getAge());
  }
 }

Would return this error:

Exception in thread "main" java.lang.Error: Unresolved compilation problem:
Cannot make a static reference to the non-static method getAge() from the type Sample

To call a non-static method from a static method is to declare an instance of the class calling the non-static method.

Read this explanation of what is the difference between non-static methods and static methods.

20. “(array) <X> Not Initialized”

You’ll get the “(array) <X> not initialized” message when an array has been declared but not initialized. Arrays are fixed in length so each array needs to be initialized with the desired length.

The following code is acceptable:

AClass[] array = {object1, object2}
 As is:
AClass[] array = new AClass[2];
...
array[0] = object1;
array[1] = object2;
But not:
AClass[] array;
...
array = {object1, object2};

Read this discussion of how to initialize arrays in Java software. (@StackOverflow)

21. “ArrayIndexOutOfBoundsException”

This is a runtime error message that occurs when the code attempts to access an array index that is not within the values. The following code would trigger this exception:

String[] name = {
  "tom",
  "dick",
  "harry"
 };
 for (int i = 0; i <= name.length; i++) {
  System.out.print(name[i] + '\n');
 }

Here’s another example (@DukeU):

int[] list = new int[5];
list[5] = 33; // illegal index, maximum index is 4

Array indexes start at zero and end at one less than the length of the array. Often it is fixed by using “<” instead of “<=” when defining the limits of the array index.

Check out this example of how an index triggered the “ArrayIndexOutOfBoundsException” Java software error message. (@StackOverflow)

22. “StringIndexOutOfBoundsException”

This is an issue that occurs when the code attempts to access part of the string that is not within the bounds of the string. Usually, this happens when the code tries to create a substring of a string that is not as long as the parameters are set at. Here’s an example (@javacodegeeks):

public class StringCharAtExample {
    public static void main(String[] args) {
       String str = "Java Code Geeks!";
        System.out.println("Length: " + str.length());
        //The following statement throws an exception, because
        //the request index is invalid.
        char ch = str.charAt(50);
    }
}

Like array indexes, string indexes start at zero. When indexing a string, the last character is at one less than the length of the string. The “StringIndexOutOfBoundsException” Java software error message usually means the index is trying to access characters that aren’t there.

Here’s an example that illustrates how the “StringIndexOutOfBoundsException” can occur and be fixed. (@StackOverflow)

23. “NullPointerException”

A “NullPointerException” will occur when the program tries to use an object reference that does not have a value assigned to it (@geeksforgeeks).

// A Java program to demonstrate that invoking a method
// on null causes NullPointerException
import java.io.*;
class GFG{
    public static void main (String[] args)    {
        // Initializing String variable with null value
        String ptr = null;
        // Checking if ptr.equals null or works fine.
       try {
           // This line of code throws NullPointerException
            // because ptr is null
            if (ptr.equals("gfg"))
               System.out.print("Same");
            else
                System.out.print("Not Same");
        } catch(NullPointerException e)
       {
            System.out.print("NullPointerException Caught");
        }
    }
}

The Java program raises an exception often when:

  • A statement references an object with a null value.
  • Trying to access a class that is defined but isn’t assigned a reference.

Here’s discussion of when developers may encounter the “NullPointerException” and how to handle it. (@StackOverflow)

24. “NoClassDefFoundError”

The “NoClassDefFoundError” will occur when the interpreter cannot find the file containing a class with the main method. Here’s an example from DZone (@DZone):

If you compile this program:

class A
 {
  // some code
 }
 public class B
 {
  public static void main(String[] args)
  {
  A a = new A();
  }
 }

Two .class files are generated: A.class and B.class. Removing the A.class file and running the B.class file, you’ll get the NoClassDefFoundError:

Exception in thread "main" java.lang.NoClassDefFoundError: A
 at MainClass.main(MainClass.java:10)
 Caused by: java.lang.ClassNotFoundException: A
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

This can happen if:

  • The file is not in the right directory.
  • The name of the class must be the same as the name of the file (without the file extension). The names are case sensitive.

Read this discussion of why “NoClassDefFoundError” occurs when running Java software. (@StackOverflow)

25. “NoSuchMethodFoundError”

This error message will occur when the Java software tries to call a method of a class and the method no longer has a definition (@myUND):

Error: Could not find or load main class wiki.java

Often the “NoSuchMethodFoundError” Java software error occurs when there is a typo in the declaration.

Read this tutorial to learn how to avoid the error message NoSuchMethodFoundError.” (@javacodegeeks)

26. “NoSuchProviderException”

“NoSuchProviderException” occurs when a security provider is requested that is not available (@alvinalexander):

javax.mail.NoSuchProviderException

When trying to find why “NoSuchProviderException” occurs, check:

  • The JRE configuration.
  • The Java home is set in the configuration.
  • Which Java environment is used.
  • The security provider entry.

Read this discussion of what causes “NoSuchProviderException” when Java software is run. (@StackOverflow)

27. AccessControlException

AccessControlException indicates that requested access to system resources such as a file system or network is denied, as in this example from JBossDeveloper (@jbossdeveloper):

ERROR Could not register mbeans java.security.
AccessControlException: WFSM000001: Permission check failed (permission "("javax.management.MBeanPermission" "org.apache.logging.log4j.core.jmx.LoggerContextAdmin#-
[org.apache.logging.log4j2:type=51634f]" "registerMBean")" in code source "(vfs:/C:/wildfly-10.0.0.Final/standalone/deployments/mySampleSecurityApp.war/WEB-INF/lib/log4j-core-2.5.jar )" of "null")

Read this discussion of a workaround used to get past an “AccessControlException” error. (@github)

28. “ArrayStoreException”

An “ArrayStoreException” occurs when the rules of casting elements in Java arrays are broken. Arrays are very careful about what can go into them. (@Roedyg) For instance, this example from JavaScan.com illustrates that this program (@java_scan):

 /* ............... START ............... */
public class JavaArrayStoreException {
     public static void main(String...args) {
         Object[] val = new Integer[4];
         val[0] = 5.8;
     }
} /* ............... END ............... */

Results in the following output:

Exception in thread "main" java.lang.ArrayStoreException: java.lang.Double
at ExceptionHandling.JavaArrayStoreException.main(JavaArrayStoreException.java:7)

When an array is initialized, the sorts of objects allowed into the array need to be declared. Then each array element needs be of the same type of object.

Read this discussion of how to solve for the “ArrayStoreException.” (@StackOverflow)

29. “Bad Magic Number”

This Java software error message means something may be wrong with the class definition files on the network. Here’s an example from The Server Side (@TSS_dotcom):

Java(TM) Plug-in: Version 1.3.1_01
Using JRE version 1.3.1_01 Java HotSpot(TM) Client VM
User home directory = C:\Documents and Settings\Ankur
Proxy Configuration: Manual Configuration
Proxy: 192.168.11.6:80
java.lang.ClassFormatError: SalesCalculatorAppletBeanInfo (Bad magic number)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source) 
at java.security.SecureClassLoader.defineClass(Unknown Source)
at sun.applet.AppletClassLoader.findClass(Unknown Source)
at sun.plugin.security.PluginClassLoader.access$201(Unknown Source)
at sun.plugin.security.PluginClassLoader$1.run(Unknown Source) 
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin.security.PluginClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.applet.AppletClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.beans.Introspector.instantiate(Unknown Source)
at java.beans.Introspector.findInformant(Unknown Source)
at java.beans.Introspector.(Unknown Source)
at java.beans.Introspector.getBeanInfo(Unknown Source)
at sun.beans.ole.OleBeanInfo.(Unknown Source)
at sun.beans.ole.StubInformation.getStub(Unknown Source)
at sun.plugin.ocx.TypeLibManager$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.plugin.ocx.TypeLibManager.getTypeLib(Unknown Source)
at sun.plugin.ocx.TypeLibManager.getTypeLib(Unknown Source)
at sun.plugin.ocx.ActiveXAppletViewer.statusNotification(Native Method)
at sun.plugin.ocx.ActiveXAppletViewer.notifyStatus(Unknown Source)
at sun.plugin.ocx.ActiveXAppletViewer.showAppletStatus(Unknown Source)
at sun.applet.AppletPanel.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

The “bad magic number” error message could happen when:

  • The first four bytes of a class file is not the hexadecimal number CAFEBABE.
  • The class file was uploaded as in ASCII mode not binary mode.
  • The Java program is run before it is compiled.

Read this discussion of how to find the reason for a “bad magic number.” (@coderanch)

30. “Broken Pipe”

This error message refers to the data stream from a file or network socket has stopped working or is closed from the other end (@ExpertsExchange).

Exception in thread "main" java.net.SocketException: Broken pipe
      at java.net.SocketOutputStream.socketWrite0(Native Method)
      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
      at java.net.SocketOutputStream.write(SocketOutputStream.java:115)
      at java.io.DataOutputStream.write

The causes of a broken pipe often include:

  • Running out of disk scratch space.
  • RAM may be clogged.
  • The datastream may be corrupt.
  • The process reading the pipe might have been closed.

Read this discussion of what is the Java error “broken pipe.” (@StackOverflow)

31. “Could Not Create Java Virtual Machine”

This Java error message usually occurs when the code tries to invoke Java with the wrong arguments (@ghacksnews):

Error: Could not create the Java Virtual Machine
Error: A fatal exception has occurred. Program will exit.

It often is caused by a mistake in the declaration in the code or allocating the proper amount of memory to it.

Read this discussion of how to fix the Java software error “Could not create Java Virtual Machine.” (@StackOverflow)

32. “class file contains wrong class”

The “class file contains wrong class” issue occurs when the Java code tries to find the class file in the wrong directory, resulting in an error message similar to the following:

MyTest.java:10: cannot access MyStruct 
bad class file: D:\Java\test\MyStruct.java 
file does not contain class MyStruct 
Please remove or make sure it appears in the correct subdirectory of the classpath. 
MyStruct ms = new MyStruct(); ^

To fix this error, these tips could help:

  • Make sure the name of the source file and the name of the class match — including case.
  • Check if the package statement is correct or missing.
  • Make sure the source file is in the right directory.

Read this discussion of how to fix a “class file contains wrong class” error. (@StackOverflow)

33. “ClassCastException”

The “ClassCastException” message indicates the Java code is trying to cast an object to the wrong class. In this example from Java Concept of the Day, running the following program:

package com;
class A{
    int i = 10;
}
class B extends A{
    int j = 20;
}
class C extends B{
    int k = 30;
} 
public class ClassCastExceptionDemo{
   public static void main(String[] args)    {
        A a = new B();   //B type is auto up casted to A type
        B b = (B) a;     //A type is explicitly down casted to B type.
        C c = (C) b;    //Here, you will get class cast exception
        System.out.println(c.k);
    }
}

Results in this error:

Exception in thread “main” java.lang.ClassCastException: com.B cannot be cast to com.
at com.ClassCastExceptionDemo.main(ClassCastExceptionDemo.java:23)

The Java code will create a hierarchy of classes and subclasses. To avoid the “ClassCastException” error, make sure the new type belongs to the right class or one of its parent classes. If Generics are used, these errors can be caught when the code is compiled.

Read this tutorial on how to fix “ClassCastException” Java software errors. (@java_concept)

34. “ClassFormatError”

The “ClassFormatError” message indicates a linkage error and occurs when a class file cannot be read or interpreted as a class file.

Caused by: java.lang.ClassFormatError: Absent Code attribute in method that is
        not native or abstract in class file javax/persistence/GenerationType
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(Unknown Source)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$000(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)

There are several reasons why a “ClassFormatError” can occur:

  • The class file was uploaded as in ASCII mode not binary mode.
  • The web server must send class files as binary not ASCII.
  • There could be a classpath error that prevents the code from finding the class file.
  • If the class is loaded twice, the second time will cause the exception to be thrown.
  • An old version of Java runtime is being used.

Read this discussion about what causes the “ClassFormatError” in Java. (@StackOverflow)

35. “ClassNotFoundException”

“ClassNotFoundException” only occurs at run time — meaning a class that was there during compilation is missing at run time. This is a linkage error.

&quot;ClassNotFoundException&quot;

Much like the “NoClassDefFoundError,” this issue can occur if:

  • The file is not in the right directory.
  • The name of the class must be the same as the name of the file (without the file extension). The names are case sensitive.

Read this discussion of what causes “ClassNotFoundException” the for more cases. (@StackOverflow).

36. “ExceptionInInitializerError”

This Java issue will occur when something goes wrong with a static initialization (@GitHub). When the Java code later uses the class, the “NoClassDefFoundError” error will occur.

java.lang.ExceptionInInitializerError
  at org.eclipse.mat.hprof.HprofIndexBuilder.fill(HprofIndexBuilder.java:54)
  at org.eclipse.mat.parser.internal.SnapshotFactory.parse(SnapshotFactory.java:193)
  at org.eclipse.mat.parser.internal.SnapshotFactory.openSnapshot(SnapshotFactory.java:106)
  at com.squareup.leakcanary.HeapAnalyzer.openSnapshot(HeapAnalyzer.java:134)
  at com.squareup.leakcanary.HeapAnalyzer.checkForLeak(HeapAnalyzer.java:87)
  at com.squareup.leakcanary.internal.HeapAnalyzerService.onHandleIntent(HeapAnalyzerService.java:56)
  at android.app.IntentService$ServiceHandler.handleMessage(IntentService.java:65)
  at android.os.Handler.dispatchMessage(Handler.java:102)
  at android.os.Looper.loop(Looper.java:145)
  at android.os.HandlerThread.run(HandlerThread.java:61)
Caused by: java.lang.NullPointerException: in == null
  at java.util.Properties.load(Properties.java:246)
  at org.eclipse.mat.util.MessageUtil.(MessageUtil.java:28)
 at org.eclipse.mat.util.MessageUtil.(MessageUtil.java:13)
  ... 10 more

There needs to be more information to fix the error. Using getCause() in the code can return the exception that caused the error to be returned.

Read this discussion about how to track down the cause of the ExceptionInInitializerError. (@StackOverflow)

37. “IllegalBlockSizeException”

An “IllegalBlockSizeException” will occur during decryption when the length message is not a multiple of 8 bytes. Here’s an example from ProgramCreek.com (@ProgramCreek):

@Override
protected byte[] engineWrap(Key key) throws IllegalBlockSizeException, InvalidKeyException {
    try {
        byte[] encoded = key.getEncoded();
        return engineDoFinal(encoded, 0, encoded.length);
    } catch (BadPaddingException e) { 
       IllegalBlockSizeException newE = new IllegalBlockSizeException();
        newE.initCause(e);
        throw newE;
    }
}

The “IllegalBlockSizeException” could be caused by:

  • Different encryption and decryption algorithm options used.
  • The message to be decrypted could be truncated or garbled in transmission.

Read this discussion about how to prevent the IllegalBlockSizeException Java software error message. (@StackOverflow)

38. “BadPaddingException”

A “BadPaddingException” will occur during decryption when padding was used to create a message than can be measured by a multiple of 8 bytes. Here’s an example from Stack Overflow (@StackOverflow):

javax.crypto.BadPaddingException: Given final block not properly padded
at com.sun.crypto.provider.SunJCE_f.b(DashoA13*..)
at com.sun.crypto.provider.SunJCE_f.b(DashoA13*..)
at com.sun.crypto.provider.AESCipher.engineDoFinal(DashoA13*..)
at javax.crypto.Cipher.doFinal(DashoA13*..)

Encrypted data is binary so don’t try to store it in a string or the data was not padded properly during encryption.

Read this discussion about how to prevent the BadPaddingException. (@StackOverflow)

39. “IncompatibleClassChangeError”

An “IncompatibleClassChangeError” is a form of LinkageError that can occur when a base class changes after the compilation of a child class. This example is from How to Do in Java (@HowToDoInJava):

Exception in thread "main" java.lang.IncompatibleClassChangeError: Implementing class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$000(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)
at net.sf.cglib.core.DebuggingClassWriter.toByteArray(DebuggingClassWriter.java:73)
at net.sf.cglib.core.DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:26)
at net.sf.cglib.core.AbstractClassGenerator.create(AbstractClassGenerator.java:216)
at net.sf.cglib.core.KeyFactory$Generator.create(KeyFactory.java:144)
at net.sf.cglib.core.KeyFactory.create(KeyFactory.java:116)
at net.sf.cglib.core.KeyFactory.create(KeyFactory.java:108)
at net.sf.cglib.core.KeyFactory.create(KeyFactory.java:104)
at net.sf.cglib.proxy.Enhancer.(Enhancer.java:69)

When the “IncompatibleClassChangeError” occurs, it is possible that:

  • The static on the main method was forgotten.
  • A legal class was used illegally.
  • A class was changed and there are references to it from an another class by its old signatures. Try deleting all class files and recompiling everything.

Try these steps to resolve the “IncompatibleClassChangeError.” (@javacodegeeks)

40. “FileNotFoundException”

This Java software error message is thrown when a file with the specified pathname does not exist.

@Override public ParcelFileDescriptor openFile(Uri uri, String mode) throws FileNotFoundException {
    if (uri.toString().startsWith(FILE_PROVIDER_PREFIX)) {
        int m = ParcelFileDescriptor.MODE_READ_ONLY;
        if (mode.equalsIgnoreCase("rw")) m = ParcelFileDescriptor.MODE_READ_WRITE; 
        File f = new File(uri.getPath());
        ParcelFileDescriptor pfd = ParcelFileDescriptor.open(f, m);
        return pfd;
    } else {
        throw new FileNotFoundException("Unsupported uri: " + uri.toString());
}
}

In addition to files not existing the specified pathname, this could mean the existing file is inaccessible.

Read this discussion about why the “FileNotFoundException” could be thrown. (@StackOverflow)

41. “EOFException”

An “EOFException” is thrown when an end of file or end of stream has been reached unexpectedly during input. Here’s an example from JavaBeat of an application that throws an EOFException:

import java.io.DataInputStream;
import java.io.EOFException;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class ExceptionExample {
    public void testMethod1() {
        File file = new File("test.txt");
        DataInputStream dataInputStream = null;
        try {
            dataInputStream = new DataInputStream(new FileInputStream(file));
            while (true) {
                dataInputStream.readInt();
            }
        } catch (EOFException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (dataInputStream != null) {
                    dataInputStream.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
       }
    }
    public static void main(String[] args) {
        ExceptionExample instance1 = new ExceptionExample();
        instance1.testMethod1();
    }
}

Running the program above results in the following exception:

java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at logging.simple.ExceptionExample.testMethod1(ExceptionExample.java:16)
at logging.simple.ExceptionExample.main(ExceptionExample.java:36)

When there is no more data while the class DataInputStream is trying to read data in the stream, “EOFException” will be thrown. It can also occur in the ObjectInputStream and RandomAccessFile classes.

Read this discussion about when the “EOFException” can occur while running Java software. (@StackOverflow)

42. “UnsupportedEncodingException”

This Java software error message is thrown when character encoding is not supported (@Penn).

public UnsupportedEncodingException()

It is possible that the Java Virtual Machine being used doesn’t support a given character set.

Read this discussion of how to handle “UnsupportedEncodingException” while running Java software. (@StackOverflow)

43. “SocketException”

A “SocketException” indicates there is an error creating or accessing a socket (@ProgramCreek).

public void init(String contextName, ContextFactory factory) {
  super.init(contextName, factory);
  String periodStr = getAttribute(PERIOD_PROPERTY);
  if (periodStr != null) {
  int period = 0;
  try {
  period = Integer.parseInt(periodStr);
  } catch (NumberFormatException nfe) {}
  if (period <= 0) {
  throw new MetricsException("Invalid period: " + periodStr);
  }
  setPeriod(period);
  }
  metricsServers =
  Util.parse(getAttribute(SERVERS_PROPERTY), DEFAULT_PORT);
  unitsTable = getAttributeTable(UNITS_PROPERTY);
  slopeTable = getAttributeTable(SLOPE_PROPERTY);
  tmaxTable = getAttributeTable(TMAX_PROPERTY);
  dmaxTable = getAttributeTable(DMAX_PROPERTY);
  try {
  datagramSocket = new DatagramSocket();
  } catch (SocketException se) {
  se.printStackTrace();
  }
 }

This exception usually is thrown when the maximum connections are reached due to:

  • No more network ports available to the application.
  • The system doesn’t have enough memory to support new connections.

Read this discussion of how to resolve “SocketException” issues while running Java software. (@StackOverflow)

44. “SSLException”

This Java software error message occurs when there is failure in SSL-related operations. The following example is from Atlassian (@Atlassian):

com.sun.jersey.api.client.ClientHandlerException: javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
  at com.sun.jersey.client.apache.ApacheHttpClientHandler.handle(ApacheHttpClientHandler.java:202)
  at com.sun.jersey.api.client.Client.handle(Client.java:365)
  at com.sun.jersey.api.client.WebResource.handle(WebResource.java:556)
  at com.sun.jersey.api.client.WebResource.get(WebResource.java:178)
  at com.atlassian.plugins.client.service.product.ProductServiceClientImpl.getProductVersionsAfterVersion(ProductServiceClientImpl.java:82)
  at com.atlassian.upm.pac.PacClientImpl.getProductUpgrades(PacClientImpl.java:111)
  at com.atlassian.upm.rest.resources.ProductUpgradesResource.get(ProductUpgradesResource.java:39)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
  at java.lang.reflect.Method.invoke(Unknown Source)
  at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$ResponseOutInvoker$1.invoke(DispatchProviderHelper.java:206)
  at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$1.intercept(DispatchProviderHelper.java:90)
  at com.atlassian.plugins.rest.common.interceptor.impl.DefaultMethodInvocation.invoke(DefaultMethodInvocation.java:61)
 at com.atlassian.plugins.rest.common.expand.interceptor.ExpandInterceptor.intercept(ExpandInterceptor.java:38)
 at com.atlassian.plugins.rest.common.interceptor.impl.DefaultMethodInvocation.invoke(DefaultMethodInvocation.java:61)
  at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper.invokeMethodWithInterceptors(DispatchProviderHelper.java:98)
  at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper.access$100(DispatchProviderHelper.java:28)
  at com.atlassian.plugins.rest.common.interceptor.impl.DispatchProviderHelper$ResponseOutInvoker._dispatch(DispatchProviderHelper.java:202)
  ...
 Caused by: javax.net.ssl.SSLException: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
  ...
 Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
  ...
 Caused by: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty

This can happen if:

  • Certificates on the server or client have expired.
  • Server port has been reset to another port.

Read this discussion of what can cause the “SSLException” error in Java software. (@StackOverflow)

45. “MissingResourceException”

A “MissingResourceException” occurs when a resource is missing. If the resource is in the correct classpath, this is usually because a properties file is not configured properly. Here’s an example (@TIBCO):

java.util.MissingResourceException: Can't find bundle for base name localemsgs_en_US, locale en_US
java.util.ResourceBundle.throwMissingResourceException
java.util.ResourceBundle.getBundleImpl
java.util.ResourceBundle.getBundle
net.sf.jasperreports.engine.util.JRResourcesUtil.loadResourceBundle
net.sf.jasperreports.engine.util.JRResourcesUtil.loadResourceBundle

Read this discussion of how to fix “MissingResourceException” while running Java software.

46. “NoInitialContextException”

A “NoInitialContextException” occurs when the Java application wants to perform a naming operation but can’t create a connection (@TheASF).

[java] Caused by: javax.naming.NoInitialContextException: Need to specify class name in environment or system property, or as an applet parameter, or in an application resource file: java.naming.factory.initial
 [java] at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:645)
 [java] at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:247)
 [java] at javax.naming.InitialContext.getURLOrDefaultInitCtx(InitialContext.java:284)
 [java] at javax.naming.InitialContext.lookup(InitialContext.java:351)
 [java] at org.apache.camel.impl.JndiRegistry.lookup(JndiRegistry.java:51)

This can be a complex problem to solve but here are some possible problems that cause the “NoInitialContextException” Java error message:

  • The application may not have the proper credentials to make a connection.
  • The code may not identify the implementation of JNDI needed.
  • The InitialContext class may not be configured with the right properties.

Read this discussion of what “NoInitialContextException” means when running Java software. (@StackOverflow)

47. “NoSuchElementException”

A “NoSuchElementException” happens when an iteration (such as a “for” loop) tries to access the next element when there is none.

public class NoSuchElementExceptionDemo{
  public static void main(String args[]) {
  Hashtable sampleMap = new Hashtable();
  Enumeration enumeration = sampleMap.elements();
  enumeration.nextElement(); //java.util.NoSuchElementExcepiton here because enumeration is empty
  }
 }
 Output:
 Exception in thread "main" java.util.NoSuchElementException: Hashtable Enumerator
  at java.util.Hashtable$EmptyEnumerator.nextElement(Hashtable.java:1084)
  at test.ExceptionTest.main(NoSuchElementExceptionDemo.java:23)

The “NoSuchElementException” can be thrown by these methods:

  • Enumeration::nextElement()
  • NamingEnumeration::next()
  • StringTokenizer::nextElement()
  • Iterator::next()

Read this tutorial of how to fix “NoSuchElementException” in Java software. (@javinpaul)

48. “NoSuchFieldError”

This Java software error message is thrown when an application tries to access a field in an object but the specified field no longer exists in the onbject (@sourceforge).

public NoSuchFieldError()

Usually, this error is caught in the compiler but will be caught during runtime if a class definition has been changed between compile and running.

Read this discussion of how to find what causes the “NoSuchFieldError” when running Java software. @StackOverflow

49. “NumberFormatException”

This Java software error message occurs when the application tries to convert a string to a numeric type, but that the number is not a valid string of digits (@alvinalexander).

package com.devdaily.javasamples;
 public class ConvertStringToNumber {
  public static void main(String[] args) {
  try {
  String s = "FOOBAR";
  int i = Integer.parseInt(s);
  // this line of code will never be reached
  System.out.println("int value = " + i);
  }
  catch (NumberFormatException nfe) {
  nfe.printStackTrace();
  }
  }
}

The can “NumberFormatException” be thrown when:

  • Leading or trailing spaces in the number.
  • The sign is not ahead of the number.
  • The number has commas.
  • Localisation may not categorize it as a valid number.
  • The number is too big to fit in the numeric type.

Read this discussion of how to avoid “NumberFormatException” when running Java software. (@StackOverflow).

50. “TimeoutException”

This Java software error message occurs when a blocking operation times out.

private void queueObject(ComplexDataObject obj) throws TimeoutException, InterruptedException {
    if (!queue.offer(obj, 10, TimeUnit.SECONDS)) {
        TimeoutException ex = new TimeoutException("Timed out waiting for parsed elements to be processed. Aborting.");
        throw ex;
    }
}

Read this discussion about how to handle “TimeoutException” when running Java software. (@StackOverflow).

Conclusion

And that wraps it up! If you’ve followed along the whole way, you should be ready to handle a variety of runtime and compiler errors and exceptions. Feel free to keep both of these articles saved or otherwise bookmarked for quick recall. And for the ultimate Java developer’s toolkit, don’t forget to download The Comprehensive Java Developer’s Guide.

Thorough Introduction to Apache Kafka

Introduction

Kafka is a word that gets heard a lot nowadays… A lot of leading digital companies seem to use it as well. But what is it actually?

Kafka was originally developed at LinkedIn in 2011 and has improved a lot since then. Nowadays it is a whole platform, allowing you to redundantly store absurd amounts of data, have a message bus with huge throughput (millions/sec) and use real-time stream processing on the data that goes through it all at once.

This is all well and great, but stripped down to its core, Kafka is a distributed, horizontally-scalable, fault-tolerant, commit log.

Those were some fancy words, let’s go at them one by one and see what they mean. Afterwards, we will dive deep into how it works.

Distributed

A distributed system is one which is split into multiple running machines, all of which work together in a cluster to appear as one single node to the end user. Kafka is distributed in the sense that it stores, receives and sends messages on different nodes (called brokers).

The benefits to this approach are high scalability and fault-tolerance.

Horizontally-scalable

Let’s define the term vertical scalability first. Say, for instance, you have a traditional database server which is starting to get overloaded. The way to get this solved is to simply increase the resources (CPU, RAM, SSD) on the server. This is called vertical scaling — where you add more resources to the machine. There are two big disadvantages to scaling upwards:

  1. There are limits defined by the hardware. You cannot scale upwards indefinitely.
  2. It usually requires downtime, something which big corporations cannot afford.

Horizontal scalability is solving the same problem by throwing more machines at it. Adding a new machine does not require downtime nor are there any limits to the amount of machines you can have in your cluster. The catch is that not all systems support horizontal scalability, as they are not designed to work in a cluster and those that are are usually more complex to work with.

Horizontal scaling becomes much cheaper after a certain threshold

Fault-tolerant

Something that emerges in non-distributed systems is that they have a single point of failure (SPoF). If your single database server fails (as machines do) for whatever reason, you’re screwed.

Distributed systems are designed in such a way to accommodate failures in a configurable way. In a 5-node Kafka cluster, you can have it continue working even if 2 of the nodes are down. It is worth noting that fault-tolerance is at a direct tradeoff with performance, as in the more fault-tolerant your system is, the less performant it is.

Commit Log

A commit log (also referred to as write-ahead log, transaction log) is a persistent ordered data structure which only supports appends. You cannot modify nor delete records from it. It is read from left to right and guarantees item ordering.

Sample illustration of a commit log, taken from here

– Are you telling me that Kafka is such a simple data structure?

In many ways, yes. This structure is at the heart of Kafka and is invaluable, as it provides ordering, which in turn provides deterministic processing. Both of which are non-trivial problems in distributed systems.

Kafka actually stores all of its messages to disk (more on that later) and having them ordered in the structure lets it take advantage of sequential disk reads.

  • Reads and writes are a constant time O(1) (knowing the record ID), which compared to other structure’s O(log N) operations on disk is a huge advantage, as each disk seek is expensive.
  • Reads and writes do not affect another. Writing would not lock reading and vice-versa (as opposed to balanced trees)

These two points have huge performance benefits, since the data size is completely decoupled from performance. Kafka has the same performance whether you have 100KB or 100TB of data on your server.

How does it work?

Applications (producers) send messages (records) to a Kafka node (broker) and said messages are processed by other applications called consumers. Said messages get stored in a topic and consumers subscribe to the topic to receive new messages.

As topics can get quite big, they get split into partitions of a smaller size for better performance and scalability. (ex: say you were storing user login requests, you could split them by the first character of the user’s username)
Kafka guarantees that all messages inside a partition are ordered in the sequence they came in. The way you distinct a specific message is through its offset, which you could look at as a normal array index, a sequence number which is incremented for each new message in a partition.

Kafka follows the principle of a dumb broker and smart consumer. This means that Kafka does not keep track of what records are read by the consumer and delete them but rather stores them a set amount of time (e.g one day) or until some size threshold is met. Consumers themselves poll Kafka for new messages and say what records they want to read. This allows them to increment/decrement the offset they’re at as they wish, thus being able to replay and reprocess events.

It is worth noting that consumers are actually consumer groups which have one or more consumer processes inside. In order to avoid two processes reading the same message twice, each partition is tied to only one consumer process per group.

Representation of the data flow

Persistence to Disk

As I mentioned earlier, Kafka actually stores all of its records to disk and does not keep anything in RAM. You might be wondering how this is in the slightest way a sane choice. There are numerous optimizations behind this that make it feasible:

  1. Kafka has a protocol which groups messages together. This allows network requests to group messages together and reduce network overhead, the server in turn persist chunk of messages in one go and consumer fetch large linear chunks at once
  2. Linear reads/writes on a disk are fast. The concept that modern disks are slow is because of disk seek, something that is not an issue in big linear operations.
  3. Said linear operations are heavily optimized by the OS, via read-ahead(prefetch large block multiples) and write-behind (group small logical writes into big physical writes) techniques.
  4. Modern OSes cache the disk in free RAM. This is called pagecache.
  5. Since Kafka stores messages in a standardized binary format unmodified throughout the whole flow (producer->broker->consumer), it can make use of the zero-copy optimization. That is when the OS copies data from the pagecache directly to a socket, effectively bypassing the Kafka broker application entirely

All of these optimizations allow Kafka to deliver messages at near network speed.

Data Distribution & Replication

Let’s talk about how Kafka achieves fault-tolerance and how it distributes data between nodes.

Data Replication

Partition data is replicated across multiple brokers in order to preserve the data in case one broker dies.

At all times, one broker “owns” a partition and is the node through which applications write/read from the partition. This is called a partition leader. It replicates the data it receives to N other brokers, called followers. They store the data as well and are ready to be elected as leader in case the leader node dies.

This helps you configure the guarantee that any successfully published message will not be lost. Having the option to change the replication factor lets you trade performance for stronger durability guarantees, depending on the criticality of the data.

 
4 Kafka brokers with a replication factor of 3

In this way, if one leader ever fails, a follower can take his place.

You may be asking, though:

– How does a producer/consumer know who the leader of a partition is?

For a producer/consumer to write/read from a partition, they need to know its leader, right? This information needs to be available from somewhere.
Kafka stores such metadata in a service called Zookeeper.

What is Zookeeper?

Zookeeper is a distributed key-value store. It is highly-optimized for reads but writes are slower. It is most commonly used to store metadata and handle the mechanics of clustering (heartbeats, distributing updates/configurations, etc).

It allows clients of the service (the Kafka brokers) to subscribe and have changes sent to them once they happen. This is how brokers know when to switch partition leaders. Zookeeper is also extremely fault-tolerant and it ought to be, as Kafka heavily depends on it.

It is used for storing all sort of metadata, to mention some:

  • Consumer group‘s offset per partition (although modern clients store offsets in a separate Kafka topic)
  • ACL (Access Control Lists) — used for limiting access/authorization
  • Producer & Consumer Quotas —maximum message/sec boundaries
  • Partition Leaders and their health

How does a producer/consumer know who the leader of a partition is?

Producer and Consumers used to directly connect and talk to Zookeeper to get this (and other) information. Kafka has been moving away from this coupling and since versions 0.8 and 0.9 respectively, clients fetch metadata information from Kafka brokers directly, who themselves talk to Zookeeper.

 
Metadata Flow

Streaming

In Kafka, a stream processor is anything that takes continual streams of data from input topics, performs some processing on this input and produces a stream of data to output topics (or external services, databases, the trash bin, wherever really…)

It is possible to do simple processing directly with the producer/consumer APIs, however for more complex transformations like joining streams together, Kafka provides a integrated Streams API library.

This API is intended to be used within your own codebase, it is not running on a broker. It works similar to the consumer API and helps you scale out the stream processing work over multiple applications (similar to consumer groups).

Stateless Processing

A stateless processing of a stream is deterministic processing that does not depend on anything external. You know that for any given data you will always produce the same output independent of anything else. An example for that would be simple data transformation — appending something to a string "Hello" -> "Hello, World!".

 

Stream-Table Duality

It is important to recognize that streams and tables are essentially the same. A stream can be interpreted as a table and a table can be interpreted as a stream.

Stream as a Table

If you look at how synchronous database replication is achieved, you’ll see that it is through the so-called streaming replication, where each change in a table is sent to a replica server. A Kafka stream can be interpreted in the same way — as a stream of updates for data, in which the aggregate is the final result of the table. Such streams get saved in a local RocksDB (by default) and are called a KTable.

 
Each record increments the aggregated count

Table as a Stream

A table can be looked at as a snapshot of the latest value for each key in a stream. In the same way stream records can produce a table, table updates can produce a changelog stream.

 
Each update produces a snapshot record in the stream

Stateful Processing

Some simple operations like map() or filter() are stateless and do not require you to keep any data regarding the processing. However, in real life, most operations you’ll do will be stateful (e.g count()) and as such will require you to store the currently accumulated state.

The problem with maintaining state on stream processors is that the stream processors can fail! Where would you need to keep this state in order to be fault-tolerant?

A naive approach is to simply store all state in a remote database and join over the network to that store. The problem with this is that there is no locality of data and lots of network round-trips, both of which will significantly slow down your application. A more subtle but important problem is that your stream processing job’s uptime would be tightly coupled to the remote database and the job will not be self-contained (a change in the database from another team might break your processing).

So what’s a better approach?
Recall the duality of tables and streams. This allows us to convert streams into tables that are co-located with our processing. It also provides us with a mechanism for handling fault tolerance — by storing the streams in a Kafka broker.

A stream processor can keep its state in a local table (e.g RocksDB), which will be updated from an input stream (after perhaps some arbitrary transformation). When the process fails, it can restore its data by replaying the stream.

You could even have a remote database be the producer of the stream, effectively broadcasting a changelog with which you rebuild the table locally.

Stateful processing, joining a KStream with a KTable

KSQL

Normally, you’d be forced to write your stream processing in a JVM language, as that is where the only official Kafka Streams API client is.

 
Sample KSQL setup

Currently in a developer preview, KSQL is a new feature which allows you to write your simple streaming jobs in a familiar SQL-like language.

You set up a KSQL server and interactively query it through a CLI to manage the processing. It works with the same abstractions (KStream & KTable), guarantees the same benefits of the Streams API (scalability, fault-tolerance) and greatly simplifies work with streams.

This might not sound as a lot but in practice is way more useful for testing out stuff and even allows people outside of development (e.g product owners) to play around with stream processing. I encourage you to take a look at the quick-start video and see how simple it is.

Streaming alternatives

Kafka streams is a perfect mix of power and simplicity. It arguably has the best capabilities for stream jobs on the market and it integrates with Kafka way easier than other stream processing alternatives (StormSamzaSpark,Wallaroo).

The problem with most other stream processing frameworks is that they are complex to work with and deploy. A batch processing framework like Spark needs to:

  • Control a large number of jobs over a pool of machines and efficiently distribute them across the cluster.
  • To achieve this it has to dynamically package up your code and physically deploy it to the nodes that will execute it. (along with configuration, libraries, etc.)

Unfortunately tackling these problems makes the frameworks pretty invasive. They want to control many aspects of how code is deployed, configured, monitored, and packaged.

Kafka Streams let you roll out your own deployment strategy when you need it, be it KubernetesMesosNomadDocker Swarm or others.

The underlying motivation of Kafka Streams is to enable all your applications to do stream processing without the operational complexity of running and maintaining yet another cluster. The only potential downside is that it is tightly coupled with Kafka, but in the modern world where most if not all real-time processing is powered by Kafka that may not be a big disadvantage.


When would you use Kafka?

As we already covered, Kafka allows you to have a huge amount of messages go through a centralized medium and store them without worrying about things like performance or data loss.

This means it is perfect for use as the heart of your system’s architecture, acting as a centralized medium that connects different applications. Kafka can be the center piece of an event-driven architecture and allows you to truly decouple applications from one another.

 

Kafka allows you to easily decouple communication between different (micro)services. With the Streams API, it is now easier than ever to write business logic which enriches Kafka topic data for service consumption. The possibilities are huge and I urge you to explore how companies are using Kafka.

Summary

Apache Kafka is a distributed streaming platform capable of handling trillions of events a day. Kafka provides low-latency, high-throughput, fault-tolerant publish and subscribe pipelines and is able to process streams of events.

We went over its basic semantics (producer, broker, consumer, topic), learned about some of its optimizations (pagecache), learned how it’s fault-tolerant by replicating data and were introduced to its powerful streaming abilities.

Kafka has seen large adoption at thousands of companies worldwide, including a third of the Fortune 500. With the continual improvement of Kafka and the recently released first major version 1.0 (1st November, 2017),there are predictions that this Streaming Platform is going to be as big and central of a data platform as relational databases are.

I hope that this introduction helped familiarize you with Apache Kafka and its potential.

Further Reading Resources & Things I did not mention

The rabbit hole goes deeper than this article was able to cover. Here are some features I did not get the chance to mention but are nevertheless important to know:

Connector API — API helping you connect various services to Kafka as a source or sink (PostgreSQL, Redis, ElasticSearch)

Log Compaction — An optimization which reduces log size. Extremely useful in changelog streams

Exactly-once Message Semantics — Guarantee that messages are received exactly once. This is a big deal.

Resources

Confluent Blog — a wealth of information regarding Apache Kafka

Kafka Documentation — Great, extensive, high-quality documentation

Kafka Summit 2017 videos

Thank you for taking the time to read this.

7 Techniques for thread-safe classes

Almost every Java application uses threads. A web server like Tomcat process each request in a separate worker thread, fat clients process long-running requests in dedicated worker threads, and even batch processes use the java.util.concurrent.ForkJoinPool to improve performance.

It is, therefore, necessary to write classes in a thread-safe way, which can be achieved by one of the following techniques:

No state

When multiple threads access the same instance or static variable you must somehow coordinate the access to this variable. The easiest way to do this is simply by avoiding instance or static variables. Methods in classes without instance variables do only use local variables and method arguments. The following example shows such a method which is part of the class java.lang.Math:

1
2
3
4
5
6
7
public static int subtractExact(int x, int y) {
  int r = x - y;
  if (((x ^ y) & (x ^ r)) < 0) {
      throw new ArithmeticException("integer overflow");
  }
  return r;
}

No shared state

If you can not avoid state do not share the state. The state should only be owned by a single thread. An example of this technique is the event processing thread of the SWT or Swing graphical user interface frameworks.

You can achieve thread-local instance variables by extending the thread class and adding an instance variable. In the following example, the field pool and workQueue are local to a single worker thread.

1
2
3
4
5
package java.util.concurrent;
public class ForkJoinWorkerThread extends Thread {
    final ForkJoinPool pool;               
    final ForkJoinPool.WorkQueue workQueue;
}

The other way to achieve thread-local variables is to use the class java.lang.ThreadLocal for the fields you want to make thread-local. Here is an example of an instance variable using java.lang.ThreadLocal:

1
2
3
4
5
6
7
8
9
10
11
public class CallbackState {
public static final ThreadLocal<CallbackStatePerThread> callbackStatePerThread =
    new ThreadLocal<CallbackStatePerThread>()
   {
      @Override
        protected CallbackStatePerThread  initialValue()
      {
       return getOrCreateCallbackStatePerThread();
      }
   };
}

You wrap the type of your instance variable inside the java.lang.ThreadLocal. You can provide an initial value for your java.lang.ThreadLocal through the method initialValue().

The following shows how to use the instance variable:

1
CallbackStatePerThread callbackStatePerThread = CallbackState.callbackStatePerThread.get();

Through calling the method get() you receive the object associated with the current thread.

Since in application servers a pool of many threads is used to process requests, java.lang.ThreadLocal leads to a high memory consumption in this environment. java.lang.ThreadLocal is therefore not recommended for classes executed by the request processing threads of an application server.

Message passing

If you do not share state using the above techniques you need a way for the threads to communicate. A technique to do this is by passing messages between threads. You can implement message passing using a concurrent queue from the package java.util.concurrent. Or, better yet, use a framework like Akka, a framework for actor style concurrency. The following example shows how to send a message with Akka:

1
target.tell(message, getSelf());

and receive a message:

1
2
3
4
5
6
@Override
public Receive createReceive() {
   return receiveBuilder()
      .match(String.class, s -> System.out.println(s.toLowerCase()))
      .build();
}

Immutable state

To avoid the problem that the sending thread changes the message during the message is read by another thread, messages should be immutable. The Akka framework, therefore, has the convention that all messages have to be immutable

When you implement an immutable class you should declare its fields as final. This not only makes sure that the compiler can check that the fields are in fact immutable but also makes them correctly initialized even when they are incorrect published. Here is an example of a final instance variable:

1
2
3
4
5
6
7
8
public class ExampleFinalField
{
  private final int finalField;
  public ExampleFinalField(int value)
  {
   this.finalField = value;
  }
}

final is a field modifier. It makes the field immutable not the object the field references to. So the type of the final field should be a primitive type like in the example or also an immutable class.

Use the data structures from java.util.concurrent

Message passing uses concurrent queues for the communication between threads. Concurrent Queues are one of the data structures provided in the package java.util.concurrent. This package provides classes for concurrent maps, queues, dequeues, sets and lists. Those data structures are highly optimized and tested for thread safety.

Synchronized blocks

If you can not use one of the above techniques use synchronized locks. By putting a block inside a synchronized block you make sure that only one thread at a time can execute this section.

1
2
3
4
synchronized(lock)
{
 i++;
}

Beware that when you use multiple nested synchronize blocks you risk deadlocks. A deadlock happens when two threads are trying to acquire a lock held by the other thread.

Volatile fields

Normal, nonvolatile fields, can be cached in registers or caches. Through the declaration of a variable as volatile, you tell the JVM and the compiler to always return the latest written value. This not only applies to the variable itself but to all values written by the thread which has written to the volatile field. The following shows an example of a volatile instance variable:

1
2
3
4
public class ExampleVolatileField
{
  private volatile int  volatileField;
}

You can use volatile fields if the writes do not depend on the current value. Or if you can make sure that only one thread at a time can update the field.

volatile is a field modifier. It makes the field itself volatile not the object it references. In case of an array you need to use java.util.concurrent.atomic.AtomicReferenceArray to access the array elements in a volatile way. See the race condition in org.­springframework.­util.­ConcurrentReferenceHashMap as an example of this error.

Even more techniques

I excluded the following more advanced techniques from this list: Atomic updates, a technique in which you call atomic instructions like compare and set provided by the CPU, java.util.concurrent.locks.ReentrantLock, a lock implementation which provides more flexibility than synchronized blocks, java.util.concurrent.locks.ReentrantReadWriteLock, a lock implementation in which reads do not block reads and java.util.concurrent.locks.StampedLock a nonreeantrant Read-Write lock with the possibility to optimistically read values.

Conclusion

The best way to achieve thread safety is to avoid shared state. For the state, you need to share you can either use message parsing together with immutable classes or the concurrent data structures together with synchronized blocks and volatile fields.

I would be glad to hear from you about the techniques you use to achieve thread-safe classes.

What Does RandomAccess Mean?

RandomAccess is a marker interface, like the Serializable and Cloneable interfaces. All these marker interfaces do not define methods. Instead, they identify a class as having a particular capability. In the case of Serializable, the interface specifies that if the class is serialized using the serialization I/O classes, a NotSerializableException will not be thrown (unless the object contains some other class that cannot be serialized). Cloneable similarly indicates that the use of the Object.clone( ) method for a Cloneable class will not throw aCloneNotSupportedException.

The RandomAccess interface identifies that a particular java.util.List implementation has fast random access. (A more accurate name for the interface would have been “FastRandomAccess.”) This interface tries to define an imprecise concept: what exactly is fast? The documentation provides a simple guide: if repeated access using the List.get( ) method is faster than repeated access using the Iterator.next( ) method, then the List has fast random access. The two types of access are shown in the following code examples.

Repeated access using List.get( ):

Object o;
for (int i=0, n=list.size(  ); i < n; i++)
  o = list.get(i);

Repeated access using Iterator.next( ):

Object o;
for (Iterator itr=list.iterator(  ); itr.hasNext(  ); )
  o = itr.next(  );

A third loop combines the previous two loops to avoid the repeated Iterator.hasNext( ) test on each loop iteration:

Object o;
Iterator itr=list.iterator(  );
for (int i=0, n=list.size(  ); i < n; i++)
  o = itr.next(  );

This last loop relies on the normal situation where List objects cannot change in size while they are being iterated through without an exception of some sort occurring. So, because the loop size remains the same, you can simply count the accessed elements without testing at each iteration whether the end of the list has been reached. This last loop is generally faster than the previous loop with the Iterator.hasNext( ) test. In the context of the RandomAccess interface, the first loop using List.get( ) should be faster than both the other loops that use Iterator.next( ) for a list to implement RandomAccess.

How Is RandomAccess Used?

So now that we know what RandomAccess means, how do we use it? There are two aspects to using the other marker interfaces, Serializable and Cloneable: defining classes that implement them and using their capabilities via ObjectInput /ObjectOutput and Object.clone( ), respectively.RandomAccess is a little different. Of course, we still need to decide whether any particular class implements it, but the possible classes are severely restricted: RandomAccess should be implemented only in java.util.List classes. And most such classes are created outside of projects. The SDK provides the most frequently used implementations, and subclasses of the SDK classes do not need to implement RandomAccess because they automatically inherit the capability where appropriate.

The second aspect, using the RandomAccess capability, is also different. Whether a class is Serializable or Cloneable is automatically detected when you use ObjectInput/ObjectOutput and Object.clone( ). But RandomAccess has no such automatic support. Instead, you need to explicitly check whether a class implements RandomAccess using the instanceof operator:

if (listObject instanceof RandomAccess)
  ...

You must then explicitly choose the appropriate access method, List.get( ) or Iterator.next( ). Clearly, if we test for RandomAccess on every loop iteration, we would be making a lot of redundant calls and probably losing the benefit of RandomAccess as well. So the pattern to follow in usingRandomAccess makes the test outside the loop. The canonical pattern looks like this:

Object o;
if (listObject instanceof RandomAccess)
{
  for (int i=0, n=list.size(  ); i < n; i++)
  {
    o = list.get(i);
    //do something with object o
  }
   
}
else
{
  Iterator itr = list.iterator(  );
  for (int i=0, n=list.size(  ); i < n; i++)
  {
    o = itr.next(  );
    //do something with object o
   
  }
}

Speedup from RandomAccess

I tested the four code loops shown in this section, using the 1.4 release, separately testing the -client (default) and -server options. To test the effect of the RandomAccess interface, I used the java.util.ArrayList and java.util.LinkedList classes. ArrayList implements RandomAccess, while LinkedList does not. ArrayList has an underlying implementation consisting of an array with constant access time for any element, so using the ArrayList iterator is equivalent to using the ArrayList.get( ) method but with some additional overhead. LinkedList has an underlying implementation consisting of linked node objects with access time proportional to the shortest distance of the element from either end of the list, whereas iterating sequentially through the list can shortcut the access time by traversing one node after another.

Times shown are the average of three runs, and all times have been normalized to the first table cell, i.e., the time taken by the ArrayList to iterate the list using the List.get( ) method in client mode.

Loop type (loop test) and access method

ArrayList java -client

LinkedList java -client

ArrayList java -server

LinkedList java -server

loop counter (i<n) and List.get( )

100%

too long

77.5%

too long

iterator (Iterator.hasNext( )) and Iterator.next( )

141%

219%

109%

213%

iterator (i<n) and Iterator.next( )

121%

205%

98%

193%

RandomAccess test with loop from row 1 or 3

100%

205%

77.5%

193%

The most important results are in the last two rows. The last line shows the times obtained by making full use of the RandomAccess interface, and the line before that shows the most optimal general technique for iterating lists if RandomAccess is not available. The size of the lists I used for the test (and consequently the number of loop iterations required to access every element) was sufficiently large that the instanceof test had no measurable cost in comparison to the time taken to run the loop. Consequently, we can see that there was no cost (but also no benefit) in adding the instanceofRandomAccess test when iterating the LinkedList, whereas the ArrayList was iterated more than 20% quicker when the instanceof test was included.

Forward and Backward Compatibility

Can you use RandomAccess and maintain backward compatibility with VM versions prior to 1.4? There are three aspects to using RandomAccess:

  • You may want to include code referencing RandomAccess without moving to 1.4.

  • Many projects need their code to be able to run in any VM, so the code needs to be backward-compatible to run in VMs using releases earlier than 1.4, where RandomAccess does not exist.

  • You will want to make your code forward-compatible so that it automatically takes advantage of RandomAccess when running in a 1.4+ JVM.

Making RandomAccess available to your development environment is the first issue, and if you are using an environment prior to 1.4, this can be as simple as adding the RandomAccess interface to your classpath. Any version of the SDK can create the RandomAccess interface. The definition for RandomAccess is:

package java.util;
public interface RandomAccess {  }

We also need to handle RandomAccess in the runtime environment. For pre-1.4 environments, the test:

if (listObject instanceof RandomAccess)

generates a NoClassDefFoundError at runtime when the JVM tries to load the RandomAccess class (for the instanceof test to be evaluated, the class has to be loaded). However, we can guard the test so that it is executed only if RandomAccess is available. The simplest way to do this is to check whether RandomAccess exists, setting a boolean guard as the outcome of that test:

static boolean RandomAccessExists;
...
   
  //execute this as early as possible after the application starts
  try
  {
    Class c =  Class.forName("java.util.RandomAccess");
    RandomAccessExists = true;
  }
  catch (ClassNotFoundException e)
  {
    RandomAccessExists = false;
  }

Finally, we need to change our instanceof tests to use the RandomAccessExists variable as a guard:

if (RandomAccessExists && (listObject instanceof RandomAccess) )

With the guarded instanceof test, the code automatically reverts to the Iterator loop if RandomAccess does not exist and should avoid throwing a NoClassDefFoundError in pre-1.4 JVMs. And, of course, the guarded instanceof test also automatically uses the faster loop branch whenRandomAccess does exist and the list object implements it.

Java Volatile Keyword

The Java volatile keyword is used to mark a Java variable as “being stored in main memory”. More precisely that means, that every read of a volatile variable will be read from the computer’s main memory, and not from the CPU cache, and that every write to a volatile variable will be written to main memory, and not just to the CPU cache.

Actually, since Java 5 the volatile keyword guarantees more than just that volatile variables are written to and read from main memory. I will explain that in the following sections.

The Java volatile Visibility Guarantee

The Java volatile keyword guarantees visibility of changes to variables across threads. This may sound a bit abstract, so let me elaborate.

In a multithreaded application where the threads operate on non-volatile variables, each thread may copy variables from main memory into a CPU cache while working on them, for performance reasons. If your computer contains more than one CPU, each thread may run on a different CPU. That means, that each thread may copy the variables into the CPU cache of different CPUs. This is illustrated here:

java-volatile-1

 

With non-volatile variables there are no guarantees about when the Java Virtual Machine (JVM) reads data from main memory into CPU caches, or writes data from CPU caches to main memory. This can cause several problems which I will explain in the following sections.

Imagine a situation in which two or more threads have access to a shared object which contains a counter variable declared like this:

public class SharedObject {

    public int counter = 0;

}

Imagine too, that only Thread 1 increments the counter variable, but both Thread 1 and Thread 2 may read the counter variable from time to time.

If the counter variable is not declared volatile there is no guarantee about when the value of the countervariable is written from the CPU cache back to main memory. This means, that the counter variable value in the CPU cache may not be the same as in main memory. This situation is illustrated here:

java-volatile-2

 

The problem with threads not seeing the latest value of a variable because it has not yet been written back to main memory by another thread, is called a “visibility” problem. The updates of one thread are not visible to other threads.

By declaring the counter variable volatile all writes to the counter variable will be written back to main memory immediately. Also, all reads of the counter variable will be read directly from main memory. Here is how the volatile declaration of the counter variable looks:

public class SharedObject {

    public volatile int counter = 0;

}

Declaring a variable volatile thus guarantees the visibility for other threads of writes to that variable.

The Java volatile Happens-Before Guarantee

Since Java 5 the volatile keyword guarantees more than just the reading from and writing to main memory of variables. Actually, the volatile keyword guarantees this:

  • If Thread A writes to a volatile variable and Thread B subsequently reads the same volatile variable, then all variables visible to Thread A before writing the volatile variable, will also be visible to Thread B after it has read the volatile variable.
  • The reading and writing instructions of volatile variables cannot be reordered by the JVM (the JVM may reorder instructions for performance reasons as long as the JVM detects no change in program behaviour from the reordering). Instructions before and after can be reordered, but the volatile read or write cannot be mixed with these instructions. Whatever instructions follow a read or write of a volatile variable are guaranteed to happen after the read or write.

These statements require a deeper explanation.

When a thread writes to a volatile variable, then not just the volatile variable itself is written to main memory. Also all other variables changed by the thread before writing to the volatile variable are also flushed to main memory. When a thread reads a volatile variable it will also read all other variables from main memory which were flushed to main memory together with the volatile variable.

Look at this example:

Thread A:
    sharedObject.nonVolatile = 123;
    sharedObject.counter     = sharedObject.counter + 1;

Thread B:
    int counter     = sharedObject.counter;
    int nonVolatile = sharedObject.nonVolatile;

Since Thread A writes the non-volatile variable sharedObject.nonVolatile before writing to the volatilesharedObject.counter, then both sharedObject.nonVolatile and sharedObject.counter are written to main memory when Thread A writes to sharedObject.counter (the volatile variable).

Since Thread B starts by reading the volatile sharedObject.counter, then both the sharedObject.counterand sharedObject.nonVolatile are read from main memory into the CPU cache used by Thread B. By the time Thread B reads sharedObject.nonVolatile it will see the value written by Thread A.

Developers may use this extended visibility guarantee to optimize the visibility of variables between threads. Instead of declaring each and every variable volatile, only one or a few need be declared volatile. Here is an example of a simple Exchanger class written after that principle:

public class Exchanger {

    private Object   object       = null;
    private volatile hasNewObject = false;

    public void put(Object newObject) {
        while(hasNewObject) {
            //wait - do not overwrite existing new object
        }
        object = newObject;
        hasNewObject = true; //volatile write
    }

    public Object take(){
        while(!hasNewObject){ //volatile read
            //wait - don't take old object (or null)
        }
        Object obj = object;
        hasNewObject = false; //volatile write
        return obj;
    }
}

Thread A may be putting objects from time to time by calling put(). Thread B may take objects from time to time by calling take(). This Exchanger can work just fine using a volatile variable (without the use of synchronized blocks), as long as only Thread A calls put() and only Thread B calls take().

However, the JVM may reorder Java instructions to optimize performance, if the JVM can do so without changing the semantics of the reordered instructions. What would happen if the JVM switched the order of the reads and writes inside put() and take()? What if put() was really executed like this:

while(hasNewObject) {
    //wait - do not overwrite existing new object
}
hasNewObject = true; //volatile write
object = newObject;

Notice the write to the volatile variable hasNewObject is now executed before the new object is actually set. To the JVM this may look completely valid. The values of the two write instructions do not depend on each other.

However, reordering the instruction execution would harm the visibility of the object variable. First of all, Thread B might see hasNewObject set to true before Thread A has actually written a new value to the object variable. Second, there is now not even a guarantee about when the new value written to objectwill be flushed back to main memory (well – the next time Thread A writes to a volatile variable somewhere…).

To prevent situations like the one described above from occurring, the volatile keyword comes with a “happens before guarantee“. The happens before guarantee guarantees that read and write instructions of volatile variables cannot be reordered. Instructions before and after can be reordered, but the volatile read/write instruction cannot be reordered with any instruction occurring before or after it.

Look at this example:

sharedObject.nonVolatile1 = 123;
sharedObject.nonVolatile2 = 456;
sharedObject.nonVolatile3 = 789;

sharedObject.volatile     = true; //a volatile variable

int someValue1 = sharedObject.nonVolatile4;
int someValue2 = sharedObject.nonVolatile5;
int someValue3 = sharedObject.nonVolatile6;

The JVM may reorder the first 3 instructions, as long as all of them happens before the volatile write instruction (they must all be executed before the volatile write instruction).

Similarly, the JVM may reorder the last 3 instructions as long as the volatile write instruction happens before all of them. None of the last 3 instructions can be reordered to before the volatile write instruction.

That is basically the meaning of the Java volatile happens before guarantee.

volatile is Not Always Enough

Even if the volatile keyword guarantees that all reads of a volatile variable are read directly from main memory, and all writes to a volatile variable are written directly to main memory, there are still situations where it is not enough to declare a variable volatile.

In the situation explained earlier where only Thread 1 writes to the shared counter variable, declaring the counter variable volatile is enough to make sure that Thread 2 always sees the latest written value.

In fact, multiple threads could even be writing to a shared volatile variable, and still have the correct value stored in main memory, if the new value written to the variable does not depend on its previous value. In other words, if a thread writing a value to the shared volatile variable does not first need to read its value to figure out its next value.

As soon as a thread needs to first read the value of a volatile variable, and based on that value generate a new value for the shared volatile variable, a volatile variable is no longer enough to guarantee correct visibility. The short time gap in between the reading of the volatile variable and the writing of its new value, creates an race condition where multiple threads might read the same value of the volatilevariable, generate a new value for the variable, and when writing the value back to main memory – overwrite each other’s values.

The situation where multiple threads are incrementing the same counter is exactly such a situation where a volatile variable is not enough. The following sections explain this case in more detail.

Imagine if Thread 1 reads a shared counter variable with the value 0 into its CPU cache, increment it to 1 and not write the changed value back into main memory. Thread 2 could then read the same countervariable from main memory where the value of the variable is still 0, into its own CPU cache. Thread 2 could then also increment the counter to 1, and also not write it back to main memory. This situation is illustrated in the diagram below:java-volatile-3

 

Thread 1 and Thread 2 are now practically out of sync. The real value of the shared counter variable should have been 2, but each of the threads has the value 1 for the variable in their CPU caches, and in main memory the value is still 0. It is a mess! Even if the threads eventually write their value for the shared counter variable back to main memory, the value will be wrong.

When is volatile Enough?

As I have mentioned earlier, if two threads are both reading and writing to a shared variable, then using the volatile keyword for that is not enough. You need to use a synchronized in that case to guarantee that the reading and writing of the variable is atomic. Reading or writing a volatile variable does not block threads reading or writing. For this to happen you must use the synchronized keyword around critical sections.

As an alternative to a synchronized block you could also use one of the many atomic data types found in the java.util.concurrent package. For instance, the AtomicLong or AtomicReference or one of the others.

In case only one thread reads and writes the value of a volatile variable and other threads only read the variable, then the reading threads are guaranteed to see the latest value written to the volatile variable. Without making the variable volatile, this would not be guaranteed.

The volatile keyword is guaranteed to work on 32 bit and 64 variables.

Performance Considerations of volatile

Reading and writing of volatile variables causes the variable to be read or written to main memory. Reading from and writing to main memory is more expensive than accessing the CPU cache. Accessing volatile variables also prevent instruction reordering which is a normal performance enhancement technique. Thus, you should only use volatile variables when you really need to enforce visibility of variables.

Comparable and Comparator in Java Example

Comparable and Comparator in Java are very useful for sorting collection of objects. Java provides some inbuilt methods to sort primitive types array or Wrapper classes array or list. Here we will first learn how we can sort an array/list of primitive types and wrapper classes and then we will use java.lang.Comparableand java.util.Comparator interfaces to sort array/list of custom classes.

Let’s see how we can sort primitive types or Object array and list with a simple program.

package com.journaldev.sort;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;

public class JavaObjectSorting {

    /**
     * This class shows how to sort primitive arrays, 
     * Wrapper classes Object Arrays
     * @param args
     */
    public static void main(String[] args) {
        //sort primitives array like int array
        int[] intArr = {5,9,1,10};
        Arrays.sort(intArr);
        System.out.println(Arrays.toString(intArr));
        
        //sorting String array
        String[] strArr = {"A", "C", "B", "Z", "E"};
        Arrays.sort(strArr);
        System.out.println(Arrays.toString(strArr));
        
        //sorting list of objects of Wrapper classes
        List<String> strList = new ArrayList<String>();
        strList.add("A");
        strList.add("C");
        strList.add("B");
        strList.add("Z");
        strList.add("E");
        Collections.sort(strList);
        for(String str: strList) System.out.print(" "+str);
    }
}

Output of the above program is:

[1, 5, 9, 10]
[A, B, C, E, Z]
 A B C E Z

Now let’s try to sort an array of objects.

package com.journaldev.sort;

public class Employee {

    private int id;
    private String name;
    private int age;
    private long salary;

    public int getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public long getSalary() {
        return salary;
    }

    public Employee(int id, String name, int age, int salary) {
        this.id = id;
        this.name = name;
        this.age = age;
        this.salary = salary;
    }

    @Override
    //this is overriden to print the user friendly information about the Employee
    public String toString() {
        return "[id=" + this.id + ", name=" + this.name + ", age=" + this.age + ", salary=" +
                this.salary + "]";
    }

}

Here is the code I used to sort the array of Employee objects.

//sorting object array
Employee[] empArr = new Employee[4];
empArr[0] = new Employee(10, "Mikey", 25, 10000);
empArr[1] = new Employee(20, "Arun", 29, 20000);
empArr[2] = new Employee(5, "Lisa", 35, 5000);
empArr[3] = new Employee(1, "Pankaj", 32, 50000);

//sorting employees array using Comparable interface implementation
Arrays.sort(empArr);
System.out.println("Default Sorting of Employees list:\n"+Arrays.toString(empArr));

When I tried to run this, it throws following runtime exception.

Exception in thread "main" java.lang.ClassCastException: com.journaldev.sort.Employee cannot be cast to java.lang.Comparable
	at java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:290)
	at java.util.ComparableTimSort.sort(ComparableTimSort.java:157)
	at java.util.ComparableTimSort.sort(ComparableTimSort.java:146)
	at java.util.Arrays.sort(Arrays.java:472)
	at com.journaldev.sort.JavaSorting.main(JavaSorting.java:41)

Comparable and Comparator

Java provides Comparable interface which should be implemented by any custom class if we want to use Arrays or Collections sorting methods. Comparable interface has compareTo(T obj) method which is used by sorting methods, you can check any Wrapper, String or Date class to confirm this. We should override this method in such a way that it returns a negative integer, zero, or a positive integer if “this” object is less than, equal to, or greater than the object passed as argument.

After implementing Comparable interface in Employee class, here is the resulting Employee class.

package com.journaldev.sort;

import java.util.Comparator;

public class Employee implements Comparable<Employee> {

    private int id;
    private String name;
    private int age;
    private long salary;

    public int getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public long getSalary() {
        return salary;
    }

    public Employee(int id, String name, int age, int salary) {
        this.id = id;
        this.name = name;
        this.age = age;
        this.salary = salary;
    }

    @Override
    public int compareTo(Employee emp) {
        //let's sort the employee based on id in ascending order
        //returns a negative integer, zero, or a positive integer as this employee id
        //is less than, equal to, or greater than the specified object.
        return (this.id - emp.id);
    }

    @Override
    //this is required to print the user friendly information about the Employee
    public String toString() {
        return "[id=" + this.id + ", name=" + this.name + ", age=" + this.age + ", salary=" +
                this.salary + "]";
    }

}

Now when we execute the above snippet for Arrays sorting of Employees and print it, here is the output.

Default Sorting of Employees list:
[[id=1, name=Pankaj, age=32, salary=50000], [id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000]]

As you can see that Employees array is sorted by id in ascending order.

But, in most real life scenarios, we want sorting based on different parameters. For example, as a CEO, I would like to sort the employees based on Salary, an HR would like to sort them based on the age. This is the situation where we need to use Java Comparator interface because Comparable.compareTo(Object o)method implementation can sort based on one field only and we can’t chose the field on which we want to sort the Object.

Java Comparator

Comparator interface compare(Object o1, Object o2) method need to be implemented that takes two Object argument, it should be implemented in such a way that it returns negative int if first argument is less than the second one and returns zero if they are equal and positive int if first argument is greater than second one.

Comparable and Comparator interfaces uses Generics for compile time type checking, learn more about Java Generics.

Here is how we can create different Comparator implementation in the Employee class.

/**
     * Comparator to sort employees list or array in order of Salary
     */
    public static Comparator<Employee> SalaryComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return (int) (e1.getSalary() - e2.getSalary());
        }
    };

    /**
     * Comparator to sort employees list or array in order of Age
     */
    public static Comparator<Employee> AgeComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return e1.getAge() - e2.getAge();
        }
    };

    /**
     * Comparator to sort employees list or array in order of Name
     */
    public static Comparator<Employee> NameComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return e1.getName().compareTo(e2.getName());
        }
    };

All the above implementations of Comparator interface are anonymous classes.

We can use these comparator to pass as argument to sort function of Arrays and Collections classes.

//sort employees array using Comparator by Salary
Arrays.sort(empArr, Employee.SalaryComparator);
System.out.println("Employees list sorted by Salary:\n"+Arrays.toString(empArr));

//sort employees array using Comparator by Age
Arrays.sort(empArr, Employee.AgeComparator);
System.out.println("Employees list sorted by Age:\n"+Arrays.toString(empArr));

//sort employees array using Comparator by Name
Arrays.sort(empArr, Employee.NameComparator);
System.out.println("Employees list sorted by Name:\n"+Arrays.toString(empArr));

Here is the output of the above code snippet:

Employees list sorted by Salary:
[[id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000], [id=1, name=Pankaj, age=32, salary=50000]]
Employees list sorted by Age:
[[id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000], [id=1, name=Pankaj, age=32, salary=50000], [id=5, name=Lisa, age=35, salary=5000]]
Employees list sorted by Name:
[[id=20, name=Arun, age=29, salary=20000], [id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=1, name=Pankaj, age=32, salary=50000]]

So now we know that if we want to sort java object array or list, we need to implement java Comparable interface to provide default sorting and we should implement java Comparator interface to provide different ways of sorting.

We can also create separate class that implements Comparator interface and then use it.

Here is the final classes we have explaining Comparable and Comparator in Java.

package com.journaldev.sort;

import java.util.Comparator;

public class Employee implements Comparable<Employee> {

    private int id;
    private String name;
    private int age;
    private long salary;

    public int getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    public long getSalary() {
        return salary;
    }

    public Employee(int id, String name, int age, int salary) {
        this.id = id;
        this.name = name;
        this.age = age;
        this.salary = salary;
    }

    @Override
    public int compareTo(Employee emp) {
        //let's sort the employee based on id in ascending order
        //returns a negative integer, zero, or a positive integer as this employee id
        //is less than, equal to, or greater than the specified object.
        return (this.id - emp.id);
    }

    @Override
    //this is required to print the user friendly information about the Employee
    public String toString() {
        return "[id=" + this.id + ", name=" + this.name + ", age=" + this.age + ", salary=" +
                this.salary + "]";
    }

    /**
     * Comparator to sort employees list or array in order of Salary
     */
    public static Comparator<Employee> SalaryComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return (int) (e1.getSalary() - e2.getSalary());
        }
    };

    /**
     * Comparator to sort employees list or array in order of Age
     */
    public static Comparator<Employee> AgeComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return e1.getAge() - e2.getAge();
        }
    };

    /**
     * Comparator to sort employees list or array in order of Name
     */
    public static Comparator<Employee> NameComparator = new Comparator<Employee>() {

        @Override
        public int compare(Employee e1, Employee e2) {
            return e1.getName().compareTo(e2.getName());
        }
    };
}

Here is the separate class implementation of Comparator interface that will compare two Employees object first on their id and if they are same then on name.

package com.journaldev.sort;

import java.util.Comparator;

public class EmployeeComparatorByIdAndName implements Comparator<Employee> {

    @Override
    public int compare(Employee o1, Employee o2) {
        int flag = o1.getId() - o2.getId();
        if(flag==0) flag = o1.getName().compareTo(o2.getName());
        return flag;
    }

}

Here is the test class where we are using different ways to sort Objects in java using Comparable and Comparator.

package com.journaldev.sort;

import java.util.Arrays;

public class JavaObjectSorting {

    /**
     * This class shows how to sort custom objects array/list
     * implementing Comparable and Comparator interfaces
     * @param args
     */
    public static void main(String[] args) {

        //sorting custom object array
        Employee[] empArr = new Employee[4];
        empArr[0] = new Employee(10, "Mikey", 25, 10000);
        empArr[1] = new Employee(20, "Arun", 29, 20000);
        empArr[2] = new Employee(5, "Lisa", 35, 5000);
        empArr[3] = new Employee(1, "Pankaj", 32, 50000);
        
        //sorting employees array using Comparable interface implementation
        Arrays.sort(empArr);
        System.out.println("Default Sorting of Employees list:\n"+Arrays.toString(empArr));
        
        //sort employees array using Comparator by Salary
        Arrays.sort(empArr, Employee.SalaryComparator);
        System.out.println("Employees list sorted by Salary:\n"+Arrays.toString(empArr));
        
        //sort employees array using Comparator by Age
        Arrays.sort(empArr, Employee.AgeComparator);
        System.out.println("Employees list sorted by Age:\n"+Arrays.toString(empArr));
        
        //sort employees array using Comparator by Name
        Arrays.sort(empArr, Employee.NameComparator);
        System.out.println("Employees list sorted by Name:\n"+Arrays.toString(empArr));
        
        //Employees list sorted by ID and then name using Comparator class
        empArr[0] = new Employee(1, "Mikey", 25, 10000);
        Arrays.sort(empArr, new EmployeeComparatorByIdAndName());
        System.out.println("Employees list sorted by ID and Name:\n"+Arrays.toString(empArr));
    }

}

Here is the output of the above program:

Default Sorting of Employees list:
[[id=1, name=Pankaj, age=32, salary=50000], [id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000]]
Employees list sorted by Salary:
[[id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000], [id=1, name=Pankaj, age=32, salary=50000]]
Employees list sorted by Age:
[[id=10, name=Mikey, age=25, salary=10000], [id=20, name=Arun, age=29, salary=20000], [id=1, name=Pankaj, age=32, salary=50000], [id=5, name=Lisa, age=35, salary=5000]]
Employees list sorted by Name:
[[id=20, name=Arun, age=29, salary=20000], [id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000], [id=1, name=Pankaj, age=32, salary=50000]]
Employees list sorted by ID and Name:
[[id=1, name=Mikey, age=25, salary=10000], [id=1, name=Pankaj, age=32, salary=50000], [id=5, name=Lisa, age=35, salary=5000], [id=10, name=Mikey, age=25, salary=10000]]

The java.lang.Comparable and java.util.Comparator are powerful interfaces that can be used to provide sorting objects in java.

Comparable vs Comparator

  1. Comparable interface can be used to provide single way of sorting whereas Comparator interface is used to provide different ways of sorting.
  2. For using Comparable, Class needs to implement it whereas for using Comparator we don’t need to make any change in the class.
  3. Comparable interface is in java.lang package whereas Comparator interface is present in java.utilpackage.
  4. We don’t need to make any code changes at client side for using Comparable, Arrays.sort() or Collection.sort() methods automatically uses the compareTo() method of the class. For Comparator, client needs to provide the Comparator class to use in compare() method.

What does RandomAccess mean?

RandomAccess is a marker interface, like the Serializable and Cloneable interfaces. All of these marker interfaces do not define methods; instead, they identify a class as having a particular capability.

In the case of Serializable, the interface specifies that if the class is serialized using the serialization I/O classes, then a NotSerializableException will not be thrown (unless the object contains some other class that cannot be serialized). Cloneable similarly indicates that the use of theObject.clone() method for a Cloneable class will not throw a CloneNotSupportedException.

The RandomAccess interface identifies that a particular java.util.List implementation has fast random access. A more accurate name for the interface would have been FastRandomAccess. This interface tries to define an imprecise concept: how fast is fast? The documentation provides a simple guide: if repeated access using the List.get() method is faster than repeated access using the Iterator.next() method, then the List has fast random access. The two types of access are shown in the following code examples:

Object o;
for (int i=0, n=list.size(); i < n; i++)
  o = list.get(i);
Object o;
for (Iterator itr=list.iterator(); itr.hasNext(); )
  o = itr.next();

There is a third loop that combines the previous two loops to avoid the repeated Iterator.hasNext() test on each loop iteration.

Object o;
Iterator itr=list.iterator();
for (int i=0, n=list.size(); i < n;!
 i++)
  o = itr.next();

This last loop relies on the normal situation, where List objects cannot change in size while they are being run without an exception of some sort occuring. So, since the loop size remains the same, you can simply count the accessed elements without testing at each iteration whether the end of the list has been reached. This last loop is generally faster than the one in Example 2. In the context of the RandomAccess interface, the first loop using List.get() should be faster than both of the loops that use Iterator.next() for a list to implement RandomAccess.

How is RandomAccess used?

So now that we know what RandomAccess means, how do we use it? With the other two marker interfaces, Serializable and Cloneable, there are two aspects to using them:

  • Defining classes which implement them, and
  • Using their capabilities via ObjectInput/ObjectOutput and Object.clone().

RandomAccess is a little different. Of course, we still need to decide whether any particular class implements it, but the possible classes are severely restricted: RandomAccess should only be implemented in java.util.List classes. And most such classes are created outside of projects; e.g., the SDK provides the most frequently used implementations, and subclasses of the SDK classes do not need to implement RandomAccess, as they will automatically inherit the capability where appropriate.

The second aspect, using the RandomAccess capability, is also different. Whether a class is Serializable or Cloneable is automatically detected when you use ObjectInput/ObjectOutput and Object.clone(). But RandomAccess has no such automatic support. You need to explicitly check whether a class implements RandomAccess using the instanceof operator:

if (listObject instanceof RandomAccess)

Then you must explicitly choose the appropriate access method, List.get() or Iterator.next(). Clearly, if we test for RandomAccess on every loop iteration, we would be making a lot of redundant calls, and probably losing the benefit of RandomAccess as well. So the pattern to follow in using RandomAccess makes the test outside the loop. The canonical pattern looks like:

Object o;
if (listObject instanceof RandomAccess)
{
  for (int i=0, n=list.size(); i < n; i++)
  {
    o = list.get(i);
    //do something with object o
  }

}
else
{
  Iterator itr = list.iterator();
  for (int i=0, n=list.size(); i < n; i++)
  {
    o = itr.next();
    //do something with object o

  }
}

The speedup from using RandomAccess

I tested the four code loops shown in this article, using the 1.4 beta release, separately testing the -client and -server options. To test the effect of the RandomAccess interface, I used the java.util.ArrayList and java.util.LinkedList classes. ArrayList implements RandomAccess, whileLinkedList does not. ArrayList has an underlying implementation consisting of an array with constant access time for any element, so using the ArrayList iterator is equivalent to using the ArrayList.get() method, but with some additional overhead. LinkedList has an underlying implementation consisting of linked node objects, so it has access time proportional to the shortest distance of the element from either end of the list; iterating sequentially through the list can shortcut the access time by traversing one node after another.

Times shown are the average of three runs, and all times have been normalized to the first table cell; i.e., the time taken by the ArrayList to iterate the list using the List.get() method, using java -client.

Table 1: Access times for loop types and access methods

Loop type (loop test) and access method ArrayList
java -client
LinkedList
java -client
ArrayList
java -server
LinkedList
java -server
loop counter (i<n) and list.get() 100% too long 77.5% too long
iterator (Iterator.hasNext()) and Iterator.next() 141% 219% 109% 213%
iterator (i<n) and iterator.next() 121% 205% 98% 193%
RandomAccess test with loop from row 1 or 3 100% 205% 77.5% 193%

Note that HotSpot is capable of optimizing away accesses that are unnecessary, so the test accessed and operated on the list elements in a way that could not eliminate the list element access. The test code is available here.

The most important results are in the last two rows of the table. The last row shows the times obtained by making full use of the RandomAccess interface; the row before that shows the most optimal general technique for iterating lists, if RandomAccess were not available. The size of the lists I used for the test (and consequently, the number of loop iterations required to access every element) was sufficiently large that the instanceof test had no measurable cost in comparison to the time taken to run the loop. Consequently, we can see that that there was no cost (but also no benefit) in adding the instanceof RandomAccess test when iterating the LinkedList; whereas the ArrayList was iterated more than 20% quicker when the instanceof test was included.

Forward and backward compatibility

What should you do if you are implementing code now? Obviously, you can start developing with a 1.4 (beta) release, but this is not an option everywhere. There are three aspects to using RandomAccess if you are developing code now:

  1. You may want to include code referencing RandomAccess without moving to 1.4; many development environments cannot be upgraded rapidly or to a beta release.
  2. Many projects need their code to be able to run in any JVM, so the code needs to be backwards-compatible to run in JVMs using releases earlier than 1.4, where RandomAccess does not exist.
  3. You will want to make your code forward-compatible so that it will automatically take advantage of RandomAccess when running in a 1.4+ JVM.

Making RandomAccess available to your development environment is the first issue, and this can be as simple as adding the RandomAccess interface to your classpath. Any version of the SDK can create the RandomAccess interface. The definition for RandomAccess is

package java.util;
public interface RandomAccess {}

This interface can be created using javac, as follows:

  1. Create a directory called temp
  2. In temp, create a directory called java
  3. In java, create a directory called util
  4. In util, create a file called RandomAccess.java, containing the definition just given
  5. Compile RandomAccess.java, using javacjavac RandomAccess.java

Now including temp in your classpath should enable classes that refer to RandomAccess to be compiled.

Some Java integrated development environments (IDEs) can make it difficult to add a class to the core SDK packages. If this is the case for your IDE, your only hope is that it accepts an external classpath for compilation purposes, in which case the custom-generated RandomAccess class will need to be held in that external classpath.

We also need to handle RandomAccess in the runtime environment. For pre-1.4 environments, the test if (listObject instanceof RandomAccess) will generate a NoClassDefFoundError at runtime, when the JVM tries to load the RandomAccess class. For the instanceof test to be evaluated, the class has to be loaded; however, we can guard the test so that it is only executed if RandomAccess is available. The simplest way to do this is to check if RandomAccess exists, setting a boolean guard as the outcome of that test:

static boolean RandomAccessExists;
..

  //execute this as early as possible after the 
  //application starts
  try
  {
    Class c =  Class.forName("java.util.RandomAccess"); RandomAccessExists = true; } catch (ClassNotFoundException e) { RandomAccessExists = false; }

Then, finally, we will need to change our instanceof tests to use the RandomAccessExists variable as a guard:

if (RandomAccessExists && (listObject instanceof RandomAccess) )

Now we have the solution for all three aspects mentioned at the beginning of this section:

  1. The RandomAccess interface can be created and compiled easily with any SDK, and this manually-compiled version can be used at compilation time as a stand-in, to compile any code which refers to RandomAccess.
  2. The guarded instanceof test will automatically revert to the Iterator loop if RandomAccess does not exist, and should avoid throwing a NoClassDefFoundError in pre-1.4 JVMs.
  3. The guarded instanceof test will also automatically use the faster loop branch when RandomAccess does exist and the list object implements it.