Top Differences between Hadoop1.0 & Hadoop 2.0

Differences-between-Hadoop1.0-&-Hadoop-2.0

Early adopters of the Hadoop ecosystem were restricted to processing models that were MapReduce-based only. Hadoop 2 has brought with it effective processing models that lend themselves to many Big Data uses, including interactive SQL queries over big data, analysis of Big Data scale graphs, and scalable machine learning abilities. The evolution of Hadoop 1’s limited processing model comprising of various batch-oriented MapReduce tasks, to the more specialized and interactive hard-core models of Hadoop 2 ,have now showcased the potential value contributed by distributed and large scale processing systems. Read on to note the major differences that exist between Hadoop 1 and 2. 

Hadoop–YARN and HDFS 
While other available solutions are likely to be unsuitable for interactive analytics; are I/O intensive; constrained with respect to providing graph support, memory intensive algorithms, and other machine learning processes; and more; Hadoop proves to be far ahead in the race. Creating a reliable, scalable and strong foundation for Big Data architectures, the Hadoop ecosystem has been positioned as one of the most dominant Big Data platforms for analytics. Here, it deserves mention that Hadoop developers had rewritten major components of the Hadoop 1 file system for producing Hadoop 2. The resource manager YARN and HDFS federation were introduced as important advances for Hadoop 2.

HDFS– Hadoop file system with a difference

HDFS, a popular Hadoop file system, comprises of two main components: blocks storage service and namespaces. While the block storage service deals with block operations, cluster management of data nodes, and replication; namespaces manage all operations on files/ directories, especially with regards to the creation and modification of files and directories.

A single Namenode was responsible for managing the complete namespace for Hadoop clusters in Hadoop 1. With the advent of the HDFS federation, several Namenode servers are being used for the management of namespaces. This in turn allows for performance improvements, horizontal scaling, and multiple namespaces. All in all, the implementation of HDFS makes existing Namenode configurations operate without changes. A shift to the HDFS federation requires Hadoop administrators to format Namenodes, and update the same for  use with latest Hadoop cluster applications. It also involves the addition of more Namenodes to the Hadoop cluster.

YARN—Supports additional performance enhancements for Hadoop 2

While the HDFS federation is responsible for bringing in measures of reliability and scalability to Hadoop, YARN brings about significant performance enhancements for certain applications; implements an overall more flexible execution engine; and offers support for additional processing models. As a recap, do know that YARN, a resource manager, was developed as a result of the separation of the resource management capabilities and processing engine of MapReduce; as implemented in Hadoop 1.

Oft referred to as the operating system of Hadoop due to its role in managing and monitoring diverse workloads, implementing security controls, maintaining multi-tenant environs, and managing all high availability Hadoop features, YARN is designed for diverse, multiple, user applications that operate on a given multi-tenant platform. In addition to MapReduce, YARN supports other multiple processing models too.

High Availability Mode (HA) of Namenode

The name node stores all metadata in the Hadoop Cluster. It’s extremely important because in case of events such as an unprecedented machine crash, it can bring down the entire Hadoop cluster. Hadoop 2.0 offers a solution for the problem on hand. Now, the High Availability feature of HDFS comes to the rescue by allowing any of the two redundant name nodes to run in the same cluster. These name nodes may run in any given active/passive way—with one operating as the primary name node, and the other as a hot standby one.

Both these name nodes share an edits log, wherein all changes are collected in shared NFS storage. At any point of time, only a single writer is allowed to access this shared storage. Here, the passive name node is also allowed access to the storage and is responsible for keeping all updated metadata information with respect to the cluster. If an active name node fails to function, the passive name node takes over as the active one and starts writing onto the shared storage.

Enhanced Utilization of Resources

In case of Hadoop 1.0, the JobTracker held the dual responsibility of driving the accurate  execution of MapReduce jobs, and also managing the resources dedicated to the cluster. With YARN coming to the scene, two major functionalities attributed to the overburdened JobTracker– job scheduling/monitoring and resource management, are split up into separate daemons. These are:

Resource Manager (RM) that lays focus upon the management of cluster resources;

An Application Master (AM), which is typically a one-per-running-application that manages individual running applications; for instance, MapReduce jobs. 

It is essential to note that there exists no more non-flexible map-reduce slots. With YARN as the central resource manager, multiple applications  can now share a common resource and run on Hadoop.

Batch Oriented application

In its 2.0 version, Hadoop goes much beyond its batch oriented nature and runs interactive applications, along with streaming them too.

Native Windows Support

Originally, Hadoop was developed for supporting the UNIX family that was linked with operating systems. With Hadoop 2.0 that offers native support for the Windows operating system, the reach of Hadoop has extended significantly. It now caters to the ever-growing Windows Server market with flair.

Non MapReduce Applications on Hadoop 2.0

Hadoop 1.0 was compatible with MapReduce framework tasks only; they could process all data stored in HDFS. Other than MapReduce, there were no more models for data processing. For things such as graph or real-time analysis of the data stored in HDFS, users had to shift the data to other alternate storage facilities like HBase. YARN helps Hadoop run non-MapReduce applications too. YARN APIs can be used for writing on other frameworks and running on top of HDFS. This helps the running of different non-MapReduce applications on Hadoop—with MPI, Giraph, Spark, and HAMA being some applications that are well-ported for running within YARN.

Data node caching for faster access

Hadoop 2.0 users and applications of the likes of Pig, Hive, or HBase are capable of identifying different sets of files that require caching. For instance, the dimension tables related to Hive can now be configured for data caches linked to the DataNode RAM; thereby allowing faster reads for Hive related queries to most frequently looked up tables.

HDFS- Multiple Storage

Another important difference between Hadoop 1.0 vs. Hadoop 2.0 is the latter’s support for all kinds of heterogeneous storage. Whether it’s about SSDs or spinning disks, Hadoop 1.0 is known to treat all storage devices as a single uniform pool on a DataNode. So, while Hadoop 1.0 users could store their data on an SSD, they were in no position to control the same. Heterogeneous storage serves to be an integral part of Hadoop’s version of 2.0 and onwards. The approach is quite general and permits users to treat memory as storage tiers for temporary and cached data.

HDFS Snapshots

Hadoop 2.0 offers additional support and compatibility for file system snapshots. They are point-in-time images of complete file system or the sub trees of a specific file system. The many uses of snapshots include:

Protection for user errors: An admin-driven process can be set up for taking snapshots periodically. So, if users happen to delete files accidentally, the lost data is capable of being restored from the snapshots containing the same.

Reliable backups: Snapshots of entire file systems or sub-trees in the file system can be used by the admin as a beginning point for full backups. There’s a scope of taking incremental backups by copying down the differences between any two given snapshots.

Disaster recovery: Snapshots may also be used for the copying of point-in-time images to remotely placed sites for disaster recovery.

Find the middle of a given linked list in Java

Given a singly linked list, find middle of the linked list. For example, if given linked list is 1->2->3->4->5 then output should be 3.

If there are even nodes, then there would be two middle nodes, we need to print second middle element. For example, if given linked list is 1->2->3->4->5->6 then output should be 4.

Input : list: 1->2->4->5
        x = 3
Output : 1->2->3->4->5

Input : list: 5->10->4->32->16
        x = 41
Output : 5->10->4->41->32->16

Method 1(Using length of the linked list):
Find the number of nodes or length of the linked using one traversal. Let it be len. Calculate c = (len/2), if len is even, else c = (len+1)/2, if len is odd. Traverse again the first c nodes and insert the new node after the cth node.
// Java implementation to insert node
// at the middle of the linked list
import java.util.*;
import java.lang.*;
import java.io.*;
class LinkedList
{
    static Node head; // head of list
    /* Node Class */
    static class Node {
        int data;
        Node next;
        
        // Constructor to create a new node
        Node(int d) {
            data = d;
            next = null;
        }
    }
    // function to insert node at the
    // middle of the linked list
    static void insertAtMid(int x)
    {
        // if list is empty
        if (head == null)
            head = new Node(x);
        else {
            // get a new node
            Node newNode = new Node(x);
            Node ptr = head;
            int len = 0;
            // calculate length of the linked list
            //, i.e, the number of nodes
            while (ptr != null) {
                len++;
                ptr = ptr.next;
            }
            // 'count' the number of nodes after which
            // the new node is to be inserted
            int count = ((len % 2) == 0) ? (len / 2) :
                                        (len + 1) / 2;
            ptr = head;
            // 'ptr' points to the node after which
            // the new node is to be inserted
            while (count-- > 1)
                ptr = ptr.next;
            // insert the 'newNode' and adjust
            // the required links
            newNode.next = ptr.next;
            ptr.next = newNode;
        }
    }
    // function to display the linked list
    static void display()
    {
        Node temp = head;
        while (temp != null)
        {
            System.out.print(temp.data + " ");
            temp = temp.next;
        }
    }
    // Driver program to test above
    public static void main (String[] args)
    {
        // Creating the list 1.2.4.5
        head = null;
        head = new Node(1);
        head.next = new Node(2);
        head.next.next = new Node(4);
        head.next.next.next = new Node(5);
        
        System.out.println("Linked list before "+
                           "insertion: ");
        display();
        int x = 3;
        insertAtMid(x);
        System.out.println("\nLinked list after"+
                           " insertion: ");
        display();
    }
}
// This article is contributed by Chhavi

Output:

Linked list before insertion: 1 2 4 5
Linked list after insertion: 1 2 3 4 5

Time Complexity: O(n)

Method 2(Using two pointers):
Based on the tortoise and hare algorithm which uses two pointers, one known as slow and the other known as fast. This algorithm helps in finding the middle node of the linked list. It is explained in the front and black split procedure of this post. Now, you can insert the new node after the middle node obtained from the above process. This approach requires only a single traversal of the list.

// Java implementation to insert node
// at the middle of the linked list
import java.util.*;
import java.lang.*;
import java.io.*;
class LinkedList
{
    static Node head; // head of list
    /* Node Class */
    static class Node {
        int data;
        Node next;
        
        // Constructor to create a new node
        Node(int d) {
            data = d;
            next = null;
        }
    }
    // function to insert node at the
    // middle of the linked list
    static void insertAtMid(int x)
    {
        // if list is empty
        if (head == null)
        head = new Node(x);
        else {
            // get a new node
            Node newNode = new Node(x);
            // assign values to the slow
            // and fast pointers
            Node slow = head;
            Node fast = head.next;
            while (fast != null && fast.next
                                  != null)
            {
                // move slow pointer to next node
                slow = slow.next;
                // move fast pointer two nodes
                // at a time
                fast = fast.next.next;
            }
            // insert the 'newNode' and adjust
            // the required links
            newNode.next = slow.next;
            slow.next = newNode;
        }
    }
    // function to display the linked list
    static void display()
    {
        Node temp = head;
        while (temp != null)
        {
            System.out.print(temp.data + " ");
            temp = temp.next;
        }
    }
    // Driver program to test above
    public static void main (String[] args)
    {
        // Creating the list 1.2.4.5
        head = null;
        head = new Node(1);
        head.next = new Node(2);
        head.next.next = new Node(4);
        head.next.next.next = new Node(5);
        
        System.out.println("Linked list before"+
                           " insertion: ");
        display();
        int x = 3;
        insertAtMid(x);
        System.out.println("\nLinked list after"+
                           " insertion: ");
        display();
    }
}
// This article is contributed by Chhavi

Output:

Linked list before insertion: 1 2 4 5
Linked list after insertion: 1 2 3 4 5

Time Complexity: O(n)

Java 8 Features

Java 8 was released in 18th March 2014, so it’s high time to look into Java 8 Features. In this tutorial, we will look into Java 8 features with examples.

Some of the important Java 8 features are:

  1. forEach() method in Iterable interface
  2. default and static methods in Interfaces
  3. Functional Interfaces and Lambda Expressions
  4. Java Stream API for Bulk Data Operations on Collections
  5. Java Time API
  6. Collection API improvements
  7. Concurrency API improvements
  8. Java IO improvements
  9. Miscellaneous Core API improvements

Let’s have a brief look on these Java 8 features. I will provide some code snippets for better understanding, so if you want to run programs in Java 8, you will have to setup Java 8 environment by following steps.

  • Download JDK8 and install it. Installation is simple like other java versions. JDK installation is required to write, compile and run the program in Java.
  • Download latest Eclipse IDE, it provides support for java 8 now. Make sure your projects build path is using Java 8 library.
    1. forEach() method in Iterable interface

      Whenever we need to traverse through a Collection, we need to create an Iterator whose whole purpose is to iterate over and then we have business logic in a loop for each of the elements in the Collection. We might get ConcurrentModificationException if iterator is not used properly.

      Java 8 has introduced forEach method in java.lang.Iterable interface so that while writing code we focus on business logic only. forEach method takes java.util.function.Consumer object as argument, so it helps in having our business logic at a separate location that we can reuse. Let’s see forEach usage with simple example.

      package com.journaldev.java8.foreach;
      
      import java.util.ArrayList;
      import java.util.Iterator;
      import java.util.List;
      import java.util.function.Consumer;
      import java.lang.Integer;
      
      public class Java8ForEachExample {
      
      	public static void main(String[] args) {
      		
      		//creating sample Collection
      		List<Integer> myList = new ArrayList<Integer>();
      		for(int i=0; i<10; i++) myList.add(i);
      		
      		//traversing using Iterator
      		Iterator<Integer> it = myList.iterator();
      		while(it.hasNext()){
      			Integer i = it.next();
      			System.out.println("Iterator Value::"+i);
      		}
      		
      		//traversing through forEach method of Iterable with anonymous class
      		myList.forEach(new Consumer<Integer>() {
      
      			public void accept(Integer t) {
      				System.out.println("forEach anonymous class Value::"+t);
      			}
      
      		});
      		
      		//traversing with Consumer interface implementation
      		MyConsumer action = new MyConsumer();
      		myList.forEach(action);
      		
      	}
      
      }
      
      //Consumer implementation that can be reused
      class MyConsumer implements Consumer<Integer>{
      
      	public void accept(Integer t) {
      		System.out.println("Consumer impl Value::"+t);
      	}
      
      
      }

      The number of lines might increase but forEach method helps in having the logic for iteration and business logic at separate place resulting in higher separation of concern and cleaner code.

    1. default and static methods in Interfaces

      If you read forEach method details carefully, you will notice that it’s defined in Iterable interface but we know that interfaces can’t have method body. From Java 8, interfaces are enhanced to have method with implementation. We can use default and static keyword to create interfaces with method implementation. forEach method implementation in Iterable interface is:

      	default void forEach(Consumer<? super T> action) {
              Objects.requireNonNull(action);
              for (T t : this) {
                  action.accept(t);
              }
          }

      We know that Java doesn’t provide multiple inheritance in Classes because it leads to Diamond Problem. So how it will be handled with interfaces now, since interfaces are now similar to abstract classes. The solution is that compiler will throw exception in this scenario and we will have to provide implementation logic in the class implementing the interfaces.

      package com.journaldev.java8.defaultmethod;
      
      @FunctionalInterface
      public interface Interface1 {
      
      	void method1(String str);
      	
      	default void log(String str){
      		System.out.println("I1 logging::"+str);
      	}
      	
      	static void print(String str){
      		System.out.println("Printing "+str);
      	}
      	
      	//trying to override Object method gives compile time error as
      	//"A default method cannot override a method from java.lang.Object"
      	
      //	default String toString(){
      //		return "i1";
      //	}
      	
      }
      package com.journaldev.java8.defaultmethod;
      
      @FunctionalInterface
      public interface Interface2 {
      
      	void method2();
      	
      	default void log(String str){
      		System.out.println("I2 logging::"+str);
      	}
      
      }

      Notice that both the interfaces have a common method log() with implementation logic.

      package com.journaldev.java8.defaultmethod;
      
      public class MyClass implements Interface1, Interface2 {
      
      	@Override
      	public void method2() {
      	}
      
      	@Override
      	public void method1(String str) {
      	}
      
      	//MyClass won't compile without having it's own log() implementation
      	@Override
      	public void log(String str){
      		System.out.println("MyClass logging::"+str);
      		Interface1.print("abc");
      	}
      	
      }

      As you can see that Interface1 has static method implementation that is used in MyClass.log()method implementation. Java 8 uses default and static methods heavily in Collection API and default methods are added so that our code remains backward compatible.

      If any class in the hierarchy has a method with same signature, then default methods become irrelevant. Since any class implementing an interface already has Object as superclass, if we have equals(), hashCode() default methods in interface, it will become irrelevant. Thats why for better clarity, interfaces are not allowed to have Object class default methods.

      For complete details of interface changes in Java 8, please read Java 8 interface changes.

    1. Functional Interfaces and Lambda Expressions

      If you notice above interfaces code, you will notice @FunctionalInterface annotation. Functional interfaces are new concept introduced in Java 8. An interface with exactly one abstract method becomes Functional Interface. We don’t need to use @FunctionalInterface annotation to mark an interface as Functional Interface. @FunctionalInterface annotation is a facility to avoid accidental addition of abstract methods in the functional interfaces. You can think of it like @Override annotationand it’s best practice to use it. java.lang.Runnable with single abstract method run() is a great example of functional interface.

      One of the major benefits of functional interface is the possibility to use lambda expressions to instantiate them. We can instantiate an interface with anonymous class but the code looks bulky.

      Runnable r = new Runnable(){
      			@Override
      			public void run() {
      				System.out.println("My Runnable");
      			}};

      Since functional interfaces have only one method, lambda expressions can easily provide the method implementation. We just need to provide method arguments and business logic. For example, we can write above implementation using lambda expression as:

      Runnable r1 = () -> {
      			System.out.println("My Runnable");
      		};

      If you have single statement in method implementation, we don’t need curly braces also. For example above Interface1 anonymous class can be instantiated using lambda as follows:

      Interface1 i1 = (s) -> System.out.println(s);
      		
      i1.method1("abc");

      So lambda expressions are means to create anonymous classes of functional interfaces easily. There are no runtime benefits of using lambda expressions, so I will use it cautiously because I don’t mind writing few extra lines of code.

      A new package java.util.function has been added with bunch of functional interfaces to provide target types for lambda expressions and method references. Lambda expressions are a huge topic, I will write a separate article on that in future.

      You can read complete tutorial at Java 8 Lambda Expressions Tutorial.

    1. Java Stream API for Bulk Data Operations on Collections

      A new java.util.stream has been added in Java 8 to perform filter/map/reduce like operations with the collection. Stream API will allow sequential as well as parallel execution. This is one of the best feature for me because I work a lot with Collections and usually with Big Data, we need to filter out them based on some conditions.

      Collection interface has been extended with stream() and parallelStream() default methods to get the Stream for sequential and parallel execution. Let’s see their usage with simple example.

      package com.journaldev.java8.stream;
      
      import java.util.ArrayList;
      import java.util.List;
      import java.util.stream.Stream;
      
      public class StreamExample {
      
      	public static void main(String[] args) {
      		
      		List<Integer> myList = new ArrayList<>();
      		for(int i=0; i<100; i++) myList.add(i);
      		
      		//sequential stream
      		Stream<Integer> sequentialStream = myList.stream();
      		
      		//parallel stream
      		Stream<Integer> parallelStream = myList.parallelStream();
      		
      		//using lambda with Stream API, filter example
      		Stream<Integer> highNums = parallelStream.filter(p -> p > 90);
      		//using lambda in forEach
      		highNums.forEach(p -> System.out.println("High Nums parallel="+p));
      		
      		Stream<Integer> highNumsSeq = sequentialStream.filter(p -> p > 90);
      		highNumsSeq.forEach(p -> System.out.println("High Nums sequential="+p));
      
      	}
      
      }

      If you will run above example code, you will get output like this:

      High Nums parallel=91
      High Nums parallel=96
      High Nums parallel=93
      High Nums parallel=98
      High Nums parallel=94
      High Nums parallel=95
      High Nums parallel=97
      High Nums parallel=92
      High Nums parallel=99
      High Nums sequential=91
      High Nums sequential=92
      High Nums sequential=93
      High Nums sequential=94
      High Nums sequential=95
      High Nums sequential=96
      High Nums sequential=97
      High Nums sequential=98
      High Nums sequential=99

      Notice that parallel processing values are not in order, so parallel processing will be very helpful while working with huge collections.
      Covering everything about Stream API is not possible in this post, you can read everything about Stream API at Java 8 Stream API Example Tutorial.

    1. Java Time API

      It has always been hard to work with Date, Time and Time Zones in java. There was no standard approach or API in java for date and time in Java. One of the nice addition in Java 8 is the java.timepackage that will streamline the process of working with time in java.

      Just by looking at Java Time API packages, I can sense that it will be very easy to use. It has some sub-packages java.time.format that provides classes to print and parse dates and times and java.time.zone provides support for time-zones and their rules.

      The new Time API prefers enums over integer constants for months and days of the week. One of the useful class is DateTimeFormatter for converting datetime objects to strings.

      For complete tutorial, head over to Java Date Time API Example Tutorial.

    1. Collection API improvements

      We have already seen forEach() method and Stream API for collections. Some new methods added in Collection API are:

      • Iterator default method forEachRemaining(Consumer action) to perform the given action for each remaining element until all elements have been processed or the action throws an exception.
      • Collection default method removeIf(Predicate filter) to remove all of the elements of this collection that satisfy the given predicate.
      • Collection spliterator() method returning Spliterator instance that can be used to traverse elements sequentially or parallel.
      • Map replaceAll()compute()merge() methods.
      • Performance Improvement for HashMap class with Key Collisions
    1. Concurrency API improvements

      Some important concurrent API enhancements are:

      • ConcurrentHashMap compute(), forEach(), forEachEntry(), forEachKey(), forEachValue(), merge(), reduce() and search() methods.
      • CompletableFuture that may be explicitly completed (setting its value and status).
      • Executors newWorkStealingPool() method to create a work-stealing thread pool using all available processors as its target parallelism level.
    1. Java IO improvements

      Some IO improvements known to me are:

      • Files.list(Path dir) that returns a lazily populated Stream, the elements of which are the entries in the directory.
      • Files.lines(Path path) that reads all lines from a file as a Stream.
      • Files.find() that returns a Stream that is lazily populated with Path by searching for files in a file tree rooted at a given starting file.
      • BufferedReader.lines() that return a Stream, the elements of which are lines read from this BufferedReader.
  1. Miscellaneous Core API improvements

    Some misc API improvements that might come handy are:

    1. ThreadLocal static method withInitial(Supplier supplier) to create instance easily.
    2. Comparator interface has been extended with a lot of default and static methods for natural ordering, reverse order etc.
    3. min(), max() and sum() methods in Integer, Long and Double wrapper classes.
    4. logicalAnd(), logicalOr() and logicalXor() methods in Boolean class.
    5. ZipFile.stream() method to get an ordered Stream over the ZIP file entries. Entries appear in the Stream in the order they appear in the central directory of the ZIP file.
    6. Several utility methods in Math class.
    7. jjs command is added to invoke Nashorn Engine.
    8. jdeps command is added to analyze class files
    9. JDBC-ODBC Bridge has been removed.
    10. PermGen memory space has been removed

That’s all for Java 8 features with example programs. If I have missed some important features of Java 8, please let me know through comments.

XML Signing and Validating

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.security.InvalidAlgorithmParameterException;
import java.security.KeyStore;
import java.security.KeyStoreException;
import java.security.NoSuchAlgorithmException;
import java.security.Security;
import java.security.UnrecoverableEntryException;
import java.security.cert.CertificateException;
import java.security.cert.X509Certificate;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;

import javax.xml.crypto.MarshalException;
import javax.xml.crypto.dsig.CanonicalizationMethod;
import javax.xml.crypto.dsig.DigestMethod;
import javax.xml.crypto.dsig.Reference;
import javax.xml.crypto.dsig.SignatureMethod;
import javax.xml.crypto.dsig.SignedInfo;
import javax.xml.crypto.dsig.Transform;
import javax.xml.crypto.dsig.XMLSignature;
import javax.xml.crypto.dsig.XMLSignatureException;
import javax.xml.crypto.dsig.XMLSignatureFactory;
import javax.xml.crypto.dsig.dom.DOMSignContext;
import javax.xml.crypto.dsig.dom.DOMValidateContext;
import javax.xml.crypto.dsig.keyinfo.KeyInfo;
import javax.xml.crypto.dsig.keyinfo.KeyInfoFactory;
import javax.xml.crypto.dsig.keyinfo.X509Data;
import javax.xml.crypto.dsig.spec.C14NMethodParameterSpec;
import javax.xml.crypto.dsig.spec.TransformParameterSpec;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class XmlSigningAndValidate {

	public void signXml() throws NoSuchAlgorithmException, InvalidAlgorithmParameterException, KeyStoreException,
			UnrecoverableEntryException, CertificateException, IOException, SAXException, ParserConfigurationException,
			MarshalException, XMLSignatureException, TransformerException {
		XMLSignatureFactory fac = XMLSignatureFactory.getInstance("DOM");
		Reference ref = fac.newReference("", fac.newDigestMethod(DigestMethod.SHA512, null),
				Collections.singletonList(fac.newTransform(Transform.ENVELOPED, (TransformParameterSpec) null)), null,
				null);

		SignedInfo si = fac.newSignedInfo(
				fac.newCanonicalizationMethod(CanonicalizationMethod.INCLUSIVE, (C14NMethodParameterSpec) null),
				fac.newSignatureMethod(SignatureMethod.RSA_SHA1, null), Collections.singletonList(ref));

		// Load the KeyStore and get the signing key and certificate.
		KeyStore ks = KeyStore.getInstance("JKS");
		ks.load(new FileInputStream("keystore.jks"), "pushstart".toCharArray());
		KeyStore.PrivateKeyEntry keyEntry = (KeyStore.PrivateKeyEntry) ks.getEntry("nettr",
				new KeyStore.PasswordProtection("pushstart".toCharArray()));
		X509Certificate cert = (X509Certificate) keyEntry.getCertificate();

		// Create the KeyInfo containing the X509Data.
		KeyInfoFactory kif = fac.getKeyInfoFactory();
		List<Object> x509Content = new ArrayList();
		x509Content.add(cert.getSubjectX500Principal().getName());
		x509Content.add(cert);
		X509Data xd = kif.newX509Data(x509Content);
		KeyInfo ki = kif.newKeyInfo(Collections.singletonList(xd));

		// Instantiate the document to be signed.
		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		dbf.setNamespaceAware(true);
		Document doc = dbf.newDocumentBuilder().parse(new FileInputStream("purchaseOrder.xml"));

		// Create a DOMSignContext and specify the RSA PrivateKey and
		// location of the resulting XMLSignature's parent element.
		DOMSignContext dsc = new DOMSignContext(keyEntry.getPrivateKey(), doc.getDocumentElement());

		// Create the XMLSignature, but don't sign it yet.
		XMLSignature signature = fac.newXMLSignature(si, ki);

		// Marshal, generate, and sign the enveloped signature.
		signature.sign(dsc);

		TransformerFactory tf = TransformerFactory.newInstance();
		Transformer trans = tf.newTransformer();
		trans.transform(new DOMSource(doc), new StreamResult(new FileOutputStream("SignedFile.xml")));
		System.out.println("Signed Xml file written : SignedFile.xml");
	}

	public void validateXml() throws Exception {
		Security.addProvider(new org.apache.jcp.xml.dsig.internal.dom.XMLDSigRI());

		DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
		dbf.setNamespaceAware(true);
		java.io.ByteArrayInputStream vb = new java.io.ByteArrayInputStream(
				Files.readAllBytes(new File("SignedFile.xml").toPath()));

		Document doc2 = dbf.newDocumentBuilder().parse(vb);
		NodeList nl = doc2.getElementsByTagNameNS(XMLSignature.XMLNS, "Signature");
		if (nl.getLength() == 0) {
			throw new Exception("Cannot find Signature element");
		}

		XMLSignatureFactory fac = XMLSignatureFactory.getInstance("DOM", java.security.Security.getProvider("XMLDSig"));

		// Unmarshal the XMLSignature.

		// Create a DOMValidateContext and specify a KeySelector and document context.
		DOMValidateContext valContext = new DOMValidateContext(new X509KeySelector(), nl.item(0));

		XMLSignature signature = fac.unmarshalXMLSignature(valContext);

		// Validate the XMLSignature.
		boolean coreValidity = signature.validate(valContext);

		// Check core validation status.
		if (coreValidity == false) {
			String validateError;
			validateError = "Signature core validation status:false";
			boolean sv = signature.getSignatureValue().validate(valContext);
			validateError = validateError + " | Signature validation status:" + sv;
			if (sv == false || true) {
				validateError = validateError + " | References: ";
				// Check the validation status of each Reference.
				@SuppressWarnings("rawtypes")
				Iterator g = signature.getSignedInfo().getReferences().iterator();
				while (g.hasNext()) {
					Reference r = (Reference) g.next();
					boolean refValid = r.validate(valContext);
					validateError = validateError + "{ref[" + r.getURI() + "] validity status: " + refValid + "}";
				}
			}
			throw new Exception(validateError);
		} else {
			System.out.println("Signature passed core validation");
		}

	}

	public static void main(String[] args) throws Exception {
		XmlSigningAndValidate xmlSigningAndValidate = new XmlSigningAndValidate();
		xmlSigningAndValidate.signXml();
		xmlSigningAndValidate.validateXml();
	}

}

import java.security.Key;
import java.security.PublicKey;
import java.security.cert.X509Certificate;
import java.util.Iterator;

import javax.xml.crypto.AlgorithmMethod;
import javax.xml.crypto.KeySelector;
import javax.xml.crypto.KeySelectorException;
import javax.xml.crypto.KeySelectorResult;
import javax.xml.crypto.XMLCryptoContext;
import javax.xml.crypto.XMLStructure;
import javax.xml.crypto.dsig.SignatureMethod;
import javax.xml.crypto.dsig.keyinfo.KeyInfo;
import javax.xml.crypto.dsig.keyinfo.X509Data;

public class X509KeySelector extends KeySelector {
	public KeySelectorResult select(KeyInfo keyInfo, KeySelector.Purpose purpose, AlgorithmMethod method,
			XMLCryptoContext context) throws KeySelectorException {
		@SuppressWarnings("rawtypes")
		Iterator ki = keyInfo.getContent().iterator();
		while (ki.hasNext()) {
			XMLStructure info = (XMLStructure) ki.next();
			if (!(info instanceof X509Data))
				continue;
			X509Data x509Data = (X509Data) info;
			@SuppressWarnings("rawtypes")
			Iterator xi = x509Data.getContent().iterator();
			while (xi.hasNext()) {
				Object o = xi.next();
				if (!(o instanceof X509Certificate))
					continue;
				final PublicKey key = ((X509Certificate) o).getPublicKey();
				// Make sure the algorithm is compatible
				// with the method.
				if (algEquals(method.getAlgorithm(), key.getAlgorithm())) {
					return new KeySelectorResult() {
						public Key getKey() {
							return key;
						}
					};
				}
			}
		}
		throw new KeySelectorException("No key found!");
	}

	boolean algEquals(String algURI, String algName) {
		return (algName.equalsIgnoreCase("DSA") && algURI.equalsIgnoreCase(SignatureMethod.DSA_SHA1))
				|| (algName.equalsIgnoreCase("RSA") && algURI.equalsIgnoreCase(SignatureMethod.RSA_SHA1));
	}
}

 

Complete Sourcecode download here

GC Explained: Heap :: Generational Garbage Collectors

JVM heap is divided into two different Generations. One is called Young and the second one is the Old (sometimes referred to as Tenured). The Young Generation is further separated into two main logical sections: Eden and Survivor spaces. There are also Virtual spaces for both Young and the Old Generations which are used by Garbage Collectors to resize other regions – mainly to meet different GC goals.

Image title

Weak Generational Hypothesis

Why is heap divided into the Young and Old Generations? It’s because lots of objects are usually created and used for a relatively short period of time. This observation is called Weak Generational Hypothesis in GC theory. Imagine some objects created and used only inside the loop – assuming that they are not going to be scalarized, every iteration discards previously created objects and creates new ones.

Object Lifecycle

Objects start their journey in Eden of the Young Generation. When Eden fills up, so called Minor GC is performed: all application threads are stopped (stop-the-world pause), objects which are not used anymore are discarded and all other objects from the Eden are moved to the first Survivor space (S0). Next time a Minor GC is performed, the objects go from S0 to the second Survivor space (S1). All live objects from Eden go to S1 as well. Notice that it leads to a differently aged object in the Survivor space – we have objects from Eden and objects which were already in the Survivor space. Next iteration of Minor GC moves the objects from S1 back to the S0, so the Survivor spaces switch every GC. Why do we have two Survivor spaces and why do we switch them? It’s pretty simple – when the object reaches certain age threshold, it is promoted to the Old Generation. It leads to Survivor space fragmentation which can be easily eliminated with moving all objects from S0 to S1 and back every Minor GC.

Eventually, when the Old Generation fills up, a Major GC will be performed on the Old Generation which cleans it up and compacts that space. If and how stop-the-world pauses occur during Major GC depends on specific GC algorithm used.

Besides Minor and Major GC, there is also a Full GC which is about cleaning the entire heap – both Young (by Minor GC) and Old (Tenured) (by Major GC) Generations. Because a Full GC includes Minor GC, it also causes stop-the-world pauses.

Summary

There are two main advantages of having the heap divided into two regions. Firstly, it’s always faster to process only some portion of the heap (stop-the-world pauses take less). Secondly, during Minor GC, all objects from Eden are either moved or discarded which automatically means that this part of the heap is compacted.

Rules for method overriding:

  • In java, a method can only be written in Subclass, not in same class.
  • The argument list should be exactly the same as that of the overridden method.
  • The return type should be the same or a subtype of the return type declared in the original overridden method in the super class.
  • The access level cannot be more restrictive than the overridden method’s access level. For example: if the super class method is declared public then the overridding method in the sub class cannot be either private or protected.
  • Instance methods can be overridden only if they are inherited by the subclass.
  • A method declared final cannot be overridden.
  • A method declared static cannot be overridden but can be re-declared.
  • If a method cannot be inherited then it cannot be overridden.
  • A subclass within the same package as the instance’s superclass can override any superclass method that is not declared private or final.
  • A subclass in a different package can only override the non-final methods declared public or protected.
  • An overriding method can throw any uncheck exceptions, regardless of whether the overridden method throws exceptions or not. However the overriding method should not throw checked exceptions that are new or broader than the ones declared by the overridden method. The overriding method can throw narrower or fewer exceptions than the overridden method.
  • Constructors cannot be overridden.

Iterator vs ListIterator

In this tutorial we will learn about the difference between Iterator and ListIterator . We will also look at the examples of Iterator and ListIterator one by one. We have already discussed different type of iterators  i.e fail-fast iterator and fail-safe iterator.

Difference between Iterator and ListIterator in Java with Example
 
1. Traversal Direction  :  ListIterator allows the programmers to iterate the list objects in both directions i.e forward as well as backward direction using previous() and next() method.
Iterator can be used to  iterate the list,map and set object in one direction i.e forward.

2.  Set and Map implemented Objects Traversal : ListIterator can be used to traverse List object only . But Iterator can be used to traverse Map, List and Set implemented objects.

for example
// ListIterator object is created
ListIterator listIteratorObject = List.listIterator();
// Iterator object is created
Iterator  iteratorObject  = List.iterator();

3. Add or Set operation at any index : According to ListIterator Oracle docs,
ListIterator can modify the list  during iteration using add(E e) , remove() or set(E e).
Iterator can not add the element during traversal but they can remove the element from the underlying collection during the iteration as they only consist of remove() method. There is no add(E e) and set(E e) method in Iterator.

4. Determine Iterator’s current position :  ListIterator can obtain the iterator’s current position in the list. Iterator’s current position during traversal can not be determined using Iterator.

5. Retrieve Index of the element : ListIterator can obtain the index of the elements using previousIndex(E e) or nextIndex(E e) methods. We can not obtain the index using Iterator as there is no such methods present.

Example of Iterator and ListIterator 

import java.util.Iterator;
import java.util.ListIterator;

public class IteratorListIteratorExample {
public static void main(String[] args) {

List listObject = new ArrayList();
listObject.add(“Alive is awesome”);
listObject.add(“Love yourself”);

ListIterator listIteratorObject = listObject.listIterator();
System.out.println(“ListIterator object output in forward direction:”);
System.out.println(“”);
while( listIteratorObject.hasNext() )
{
System.out.println(listIteratorObject.next());
}

System.out.println(“ListIterator object output in backward direction:”);
System.out.println(“”);
while( listIteratorObject.hasPrevious() )
{
System.out.println(listIteratorObject.previous());
}

List iteratorListObject = new ArrayList();

iteratorListObject.add(“Facebook”);
iteratorListObject.add(“Google”);
iteratorListObject.add(“Apple”);

Iterator javaHungryIterator = iteratorListObject.iterator();
System.out.println(“Iterator object output in forward direction:”);

while( javaHungryIterator.hasNext() )
{
System.out.println(javaHungryIterator.next());
}

}
}

Output :

ListIterator object output in forward direction:
Alive is awesome
Love yourself
ListIterator object output in backward direction:
Love yourself
Alive is awesome
Iterator object output in forward direction:
Facebook
Google
Apple

Similarities between Iterator and ListIterator in Java

1. Interfaces : Both Iterator and ListIterator are interfaces . ListIterator extends Iterator interface.

2. Collection Framework : Both Iterator and ListIterator are member of the Java Collection Framework.

3. Traversal : Both are used to iterate over the collection of objects .

4. Interfaces added to jdk : Both interfaces are added to the jdk in java 1.2

Recap : Difference between Iterator and ListIterator in Java with Example 

ListIterator Iterator
Traversal Direction Both , forward and backward Forward
Objects traversal List only Map, Set and List
Add and Set operations Allows both operations Not possible
Iterator’s current position Yes , can be determined Not possible.
Retrieve Index Yes Not possible