Translate

Friday, August 17, 2012

Performance comparison of Executor framework vs Fork/Join framework’s RecursiveTask feature in java or JDK7

I was testing new RecursiveTask feature of ForkJoin framework introduced in JDK7 or Java 1.7.

There are two types of ForkJoinTask specializations:
  1. Instances of RecursiveAction represent executions that do not yield a return value.
  2. In contrast, instances of RecursiveTask yield return values.
I have already posted an example of RecursiveAction in previous post. To know more about theoretic details and RecursiveAction sample code, you can visit the link: How to use Fork-Join Framework features in JDK7?


In this post I will do comparison of Executor Framework Vs ForkJoin framework's RecursiveTask feature. Following example fills primitive long array of size 100000000 with values from 0 to 10000000.

I use executor framework to calculate the sum of whole array  then I perform same logic using ForkJoin framework's RecursiveTask feature. To setup the example, I will first show the code which uses executor framework and its result then I will show ForkJoin framework's RecursiveTask feature implementing same logic and its result. Array sum logic in both samples was executed 10 times to get the average output and performance difference.

Update 1: Anonymous commenter advised to split the executor pool in even size (4 = number of available processors in my case) to maximize the through put. Advise well taken and updated blog as well. There is defiantly improvement in performance.

Executor framework code sample:

import static java.util.Arrays.asList;

import java.util.List;
import java.util.Random;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;

public class ExecutorSum {
    Random random = new Random();

    // Fill array with its own index value
    public void fillArray(long[] array) {
        for (int i = 0; i < array.length; i++) {
            array[i] = i;
        }
    }

    public static void main(String[] args) throws InterruptedException,
            ExecutionException {
        ExecutorSum sum = new ExecutorSum();
        long[] array = new long[10_00_00_000];
        sum.fillArray(array);

        for (int i = 0; i < 10; i++) {
            int processors = Runtime.getRuntime().availableProcessors();
            ExecutorService executor = Executors.newFixedThreadPool(processors);
            List<Future<Long>> results;
            long start = System.currentTimeMillis();
          
            // array size/No. of processors
            int splitCount = array.length / processors;
          
            //Split pool size into even size for maximum through put
            results = executor.invokeAll(asList(new ArraySum(array, 0,
                    splitCount), new ArraySum(array, splitCount + 1,
                    splitCount * 2), new ArraySum(array, (splitCount * 2) + 1,
                    splitCount * 3), new ArraySum(array, (splitCount * 3) + 1,
                    array.length)

            ));

            executor.shutdown();

            // Calculating final result sum
            long count = 0;
            for (Future<Long> result : results) {
                count = count + result.get();
            }

            System.out.println("result: " + count);
            System.out.println("Sequential processing time: "
                    + (System.currentTimeMillis() - start) + " ms");
        }
    }
}

class ArraySum implements Callable<Long> {
    private final long from;
    private final long to;
    private long[] array;

    ArraySum(long[] array, long from, long to) {
        this.from = from;
        this.to = to;
        this.array = array;
    }

    @Override
    public Long call() throws Exception {
        long count = 0L;
        // Calculating sum of the given array range
        for (int i = (int) from; i < to; i++) {
            count = count + array[i];
        }

        return count;
    }
}

Output:

result: 4999999800000000
Sequential processing time: 113 ms
result: 4999999800000000
Sequential processing time: 108 ms
result: 4999999800000000
Sequential processing time: 113 ms
result: 4999999800000000
Sequential processing time: 124 ms
result: 4999999800000000
Sequential processing time: 122 ms
result: 4999999800000000
Sequential processing time: 107 ms
result: 4999999800000000
Sequential processing time: 123 ms
result: 4999999800000000
Sequential processing time: 134 ms
result: 4999999800000000
Sequential processing time: 122 ms
result: 4999999800000000
Sequential processing time: 109 ms

Fork/Join framework code sample:

import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinSumTask {
   
    public void fillArray(long[] array) {
        for (int i=0; i<array.length; i++) {
            array[i] = i;
        }
    }

    public static void main(String[] args) {
        ForkJoinSumTask sum = new ForkJoinSumTask();
        long[] array = new long[10_00_00_000];
        sum.fillArray(array);
       
        System.out.println("Number of processors available: " + Runtime.getRuntime().availableProcessors());
       
        ForkJoinPool fjpool = new ForkJoinPool(32); //Default parallelism level = Runtime.getRuntime().availableProcessors()
       
        for (int i=0; i<10; i ++) {
            RecursiveSumTask task = new RecursiveSumTask(array, 0, array.length);
            long start = System.currentTimeMillis();
            System.out.println("Result: " + fjpool.invoke(task));
            System.out.println("Parallel processing time: "    + (System.currentTimeMillis() - start)+ " ms");
        }
       
        System.out.println("Number of steals: " + fjpool.getStealCount() + "\n");
    }
}

class RecursiveSumTask extends RecursiveTask<Long> {
    private static final long serialVersionUID = 1L;
    private final long from;
    private final long to;
    private long[] array;
    final int splitSize = 10_00_0000; //Some threshold size to spit the task

    RecursiveSumTask(long[] array, long from, long to) {
        this.from = from;
        this.to = to;
        this.array = array;
    }

    @Override
    protected Long compute() {
        long count = 0L;
        List<RecursiveTask<Long>> forks = new LinkedList<>();
       
        if ( to - from > splitSize) {
            // task is huge so divide in half
            long mid = (from + to) >>> 1;
           
            //Divided the given task into task1 and task2
            RecursiveSumTask task1 = new RecursiveSumTask(array, from, mid);
            forks.add(task1);
            task1.fork();
           
            RecursiveSumTask task2 = new RecursiveSumTask(array, mid, to);
            forks.add(task2);
            task2.fork();
        } else {
            //Calculating sum of the given array range
            for (int i = (int) from; i < to; i++) {
                count = count + array[i];
            }
        }
       
        //Waiting for the result
        for (RecursiveTask<Long> task : forks) {
            count = count + task.join();
        }
       
        return count;
    }
}

Output:

Number of processors available: 4
Result: 4999999950000000
Parallel processing time: 135 ms
Result: 4999999950000000
Parallel processing time: 123 ms
Result: 4999999950000000
Parallel processing time: 105 ms
Result: 4999999950000000
Parallel processing time: 115 ms
Result: 4999999950000000
Parallel processing time: 111 ms
Result: 4999999950000000
Parallel processing time: 114 ms
Result: 4999999950000000
Parallel processing time: 114 ms
Result: 4999999950000000
Parallel processing time: 109 ms
Result: 4999999950000000
Parallel processing time: 126 ms
Result: 4999999950000000
Parallel processing time: 128 ms
Number of steals: 301

Conclusion:

Both code samples calculate array sum 10 times to warmup HotSpot JVM and get average result. As you can see from both outputs, there is no visible improvement in the performance using RecursiveTask. May be my sum logic is small and not so complex. If the logic is more complex then performance difference will be lot more.

References:
  1. How to use Fork-Join Framework features in JDK7?  
  2. Fork and Join: Java Can Excel at Painless Parallel Programming 
  3. JDK 7 Adoption Guide

Thursday, August 16, 2012

How to use Fork/Join Framework's RecursiveAction feature in JDK7?

JDK 7 has new additions for supporting parallelism using ForkJoinPool executor that is dedicated to running instances implementing ForkJoinTask.

ForkJoinTask objects feature two specific methods:
  1. The fork() method allows a ForkJoinTask to be planned for asynchronous execution. This allows a new ForkJoinTask to be launched from an existing one. fork() only schedules a new task within a ForkJoinPool, but no child Java Virtual Machine is ever created.
  2. In turn, the join() method allows a ForkJoinTask to wait for the completion of another one.
There are two types of ForkJoinTask specializations:
  1. Instances of RecursiveAction represent executions that do not yield a return value.
  2. In contrast, instances of RecursiveTask yield return values. You can checkout RecursiveTask code sample at link: Performance comparison of Executor framework vs ForkJoin framework’s RecursiveTask feature in java or JDK7
Following example illustrates how to use RecursiveAction type of ForkJoinTask implementation:

Code Sample:

import static java.util.Arrays.asList;

import java.util.Random;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveAction;

public class ForkJoinRandomFillAction {
    Random random = new Random();
   
    public void loadArray(int[] array) {
        for (int i=0; i<array.length; i++) {
            array[i] = random.nextInt(10000); //Generates numbers from 0 to 10000
        }
    }
   
    public static void main(String[] args) {
        ForkJoinRandomFillAction sort = new ForkJoinRandomFillAction();
      
        int arrayLength = 1_00_00_0000;
        int array[] = new int[arrayLength];

        //No. of times sequential & Parallel operation should be performed to warm up HotSpot JVM
        final int iterations = 10;
      
        for (int i = 0; i < iterations; i++) {
            long start = System.currentTimeMillis();
            sort.loadArray(array);

            System.out.println("Sequential processing time: " + (System.currentTimeMillis() - start) + " ms");
          
        }
      
        System.out.println("Number of processor available: " + Runtime.getRuntime().availableProcessors());
      
        ForkJoinPool fjpool = new ForkJoinPool(64); //Default parallelism level = Runtime.getRuntime().availableProcessors()
      
        for (int i = 0; i < iterations; i++) {
            // Create a task with the complete array
            RecursiveAction task = new RandomFillAction(array, 0, array.length);
            long start = System.currentTimeMillis();
            fjpool.invoke(task);

            System.out.println("Parallel processing time: "    + (System.currentTimeMillis() - start)+ " ms");
        }
      
        System.out.println("Number of steals: " + fjpool.getStealCount() + "\n");
    }
}

class RandomFillAction extends RecursiveAction {
    private static final long serialVersionUID = 1L;
    final int low;
    final int high;
    private int[] array;
    final int splitSize = 40000; //Some threshold size to spit the task
   
    public RandomFillAction(int[] array, int low, int high) {
        this.low = low;
        this.high = high;
        this.array = array;
    }

    @Override
    protected void compute() {
        if (high - low > splitSize) {
            // task is huge so divide in half
            int mid = (low + high) >>> 1;
            invokeAll(asList(new RandomFillAction(array, low, mid), new RandomFillAction(array, mid, high)));
        } else {
            //Some calculation logic
            Random random = new Random();
            for (int i = low; i < high; i++) {
                array[i] = random.nextInt(10000);
            }
        }
    }
}

Output:

Sequential processing time: 1180 ms
Sequential processing time: 1178 ms
Sequential processing time: 1167 ms
Sequential processing time: 1165 ms
Sequential processing time: 1162 ms
Sequential processing time: 1158 ms
Sequential processing time: 1154 ms
Sequential processing time: 1169 ms
Sequential processing time: 1168 ms
Sequential processing time: 1161 ms
Number of processor available: 4
Parallel processing time: 469 ms
Parallel processing time: 393 ms
Parallel processing time: 389 ms
Parallel processing time: 373 ms
Parallel processing time: 375 ms
Parallel processing time: 371 ms
Parallel processing time: 587 ms
Parallel processing time: 384 ms
Parallel processing time: 385 ms
Parallel processing time: 376 ms
Number of steals: 1498

Conclusion:
  1. ForkJoin framework gives almost 300% improvement in performance compare to sequential execution.
  2. I was just trying to demonstrate ForkJoin framework in simplest way possible. You can definitely use executor framework in java to split the task in different threads to gain better performance.
  3. This post focused on the new fork/join tasks provided by Java SE 7 for making it easier to write parallel programs. Though it is not that as easy as marking some methods or long running loops with annotations to make them run in parallel still it is a good addition to JDK 7.
References:
  1. Performance comparison of Executor framework vs ForkJoin framework’s RecursiveTask feature in java or JDK7
  2. Fork and Join: Java Can Excel at Painless Parallel Programming
  3. JDK 7 Adoption Guide

Tuesday, August 7, 2012

How to use node.js?

Introduction:

Node.js is a platform built on Chrome's JavaScript run-time for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices. Node.js is a server-side version of JavaScript. That means all the things all them cool things about JavaScript apply here.

What makes Node any different from the rest?
Node is evented I/O for V8 JavaScript. V8 is Google’s super fast JavaScript implementation that’s used in their Chrome browser. JavaScript’s ability to pass around closures makes event-based programming dead simple.

Node is not strictly for web development. You can think of Node as a framework for server development of any kind. With Node you can build an IRC server, a chat server, or HTTP server as done in following hello world example. 

You can use node.js to create light weight web server. See the following example from node.js tutorial site:

var http = require('http');
  http.createServer(function (req, res) {
      res.writeHead(200, {'Content-Type': 'text/plain'});
      res.write("Dynamic contents...");
      res.end('Hello World\n');
  }).listen(1337, "127.0.0.1");

console.log('Server running at http://127.0.0.1:1337/');
 
Above code is java script so you can write your dynamic code generation logic in java script and start using it in no time.

You will have to download and setup node.js from link:http://nodejs.org/ for Windows or whatever platform you prefer to use.

Start the web server using command:
node hello.js

To test if above sample works open your browser and hit url:  http://127.0.0.1:1337/

You should see familiar Hello World message.

References:
node.js Download: http://nodejs.org/
node.js Hello World Example: http://www.theprojectspot.com/tutorial-post/Node-js-for-beginners-part-1-hello-world/2
More detailed discussion: http://net.tutsplus.com/tutorials/javascript-ajax/this-time-youll-learn-node-js/