[SOLVED] String vs long performance in Java

Issue

As I read in another questions, int has a much better performance than a long, and a little better performance than a string.

Let’s say that I have a number greater than an int and I want to use it for comparisons with similar numbers: What type of variable has better performance for such a case? long or string?

For example to compare 111.222.333.444 with 555.666.777.888:

long x = 111222333444;
long y = 555666777888;
if(x == y){ /*code*/ }

VS

string x = "111222333444";
string y = "555666777888";
if(x.equals(y)){ /*code*/ }

Which case has the best performance? The difference is significant?

Solution

As I read in another questions, int has a much better performance than a long …

Well what you have read is probably wrong. Or more likely, what you have understood from what you have read is wrong.

It is true that on some machines an arithmetic operations on a long may take longer than an analogous int operation. But on a modern machine, there is likely to be no difference (for the arithmetic itself), and even when there is a difference it will be just 1 or 2 clock cycles; i.e. nano-seconds.

So "a lot better" is an exaggeration.

… and a little better performance than a string.

That is wrong too. In fact, operations on int values is going to be a LOT faster than operations on String object. For instance, consider

String x = "111222333444";
String y = "555666777888";
if (x.equals(y)) { /*code*/ }

The equals method will do the following:

  1. Test if x == y and return true if it is.
  2. Test if y is a String, and return false if it isn’t.
  3. Test that the lengths of the 2 strings are the same and return false if they aren’t.
  4. Loop over the characters in the string, comparing the characters at the same position, and returning false if they aren’t equal.
  5. Return true if the loop finishes without finding non-equal characters.

Plus the overhead of a method call (that is too large to inline).

We are talking about (I estimate) a minimum of 20 to 30 machine instructions, and many more if the two strings are equal, or equal at the start.

By contrast, x == y using int or long is just one instruction1.


However, there is a bigger point here.

The chances are that you are actually wasting your time here. Unless you are doing these operations for millions of numbers, the chances are that the difference in performance is not going to be noticeable.

My advice is:

  • Don’t optimize unless you have to.
  • Don’t even think about this kind of thing unless you have to.
  • If you do have to (i.e. you have real, hard performance numbers to say that your code is too slow), then do it scientifically. Write some benchmarks involving running your >>real<< code on >>real<< data, and use a profile to figure out where the code is actually spending most of its time. Then optimize only those parts of the code.

1 – I’m no expert on instruction level performance, but this document seems to be saying that an Intel CMP instruction takes 1, 2 or 3 clock cycles.

Answered By – Stephen C

Answer Checked By – Robin (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *