Java priority queue implementation - memory locality

Java priority queue implementation - memory locality - java

I am trying to implement an efficient priority queue in Java. I got to a good implementation of a binary heap but it doesn't have the ideal cache performance. For this I started studying the Van Emde Boas layout in a binary heap which led me to a "blocked" version of a binary heap, where the trick is to calculate the children and parent indices.
Although I was able to do this, the cache behavior (and running time) got worse. I think that the problem is: locality of reference is probably not being achieved, since it is Java - I'm not so sure if using an array of objects actually makes objects to be contiguous in memory in Java, can anyone confirm this please?
Also I would like very much to know what kind of data-structures Java's native PriorityQueue uses, if any would know.

In general, there is no good way to force your objects in the queue to occupy a contiguous chunk of memory. There are, however, some techniques that are suitable for special cases.
At a high level, the techniques involve using byte arrays and 'serializing' data to and from the array. This is actually quite effective if you are storing very simple objects. For example, if you are storing a bunch of 2D points + weights, you can simply write byte equivalent of the weight, x-coordinate, y-coordinate.
The problem at this point, of course, is in allocating instances while peeking/popping. You can avoid this by using a callback.
Note that even in cases where the object being stored itself is complex, using a technique similar to this where you keep one array for the weights and a separate array of references for the actual objects allows you to avoid following the object reference until absolutely necessary.
Going back to the approach for storing simple immutable value-type, here's an incomplete sketch of what you could do:
abstract class LowLevelPQ<T> {
interface DataHandler<R, T> {
R handle(byte[] source, int startLoc);
}
LowLevelPQ(int entryByteSize) { ... }
abstract encode(T element, byte[] target, int startLoc);
abstract T decode(byte[] source, int startLoc);
abstract int compare(byte[] data, int startLoc1, int startLoc2);
abstract <R> R peek(DataHandler<R, T> handler) { ... }
abstract <R> R pop(DataHandler<R, T> handler) { ... }
}
class WeightedPoint {
WeightedPoint(int weight, double x, double y) { ... }
double weight() { ... }
double x() { ... }
...
}
class WeightedPointPQ extends LowLevelPQ<WeightedPoint> {
WeightedPointPQ() {
super(4 + 8 + 8); // int,double,double
}
int compare(byte[] data, int startLoc1, int startLoc2) {
// relies on Java's big endian-ness
for (int i = 0; i < 4; ++i) {
int v1 = 0xFF & (int) data[startLoc1];
int v2 = 0xFF & (int) data[startLoc2];
if (v1 < v2) { return -1; }
if (v1 > v2) { return 1; }
}
return 0;
}
...
}

I don't think it would. Remember, "arrays of objects" aren't arrays of objects, they are arrays of object references (unlike arrays of primitives which really are arrays of the primitives). I'd expect the object references are contiguous in memory, but since you can make those references refer to any objects you want whenever you want, I doubt there's any guarantee that the objects referred to by the array of references will be contiguous in memory.
For what it's worth, the JLS section on arrays says nothing about any guarantees of contiguousness.

I think there is some FUD going on here. It is basically inconceivable that any implementation of arrays would not use contiguous memory. And the way the term is used in the JVM specification when describing the .class file format makes it pretty clear that no other implementation is contemplated.
java.util.PriorityQueue uses a binary heap, as it says in the Javadoc, implemented via an array.

Related

Java set variable and method types from one place

I am trying to make a flexible data structure that I can copy paste around. I want it to handle only one data type, but be able to change this easily when copy pasting this around. Consider this simple example:
class Data {
//customize these
int a;
int b;
int val;
private int f() {
return a + b;
}
//built-in functions that I don't customize
public void build() {
int temp = f();
this.val = temp;
}
public int query() {
return this.val;
}
}
Suppose I want a and b to be arrays and f to be concatenation. I would have to change the type of a, b, val, f, query, and temp. This would be a pain to change if I have more code that is dependent on the type. I want something like this:
ClassType c = Integer; //or maybe Long, int[], ArrayList ...
c a;
c b;
private c f() {
...
I am not looking for generics. In any particular program, I will have a single fixed type. This is purely to make copy-pasting this data structure around easier, across multiple independent programs. i.e., I do not want to specify these data types on each instantiation (with generics).
Also, ideally, this would not add overhead for int which is a common use case for me (so simply creating a class of whatever data type would not be ideal).
More specifics (kind of unrelated): I want a Segment Tree that I can "customize" to be range min query, sort the range as array, 2D Segment Tree, etc. (note that these 2 cases have different data types, and thus different method signatures) I already have the methods and data types I need figured out, and now I need to implement it.

In Java datatype is resolved at compile time, while data value is resolved at runtime. So that is the reason assigning some value as datatype of some variable is not possible in Java.

How to efficiently store a set of tuples/pairs in Java

I need to perform a check if the combination of a long value and an integer value were already seen before in a very performance-critical part of an application. Both values can become quite large, at least the long will use more than MAX_INT values in some cases.
Currently I have a very simple implementation using a Set<Pair<Integer, Long>>, however this will require too many allocations, because even when the object is already in the set, something like seen.add(Pair.of(i, l)) to add/check existence would allocate the Pair for each call.
Is there a better way in Java (without libraries like Guava, Trove or Apache Commons), to do this check with minimal allocations and in good O(?)?
Two ints would be easy because I could combine them into one long in the Set, but the long cannot be avoided here.
Any suggestions?

Here are two possibilities.
One thing in both of the following suggestions is to store a bunch of pairs together as triple ints in an int[]. The first int would be the int and the next two ints would be the upper and lower half of the long.
If you didn't mind a 33% extra space disadvantage in exchange for an addressing speed advantage, you could use a long[] instead and store the int and long in separate indexes.
You'd never call an equals method. You'd just compare the three ints with three other ints, which would be very fast. You'd never call a compareTo method. You'd just do a custom lexicographic comparison of the three ints, which would be very fast.
B* tree
If memory usage is the ultimate concern, you can make a B* tree using an int[][] or an ArrayList<int[]>. B* trees are relatively quick and fairly compact.
There are also other types of B-trees that might be more appropriate to your particular use case.
Custom hash set
You can also implement a custom hash set with a custom, fast-calculated hash function (perhaps XOR the int and the upper and lower halves of the long together, which will be very fast) rather than relying on the hashCode method.
You'd have to figure out how to implement the int[] buckets to best suit the performance of your application. For example, how do you want to convert your custom hash code into a bucket number? Do you want to rebucket everything when the buckets start getting too many elements? And so on.

How about creating a class that holds two primitives instead? You would drop at least 24 bytes just for the headers of Integer and Long in a 64 bit JVM.
Under this conditions you are looking for a Pairing Function, or generate an unique number from 2 numbers. That wikipeia page has a very good example (and simple) of one such possibility.

How about
class Pair {
int v1;
long v2;
#Override
public boolean equals(Object o) {
return v1 == ((Pair) o).v1 && v2 == ((Pair) o).v2;
}
#Override
public int hashCode() {
return 31 * (31 + Integer.hashCode(v1)) + Long.hashCode(v2);
}
}
class Store {
// initial capacity should be tweaked
private static final Set<Pair> store = new HashSet<>(100*1024);
private static final ThreadLocal<Pair> threadPairUsedForContains = new ThreadLocal<>();
void init() { // each thread has to call init() first
threadPairUsedForContains.set(new Pair());
}
boolean contains(int v1, long v2) { // zero allocation contains()
Pair pair = threadPairUsedForContains.get();
pair.v1 = v1;
pair.v2 = v2;
return store.contains(pair);
}
void add(int v1, long v2) {
Pair pair = new Pair();
pair.v1 = v1;
pair.v2 = v2;
store.add(pair);
}
}

Code duplication caused by primitive types: How to avoid insanity?

In one of my Java projects I am plagued by code repetition due to the way Java handles (not) primitives. After having to manually copy the same change to four different locations (int, long, float, double) again, for the third time, again and again I came really close (?) to snapping.
In various forms, this issue has been brought up now and then on StackOverflow:
Managing highly repetitive code and documentation in Java
How to avoid repetition when working with primitive types?
Passing dynamic list of primitives to a Java method
The consensus seemed to converge to two possible alternatives:
Use some sort of code generator.
What can you do? C'est la vie!
Well, the second solution is what I am doing now and it is slowly becoming dangerous for my sanity, much like the well known torture technique.
Two years have passed since these questions were asked and Java 7 came along. I am, therefore, hopeful for an easier and/or more standard solution.
Does Java 7 have any changes that might ease the strain in such cases? I could not find anything in the condensed change summaries, but perhaps there is some obscure new feature somewhere?
While source code generation is an alternative, I'd prefer a solution supported using the standard JDK feature set. Sure, using cpp or another code generator would work, but it adds more dependencies and requires changes to the build system.
The only code generation system of sorts that seems to be supported by the JDK is via the annotations mechanism. I envision a processor that would expand source code like this:
#Primitives({ "int", "long", "float", "double" })
#PrimitiveVariable
int max(#PrimitiveVariable int a, #PrimitiveVariable int b) {
return (a > b)?a:b;
}
The ideal output file would contain the four requested variations of this method, preferrably with associated Javadoc comments e.t.c. Is there somewhere an annotation processor to handle this case? If not, what would it take to build one?
Perhaps some other trick that has popped up recently?
EDIT:
An important note: I would not be using primitive types unless I had a reason. Even now there is a very real performance and memory impact by the use of boxed types in some applications.
EDIT 2:
Using max() as an example allows the use of the compareTo() method that is available in all numeric boxed types. This is a bit trickier:
int sum(int a, int b) {
return a + b;
}
How could one go about supporting this method for all numeric boxed types without actually writing it six or seven times?

I tend to use a "super type" like long or double if I still want a primitive. The performance is usually very close and it avoids creating lots of variations. BTW: registers in a 64-bit JVM will all be 64-bit anyway.

Why are you hung up on primitives? The wrappers are extremely lightweight and auto-boxing and generics does the rest:
public static <T extends Number & Comparable<T>> T max(T a, T b) {
return a.compareTo(b) > 0 ? a : b;
}
This all compiles and runs correctly:
public static void main(String[] args) {
int i = max(1, 3);
long l = max(6,7);
float f = max(5f, 4f);
double d = max(2d, 4d);
byte b = max((byte)1, (byte)2);
short s = max((short)1, (short)2);
}
Edited
OP has asked about a generic, auto-boxed solution for sum(), and will here it is.
public static <T extends Number> T sum(T... numbers) throws Exception {
double total = 0;
for (Number number : numbers) {
total += number.doubleValue();
}
if (numbers[0] instanceof Float || numbers[0] instanceof Double) {
return (T) numbers[0].getClass().getConstructor(String.class).newInstance(total + "");
}
return (T) numbers[0].getClass().getConstructor(String.class).newInstance((total + "").split("\\.")[0]);
}
It's a little lame, but not as lame as doing a large series of instanceof and delegating to a fully typed method. The instanceof is required because while all Numbers have a String constructor, Numbers other than Float and Double can only parse a whole number (no decimal point); although the total will be a whole number, we must remove the decimal point from the Double.toString() before sending it into the constructor for these other types.

Does Java 7 have any changes that might ease the strain in such cases?
No.
Is there somewhere an annotation processor to handle this case?
Not that I am aware of.
If not, what would it take to build one?
Time, or money. :-)
This seems to me like a problem-space where it would be difficult to come up with a general solution that works well ... beyond trivial cases. Conventional source code generation or a (textual) preprocessor seems more promising to me. (I'm not an Annotation processor expert though.)

If the extraordinary verbosity of Java is getting to you, look into some of the new, higher-level languages which run on the JVM and can interoperate with Java, like Clojure, JRuby, Scala, and so on. Your out-of-control primitive repetition will become a non-issue. But the benefits will go much further than that -- there are all kinds of ways which the languages just mentioned allow you to get more done with less detailed, repetitive, error-prone code (as compared to Java).
If performance is a problem, you can drop back into Java for the performance-critical bits (using primitive types). But you might be surprised at how often you can still get a good level of performance in the higher-level language.
I personally use both JRuby and Clojure; if you are coming from a Java/C/C#/C++ background, both have the potential to change the way you think about programming.

Heh. Why not get sneaky? With reflection, you can pull the annotations for a method (annotations similar to the example you've posted). You can then use reflection to get the member names, and put in the appropriate types... In a system.out.println statement.
You would run this once, or each time you modded the class. The output could then be copy-pasted in. This would probably save you significant time, and not be too hard to develop.
Hm ,as for the contents of the methods... I mean, if all your methods are trivial, you could hard code the style (ie if methodName.equals("max") print return a>b:a:b etc. Where methodName is determined via reflection), or you could, ummmmm... Hm. I'm imagining the contents can be easily copy pasted, but that just seems more work.
Oh! Whty not make another annotation called " contents ", give it a string value of the method contents, add that to the member, and now you can print out the contents too.
In the very least, the time spent coding up this helper, even if about as long as doing the tedious work, well, it would be more interesting, riiiight?

Your question is pretty elaborate as you already seem to know all the 'good' answers. Since due to language design we are not allowed to use primitives as generic parameter types, the best practical answer is where #PeterLawrey is heading.
public class PrimitiveGenerics {
public static double genericMax( double a, double b) {
return (a > b) ?a:b;
}
public int max( int a, int b) {
return (int) genericMax(a, b);
}
public long max( long a, long b) {
return (long) genericMax(a, b);
}
public float max( float a, float b) {
return (float) genericMax(a, b);
}
public double max( double a, double b) {
return (double) genericMax(a, b);
}
}
The list of primitive types is small and hopefully constant in future evolution of the language and double type is the widest/most general.
In the worst case, you compute using 64 bit variables where 32 bit would suffice. There is a performance penalty for conversion(tiny) and for pass by value into one more method (small), but no Objects are created as this is the main (and really huge) penalty for using primitive wrappers.
I also used a static method so it is bound early and not in run-time, although it is just one and which is something that JVM optimization usually takes care of but it won't hurt anyway. May depend on real case scenario.
Would be lovely if someone tested it, but I believe this is the best solution.
UPDATE:
Based on #thkala's comment, double may only represent long-s until certain magnitude as it loses precision (becomes imprecise when dealing with long-s) after that:
public class Asdf2 {
public static void main(String[] args) {
System.out.println(Double.MAX_VALUE); //1.7976931348623157E308
System.out.println( Long.MAX_VALUE); //9223372036854775807
System.out.println((double) Long.MAX_VALUE); //9.223372036854776E18
}
}

From the performance point of view (I make a lot of CPU-bound algorithms too), I use my own boxings that are not immutable. This allows using mutable numbers in sets like ArrayList and HashMap to work with high performance.
It takes one long preparation step to make all the primitive containers with their repetitive code, and then you just use them. As I also deal with 2-dimensional, 3-dimensional etc values, I also created those for myself. The choice is yours.
like:
Vector1i - 1 integer, replaces Integer
Vector2i - 2 integer, replaces Point and Dimension
Vector2d - 2 doubles, replaces Point2D.Double
Vector4i - 4 integers, could replace Rectangle
Vector2f - 2-dimensional float vector
Vector3f - 3-dimensional float vector
...etc...
All of them represent a generalized 'vector' in mathematics, hence the name for all these primitives.
One downside is that you cannot do a+b, you have make methods like a.add(b), and for a=a+b I chose to name the methods like a.addSelf(b). If this bothers you, take a look at Ceylon, which I discovered very recently. It's a layer on top of Java (JVM/Eclispe compatbile) created especially to address it's limitations (like operator overloading).
One other thing, watch out when using these classes as a key in a Map, as sorting/hashing/comparing will go haywire when the value changes.

I'd agree with previous answers/comments that say there isn't a way to do exactly what you want "using the standard JDK feature set." Thus, you are going to have to do some code generation, although it won't necessarily require changes to the build system. Since you ask:
... If not, what would it take to build one?
... For a simple case, not too much, I think. Suppose I put my primitive operations in a util class:
public class NumberUtils {
// #PrimitiveMethodsStart
/** Find maximum of int inputs */
public static int max(int a, int b) {
return (a > b) ? a : b;
}
/** Sum the int inputs */
public static int sum(int a, int b) {
return a + b;
}
// #PrimitiveMethodsEnd
// #GeneratedPrimitiveMethodsStart - Do not edit below
// #GeneratedPrimitiveMethodsEnd
}
Then I can write a simple processor in less than 30 lines as follows:
public class PrimitiveMethodProcessor {
private static final String PRIMITIVE_METHODS_START = "#PrimitiveMethodsStart";
private static final String PRIMITIVE_METHODS_END = "#PrimitiveMethodsEnd";
private static final String GENERATED_PRIMITIVE_METHODS_START = "#GeneratedPrimitiveMethodsStart";
private static final String GENERATED_PRIMITIVE_METHODS_END = "#GeneratedPrimitiveMethodsEnd";
public static void main(String[] args) throws Exception {
String fileName = args[0];
BufferedReader inputStream = new BufferedReader(new FileReader(fileName));
PrintWriter outputStream = null;
StringBuilder outputContents = new StringBuilder();
StringBuilder methodsToCopy = new StringBuilder();
boolean inPrimitiveMethodsSection = false;
boolean inGeneratedPrimitiveMethodsSection = false;
try {
for (String line;(line = inputStream.readLine()) != null;) {
if(line.contains(PRIMITIVE_METHODS_END)) inPrimitiveMethodsSection = false;
if(inPrimitiveMethodsSection)methodsToCopy.append(line).append('\n');
if(line.contains(PRIMITIVE_METHODS_START)) inPrimitiveMethodsSection = true;
if(line.contains(GENERATED_PRIMITIVE_METHODS_END)) inGeneratedPrimitiveMethodsSection = false;
if(!inGeneratedPrimitiveMethodsSection)outputContents.append(line).append('\n');
if(line.contains(GENERATED_PRIMITIVE_METHODS_START)) {
inGeneratedPrimitiveMethodsSection = true;
String methods = methodsToCopy.toString();
for (String primative : new String[]{"long", "float", "double"}) {
outputContents.append(methods.replaceAll("int\\s", primative + " ")).append('\n');
}
}
}
outputStream = new PrintWriter(new FileWriter(fileName));
outputStream.print(outputContents.toString());
} finally {
inputStream.close();
if(outputStream!= null) outputStream.close();
}
}
}
This will fill the #GeneratedPrimitiveMethods section with long, float and double versions of the methods in the #PrimitiveMethods section.
// #GeneratedPrimitiveMethodsStart - Do not edit below
/** Find maximum of long inputs */
public static long max(long a, long b) {
return (a > b) ? a : b;
}
...
This is an intentionally a simple example, and I'm sure it doesn't cover all cases, but you get the point and can see how it could be extended e.g. to search multiple files or use normal annotations and detect method ends.
Furthermore, whilst you could set this up as a step in your build system, I set this up to run as a builder before the Java builder in my eclipse project. Now whenever I edit the file and hit save; it's updated automatically, in place, in less than a quarter of a second. Thus, this becomes more of a editing tool, than a step in the build system.
Just a thought...

Is a switch statement the fastest way to implement operator interpretation in Java

Is a switch statement the fastest way to implement operator interpretation in Java
public boolean accept(final int op, int x, int val) {
switch (op) {
case OP_EQUAL:
return x == val;
case OP_BIGGER:
return x > val;
case OP_SMALLER:
return x < val;
default:
return true;
}
}
In this simple example, obviously yes. Now imagine you have 1000 operators. would it still be faster than a class hierarchy? Is there a threshold when a class hierarchy becomes more efficient in speed than a switch statement? (in memory obviously not)
abstract class Op {
abstract public boolean accept(int x, int val);
}
And then one class per operator.
EDIT:
Sorry, I should have been more specific by the look of the answers.
The Operator is totally unknown and I'm using JDk 1.4. No choice. No enums. No Closures. :(
The operator is chosen by the user among many many many choices. For simplicity sake, Imagine a GUI List with 1000 operations, when user selects one, op code of the switch statement is chosen. Using a class hierarchy, user would select a class.
I'm asking this question because someone must have tested it before. I don't feel like creating 1000 classes and 1000 bogus op codes to test it. If nobody has done it. I will test it and report the results, if they may have any meaning at all.

EDIT:
Okay, since you have to use JDK 1.4, my original answer is a no-go (left below for reference). I would guess that the switch is not as fast as the abstract class-based solution when you're just looking at the apply(which,a,b) vs which.apply(a,b) call. You'll just have to test that.
However, when testing, you may also want to consider start-up time, memory footprint, etc.
ORIGINAL:
public enum OPERATION {
// ...operators+implementation, e.g.:
GREATER_THAN { public boolean apply(int a, int b) { return a > b; } };
public abstract boolean apply(int a, int b);
}
usage:
OPERATION x = //..however you figure out which
boolean result = x.apply(a,b);
this is one of the case uses in Effective Java for enums. It works exactly like the switch, only less confusing.

Because of the way a switch statement is usually implemented in a jvm, with a lookup table, it is likely it is going to be faster, with a small or big number of operators. This is just guessing; to have a definite answer you need to benchmark it on the system it is intended to run.
But, that is just a micro-optimization which you shouldn't care about unless profiling shows that it could really make a difference. Using integers instead of a specific class (or enum) makes code less readable. A huge switch statement with 1000 cases is a sign of a bad design. And that will have an influence on the code that is using the operators; less readable, more bugs, harder to refactor,...
And to get back to performance, which seems to be the goal here. In hard to read, badly designed code, the changes required for macro-optimizations become harder. And those optimizations are usually a lot more important than micro-optimizations like this switch.

I do not know what is fastest, nor do I think there are any guarantees. Optimization of code is very much dependent on both compiler and runtime.
I do think that it's hard to beat a switch statement. Due to the limitations Java puts on the types which can be switched it can fairly easily be compiled to a lookup table, which is about the fastest access you can get.

Use a table-driven method, as a previous poster pointed out you may use the operator as the index of an array. The value stored in the array could be an instance of a class that performs the comparison. The array can be initialized statically, or better on-demand (lazy loading pattern).
e.g.
// Interface and classes
interface Operator {
boolean operate(int x, int y);
}
class EqualsOperator implements Operator {
boolean operate(int x, int y){
return x==y;
}
}
class NotEqualsOperator implements Operator {
boolean operate(int x, int y){
return x=!y;
}
}
...
// Static initialization
Operator[] operators = new Operator[n];
operator[0] = new EqualsOperator();
operator[1] = new NotEqualsOperator();
...
// Switch
public boolean accept(final int op, int x, int val) {
operator[op].operate(x,val);
}

If the calling method already has to decide which operator value to use and call accept(), then the fastest thing would be to do the comparisons inline in that same calling method.
Alternatively, use three methods (or strategies):
public boolean acceptGreater(int x, int val) {
return x > val;
}
public boolean acceptLess(int x, int val) {
return x < val;
}
public boolean acceptEquals(int x, int val) {
return x == val;
}

I wouldn't look at this purely from a raw performance point of view, but I'd evaluate this as a refactoring candidate, see c2:Refactor Mercilessly. I liked the answer given to code resuability:
If you repeat it once, copy it.
If you repeat it twice, refactor it.
I'd identify the adding of multiple case statements as repetition, and then I'd refactor to implement the Strategy Pattern.
I'd name the operator classes with a strategy suffix, and implement the execute method.

I've always found that the java switch statement is not as powerful as I need. In his last release lambdaj implements it with a smart use of closure and Hamcrest matcher.

Grab a segment of an array in Java without creating a new array on heap

I'm looking for a method in Java that will return a segment of an array. An example would be to get the byte array containing the 4th and 5th bytes of a byte array. I don't want to have to create a new byte array in the heap memory just to do that. Right now I have the following code:
doSomethingWithTwoBytes(byte[] twoByteArray);
void someMethod(byte[] bigArray)
{
byte[] x = {bigArray[4], bigArray[5]};
doSomethingWithTwoBytes(x);
}
I'd like to know if there was a way to just do doSomething(bigArray.getSubArray(4, 2)) where 4 is the offset and 2 is the length, for example.

Disclaimer: This answer does not conform to the constraints of the question:
I don't want to have to create a new byte array in the heap memory just to do that.
(Honestly, I feel my answer is worthy of deletion. The answer by #unique72 is correct. Imma let this edit sit for a bit and then I shall delete this answer.)
I don't know of a way to do this directly with arrays without additional heap allocation, but the other answers using a sub-list wrapper have additional allocation for the wrapper only – but not the array – which would be useful in the case of a large array.
That said, if one is looking for brevity, the utility method Arrays.copyOfRange() was introduced in Java 6 (late 2006?):
byte [] a = new byte [] {0, 1, 2, 3, 4, 5, 6, 7};
// get a[4], a[5]
byte [] subArray = Arrays.copyOfRange(a, 4, 6);

Arrays.asList(myArray) delegates to new ArrayList(myArray), which doesn't copy the array but just stores the reference. Using List.subList(start, end) after that makes a SubList which just references the original list (which still just references the array). No copying of the array or its contents, just wrapper creation, and all lists involved are backed by the original array. (I thought it'd be heavier.)

If you're seeking a pointer style aliasing approach, so that you don't even need to allocate space and copy the data then I believe you're out of luck.
System.arraycopy() will copy from your source to destination, and efficiency is claimed for this utility. You do need to allocate the destination array.

One way is to wrap the array in java.nio.ByteBuffer, use the absolute put/get functions, and slice the buffer to work on a subarray.
For instance:
doSomething(ByteBuffer twoBytes) {
byte b1 = twoBytes.get(0);
byte b2 = twoBytes.get(1);
...
}
void someMethod(byte[] bigArray) {
int offset = 4;
int length = 2;
doSomething(ByteBuffer.wrap(bigArray, offset, length).slice());
}
Note that you have to call both wrap() and slice(), since wrap() by itself only affects the relative put/get functions, not the absolute ones.
ByteBuffer can be a bit tricky to understand, but is most likely efficiently implemented, and well worth learning.

Use java.nio.Buffer's. It's a lightweight wrapper for buffers of various primitive types and helps manage slicing, position, conversion, byte ordering, etc.
If your bytes originate from a Stream, the NIO Buffers can use "direct mode" which creates a buffer backed by native resources. This can improve performance in a lot of cases.

You could use the ArrayUtils.subarray in apache commons. Not perfect but a bit more intuitive than System.arraycopy. The downside is that it does introduce another dependency into your code.

I see the subList answer is already here, but here's code that demonstrates that it's a true sublist, not a copy:
public class SubListTest extends TestCase {
public void testSubarray() throws Exception {
Integer[] array = {1, 2, 3, 4, 5};
List<Integer> list = Arrays.asList(array);
List<Integer> subList = list.subList(2, 4);
assertEquals(2, subList.size());
assertEquals((Integer) 3, subList.get(0));
list.set(2, 7);
assertEquals((Integer) 7, subList.get(0));
}
}
I don't believe there's a good way to do this directly with arrays, however.

List.subList(int startIndex, int endIndex)

The Lists allow you to use and work with subList of something transparently. Primitive arrays would require you to keep track of some kind of offset - limit. ByteBuffers have similar options as I heard.
Edit:
If you are in charge of the useful method, you could just define it with bounds (as done in many array related methods in java itself:
doUseful(byte[] arr, int start, int len) {
// implementation here
}
doUseful(byte[] arr) {
doUseful(arr, 0, arr.length);
}
It's not clear, however, if you work on the array elements themselves, e.g. you compute something and write back the result?

One option would be to pass the whole array and the start and end indices, and iterate between those instead of iterating over the whole array passed.
void method1(byte[] array) {
method2(array,4,5);
}
void method2(byte[] smallarray,int start,int end) {
for ( int i = start; i <= end; i++ ) {
....
}
}

Java references always point to an object. The object has a header that amongst other things identifies the concrete type (so casts can fail with ClassCastException). For arrays, the start of the object also includes the length, the data then follows immediately after in memory (technically an implementation is free to do what it pleases, but it would be daft to do anything else). So, you can;t have a reference that points somewhere into an array.
In C pointers point anywhere and to anything, and you can point to the middle of an array. But you can't safely cast or find out how long the array is. In D the pointer contains an offset into the memory block and length (or equivalently a pointer to the end, I can't remember what the implementation actually does). This allows D to slice arrays. In C++ you would have two iterators pointing to the start and end, but C++ is a bit odd like that.
So getting back to Java, no you can't. As mentioned, NIO ByteBuffer allows you to wrap an array and then slice it, but gives an awkward interface. You can of course copy, which is probably very much faster than you would think. You could introduce your own String-like abstraction that allows you to slice an array (the current Sun implementation of String has a char[] reference plus a start offset and length, higher performance implementation just have the char[]). byte[] is low level, but any class-based abstraction you put on that is going to make an awful mess of the syntax, until JDK7 (perhaps).

#unique72 answer as a simple function or line, you may need to replace Object, with the respective class type you wish to 'slice'. Two variants are given to suit various needs.
/// Extract out array from starting position onwards
public static Object[] sliceArray( Object[] inArr, int startPos ) {
return Arrays.asList(inArr).subList(startPos, inArr.length).toArray();
}
/// Extract out array from starting position to ending position
public static Object[] sliceArray( Object[] inArr, int startPos, int endPos ) {
return Arrays.asList(inArr).subList(startPos, endPos).toArray();
}

How about a thin List wrapper?
List<Byte> getSubArrayList(byte[] array, int offset, int size) {
return new AbstractList<Byte>() {
Byte get(int index) {
if (index < 0 || index >= size)
throw new IndexOutOfBoundsException();
return array[offset+index];
}
int size() {
return size;
}
};
}
(Untested)

I needed to iterate through the end of an array and didn't want to copy the array. My approach was to make an Iterable over the array.
public static Iterable<String> sliceArray(final String[] array,
final int start) {
return new Iterable<String>() {
String[] values = array;
int posn = start;
#Override
public Iterator<String> iterator() {
return new Iterator<String>() {
#Override
public boolean hasNext() {
return posn < values.length;
}
#Override
public String next() {
return values[posn++];
}
#Override
public void remove() {
throw new UnsupportedOperationException("No remove");
}
};
}
};
}

This is a little more lightweight than Arrays.copyOfRange - no range or negative
public static final byte[] copy(byte[] data, int pos, int length )
{
byte[] transplant = new byte[length];
System.arraycopy(data, pos, transplant, 0, length);
return transplant;
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.