Java Resizable Array（）

　　本篇文章为你整理了Java Resizable Array（）的详细内容，包含有 Java Resizable Array，希望能帮助你了解 Java Resizable Array。

　　 Sometimes you want to keep data (typically raw bytes) in a single, consecutive array for fast and easy access, but need the array to be

　　 resizable, or at least expandable. Java arrays are not resizable, so using an array by itself is not sufficient.

　　 Thus, to get a resizable array for primitive types you need to implement

　　 it yourself. In this tutorial I will show you how to implement a resizable array in Java.

　　
You might wonder why you cannot just use a Java ArrayList for this purpose. The answer is - you can if

　　 one of these conditions are true:

　　
You are storing some object type in the array.

　　 You are storing primitive types in the array and performance and memory usage doesnt matter.

　　
The Java ArrayList class only works for objects - not for primitive types (byte, int, long etc). If you use a

　　 Java ArrayList to store primitive data it will be autoboxed into an object when inserted into the ArrayList, and

　　 unboxed out of this object back to a primitive type. This boxing and unboxing adds an overhead to insertion of and

　　 access to each element - which you do not want when you are trying to optimize for performance (this article is

　　 part of my Java Performance trail).

　　 Additionally, you have no guarantee for where in memory the autoboxed objects are stored. They can be scattered all

　　 over the heap. Thus consecutive access

　　 to them can be much, much slower than when stored consecutively in memory - in a primitive type array.

　　 Additionally, storing a primitive in its object counterpart (e.g. a long in a Long object) incurs significant

　　 extra memory overhead per element.

　　Java Resizable Array - GitHub Repository

　　 The code for the Java resizable array implemented in this tutorial can be found on GitHub (for easy access):

　　 https://github.com/jjenkov/java-resizable-array

　　 The code consist of 3 Java classes and 2 unit tests.

　　
Imagine you have a server receiving messages of varying sizes. Some messages are very small (e.g. less than 4KB),

　　 others as large as 1MB or even larger. If the server cannot see from the beginning of the message how big the message

　　 will end up being, there is no way of pre-allocating the correct amount of memory for that message. The

　　 alternative is to "over-allocate", meaning allocate a memory block that we believe is big enough to hold the

　　 message.

　　 If the server is receiving messages from many (100K +) connections at the same time, we need to limit how much

　　 memory we allocate for each message. We cannot just allocate the maximal message size (e.g. 1MB or 16MB etc.) for

　　 each message buffer. That would deplete the server memory fast when the number of connections and messages go up!

　　 100.000 x 1MB = 100GB (approximately - not precisely - but you get the picture).

　　 The alternative is to start with a smaller buffer and if that buffer turns out not to be big enough to hold the

　　 full message, we expand the buffer.

　　
This is an expensive set of operations! Especially the copying of the old data to the new data gets more and

　　 more expensive the larger the old array is.

　　 Note, that in languages with built-in garbage collection (like Java, C# etc.) you do not explicitly have to

　　 deallocate the old array, but the VM still has to do it, so your application still pays the deallocation price

　　 sooner or later.

　　
To minimize the price paid for array expansion you want to expand the array as few times as possible, in order

　　 to make it large enough to hold the total message.

　　 If we assume that most messages are small, we can start with a small buffer. If a message grows beyond

　　 the small buffer size, we allocate a new, medium sized array and copy the data to the new, medium sized array.

　　 If the message outgrows the medium sized array too, a large array is allocated, and the message is copied to that.

　　 The large array should be the largest size allowed for a message. If the message outgrows the large message buffer,

　　 it is too large to be processed and should be rejected by the server.

　　 Using this strategy, most messages will only ever use a small buffer. This means that the server memory is

　　 used more efficiently. 100.000 x 4KB (small buffer size) is only 400MB. Most servers should be able to handle that.

　　 Even at 4GB (1.000.000 x 4KB) modern servers should be able to handle it.

　　
The ResizableArrayBuffer contains a single, large array. This array is divided into three sections.

　　 One section for small arrays, one for medium size arrays and one for large arrays.

　　 The ResizableArray represents a single, resizable array, with its data stored in the array in

　　 the ResizableArrayBuffer.

　　 Here is a diagram illustrating the large array divided into sections, and each section divided into blocks.

　　 By reserving an area of the large array in the ResizableArrayBuffer for each message size interval

　　 (small, medium, large), we make sure that the array cannot be fully filled with either size messages. For instance,

　　 receiving a lot of small messages cannot take all the memory and thus block for medium and large messages. Similarly,

　　 receiving a lot of large messages cannot take up all memory and block for medium and small messages.

　　 Since all messages start as small messages, if the amount of small arrays is depleted, no new arrays can be allocated,

　　 regardless of whether or not there is space in the medium and large array sections. However, it is possible to

　　 make the small array section large enough that the probability of this happening is very low.

　　 It is still possible to grow small messages into medium and large messages, even if the small message section

　　 is fully used.

　　
A possible optimization is to use only a single block size. Thus, sometimes a newly allocated block might

　　 be allocted just after the block that needs to be expanded. In that case you dont need to copy the data from

　　 the old array to the new. You can simply "expand" the block to include both of the two allocated blocks, and

　　 write new data into the expanded block from where the second block was added. This saves copying of all array

　　 data when a memory block can be "expanded" into the next consecutive block.

　　 A drawback of the above mentioned possible optimization is, that in the cases where it is not possible to expand

　　 into the next memory block, copying of data is still necessary. Thus you have a small overhead of checking

　　 if expansion is possible. Additionally, if you use a small block size, you may have to expand blocks more often

　　 than if you use e.g. a small, medium and large block size.

　　
The large array inside the ResizableArrayBuffer is split up into three sections. Each section is split

　　 up into smaller blocks. Each block in each section has the same size. Blocks in the small message section has the

　　 same small block size. Blocks in the medium message section has the same medium block size. And blocks in the

　　 large message section has the same large block size.

　　 When all blocks within a section has the same size, it is easier to keep track of used and unused blocks.

　　 You can simply use a queue containing the start indexes of each block. One queue is needed for each section of

　　 the shared array. Thus, one queue is needed to keep track of free small blocks, one queue for free medium blocks

　　 and one queue for free large blocks.

　　 Allocating a block from any of the sections can be accomplished simply by taking the next free block start index

　　 from the queue associated with the desired section. Freeing a block is done by putting the start index back into

　　 the corresponding queue.

　　 As a queue I have used a simple Ring Buffer implementation. The implementation

　　 used in the GitHubRepository is called QueueIntFlip.

　　
The Resizable array will expand itself when you write data to it. If you attempt to write more data to it than

　　 it has space for in its currently allocated block, it will allocate a new, larger block and copy all its data

　　 to that new block. The previous smaller block is then freed.

　　
Once you are done with a resizable array you should free it again, so that it can be used for other messages.

　　
I want to show you how to use the ResizableArrayBuffer (from the GitHub repository).

　　
This example creates a ResizableArrayBuffer with a small array size of 4KB, medium array size of

　　 128KB and a larger array size of 1MB. The ResizableArrayBuffer contains space for 1024 small arrays (4MB in total),

　　 32 medium arrays (4MB in total) and 4 large arrays (4MB in total) - for a full shared array size of 12MB.

　　
To obtain a ResizableArray instance, call the ResizableArrayBuffers getArray()

　　 method, like this:

　　ResizableArray resizableArray = arrayBuffer.getArray();

　　
This example obtains a ResizableArray of the smallest size (4KB the small size was set to earlier).

　　
You write to a ResizableArray by calling its write() method. The ResizableArray

　　 class in the GitHub repository only contains a single write() method which takes a ByteBuffer

　　 as parameter. It should be pretty easy to add more write() methods yourself, though.

　　 Here is a write example:

　　ByteBuffer byteBuffer = ByteBuffer.allocate(1024);

　　for(int i=0; i ByteBuffer into the array (block) of the ResizableArray.

　　 The value returned by write() is the number of bytes copied from the ByteBuffer.

　　 In case the ByteBuffer had contained more data than the ResizableArray, the

　　 ResizableArray will attempt to expand itself to make space for the data in the ByteBuffer.

　　 If the ResizableArray cannot contain all the data in the ByteBuffer after expanding

　　 itself to the max size, the write() method will return -1 and no data will have been

　　 copied at all!

　　
When you read from a ResizableArray you do so directly in the shared array which all

　　 ResizableArray instances share. The ResizableArray contains the following

　　 public field:

　　public byte[] sharedArray = null;

　　public int offset = 0;

　　public int capacity = 0;

　　public int length = 0;

　　 The sharedArray field references the shared array which all ResizableArray instances

　　 keep their data in. That is the array kept internally in the ResizableArrayBuffer too.

　　 The offset field contains the start index in the shared array where this ResizableArray

　　 keeps its data (the block assigned to it).

　　 The capacity field contains the size of the block in the shared array assigned to this

　　 ResizableArray instance.

　　 The length field contains how much of the assigned block the ResizableArray is actually

　　 using.

　　 To read the data written to a ResizableArray simply read the bytes from sharedArray[offset]

　　 to sharedArray[offset + length -1].

　　
Once you are done using a ResizableArray instance you should free it. You do so simply by calling

　　 the free() method on the ResizableArray, like this:

　　resizableArray.free();

　　 Calling free() takes care of returning the used block to the correct block queue, regardless of

　　 the size of the block allocated to the ResizableArray.

　　
You can change the design of my ResizableArrayBuffer to meet your needs. For instance, you can

　　 create more than 3 sections inside it. It should be fairly easy to do. Just look at the code in the GitHub

　　 repository and modify it.

　　以上就是Java Resizable Array（）的详细内容，想要了解更多 Java Resizable Array的内容，请持续关注盛行IT软件开发工作室。

郑重声明：本文由网友发布，不代表盛行IT的观点，版权归原作者所有，仅为传播更多信息之目的，如有侵权请联系，我们将第一时间修改或删除，多谢。

相关文章阅读