Ph: 72915287

/dev/null

Cameron Purdy
Tuesday Apr 26, 2011

Feedback on Elastic Data

Wow! The new Elastic Data capability has been a hit with everyone I've talked to about the release. There are still a couple areas (like indexes) that Elastic Data doesn't help with (yet), but for the "big data" use cases we're working with, it seems to be a great fit.

Tuesday Apr 19, 2011

Coherence training in NYC on May 2

There's an upcoming public 5-day training course in New York City the week of 2 May: Oracle Coherence 3.6: Share and Manage Data In Clusters. A month later, there's the new 3-day Coherence administrator course in New York City: Oracle Coherence 3.6: Administer and Troubleshoot Clusters.

Links to available training course, documentation, screencasts, and everything else related to Coherence are located on coherence.oracle.com.

Monday Apr 18, 2011

Elastic Data: Terabytes of Cache

The Oracle Coherence 3.7 release went live today. You can read more about it on InfoQ and PC World.

There are close to 500 improvements in the release, but the headline feature is Elastic Data, which enables applications to scale elastically both horizontally (scale-out) and vertically (scale-up), transparently managing "Big Data" across RAM and solid state storage (SSD, or "flash drives"). With a total of two lines of configuration (both optional), you specify the amount of RAM to allocate per node to Elastic Data, and what drive to transparently expand storage onto. On my Mac notebook, for example, I can easily run a 100GB cache across two local cluster nodes with full HA (thanks to a second synchronous copy of the data). That's a total of 200GB of data on a notebook that "only" has 8GB of RAM -- and my test scripts run Java with just a 1GB heap! To top it off, Elastic Data runs as fast as on-heap data storage!

We'll be releasing more video walk-throughs of new Coherence features over the next few days, and hopefully I'll get some time to talk about some of the R&D we're doing focused on high-performance and ultra-low-latency I/O, and how it will eventually make its way into the Java platform. If you think what we're doing with flash storage is cool, wait until you see what we can do with an ultra-low-latency 80-gigabit Infiniband network! :-)

Wednesday Nov 03, 2010

On a Train

On the occasion of taking a train this morning, and finding this one in some old notes:

On a Train

I saw your face
A beautiful reflection in a window
Eternal youth and grace
My gaze, transfixed, thus nudged you
From your listless sleep
I turned to see your eyes
And the grey came out of your hair
Like foam comes out of a wave
As the color faded from your face.
Sunday Sep 12, 2010

/dev/null is web scale

It turns out that my blog (/dev/null) is web scale!

Saturday Jul 24, 2010

Serialization Maps

What do you get when you combine Binary serialization with an AbstractKeyBasedMap or AbstractKeySetBasedMap? Let's take a look. Let's start with the BinaryStore interface, which represents the ability to store raw (i.e. opaque) Binary data, for example in a file:

public interface BinaryStore
    {
    /**
    * Return the value associated with the specified key, or null if the
    * key does not have an associated value in the underlying store.
    *
    * @param binKey  key whose associated value is to be returned
    *
    * @return the value associated with the specified key, or
    *         null if no value is available for that key
    */
    public Binary load(Binary binKey);

    /**
    * Store the specified value under the specific key in the underlying
    * store. This method is intended to support both key/value creation
    * and value update for a specific key.
    *
    * @param binKey    key to store the value under
    * @param binValue  value to be stored
    *
    * @throws UnsupportedOperationException  if this implementation or the
    *         underlying store is read-only
    */
    public void store(Binary binKey, Binary binValue);

    /**
    * Remove the specified key from the underlying store if present.
    *
    * @param binKey key whose mapping is to be removed from the map
    *
    * @throws UnsupportedOperationException  if this implementation or the
    *         underlying store is read-only
    */
    public void erase(Binary binKey);

    /**
    * Remove all data from the underlying store.
    *
    * @throws UnsupportedOperationException  if this implementation or the
    *         underlying store is read-only
    */
    public void eraseAll();

    /**
    * Iterate all keys in the underlying store.
    *
    * @return a read-only iterator of the keys in the underlying store
    *
    * @throws UnsupportedOperationException  if the underlying store is not
    *         iterable
    */
    public Iterator keys();

    /**
    * If a BinaryStore is aware of the number of keys that it stores, then it
    * should implement this optional interface in order to allow that
    * information to be efficiently communicated to an intelligent consumer of
    * the BinaryStore interface.
    * 
    * @since Coherence 3.7
    */
    public interface SizeAware
            extends BinaryStore
        {
        /**
        * Determine the number of keys in the BinaryStore.
        *
        * @return the number of keys in the BinaryStore
        *
        * @since Coherence 3.7
        */
        public int size();
        }
    }

The BinaryStore interface is implemented by a number of storage facilities, such as for Oracle Berkeley DB, allowing Coherence to store data off the Java heap, and even on disk. The consumer of the BinaryStore interface that we're looking at today is the serialization map, i.e. a Map implementation that appears to contain object keys and object values, but actually manages a mapping from Binary keys to Binary values by delegating to an underlying BinaryStore. There are a number of serialization map implementations, arranged here by their inheritance relationships:

AbstractKeyBasedMap
SimpleSerializationMap - a basic serialization map that avoids keeping keys in memory
AbstractKeySetBasedMap
SerializationMap - a basic serialization map that relies on the entire key set being in memory
AbstractSerializationCache - an abstract serialization map that adds support for the ObservableMap API
SerializationCache - a serialization map that supports size-limited eviction and time-based expiry via the CacheMap and ConfigurableCacheMap API SerializationPagedCache - a significantly more complicated expiry-based cache that expires an entire BinaryStore's worth of data at a time

The general implementations of each of these (except for the SerializationPagedCache) are relatively straight-forward. For example, the get() implementation on the SimpleSerializationMap:

/**
* {@inheritDoc}
*/
public Object get(Object oKey)
    {
    Object  oValue   = null;
    long    ldtStart = getSafeTimeMillis();
    Binary  binValue = getBinaryStore().load(toBinary(oKey));
    if (binValue == null)
        {
        m_stats.registerMiss(ldtStart);
        }
    else
        {
        oValue = fromBinary(binValue);
        m_stats.registerHit(ldtStart);
        }
    return oValue;
    }

Note that when a serialization map is used as a backing map, all of its data (both keys and values) are already serialized in a Binary form. As a result, it can pass through those Binary values without any additional serialization/deserialization.

The primary difference between the SimpleSerializationMap and the SerializationMap (and its sub-classes) is that the SimpleSerializationMap doesn't maintain a data structure for the set of keys that it manages, allowing a much larger set of data (e.g. hundreds of millions of objects) to be managed without additional memory pressure.

Trivia: The "async" configuration option for the serialization maps simply wraps the configured BinaryStore implementation with an instance of AsyncBinaryStore, which is described in its JavaDoc as:

An AsyncBinaryStore is a BinaryStore wrapper that performs the "O" (output) portion of its I/O asynchronously on a daemon thread. The output portion consists of store and erase processing.

Since the "O" portion is passed along to the wrapped BinaryStore on a separate thread, only read operations are blocking, thus the BinaryStore operations on a whole appear much faster. As such, it is somewhat analogous to a write-behind cache.

If an operation fails on the daemon thread, all further operations will occur synchronously, so that exceptions will propagate successfully up. It is assumed that once one exception occurs, the underlying BinaryStore is in a state that will cause more exceptions to occur. Even when an exception occurs on the daemon thread, that write-behind data will not be "lost" because it will still be available in the internal data structures that keeps track of pending writes.

Binary, Buffers and Streams, Oh My!

It's worth spending some time explaining the evolution of one of the most central classes in Coherence: Binary. The Binary class was introduced in early 2002 as the basis for the partitioned cache, which is the technological cornerstone of today's In-Memory Data Grid (IMDG) products. At the time, we needed a class that would encapsulate the binary form of a key, guarantee immutability, provide a predictable (and machine-independent) hash code, implement equals(), and be readable and writable, for example to and from a message, packet or stream.

When faced with this set of requirements, one had to look no farther than Java's own String class for an example API that would be well-known on the Java platform, and that is exactly what we did. String is immutable, it represented a char[] (while we required a byte[] for a binary value), it provided a stable and predictable hash code, an equals() implementation, and could easily be read from and written to a stream. The early version of Binary provided this rough set of capabilities.

The idea behind the partitioned cache was to take an infinite domain of keys and reduce that infinite domain down to a finite domain of partitions, with the cluster responsible for (1) assigning ownership of those partitions to cluster members, and (2) assigning backup responsibilities to other members. Even before we introduced the partitioned cache, we had to be able to transmit keys over the wire -- for example in a Coherence replicated cache -- and thus we already had support for key serialization. With the introduction of the Binary class, not only was there something to hold the result of that serialization, but there was also a predictable hash code for that result, thus reducing the infinite domain of keys down to a set of 4 billion hash codes, with the understanding that no matter where the key was serialized, it would result in the same hash code. The partition identifier (PID) is calculated from that hash code modulo the maximum (preferably prime) number of partitions.

Given some implementation of serialization, such as Java Serializable/Externalizable, Coherence ExternalizableLite, Google Protocol Buffers, or the Portable Object Format (POF), and given a context (i.e. such necessary things as a ClassLoader), there exists a function f() such that Binary=f(Object), some inverse function f-1 such that Object=f-1(Binary), and subject to Java's rules of equality as defined by equals(), the following must hold true for keys:

For a given Binary key b, b == f(f-1(b))

For a given Object key o, o == f-1(f(o))

For two given Object keys o1 and o2, o1 == o2 iff f(o1) == f(o2)

Generally speaking, we refer to that function f() as toBinary(), and that inverse function f-1 as fromBinary(), although the actual interface is far less reminiscent of SmallTalk:

/**
* The Serializer interface provides the capability of reading and writing a
* Java object from and to an in-memory buffer.
* <p/>
* Serializer implementations should implement the ClassLoaderAware interface
* if they need access to a ClassLoader. However, to support hot-deploying
* containers, it is important that a Serializer <b>not</b> hold any strong
* references to that ClassLoader, or to any Class objects obtained from that
* ClassLoader.
*  
* @author cp/jh  2007.07.21
*
* @see ReadBuffer
* @see WriteBuffer
*
* @since Coherence 3.2
*/
public interface Serializer
    {
    /**
    * Serialize an object to a WriteBuffer by writing its state using the
    * specified BufferOutput object.
    *
    * @param out  the BufferOutput with which to write the object's state
    * @param o    the object to serialize
    *
    * @exception IOException  if an I/O error occurs
    */
    public void serialize(WriteBuffer.BufferOutput out, Object o)
            throws IOException;

    /**
    * Deserialize an object from a ReadBuffer by reading its state using the
    * specified BufferInput object.
    *
    * @param in  the BufferInput with which to read the object's state
    *
    * @return the deserialized user type instance
    *
    * @exception IOException  if an I/O error occurs
    */
    public Object deserialize(ReadBuffer.BufferInput in)
            throws IOException;
    }

What has changed significantly since that time is how Binary objects are constructed. Early on, a DataOutputStream around a ByteArrayOutputStream would have been the likeliest route, with the resulting byte[] being passed to the Binary constructor. There are a few problems with this:

Binary is immutable, so just like when passing a char[] to the String constructor, Binary would have to copy the passed byte[], which is inefficient.

Writing via a DataOutputStream tends to be inefficient, including synchronization and many calls to the underlying OutputStream.

ByteArrayOutputStream is heavily synchronized.

ByteArrayOutputStream itself makes a copy of the byte[], which is yet another memory allocation and memory copy!

Streams are handy for dealing with unlimited and/or unknown amounts of data, but keys are small and we're serializing them for the purpose of creating a Binary (i.e. on-heap) object.

The straw that broke the camel's back was the introduction of the NIO APIs, which were buffer-centric, and largely incongruent (and without custom adapters, incompatible) with the original Java IO stream-based APIs. With pressure to utilize NIO in our TCMP clustering protocol, in our client/server TCP*Extend protocol, and for off-heap and memory-mapped storage, and with the added impetus of profiling information showing gains to be had in message- and data-serialization, we redesigned our I/O around a bonded buffer/stream model. The concept was simple: All stream-based I/O had to continue to be supported, and all buffer-based I/O had to continue to be supported, and in both cases, the end result (e.g. a Binary, an NIO buffer, a packet, or a TCP send) would have to be the same, and would have to be efficiently produced. To accomplish this, we defined I/O both for buffer-based reads and writes, but bonded that with stream-based access to the same data. The resulting interfaces are ReadBuffer and WriteBuffer for representing buffer access, and their inner interfaces BufferInput and BufferOutput for representing stream access:

/**
* The ReadBuffer interface represents an in-memory block of binary data,
* such as that represented by a byte[], a Binary object, or an NIO buffer.
*
* @author cp  2005.01.18
*/
public interface ReadBuffer
        extends ByteSequence, Cloneable
    {
    /**
    * Determine the length of the buffer.
    *
    * @return the number of bytes of data represented by this ReadBuffer
    */
    public int length();

    /**
    * Returns the byte at the specified offset. An offset ranges
    * from <code>0</code> to <code>length() - 1</code>. The first byte
    * of the sequence is at offset <code>0</code>, the next at offset
    * <code>1</code>, and so on, as for array indexing.
    *
    * @param of  the offset (index) of the byte
    *
    * @return the byte at the specified offset in this ReadBuffer
    *
    * @exception  IndexOutOfBoundsException  if the <code>of</code>
    *             argument is negative or not less than the length of this
    *             ReadBuffer
    */
    public byte byteAt(int of);

    /**
    * Copies bytes from this ReadBuffer into the destination byte array.
    * <p>
    * The first byte to be copied is at offset <code>ofBegin</code>;
    * the last byte to be copied is at offset <code>ofEnd-1</code>
    * (thus the total number of bytes to be copied is <code>ofEnd -
    * ofBegin</code>). The bytes are copied into the subarray of
    * <code>abDest</code> starting at offset <code>ofDest</code>
    * and ending at index:
    * <p><blockquote><pre>
    *     ofDest + (ofEnd - ofBegin) - 1
    * </pre></blockquote>
    * <p>
    * This method is the ReadBuffer equivalent of
    * {@link String#getChars(int, int, char[], int)}. It allows the caller
    * to extract a chunk of bytes into the caller's own array.
    *
    * @param ofBegin  offset of the first byte in the ReadBuffer to copy
    * @param ofEnd    offset after the last byte in the ReadBuffer to copy
    * @param abDest   the destination byte array
    * @param ofDest   the offset in the destination byte array to copy the
    *                 first byte to
    *
    * @exception IndexOutOfBoundsException  Thrown if any of the following
    *            is true:
    *   <ul>
    *   <li><code>ofBegin</code> is negative;
    *   <li><code>ofBegin</code> is greater than <code>ofEnd</code>
    *   <li><code>ofEnd</code> is greater than the length of this
    *       ReadBuffer;
    *   <li><code>ofDest</code> is negative
    *   <li><code>ofDest + (ofEnd - ofBegin)</code> is larger than
    *       <code>abDest.length</code>
    *   </ul>
    * @exception NullPointerException if <code>abDest</code> is
    *   <code>null</code>
    */
    public void copyBytes(int ofBegin, int ofEnd, byte abDest[], int ofDest);

    /**
    * Get a BufferInput object to read data from this buffer. Note that each
    * call to this method will return a new BufferInput object, with the
    * possible exception being that a zero-length ReadBuffer could always
    * return the same instance (since there is nothing to read).
    *
    * @return a BufferInput that is reading from this buffer starting at
    *         offset zero
    */
    public BufferInput getBufferInput();

    /**
    * Obtain a ReadBuffer for a portion of this ReadBuffer.
    *
    * @param of  the beginning index, inclusive
    * @param cb  the number of bytes to include in the resulting ReadBuffer
    *
    * @return a ReadBuffer that represents a portion of this ReadBuffer
    *
    * @exception  IndexOutOfBoundsException  if <code>of</code> or
    *             <code>cb</code> is negative, or <code>of + cb</code> is
    *             larger than the length of this <code>ReadBuffer</code>
    *             object
    */
    public ReadBuffer getReadBuffer(int of, int cb);

    /**
    * Get the contents of the ReadBuffer as a byte array.
    * <p>
    * This is the equivalent of <code>toByteArray(0, length())</code>.
    *
    * @return a byte[] with the contents of this ReadBuffer object
    */
    public byte[] toByteArray();

    /**
    * Get a portion of the contents of the ReadBuffer as a byte array.
    * <p>
    * This is the equivalent of
    * <code>getReadBuffer(of, cb).toByteArray()</code>.
    *
    * @param of  the beginning index, inclusive
    * @param cb  the number of bytes to include in the resulting byte[]
    *
    * @return  a byte[] containing the specified portion of this ReadBuffer
    *
    * @exception  IndexOutOfBoundsException  if <code>of</code> or
    *             <code>cb</code> is negative, or <code>of + cb</code> is
    *             larger than the length of this <code>ReadBuffer</code>
    *             object
    */
    public byte[] toByteArray(int of, int cb);

    /**
    * Returns a new Binary object that holds the complete contents of this
    * ReadBuffer.
    * <p>
    * This is the equivalent of <code>toBinary(0, length())</code>.
    *
    * @return  the contents of this ReadBuffer as a Binary object
    */
    public Binary toBinary();

    /**
    * Returns a Binary object that holds the specified portion of this
    * ReadBuffer.
    * <p>
    * This is the equivalent of
    * <code>getReadBuffer(of, cb).toBinary()</code>.
    *
    * @param of  the beginning index, inclusive
    * @param cb  the number of bytes to include in the Binary object
    *
    * @return  a Binary object containing the specified portion of this
    *          ReadBuffer
    *
    * @exception  IndexOutOfBoundsException  if <code>of</code> or
    *             <code>cb</code> is negative, or <code>of + cb</code> is
    *             larger than the length of this <code>ReadBuffer</code>
    *             object
    */
    public Binary toBinary(int of, int cb);

    /**
    * {@inheritDoc}
    * 
    * @since Coherence 3.7
    */
    public ByteSequence subSequence(int ofStart, int ofEnd);
    

    // ----- Object methods -------------------------------------------------

    /**
    * Compare two ReadBuffer objects for equality.
    *
    * @param o  a ReadBuffer object
    *
    * @return true iff the other ReadBuffer is identical to this
    */
    public boolean equals(Object o);

    /**
    * Create a clone of this ReadBuffer object.
    *
    * @return a ReadBuffer object with the same contents as this
    *         ReadBuffer object
    */
    public Object clone();


    // ----- inner interface: BufferInput -----------------------------------

    /**
    * The BufferInput interface represents a DataInputStream on top of a
    * ReadBuffer.
    *
    * @author cp  2005.01.18
    */
    public interface BufferInput
            extends InputStreaming, DataInput
        {
        // ----- InputStreaming methods ---------------------------------

        /**
        * Returns the number of bytes that can be read (or skipped over) from
        * this input stream without causing a blocking I/O condition to
        * occur. This method reflects the assumed implementation of various
        * buffering InputStreams, which can guarantee non-blocking reads up
        * to the extent of their buffers, but beyond that the read operations
        * will have to read from some underlying (and potentially blocking)
        * source.
        * <p>
        * BufferInput implementations must implement this method to return
        * the extent of the buffer that has not yet been read; in other
        * words, the entire un-read portion of the buffer <b>must</b> be
        * available.
        *
        * @return  the number of bytes that can be read from this InputStream
        *          without blocking
        *
        * @exception IOException  if an I/O error occurs
        */
        public int available()
                throws IOException;

        /**
        * Close the InputStream and release any system resources associated
        * with it.
        * <p>
        * BufferInput implementations do not pass this call down onto an
        * underlying stream, if any.
        *
        * @exception IOException  if an I/O error occurs
        */
        public void close()
                throws IOException;

        /**
        * Marks the current read position in the InputStream in order to
        * support the stream to be later "rewound" (using the {@link #reset}
        * method) to the current position. The caller passes in the maximum
        * number of bytes that it expects to read before calling the
        * {@link #reset} method, thus indicating the upper bounds of the
        * responsibility of the stream to be able to buffer what it has read
        * in order to support this functionality.
        * <p>
        * BufferInput implementations ignore the <code>cbReadLimit</code>;
        * they must support an unlimited read limit, since they appear to the
        * user as an input stream on top of a fully realized read buffer.
        *
        * @param cbReadLimit  the maximum number of bytes that caller expects
        *                     the InputStream to be able to read before the
        *                     mark position becomes invalid
        */
        public void mark(int cbReadLimit);

        /**
        * Determine if this InputStream supports the {@link #mark} and
        * {@link #reset} methods.
        * <p>
        * BufferInput implementations <b>must</b> support the {@link #mark}
        * and {@link #reset} methods, so this method always returns
        * <code>true</code>.
        *
        * @return  <code>true</code> if this InputStream supports the mark
        *          and reset method; <code>false</code> otherwise
        */
        public boolean markSupported();


        // ----- DataInput methods --------------------------------------

        /**
        * Read <code>ab.length</code> bytes and store them in
        * <code>ab</code>.
        * <p>
        * This method blocks until input data is available, the end of the
        * stream is detected, or an exception is thrown.
        *
        * @param ab  the array to store the bytes which are read from the
        *            stream
        *
        * @exception NullPointerException  if the passed array is null
        * @exception java.io.EOFException  if the stream is exhausted before the
        *            number
        *            of bytes indicated by the array length could be read
        * @exception IOException  if an I/O error occurs
        */
        public void readFully(byte ab[])
                throws IOException;

        /**
        * Read <code>cb</code> bytes and store them in <code>ab</code>
        * starting at offset <code>of</code>.
        * <p>
        * This method blocks until input data is available, the end of the
        * stream is detected, or an exception is thrown.
        *
        * @param ab  the array to store the bytes which are read from the
        *            stream
        * @param of  the offset into the array that the read bytes will be
        *            stored
        * @param cb  the maximum number of bytes to read
        *
        * @exception NullPointerException  if the passed array is null
        * @exception IndexOutOfBoundsException  if <code>of</code> or
        *            <code>cb</code> is negative, or <code>of+cb</code> is
        *            greater than the length of the <code>ab</code>
        * @exception java.io.EOFException  if the stream is exhausted before the
        *            number of bytes indicated by the array length could be
        *            read
        * @exception IOException  if an I/O error occurs
        */
        public void readFully(byte ab[], int of, int cb)
                throws IOException;

        /**
        * Skips over up to the specified number of bytes of data. The number
        * of bytes actually skipped over may be fewer than the number
        * specified to skip, and may even be zero; this can be caused by an
        * end-of-file condition, but can also occur even when there is data
        * remaining to be read. As a result, the caller should check the
        * return value from this method, which indicates the actual number of
        * bytes skipped.
        *
        * @param cb  the maximum number of bytes to skip over
        *
        * @return  the actual number of bytes that were skipped over
        *
        * @exception IOException  if an I/O error occurs
        */
        public int skipBytes(int cb)
                throws IOException;

        /**
        * Read a boolean value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeBoolean} method.
        *
        * @return either <code>true</code> or <code>false</code>
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public boolean readBoolean()
                throws IOException;

        /**
        * Read a byte value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeByte} method.
        *
        * @return a <code>byte</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public byte readByte()
                throws IOException;

        /**
        * Read an unsigned byte value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeByte} method when it is used with
        * unsigned 8-bit values.
        *
        * @return an <code>int</code> value in the range 0x00 to 0xFF
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public int readUnsignedByte()
                throws IOException;

        /**
        * Read a short value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeShort} method.
        *
        * @return a <code>short</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public short readShort()
                throws IOException;

        /**
        * Read an unsigned short value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeShort} method when it is used with
        * unsigned 16-bit values.
        *
        * @return an <code>int</code> value in the range of 0x0000 to 0xFFFF
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public int readUnsignedShort()
                throws IOException;

        /**
        * Read a char value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeChar} method.
        *
        * @return a <code>char</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public char readChar()
                throws IOException;

        /**
        * Read an int value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeInt} method.
        *
        * @return an <code>int</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public int readInt()
                throws IOException;

        /**
        * Read a long value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeLong} method.
        *
        * @return a <code>long</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public long readLong()
                throws IOException;

        /**
        * Read a float value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeFloat} method.
        *
        * @return a <code>float</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public float readFloat()
                throws IOException;

        /**
        * Read a double value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeDouble} method.
        *
        * @return a <code>double</code> value
        *
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public double readDouble()
                throws IOException;

        /**
        * Reads the next "line" of text.
        * <p>
        * This method does not have a counterpart in the
        * {@link java.io.DataOutput} interface. Furthermore, this method is
        * defined as operating on bytes and not on characters, and thus it
        * should be selected for use only after careful consideration, as if
        * it were deprecated (which it has been in java.io.DataInputStream).
        *
        * @return a line of text as a String
        * @exception  IOException  if an I/O error occurs.
        */
        public String readLine()
                throws IOException;

        /**
        * Reads a String value.
        * <p>
        * This method is the counterpart for the
        * {@link java.io.DataOutput#writeUTF} method.
        *
        * @return a String value
        *
        * @exception java.io.UTFDataFormatException  if the bytes that were
        *            read were not
        *            a valid UTF-8 encoded string
        * @exception EOFException  if the value could not be read because no
        *            more data remains to be read
        * @exception IOException  if an I/O error occurs
        */
        public String readUTF()
                throws IOException;

        // ----- BufferInput methods ------------------------------------

        /**
        * Get the ReadBuffer object that this BufferInput is reading from.
        *
        * @return the underlying ReadBuffer object
        */
        public ReadBuffer getBuffer();

        /**
        * Read a variable-length encoded UTF packed String. The major
        * differences between this implementation and DataInput is that this
        * supports null String values and is not limited to 64KB UTF-encoded
        * values.
        *
        * @return a String value; may be null
        *
        * @exception IOException  if an I/O error occurs
        */
        public String readSafeUTF()
                throws IOException;

        /**
        * Read an int value using a variable-length storage format as described
        * by {@link WriteBuffer.BufferOutput#writePackedInt(int)}.
        *
        * @return  an int value
        *
        * @exception IOException  if an I/O error occurs
        */
        public int readPackedInt()
                throws IOException;

        /**
        * Read a long value using a variable-length storage format as described
        * by {@link WriteBuffer.BufferOutput#writePackedLong(long)}.
        *
        * @return  a long value
        *
        * @exception IOException  if an I/O error occurs
        */
        public long readPackedLong()
                throws IOException;

        /**
        * Read <code>cb</code> bytes and return them as a ReadBuffer object.
        *
        * @param cb  the number of bytes to read
        *
        * @return a ReadBuffer object composed of <code>cb</code> bytes read
        *         from the BufferInput
        *
        * @exception EOFException  if the stream is exhausted before
        *            the number of bytes indicated could be read
        * @exception IOException  if an I/O error occurs
        */
        public ReadBuffer readBuffer(int cb)
                throws IOException;

        /**
        * Determine the current offset of this BufferInput within the
        * underlying ReadBuffer.
        *
        * @return the offset of the next byte to read from the ReadBuffer
        */
        public int getOffset();

        /**
        * Specify the offset of the next byte to read from the underlying
        * ReadBuffer.
        *
        * @param of  the offset of the next byte to read from the ReadBuffer
        *
        * @exception  IndexOutOfBoundsException  if <code>of < 0</code> or
        *             <code>of > getBuffer().length()</code>
        */
        public void setOffset(int of);
        }
    }

/**
* The WriteBuffer interface represents an in-memory block of binary data
* that is being accumulated (written to). It is analogous to the byte[]
* inside a Java ByteArrayOutputStream.
*
* @author cp  2005.01.18 created
* @author cp  2005.03.21 defining
*/
public interface WriteBuffer
    {
    // ----- buffer write operations ----------------------------------------

    /**
    * Store the specified byte at the specified offset within the buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally equivalent to the following code:
    * <pre><code>
    * byte[] abSrc = new byte[1];
    * abSrc[0] = b;
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param b       the byte to store in this buffer
    */
    public void write(int ofDest, byte b);

    /**
    * Store the specified bytes at the specified offset within the buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally equivalent to the following code:
    * <pre><code>
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param abSrc   the array of bytes to store in this buffer
    *
    * @exception NullPointerException  if <code>abSrc</code> is
    *            <code>null</code>
    * @exception IndexOutOfBoundsException  if <tt>ofDest</tt> is negative,
    *            or if <tt>ofDest + abSrc.length</tt> is
    *            greater than <tt>{@link #getMaximumCapacity()}</tt>
    */
    public void write(int ofDest, byte[] abSrc);

    /**
    * Store the specified number of bytes from the specified location within
    * the passed byte array at the specified offset within this buffer.
    * <p>
    * As a result of this method, the buffer length as reported by the
    * <tt>{@link #length()}</tt> method will become
    * <tt>Math.max({@link #length()}, ofDest + cbSrc)</tt>.
    * <p>
    * As a result of this method, the buffer capacity as reported by the
    * <tt>{@link #getCapacity()}</tt> method will not change if the new value
    * returned by <tt>{@link #length()}</tt> would not exceed the old value
    * returned by <tt>{@link #getCapacity()}</tt>; otherwise, the capacity
    * will be increased such that
    * <tt>{@link #getCapacity()} >= {@link #length()}</tt>. Regardless, it is
    * always true that <tt>{@link #getCapacity()} >= {@link #length()}</tt>
    * and <tt>{@link #getMaximumCapacity()} >= {@link #getCapacity()}</tt>.
    * If the buffer capacity cannot be increased due to resource constraints,
    * an undesignated Error or RuntimeException will be thrown, such as
    * OutOfMemoryError.
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param abSrc   the array containing the bytes to store in this buffer
    * @param ofSrc   the offset within the passed byte array to copy from
    * @param cbSrc   the number of bytes to copy from the passed byte array
    *
    * @exception NullPointerException  if <code>abSrc</code> is
    *            <code>null</code>
    * @exception IndexOutOfBoundsException  if <tt>ofDest</tt>,
    *            <tt>ofSrc</tt> or <tt>cbSrc</tt> is negative, if
    *            <tt>ofSrc + cbSrc</tt> is greater than
    *            <tt>abSrc.length</tt>, or if <tt>ofDest + cbSrc</tt> is
    *            greater than <tt>{@link #getMaximumCapacity()}</tt>
    */
    public void write(int ofDest, byte[] abSrc, int ofSrc, int cbSrc);

    /**
    * Store the contents of the specified ReadBuffer at the specified offset
    * within this buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally equivalent to the following code:
    * <pre><code>
    * byte[] abSrc = bufSrc.toByteArray();
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param bufSrc  the array of bytes to store in this buffer
    *
    * @exception NullPointerException  if <code>bufSrc</code> is
    *            <code>null</code>
    * @exception IndexOutOfBoundsException  if <tt>ofDest</tt> is negative,
    *            or if <tt>ofDest + bufSrc.length()</tt> is
    *            greater than <tt>{@link #getMaximumCapacity()}</tt>
    */
    public void write(int ofDest, ReadBuffer bufSrc);

    /**
    * Store the specified portion of the contents of the specified ReadBuffer
    * at the specified offset within this buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally equivalent to the following code:
    * <pre><code>
    * byte[] abSrc = bufSrc.toByteArray(ofSrc, cbSrc);
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param bufSrc  the array of bytes to store in this buffer
    * @param ofSrc   the offset within the passed ReadBuffer to copy from
    * @param cbSrc   the number of bytes to copy from the passed ReadBuffer
    *
    * @exception NullPointerException  if <code>bufSrc</code> is
    *            <code>null</code>
    * @exception IndexOutOfBoundsException  if <tt>ofDest</tt>,
    *            <tt>ofSrc</tt> or <tt>cbSrc</tt> is negative, if
    *            <tt>ofSrc + cbSrc</tt> is greater than
    *            <tt>bufSrc.length()</tt>, or if <tt>ofDest + cbSrc</tt> is
    *            greater than <tt>{@link #getMaximumCapacity()}</tt>
    */
    public void write(int ofDest, ReadBuffer bufSrc, int ofSrc, int cbSrc);

    /**
    * Store the remaining contents of the specified InputStreaming object at
    * the specified offset within this buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally <i>similar</i> to the following code:
    * <pre><code>
    * ByteArrayOutputStream streamOut = new ByteArrayOutputStream();
    * int b;
    * while ((b = stream.read()) >= 0)
    *     {
    *     streamOut.write(b);
    *     }
    * byte[] abSrc = streamOut.toByteArray();
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param stream  the stream of bytes to read and store in this buffer
    *
    * @exception IOException  if an IOException occurs reading from the
    *            passed stream
    */
    public void write(int ofDest, InputStreaming stream)
            throws IOException;

    /**
    * Store the specified number of bytes from the specified InputStreaming
    * object at the specified offset within this buffer.
    * <p>
    * For purposes of side-effects and potential exceptions, this method is
    * functionally <i>similar</i> to the following code:
    * <pre><code>
    * DataInputStream streamData = new DataInputStream(
    *         new WrapperInputStream(stream));
    * byte[] abSrc = new byte[cbSrc];
    * streamData.readFully(abSrc);
    * write(ofDest, abSrc, 0, abSrc.length);
    * </code></pre>
    *
    * @param ofDest  the offset within this buffer to store the passed data
    * @param stream  the stream of bytes to read and store in this buffer
    * @param cbSrc   the exact number of bytes to read from the stream and
    *                put in this buffer
    *
    * @exception IOException  if an IOException occurs reading from the
    *            passed stream
    */
    public void write(int ofDest, InputStreaming stream, int cbSrc)
            throws IOException;


    // ----- buffer maintenance ----------------------------------------------

    /**
    * Determine the length of the data that is in the buffer. This is the
    * actual number of bytes of data that have been written to the buffer,
    * not the capacity of the buffer.
    *
    * @return the number of bytes of data represented by this WriteBuffer
    */
    public int length();

    /**
    * Starting with the byte at offset <tt>of</tt>, retain the remainder
    * of this WriteBuffer, such that the byte at offset <tt>of</tt> is
    * shifted to offset 0, the byte at offset <tt>of + 1</tt> is shifted to
    * offset 1, and so on up to the byte at offset
    * <tt>{@link #length()} - 1</tt>, which is shifted to offset
    * <tt>{@link #length()} - of - 1</tt>. After this method, the length of
    * of the buffer as indicated by the {@link #length()} method will be
    * equal to <tt>{@link #length()} - of</tt>.
    * <p>
    * This method is functionally equivalent to the following code:
    * <pre><code>
    * retain(of, length() - of);
    * </code></pre>
    *
    * @param of  the offset of the first byte within the WriteBuffer that
    *            will be retained
    *
    * @exception IndexOutOfBoundsException  if <tt>of</tt> is negative or if
    *            <tt>of</tt> is greater than <tt>{@link #length()}</tt>
    */
    public void retain(int of);

    /**
    * Starting with the byte at offset <tt>of</tt>, retain <tt>cb</tt> bytes
    * in this WriteBuffer, such that the byte at offset <tt>of</tt> is
    * shifted to offset 0, the byte at offset <tt>of + 1</tt> is shifted to
    * offset 1, and so on up to the byte at offset <tt>of + cb - 1</tt>,
    * which is shifted to offset <tt>cb - 1</tt>. After this method, the
    * length of the buffer as indicated by the {@link #length()} method will
    * be equal to <tt>cb</tt>.
    * <p>
    * Legal values for the offset of the first byte to retain <tt>of</tt> are
    * <tt>(of >= 0 && of <= {@link #length()})</tt>. Legal values for the
    * number of bytes to retain <tt>cb</tt> are
    * <tt>(cb >= 0 && cb <= {@link #length()})</tt>, such that
    * <tt>(of + cb <= {@link #length()})</tt>.
    * <p>
    * If <tt>cb</tt> is zero, then this method will have the same effect as
    * clear. If <tt>of</tt> is zero, then this method will have the effect
    * of truncating the data in the buffer, but no bytes will be shifted
    * within the buffer.
    * <p>
    * The effect on the capacity of the buffer is implementation-specific;
    * some implementations are expected to retain the same capacity while
    * others are expected to shrink accordingly.
    *
    * @param of  the offset of the first byte within the WriteBuffer that
    *            will be retained
    * @param cb  the number of bytes to retain
    *
    * @exception IndexOutOfBoundsException  if <tt>of</tt> or <tt>cb</tt> is
    *            negative of if <tt>of + cb</tt> is greater than
    *            <tt>{@link #length()}</tt>
    */
    public void retain(int of, int cb);

    /**
    * Set the length of the buffer as indicated by the {@link #length()}
    * method to zero.
    * <p>
    * The effect on the capacity of the buffer is implementation-specific;
    * some implementations are expected to retain the same capacity while
    * others are expected to shrink accordingly.
    */
    public void clear();

    /**
    * Determine the number of bytes that the buffer can hold without resizing
    * itself. In other words, a WriteBuffer has <tt>getCapacity() -
    * {@link #length()}</tt> bytes that can be written to it without
    * overflowing the current underlying buffer allocation. Since the buffer
    * is an abstract concept, the actual mechanism for the underlying buffer
    * is not known, but it could be a Java NIO buffer, or a byte array, etc.
    * <p>
    * Note that if the maximum size returned by {@link #getMaximumCapacity()}
    * is greater than the current size returned by this method, then the
    * WriteBuffer will automatically resize itself to allocate more space
    * when the amount of data written to it passes the current size.
    *
    * @return the number of bytes of data that this WriteBuffer can hold
    *         without resizing its underlying buffer
    */
    public int getCapacity();

    /**
    * Determine the maximum number of bytes that the buffer can hold. If the
    * maximum size is greater than the current size, then the buffer is
    * expected to resize itself as necessary up to the maximum size in order
    * to contain the data given to it.
    *
    * @return the maximum number of bytes of data that the WriteBuffer can
    *         hold
    */
    public int getMaximumCapacity();


    // ----- obtaining different "write views" to the buffer ----------------

    /**
    * Obtain a WriteBuffer starting at a particular offset within this
    * WriteBuffer.
    * <p>
    * This is functionally equivalent to:
    * <code><pre>
    * return getWriteBuffer(of, getMaximumCapacity() - of);
    * </pre></code>
    *
    * @param of  the beginning index, inclusive
    *
    * @return a WriteBuffer that represents a portion of this WriteBuffer
    *
    * @exception  IndexOutOfBoundsException  if <tt>of</tt> is
    *             negative, or <tt>of</tt> is larger than the
    *             <tt>{@link #getMaximumCapacity()}</tt> of this
    *             <tt>WriteBuffer</tt> object
    */
    public WriteBuffer getWriteBuffer(int of);

    /**
    * Obtain a WriteBuffer for a portion of this WriteBuffer.
    * <p>
    * Use of the resulting buffer will correspond to using this buffer
    * directly but with the offset being passed to the buffer methods
    * automatically having <tt>of</tt> added. As a result, the length of this
    * buffer can be modified by writing to the new buffer; however, changes
    * made directly to this buffer will not affect the length of the new
    * buffer.
    * <p>
    * Note that the resulting WriteBuffer is limited in the number of bytes
    * that can be written to it; in other words, its
    * <tt>{@link #getMaximumCapacity()}</tt> must return the same value as
    * was passed in <tt>cb</tt>.
    *
    * @param of  the offset of the first byte within this WriteBuffer
    *            to map to offset 0 of the new WriteBuffer
    * @param cb  the number of bytes to cover in the resulting WriteBuffer
    *
    * @return a WriteBuffer that represents a portion of this WriteBuffer
    *
    * @exception  IndexOutOfBoundsException  if <tt>of</tt> or <tt>cb</tt>
    *             is negative, or <tt>of + cb</tt> is larger than
    *             the <tt>{@link #getMaximumCapacity()}</tt> of this
    *             <tt>WriteBuffer</tt> object
    */
    public WriteBuffer getWriteBuffer(int of, int cb);

    /**
    * Get a BufferOutput object to write data to this buffer, starting at
    * the beginning of the WriteBuffer.
    * <p>
    * This is functionally equivalent to:
    * <code><pre>
    * return getBufferOutput(0);
    * </pre></code>
    *
    * @return a BufferOutput that is writing to this buffer starting at
    *         offset zero
    */
    public BufferOutput getBufferOutput();

    /**
    * Get a BufferOutput object to write data to this buffer starting at a
    * particular offset.
    * <p>
    * Note that each call to this method will return a new BufferOutput
    * object, with the possible exception being that a zero-length
    * non-resizing WriteBuffer could always return the same instance (since
    * it is not writable).
    * <p>
    * This is functionally equivalent to:
    * <pre><code>
    * BufferOutput bufout = getBufferOutput();
    * bufout.setOffset(of);
    * return bufout;
    * </code></pre>
    *
    * @param of  the offset of the first byte of this buffer that the
    *            BufferOutput will write to
    *
    * @return a BufferOutput that will write to this buffer
    */
    public BufferOutput getBufferOutput(int of);

    /**
    * Get a BufferOutput object to write data to this buffer. The
    * BufferOutput object returned by this method is set to append to the
    * WriteBuffer, meaning that its offset is pre-set to the
    * {@link #length()} of this buffer.
    * <p>
    * This is functionally equivalent to:
    * <pre><code>
    * return getBufferOutput(length());
    * </code></pre>
    *
    * @return a BufferOutput configured to append to this buffer
    */
    public BufferOutput getAppendingBufferOutput();


    // ----- accessing the buffered data ------------------------------------

    /**
    * Get a ReadBuffer object that is a snapshot of this WriteBuffer's data.
    * <p>
    * This method is functionally equivalent to the following code:
    * <pre><code>
    * ReadBuffer buf = getUnsafeReadBuffer();
    * byte[] ab = buf.toByteArray();
    * return new ByteArrayReadBuffer(ab);
    * </code></pre>
    *
    * @return a ReadBuffer that reflects the point-in-time contents of this
    *         WriteBuffer
    */
    public ReadBuffer getReadBuffer();

    /**
    * Get a ReadBuffer object to read data from this buffer. This method is
    * not guaranteed to return a snapshot of this buffer's data, nor is it
    * guaranteed to return a live view of this buffer, which means that
    * subsequent changes to this WriteBuffer may or may not affect the
    * contents and / or the length of the returned ReadBuffer.
    * <p>
    * To get a snapshot, use the {@link #getReadBuffer()} method.
    *
    * @return a ReadBuffer that reflects the contents of this WriteBuffer
    *         but whose behavior is undefined if the WriteBuffer is modified
    */
    public ReadBuffer getUnsafeReadBuffer();

    /**
    * Returns a new byte array that holds the complete contents of this
    * WriteBuffer.
    * <p>
    * This method is functionally equivalent to the following code:
    * <pre><code>
    * return getUnsafeReadBuffer().toByteArray();
    * </code></pre>
    *
    * @return  the contents of this WriteBuffer as a byte[]
    */
    public byte[] toByteArray();

    /**
    * Returns a new Binary object that holds the complete contents of this
    * WriteBuffer.
    * <p>
    * This method is functionally equivalent to the following code:
    * <pre><code>
    * return getUnsafeReadBuffer().toBinary();
    * </code></pre>
    *
    * @return  the contents of this WriteBuffer as a Binary object
    */
    public Binary toBinary();


    // ----- Object methods -------------------------------------------------

    /**
    * Create a clone of this WriteBuffer object. Changes to the clone will
    * not affect the original, and vice-versa.
    *
    * @return a WriteBuffer object with the same contents as this
    *         WriteBuffer object
    */
    public Object clone();


    // ----- inner interface: BufferOutput ----------------------------------

    /**
    * The BufferOutput interface represents a DataOutputStream on top of a
    * WriteBuffer.
    *
    * @author cp  2005.01.18
    */
    public interface BufferOutput
            extends OutputStreaming, DataOutput
        {
        // ----- OutputStreaming methods --------------------------------

        /**
        * Close the OutputStream and release any system resources associated
        * with it.
        * <p>
        * BufferOutput implementations do not pass this call down onto an
        * underlying stream, if any.
        *
        * @exception IOException  if an I/O error occurs
        */
        public void close()
                throws IOException;

        // ----- DataOutput methods -------------------------------------

        /**
        * Writes the boolean value <tt>f</tt>.
        *
        * @param f  the boolean to be written
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeBoolean(boolean f)
                throws IOException;

        /**
        * Writes the eight low-order bits of the argument <tt>b</tt>. The 24
        * high-order bits of <tt>b</tt> are ignored.
        *
        * @param b  the byte to write (passed as an integer)
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeByte(int b)
                throws IOException;

        /**
        * Writes a short value, comprised of the 16 low-order bits of the
        * argument <tt>n</tt>; the 16 high-order bits of <tt>n</tt> are
        * ignored.
        *
        * @param n  the short to write (passed as an integer)
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeShort(int n)
                throws IOException;

        /**
        * Writes a char value, comprised of the 16 low-order bits of the
        * argument <tt>ch</tt>; the 16 high-order bits of <tt>ch</tt> are
        * ignored.
        *
        * @param ch  the char to write (passed as an integer)
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeChar(int ch)
                throws IOException;

        /**
        * Writes an int value.
        *
        * @param n  the int to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeInt(int n)
                throws IOException;

        /**
        * Writes a long value.
        *
        * @param l  the long to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeLong(long l)
                throws IOException;

        /**
        * Writes a float value.
        *
        * @param fl  the float to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeFloat(float fl)
                throws IOException;

        /**
        * Writes a double value.
        *
        * @param dfl  the double to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeDouble(double dfl)
                throws IOException;

        /**
        * Writes the String <tt>s</tt>, but only the low-order byte from each
        * character of the String is written.
        *
        * @param s  the String to write
        *
        * @exception IOException  if an I/O error occurs
        * @exception NullPointerException  if <tt>s</tt> is <tt>null</tt>
        */
        public void writeBytes(String s)
                throws IOException;

        /**
        * Writes the String <tt>s</tt> as a sequence of characters.
        *
        * @param s  the String to write
        *
        * @exception IOException  if an I/O error occurs
        * @exception NullPointerException  if <tt>s</tt> is <tt>null</tt>
        */
        public void writeChars(String s)
                throws IOException;

        /**
        * Writes the String <tt>s</tt> as a sequence of characters, but using
        * UTF-8 encoding for the characters, and including the String length
        * data so that the corresponding {@link java.io.DataInput#readUTF}
        * method can reconstitute a String from the written data.
        *
        * @param s  the String to write
        *
        * @exception IOException  if an I/O error occurs
        * @exception NullPointerException  if <tt>s</tt> is <tt>null</tt>
        */
        public void writeUTF(String s)
                throws IOException;

        // ----- BufferOutput methods -----------------------------------

        /**
        * Get the WriteBuffer object that this BufferOutput is writing to.
        *
        * @return the underlying WriteBuffer object
        */
        public WriteBuffer getBuffer();

        /**
        * Write a variable-length encoded UTF packed String. The major
        * differences between this implementation and DataOutput is that this
        * implementation supports null values and is not limited to 64KB
        * UTF-encoded values.
        * <p>
        * The binary format for a Safe UTF value is a "packed int" for the
        * binary length followed by the UTF-encoded byte stream. The length
        * is either -1 (indicating a null String) or in the range
        * <tt>0 .. Integer.MAX_VALUE</tt> (inclusive). The UTF-encoded
        * portion uses a format identical to DataOutput.
        *
        * @param s  a String value to write; may be null
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeSafeUTF(String s)
                throws IOException;

        /**
        * Write an int value using a variable-length storage-format.
        * <p>
        * The format differs from DataOutput in that DataOutput always uses
        * a fixed-length 4-byte Big Endian binary format for int values.
        * The "packed" format includes a sign bit (0x40) and a continuation
        * bit (0x80) in the first byte, followed by the least 6 significant
        * bits of the int value. Subsequent bytes (each appearing only if
        * the previous byte had its continuation bit set) include a
        * continuation bit (0x80) and the next least 7 significant bits of
        * the int value. In this way, a 32-bit value is encoded into 1-5
        * bytes, depending on the magnitude of the int value being encoded.
        *
        * @param n  an int value to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writePackedInt(int n)
                throws IOException;

        /**
        * Write a long value using a variable-length storage-format.
        * <p>
        * The format differs from DataOutput in that DataOutput always uses
        * a fixed-length 8-byte Big Endian binary format for long values.
        * The "packed" format includes a sign bit (0x40) and a continuation
        * bit (0x80) in the first byte, followed by the least 6 significant
        * bits of the long value. Subsequent bytes (each appearing only if
        * the previous byte had its continuation bit set) include a
        * continuation bit (0x80) and the next least 7 significant bits of
        * the long value. In this way, a 64-bit value is encoded into 1-10
        * bytes, depending on the magnitude of the long value being encoded.
        *
        * @param l  a long value to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writePackedLong(long l)
                throws IOException;

        /**
        * Write all the bytes from the passed ReadBuffer object.
        * <p>
        * This is functionally equivalent to the following code:
        * <code><pre>
        * getBuffer().write(getOffset(), buf);
        * </pre></code>
        *
        * @param buf  a ReadBuffer object
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeBuffer(ReadBuffer buf)
                throws IOException;

        /**
        * Write <code>cb</code> bytes from the passed ReadBuffer object
        * starting at offset <code>of</code> within the passed ReadBuffer.
        * <p>
        * This is functionally equivalent to the following code:
        * <code><pre>
        * getBuffer().write(getOffset(), buf, of, cb);
        * </pre></code>
        *
        * @param buf  a ReadBuffer object
        * @param of   the offset within the ReadBuffer of the first byte to
        *             write to this BufferOutput
        * @param cb   the number of bytes to write
        *
        * @exception IOException  if an I/O error occurs
        */
        public void writeBuffer(ReadBuffer buf, int of, int cb)
                throws IOException;

        /**
        * Write the remaining contents of the specified InputStreaming
        * object.
        * <p>
        * This is functionally equivalent to the following code:
        * <code><pre>
        * getBuffer().write(getOffset(), stream);
        * </pre></code>
        *
        * @param stream  the stream of bytes to write to this BufferOutput
        *
        * @exception IOException  if an I/O error occurs, specifically if an
        *            IOException occurs reading from the passed stream
        */
        public void writeStream(InputStreaming stream)
                throws IOException;

        /**
        * Write the specified number of bytes of the specified InputStreaming
        * object.
        * <p>
        * This is functionally equivalent to the following code:
        * <code><pre>
        * getBuffer().write(getOffset(), stream, cb);
        * </pre></code>
        *
        * @param stream  the stream of bytes to write to this BufferOutput
        * @param cb      the exact number of bytes to read from the stream
        *                and write to this BufferOutput
        *
        * @exception EOFException  if the stream is exhausted before
        *            the number of bytes indicated could be read
        * @exception IOException  if an I/O error occurs, specifically if an
        *            IOException occurs reading from the passed stream
        */
        public void writeStream(InputStreaming stream, int cb)
                throws IOException;

        /**
        * Determine the current offset of this BufferOutput within the
        * underlying WriteBuffer.
        *
        * @return the offset of the next byte to write to the WriteBuffer
        */
        public int getOffset();

        /**
        * Specify the offset of the next byte to write to the underlying
        * WriteBuffer.
        *
        * @param of  the offset of the next byte to write to the
        *            WriteBuffer
        *
        * @exception  IndexOutOfBoundsException  if <code>of < 0</code> or
        *             <code>of > getBuffer().getMaximumCapacity()</code>
        */
        public void setOffset(int of);

        // ----- constants ------------------------------------------------------

        /**
        * Maximum encoding length for a packed int value.
        */
        public static final int MAX_PACKED_INT_SIZE = 5;

        /**
        * Maximum encoding length for a packed long value.
        */
        public static final int MAX_PACKED_LONG_SIZE = 10;
        }
    }

Suddenly there was no either/or trade-off. Data could be efficiently written either as if to an underlying buffer or as if to a stream i.e. a BufferOutput (or to both in tandem!) with the same result. A write of an integer (or a long, or a String, etc.) wouldn't require a series of delegations from one stream to another, and could be optimized based on the type of buffer actually in use. Here is the hierarchy of the WriteBuffer implementations:

AbstractWriteBuffer - implements most of the API, but not optimized for a specific underlying buffer implementation
ByteArrayWriteBuffer - optimized for the underlying buffer being a byte[]
BinaryWriteBuffer - optimized for creating Binary objects when the writing is done
ByteBufferWriteBuffer - optimized for the underlying buffer being an NIO buffer DelegatingWriteBuffer - eases the implementation of WriteBuffer.getWriteBuffer() MultiBufferWriteBuffer - allows buffers to be strung together to create large virtual buffers

Likewise, for the ReadBuffer implementations:

AbstractReadBuffer - implements most of the API, but not optimized for a specific underlying buffer implementation
AbstractByteArrayReadBuffer - optimized for the underlying buffer being a byte[]
ByteArrayReadBuffer - for use when the byte[] doesn't have an immutability guarantee Binary - you've probably seen this class name before!
ByteBufferReadBuffer - optimized for the underlying buffer being an NIO buffer MultiBufferReadBuffer - allows buffers to be strung together to create large virtual buffers

As you can see, the best part of this architecture is that Binary itself became a ReadBuffer. Now, reading information out of a Binary is as simple as:

BufferInput in = bin.getBufferInput();
String s = in.readSafeUTF();
int n = in.readInt();
Binary bin2 = in.readBuffer(n).toBinary(); // no byte[] alloc or copy!

This buffer design ended up being a crucial element to the success of a number of Coherence features, including the TCP/IP-based Extend protocol, the Portable Object Format (POF), and the 75% reduction in memory allocations (as reported by jRockit profiling tools) that we were able to achieve in our recent Coherence 3.6 release. Furthermore, it serves as the basis for a number of projects that are currently in our R&D pipeline, such as super-efficient on-heap and off-heap storage for even larger data sets, better support for Infiniband and 10G networks, and much more.

Friday Jul 23, 2010

It's Maps all the way down

From "A Brief History of Time":

A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise." The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever", said the old lady. "But it's turtles all the way down!"

Coherence uses a analogous architecture, but replace "turtle" in the above story with "Java Map interface". For example, a common use case is to cache a limited amount of data in an application server tier, backed by an In-Memory Data Grid (IMDG). The topology for this use case is referred to as "size-limited near caching with a partitioned cache"; in other words, the application server uses a near cache to keep recently- and commonly-used data locally in memory, while the entire data set (up to a few terabytes) is cached by the In-Memory Data Grid. Coherence, as an IMDG, automatically partitions all of the data across the available cache servers (aka storage-enabled nodes).

In this scenario, the cache that is being used on the application server is a Java Map, implemented by Near Cache. The Near Cache is backed by two Java Maps: The first is an in-memory size-limited Java Map (where the "near" data gets stored), implemented by Local Cache, and the second is the partitioned Java Map, implemented by Partitioned Cache Service. The partitioned Map is actually a thin ClassLoader-aware veneer that handles serialization and deserialization, and delegates to an underlying binary-only Java Map, also implemented by Partitioned Cache Service. The binary implementation uses TCMP messaging to communicate across the cluster to the server(s) that own the partition(s) involved with a given client-side operation. When an operation request is received by one of those servers, it is translated into an invocation against a member-specific Java Map, implemented by the Storage module of the Partitioned Cache Service. That invocation is then typically delegated to a multiplexing Java Map, implemented by a Partition-Aware Backing Map (PABM). The invocation is then delivered to an actual data-managing Java Map, which is the backing map that was configured for that cache. The backing map may in turn be composed of a sequence and/or tree of delegating maps, such as when the cache is backed by a database; in this case, the backing map is a database-aware Java Map, implemented by Read/Write Backing Map (RWBM). The Read/Write Backing Map will store cache data in memory by delegating to a configured backing Java Map, implemented for example by a Local Cache or a thread-safe Hash Map. (There are even more turtles than this, since there are fault-tolerant "safe" layers, the option for client/server proxy layers, client- and subject-specific security layers, and "backup" redundancy layers.)

All in all, Coherence contains roughly 80 Map extensions and implementations, and a few dozen more in the unit and functional tests packages. Many of the early Map implementations in Coherence were based on the AbstractMap implementation that was part of the Java Collections framework, but AbstractMap turned out to be a rather poor base class for Map implementations for a number of reasons, including:

At first glance, AbstractMap implements everything except for one method. Unfortunately, that one method is entrySet(), which means that instead of implementing a Map of keys to values (which makes sense), you have to implement a Set of key/value ("Entry") pairs (which doesn't typically make sense).

In most cases, you manage the internal data structures by key, not by entry, so you end up overriding the keySet() implementation with a custom key set, and writing an entrySet() implementation that delegates back to the Map methods themselves or to the key set.

The AbstractMap method implementations all delegate to the Entry Set, so simple operations such as isEmpty(), size(), remove and even get() iterate through the entries -- typically, in most cases iterating through all of the entries! To create an efficient Map implementation, you thus have to override most of the concrete implementations from AbstractMap. Furthermore, if you implement the entrySet() as describe above, the Map methods must be overridden to avoid infinite recursion.

As a result, in many cases AbstractMap ends up providing only a handy implementation of hashCode(), toString() and equals(). To help address this, Coherence provides two abstract Map implementations:

When a Map knows all of its keys, i.e. when a Map has a data structure that contains all of its keys, then it should extend AbstractKeySetBasedMap.

When a Map doesn't hold all of its keys in a data structure, then it should extend AbstractKeyBasedMap. A good example of this is when the keys and values are stored on disk or in another system, or in another Map that is delegated to, and thus the delegating Map implementation does not have to store all of the keys redundantly.

In both cases, the keySet(), entrySet() and values() methods are implemented by the abstract class, and typically do not have to be overridden. To override any of them, there is a standard inner class factory pattern that allows sub-classes to override the KeySet, EntrySet and ValuesCollection inner classes, for example:

protected Set> instantiateEntrySet()

Like AbstractMap, a number of methods are implemented, but should generally be overridden for efficiency purposes. For example, implementations extending AbstractKeyBasedMap should implement and/or override the following methods:

/**
* Create an iterator over the keys in this Map. The Iterator must
* support remove() if the Map supports removal.
*
* @return a new instance of an Iterator over the keys in this Map
*/
protected abstract Iterator<K> iterateKeys();

/**
* Returns the value to which this map maps the specified key.
*
* @param oKey  the key object
*
* @return the value to which this map maps the specified key,
*         or null if the map contains no mapping for this key
*/
public abstract V get(Object oKey);

/**
* Associates the specified value with the specified key in this map.
*
* @param oKey    key with which the specified value is to be associated
* @param oValue  value to be associated with the specified key
*
* @return previous value associated with specified key, or <tt>null</tt>
*         if there was no mapping for key
*/
public V put(K oKey, V oValue)
    {
    throw new UnsupportedOperationException();
    }

/**
* Removes the mapping for this key from this map if present.
*
* @param oKey key whose mapping is to be removed from the map
*
* @return previous value associated with specified key, or <tt>null</tt>
*         if there was no mapping for key.  A <tt>null</tt> return can
*         also indicate that the map previously associated <tt>null</tt>
*         with the specified key, if the implementation supports
*         <tt>null</tt> values.
*/
public V remove(Object oKey)
    {
    throw new UnsupportedOperationException();
    }

/**
* Returns the number of key-value mappings in this map.
*
* @return the number of key-value mappings in this map
*/
public int size()
    {
    // this begs for sub-class optimization
    int c = 0;
    for (Iterator iter = iterateKeys(); iter.hasNext(); )
        {
        iter.next();
        ++c;
        }
    return c;
    }

/**
* Returns <tt>true</tt> if this map contains a mapping for the specified
* key.
*
* @return <tt>true</tt> if this map contains a mapping for the specified
*         key, <tt>false</tt> otherwise.
*/
public boolean containsKey(Object oKey)
    {
    // this begs for sub-class optimization
    for (Iterator iter = iterateKeys(); iter.hasNext(); )
        {
        if (equals(oKey, iter.next()))
            {
            return true;
            }
        }
    return false;
    }

Additionally, depending on the implementation, it may make sense to override a few additional methods:

/**
* Clear all key/value mappings.
*/
public void clear()
    {
    // this begs for sub-class optimization
    for (Iterator iter = iterateKeys(); iter.hasNext(); )
        {
        iter.next();
        iter.remove();
        }
    }

/**
* Returns <tt>true</tt> if this map contains no key-value mappings.
*
* @return <tt>true</tt> if this map contains no key-value mappings
*/
public boolean isEmpty()
    {
    return size() == 0;
    }

/**
* Removes the mapping for this key from this map if present. This method
* exists to allow sub-classes to optmiize remove functionalitly for
* situations in which the original value is not required.
*
* @param oKey key whose mapping is to be removed from the map
*
* @return true iff the Map changed as the result of this operation
*/
protected boolean removeBlind(Object oKey)
    {
    if (containsKey(oKey))
        {
        remove(oKey);
        return true;
        }
    else
        {
        return false;
        }
    }

In fact, many of those methods are overridden by AbstractKeySetBasedMap, which extends AbstractKeyBasedMap. Implementations extending AbstractKeySetBasedMap still have to implement or override get(), put() and remove, and may chose to override clear() and removeBlind(), but do not have to override iterateKeys(), containsKey(), isEmpty() or size(). There are two new methods that do need to be implemented or overridden:

/**
* Obtain a set of keys that are represented by this Map.
* <p/>
* The AbstractKeySetBasedMap only utilizes the internal key set as a
* read-only resource.
*
* @return an internal Set of keys that are contained by this Map
*/
protected abstract Set<K> getInternalKeySet();

/**
* Determine if this Iterator should remove an iterated item by calling
* remove on the internal key Set Iterator, or by calling removeBlind on
* the map itself.
*
* @return true to remove using the internal key Set Iterator or false to
*         use the {@link AbstractKeyBasedMap#removeBlind(Object)} method
*/
protected boolean isInternalKeySetIteratorMutable()
    {
    return false;
    }

The isInternalKeySetIteratorMutable() method allows for an optimization to be made in instantiateKeyIterator() with respect to providing an iterator over the keys of the Map:

/**
* Factory pattern: Create a mutable Iterator over the keys in the Map
*
* @return a new instance of Iterator that iterates over the keys in the
*         Map and supports element removal
*/
protected Iterator instantiateKeyIterator()
    {
    Iterator iter = getInternalKeySet().iterator();
    if (!isInternalKeySetIteratorMutable())
        {
        iter = new KeyIterator(iter);
        }
    return iter;
    }

Remember, this method was required by the AbstractKeyBasedMap super class; since the AbstractKeySetBasedMap knows all of its keys, it can implement this method by delegating to the "internal" key set.

These two abstract implementations serve as the basis for over 20 Map implementations in Coherence, and when they were introduced, their use (in the place of AbstractMap) resulted in the elimination of thousands of lines of code.

Next up: The purpose of Binary, and the various "serialization" Maps.

Friday May 21, 2010

Hasta la vista, baby!

Abi has joined this session!

Connected with Abi. Your reference number for this chat session is 2357642.

Abi: Welcome to Microsoft Customer Service Chat, Cameron. Kindly give me a moment to review your question.

Abi: Thank you for waiting. I'm sorry to hear that your having issues with the 25-digit product key of your Windows 7. Did you get an error message? If yes, may you please provide it to me?

Cameron Purdy: Yes, in the title it says "Invalid product key" and then in the body of the message it says "The product key you have entered does not appear to be a valid Windows 7 product key. Please check your product key, and type it again."

Abi: I see. By the way, please note that this chat service is available to guide you to the appropriate resources for your questions. Often this can include technical support options or a phone number to the appropriate Microsoft Team.

Cameron Purdy: I have typed it and checked it several times now. (It doesn't help that the font is ridiculously bad and you can't tell whether some letters/numbers are sixes or G's.)

Abi: Cameron, did you receive this message during activation or installation?

Cameron Purdy: Activation.

Abi: I sincerely apologize for the inconvenience this has caused you.

Cameron Purdy: I have the original package in front of me with the product key. This is supposed to be an "upgrade" to the Vista OEM license that came with the computer, but when I tried to upgrade it told me that I had to install a new copy, so I did the install, and now it won't take the key.

Abi: For this particular issue, please contact our Personal Support at 800-936-5700. They are open Monday - Friday 5 AM - 9 PM, Saturday - Sunday 6 AM - 3 PM Pacific Time.

Abi: Is there anything else I can help you with today, Cameron?

Cameron Purdy: The location of a local Apple store?

Abi: I'm sorry but Microsoft doesn't support Apple.

Abi: It seems like I lost you. I hope this will be resolved soon. Thank you for using Microsoft Customer Service Chat. Please feel free to come back again. We are available 7 days a week, 24 hours a day.

Abi has left this session!

The session has ended!

Monday Apr 12, 2010

Coherence 3.5 Book

For such a successful software package that's used by thousands of companies and has been out in the market for over eight years, it's a little surprising that the only documentation available to date came from the vendor itself.

Thankfully, Aleks Seovic, who is one of the authors of Spring for .NET, wrote a hefty volume on building high-performance, scalable applications with Coherence 3.5:

[image]

There aren't too many good books on the subject of extreme scalability and scalable performance, and very few that look at those topics in tight relation to High Availability (HA), continuous availability and reliability. This book brings a lot of those ideas down to earth, and shows how to architect and build real, working applications that scale well, perform beautifully (and consistently!), and survive hardware failure without any impact on the users of the applications.

Here's Aleks in his own words talking about this topic:

The part of a typical application that is most difficult to scale is a data layer. Scaling web servers out is trivial -- you simply add more of them behind the load balancer. Because HTTP is naturally a stateless protocol, HTTP requests can be easily balanced across many machines.

The situation is similar with the application servers, but with an important difference -- in order to be able to scale application servers out, we need to make middle tier services stateless. This is not necessarily a bad thing, but it can lead to both performance and scalability problems if done incorrectly.

Unfortunately, very few services are truly stateless in real life. What is usually the case is that they need some data to process, so the common approach is for a service to load data from the data layer, process it, and persist the result back in the data layer. That means that in order to make application services stateless, we need to push all the state (even the transient state, such as HTTP sessions) to the data layer. This puts additional load on the data layer -- the more web and application servers we have, the harder our data layer will have to work to satisfy incoming requests.

The problem is that most applications use RDBMS as a data layer, and relational databases are both difficult and expensive to scale. When you use master-slave replication, clustering or sharding to scale your database, you significantly increase both the complexity and the cost of the system. Even if you use an open source database, such as MySql, you will have additional hardware and administration costs. In the case of sharding, you will also increase development cost quite a bit.

The bottom line is that relational databases were simply not designed for scale out. Even though we have come up with different solutions to scale them, all of them feel like a kludge to a certain extent, and have some fairly significant limitations.

On the other hand, Coherence and other in-memory data grids were designed with scalability as a primary goal. In the case of Coherence specifically, it is trivial to scale the cluster out by adding more machines. Coherence will dynamically repartition the cluster, which will automatically reduce both the data volume and the processing load across the cluster.

The net result is that in-memory data grids allow you to put state back into the application. This can significantly reduce the load on a database -- it is not uncommon to see database load drop 80% or more after Coherence is introduced into the architecture. They also allow you to keep transient data, such as HTTP sessions, only within the grid, reducing the load on a database even further.

Since I've been talking with Aleks about this project the whole way through, I can attest that this book was a long time in the making! It's never hard to start one, but it's always impressive to me when people finish writing a book, because it is just so much more work than one would expect it to be. At the end of the project, Aleks asked me to write the foreword, which is as close as I've ever come to writing a book ;-)

(Note: Assuming I did the above link right, a small proceed from the Amazon transaction goes to support one of our local public radio stations.)

All Good Things ...

The big news in Javaland from this past week is that James Gosling left Oracle. That sounds so negative, though.

How about this instead: James Gosling started a new blog!

Yeah, that's a tough one. It's always disappointing to lose talented people. As for you, James, best wishes, and I hope we'll still see you at JavaOne.

Thursday Jan 21, 2010

Eight Theses

A recent post by Tim Bray started a conversation on a mailing list that I enjoy lurking (and occasionally commenting) on. In "Doing It Wrong", Tim laments the complexity of "enterprise systems", and the heavy-weight processes and cultures that surround them. Now I may be fairly evil to point this out, but Tim is widely credited for co-inventing XML, and thus (IMHO) is automatically disqualified from lamenting the complexity of enterprise systems, since most modern complexity seems to somehow be related to the use of XML ;-).

Einar Landre, speaking at Oppdat, and on much the same subject, asked the question "What is the reasons why students or startups manage to pull off things that big enterprise IT teams almost never manage to pull off?"

Dave Bartlett responded:

There are examples of groups withing large companies that have successfully short circuited the bureaucracy. In the '80s the term for these groups was usually a "skunk works" project. Essentially a start-up within a large company, but for that project to succeed there had to be someone with the vision to recognize an opportunity from the changes that are emerging from technology, business and society. With the entrepreneurial culture we have today, less of these talented people go into large corporations.

And Barry Hawkins added:

Regarding "enterprises" (a term I've come to loathe), the apathy and ineffectiveness seems to stem from being so far removed from the actual business of providing people something they pay money for. So very few companies manage to reach a largish size and maintain that sense of immediacy and connection with their domain.

I added seven points to consider, which after some feedback I expanded to these Eight Theses:

There is a difference between version 1 and version 2. Building something with a single purpose from scratch is mind-bogglingly easy compared to changing something that already exists. While we use the term "system integration" in enterprise software, what we're really talking about is going from version 72915287 to version 72915288 of a complex system called "IT", composed of many moving parts (e.g. applications and the infrastructure that they run on). Those same people, if they were building a new "IT version 1" to include all the features of "IT version 72915288" would be able to do it in a much shorter aggregate time, at much lower aggregate cost. Second system effect. While I just claimed that a veteran IT group could build an "IT version 1" from scratch quite quickly and inexpensively, there is also an incredibly high likelihood that they could not build it at all. The name given to this phenomenon is "second system effect", which basically means that people who have battle scars from a legacy system will tend to over-invest in the "flexibility" and "completeness" of a green-field replacement of that system (often out of fear of missing something important), causing the complexity of the project to sky-rocket and killing it in the process. In summary: Knowledge is paralyzing. (That could largely explain why students are fearless.) The real cost of complexity increases exponentially. I usually deplore folklore about start-ups that "go big" (e.g. Google and Twitter), but there is one that I'd like to repeat (and probably butcher in the process), and here it is: One guy wrote eBay in a day -- one guy, one day! When they built version 2, the project took the same guy many weeks. Version 3 took many years. There can never be a version 4, because the system is so complex that it literally cannot be replaced in total. Just to be clear, the same brilliant person is usually present for version 1 and 2 and 3, so why does each version take longer -- even if replacing the system in full? The answer is that small amounts of increases in perceived complexity (e.g. each new requirement) expands the real complexity of the system exponentially. Want examples? Take a hand-built web-site and add internationalization / localization. Add accessibility support. Add support for other devices, such as an iPhone. Add scale-out. Add high availability. Add security. If you know someone that thinks it's easy, then whatever you do, don't hire that person, fire them if they already work for you, and quit if you work for them. Survivor bias. (i.e. Perception vs. reality.) For every successful (?) Twitter, there are dozens or hundreds of failures. While many of us have heard of every single large success, no one person has even heard of 10% of the failures. While large business have large IT failures, the rate of failure is far lower than that of start-ups (including all those half- and 90%-implemented ideas that never even make it to a visible "start-up" phase). Attempting to draw conclusions from a comparison of success in enterprises versus start-ups is thus pointless, because one is largely comparing the success from one group with the successes and failures from the other, and that bias is unavoidable. Managing risk when there is something to lose. When you are building a new system, there's nothing to break, and thus nothing to lose. When you are modifying a production system that is perceived as largely working and mostly available, there's plenty to lose. It is self-evident that you will make different decisions when you feel that you have something to lose. Organizational overhead. In a start-up or a small company you are much more likely to spend a large percentage of your time actively working toward the goals of the company, while in a large company you may only spend a small fraction of your time (if any) working toward the goals of the company. That's why "skunk-works" projects can work: Large organizations often exist (both in aggregate and at any observable level of division) primarily to continue to exist and can best be described as a social form of Brownian motion; a skunk-works project creates a goal that is in concert with the real goal of the larger company, and then shields that project's direction and progress from the effects of the larger organization's Brownian motion. It can be done. Employee bias. Ask yourself this: Who works at start-ups? The risk profile of an employee at a start-up will tend to be significantly higher than those at a large company. That concentration of risk takers may contribute to the high percentage of start-ups that fail, but they also contribute to the breakthroughs that would be impossible to realize in a large, risk-averse organization. Perhaps the scariest aspect of this is that many employees at large organizations have absolutely no power and -- managing their own risk well -- do absolutely net-nothing, which is why it does not surprise me that large organizations have such difficulty executing on projects of substantial size. Conflicting goals. In a small team or a small organization, particularly one that is formed around a vision or commonality, it's actually possible to have a largely-shared set of values and goals, and a largely-shared vision. This is much more difficult (if not impossible) in a large, established organization. Furthermore, it is likely that the incentives meant to drive behavior will only approximate (at best) the goals of the organization. For example, someone working in Direct Sales will likely be rewarded for their ability to keep cold-called customers on the phone, while someone working in Technical Support will likely be rewarded for their ability to get customers off of the phone. Add politics and self-centered career- and empire-builders, and you have an environment in which a few destructive employees will freely spend company resources for their own purposes, often at the expense of the real goals and values of the organization.

These thoughts are obviously my own perceptions from my own experiences. I do find that having an understanding of these trade-offs and differences is useful in "getting stuff done" and shielding our own little skunk-works organization within a large company. On the other hand, my experience is so short and limited that all of the things that I assume to be true may yet be turned on their heads. I hope pleasantly so ....

Thursday Oct 08, 2009

Innovation

I was interviewed recently on the topic of innovation as part of the Oracle Innovation Showcase. Reading back through it, I liked this question and answer:

Q: What would you call the enemy of innovation?

A: The first is complacency. A lot of people are satisfied with what they have. If you can convince yourself you're satisfied, you'll stop looking for a better solution. The other is an inability to listen and appreciate the complexity of a problem. Everyone wants to believe they have the answer before the question even gets asked. People don't take the time to listen and appreciate the individual complexity of each customer's problem and the nuances of their environments.

The full interview can be found on the Oracle Innovation Showcase.
Wednesday Sep 30, 2009

The challenge with GC in Java and .NET

A recent topic on Artima is coincident with a topic that I was recently writing about. I've included a portion of that here:

--

There exists no shortage of opinions on the topic of what aspects are the most important in a system of execution. One voice will claim that only performance matters, while another will suggest that it no longer matters at all. One voice will claim that achieving efficiencies in development is far more valuable, while another will insist that predictability and stability in execution is critical. Opinions morph with time, as the reality of the physical units of execution evolves and the conceptual units of design are ever the more realized in languages and libraries.

Nonetheless the state of the art today bears the hallmark of a path followed far beyond its logical conclusion. In 1977, John Backus raised an early warning in his ACM Turing Award lecture:

"Surely there must be a less primitive way of making big changes in the store than by pushing vast numbers of words back and forth through the von Neumann bottleneck. Not only is this tube a literal bottleneck for the data traffic of a problem, but, more importantly, it is an intellectual bottleneck that has kept us tied to word-at-a-time thinking instead of encouraging us to think in terms of the larger conceptual units of the task at hand. Thus programming is basically planning and detailing the enormous traffic of words through the von Neumann bottleneck, and much of that traffic concerns not significant data itself, but where to find it."

While programming advances have largely digested and expelled the explicit concerns of store-addressing and word-at-a-time thinking, these advances have been repetitively accreted onto a burial mound whose foundation remains a von Neumann architecture. Perhaps the success of that underlying architecture is the result of natural selection, or perhaps we have only inertia to blame. In any case, the evolution of concurrent multi-processing and distributed systems has stretched the von Neumann architecture past its effective limits. Specifically, it appears that the recent growth in the extent of the now automatically-managed store has occurred at a pace well beyond the increase in performance of the heart of the von Neumann machine: the processor. Whether this imbalance can be rectified by further technological accretion or by the adoption of a fundamentally new execution architecture is yet to be seen, but regardless the inevitable and predictable increase in performance that has become the opiate of an industry has taken a sabbatical, and may have accepted an early retirement altogether.

There has existed a loose historic alignment in the growth of processor performance, memory capacity, memory throughput, durable storage capacity, durable storage throughput and network throughput. This relatively consistent growth has allowed a general model of assumptions to be perpetuated throughout hardware architectures, operating systems, programming languages and the various resulting systems of execution. Now we find that model to be threatened by the failed assumption that processor performance will increase at a rapid and relatively predictable rate.

To maintain the façade of progress, explicit hardware parallelism has emerged as the dominant trend in increasing processor throughput. Symmetric Multi-Processing (SMP) has a relatively long history in multi-CPU systems, but adoption of those systems was hobbled both by high prices and a lack of general software support. The advent of the Internet propelled multi-CPU systems into the mainstream for back-end servers, but it is the recent, seemingly instantaneous and near-universal commoditization of multi-core CPUs that has finalized the dramatic shift from a focus on processor performance to a focus on processor parallelism. Further compounding the adoption of multi-CPU and multi-core systems are various technologies for Concurrent Multi-Threading (CMT), which enables a single CPU core to execute multiple threads concurrently. In aggregate, the last decade has increased parallelism from 1 to 16 concurrently executing threads in an entry-level server, while the performance of an individual processing unit has only increased by only a few times. Looking forward, processor performance is now expected to improve only incrementally, while the level of parallelism appears to be doubling with each new processor generation.

Since overall processing throughput has continued to increase at a dramatic pace not dissimilar from its historic trend, this shift from performance to parallelism could be safely ignored but for one problem: The von Neumann architecture is bound to a processing unit, and thus has nearly halted its forward progress in terms of the throughput of a single thread of execution. This means that for the first time, existing programs do not run significantly faster on newer generations of hardware unless they were built to explicitly take advantage of thread parallelism, which is to say unless they were built assuming their execution would occur on multiple von Neumann machines in parallel. Since the art of programming is expressed almost entirely in imperative terms, and since the imperative nature of programming languages is based on the von Neumann architecture, we have managed to accumulate generations of programs and programmers that are hard-wired to a model that has at least temporarily halted its forward progress.

Computing devices thus are providing increases in processing throughput that can only be consumed by parallelism. It is obvious that this mandates support for parallelism in any new system of execution, but there is a far less obvious implication of critical importance. Parallelism increases throughput only to the extent that coordination is not required by the threads of execution, and coordination is required only for resources that are potentially shared across multiple threads of execution. In modern systems of execution, explicit parallelism is provided by threads of execution, each representing the state of a von Neumann machine, but those machines collectively share a single store. Compounding the coordination overhead for the store is the prevalence of automatic management of the store, referred to as Garbage Collection (GC), which unavoidably requires some level of coordination. While GC algorithms have advanced dramatically in terms of parallelism, the remaining non-parallelized (and possibly non-parallelizable) portion of GC is executed as if by a single thread. The unavoidable conclusion is that growth in the shared store without a corresponding increase in processor performance will lead to unavoidable and growing pauses in the execution of the parallelized von Neumann machines.

A series of amazing advances in GC algorithms have thus far masked this inevitable consequence, but the advances are already showing diminishing returns, while the upward pressure on the size of the store has not abated and the dramatic progress of processor performance has not resumed.

Upcoming Coherence SIGs

Unfortunately, considering that we've booked the largest room that Oracle has available, the NYC Coherence SIG scheduled for 1 October is basically "sold out". (Pre-registration is required.)

Fortunately, we have another one coming up in San Francisco and the Bay Area. This one is will be held at Oracle's Redwood Shores campus on 8 October. Pre-registration is required for this one, too, but we've reserved plenty of room. Some of the topics being covered:

Aleks Seovic, who authored a book on Coherence (coming out later this year) will be talking about Coherence extension points and utilities and tools that he built as part of writing the book. Aleks (who wrote a good portion of Spring.NET) was also deeply involved with the architecture and implementation of Coherence for .NET, and he's a fun speaker. Matt Rosen and Horea Abrudan from Orient Overseas Container Lines will be talking about how they have used Coherence, including the use cases they set out to address, the development process, and what they've learned in production. Everett Williams, who authored the Coherence JMX Reporter and a number of monitoring features in the product, will be presenting on ... the Coherence JMX Reporter. For those of you who haven't seen it, the JMX Reporter allows the cluster to produce time-series analytics that help to diagnose issues in large-scale production environments, i.e. the types of environments that you can't take down to "debug", and that are so large that you couldn't do it even if you were allowed to.

There are quite a few other Coherence events going on this quarter as well, but the biggest is Oracle OpenWorld in San Francisco. Click on that link and save it, because it has all of the information about Coherence at OpenWorld, and will save you a ton of time! (OpenWorld is usually around 85,000 people, so it can be pure chaos!)


Archives
« February 2012
Sun Mon Tue Wed Thu Fri Sat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
     
             



Links
Referrers


The views expressed on this blog are my own and do not necessarily reflect the views of my employer.
Content copyright 2002, 2003, 2004, 2005, 2006, 2007 by Cameron Purdy. All rights reserved.


You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser