org.codehaus.jackson.smile
Class SmileConstants

java.lang.Object
  extended by org.codehaus.jackson.smile.SmileConstants

public final class SmileConstants
extends Object

Constants used by SmileGenerator and SmileParser

Author:
tatu

Field Summary
static byte BYTE_MARKER_END_OF_CONTENT
          In addition we can use a marker to allow simple framing; splitting of physical data (like file) into distinct logical sections like JSON documents.
static byte BYTE_MARKER_END_OF_STRING
           
static int HEADER_BIT_HAS_RAW_BINARY
          Indicator bit that indicates whether encoded content may contain raw (unquoted) binary values.
static int HEADER_BIT_HAS_SHARED_NAMES
          Indicator bit that indicates whether encoded content may have Shared names (back references to recently encoded field names).
static int HEADER_BIT_HAS_SHARED_STRING_VALUES
          Indicator bit that indicates whether encoded content may have shared String values (back references to recently encoded 'short' String values, where short is defined as 64 bytes or less).
static byte HEADER_BYTE_1
          First byte of data header
static byte HEADER_BYTE_2
          Second byte of data header
static byte HEADER_BYTE_3
          Third byte of data header
static byte HEADER_BYTE_4
          Fourth byte of data header; contains version nibble, may have flags
static int HEADER_VERSION_0
          Current version consists of four zero bits (nibble)
static int INT_MARKER_END_OF_STRING
          We need a byte marker to denote end of variable-length Strings.
static int MAX_SHARED_NAMES
          Longest back reference we use for field names is 10 bits; no point in keeping much more around
static int MAX_SHARED_STRING_LENGTH_BYTES
          Also: whereas we can refer to names of any length, we will only consider text values that are considered "tiny" or "short" (ones encoded with length prefix); this value thereby has to be maximum length of Strings that can be encoded as such.
static int MAX_SHARED_STRING_VALUES
          Longest back reference we use for short shared String values is 10 bits, so up to (1 << 10) values to keep track of.
static int MAX_SHORT_NAME_ASCII_BYTES
          Encoding has special "short" forms for field names that can be represented by 64 bytes of UTF-8 or less.
static int MAX_SHORT_NAME_UNICODE_BYTES
          Maximum byte length for short non-ASCII names is slightly less due to having to reserve bytes 0xF8 and above (but we get one more as values 0 and 1 are not valid)
static int MAX_SHORT_VALUE_STRING_BYTES
          Encoding has special "short" forms for value Strings that can be represented by 64 bytes of UTF-8 or less.
static int MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
          And to make encoding logic tight and simple, we can always require that output buffer has this amount of space available before encoding possibly short String (3 bytes since longest UTF-8 encoded Java char is 3 bytes).
static int[] sUtf8UnitLengths
          Additionally we can combine UTF-8 decoding info into similar data table.
static byte TOKEN_KEY_EMPTY_STRING
          Let's use same code for empty key as for empty String value
static byte TOKEN_KEY_LONG_STRING
           
static byte TOKEN_LITERAL_EMPTY_STRING
           
static byte TOKEN_LITERAL_END_ARRAY
           
static byte TOKEN_LITERAL_END_OBJECT
           
static byte TOKEN_LITERAL_FALSE
           
static byte TOKEN_LITERAL_NULL
           
static byte TOKEN_LITERAL_START_ARRAY
           
static byte TOKEN_LITERAL_START_OBJECT
           
static byte TOKEN_LITERAL_TRUE
           
static int TOKEN_MISC_BINARY_7BIT
          Type (for misc, other) used for "safe" (encoded by only using 7 LSB, giving 8/7 expansion ratio).
static int TOKEN_MISC_BINARY_RAW
          Raw binary data marker is specifically chosen as separate from other types, since it can have significant impact on framing (or rather fast scanning based on structure and framing markers).
static int TOKEN_MISC_FLOAT_32
          Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating 32-bit IEEE single precision floating point number.
static int TOKEN_MISC_FLOAT_64
          Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating 64-bit IEEE double precision floating point number.
static int TOKEN_MISC_FLOAT_BIG
          Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating BigDecimal type.
static int TOKEN_MISC_FP
          Type (for misc, other) used for regular floating-point types (float, double)
static int TOKEN_MISC_INTEGER
          Type (for misc, other) used for regular integral types (byte/short/int/long)
static int TOKEN_MISC_INTEGER_32
          Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating 32-bit integer (int)
static int TOKEN_MISC_INTEGER_64
          Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating 32-bit integer (long)
static int TOKEN_MISC_INTEGER_BIG
          Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating BigInteger type.
static int TOKEN_MISC_LONG_TEXT_ASCII
          Type (for misc, other) used for variable length UTF-8 encoded text, when it is known to only contain ASCII chars.
static int TOKEN_MISC_LONG_TEXT_UNICODE
          Type (for misc, other) used for variable length UTF-8 encoded text, when it is NOT known to only contain ASCII chars (which means it MAY have multi-byte characters) Note: 2 LSB are reserved for future use; must be zeroes for now
static int TOKEN_MISC_SHARED_STRING_LONG
          Type (for misc, other) used for shared String values where index does not fit in "short" reference range (which is 0 - 30).
static int TOKEN_PREFIX_KEY_ASCII
           
static int TOKEN_PREFIX_KEY_SHARED_LONG
           
static int TOKEN_PREFIX_KEY_SHARED_SHORT
           
static int TOKEN_PREFIX_KEY_UNICODE
           
static int TOKEN_PREFIX_MISC_OTHER
           
static int TOKEN_PREFIX_SHARED_STRING_SHORT
           
static int TOKEN_PREFIX_SHORT_UNICODE
           
static int TOKEN_PREFIX_SMALL_ASCII
           
static int TOKEN_PREFIX_SMALL_INT
           
static int TOKEN_PREFIX_TINY_ASCII
           
static int TOKEN_PREFIX_TINY_UNICODE
           
 
Constructor Summary
SmileConstants()
           
 
Method Summary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_SHORT_VALUE_STRING_BYTES

public static final int MAX_SHORT_VALUE_STRING_BYTES
Encoding has special "short" forms for value Strings that can be represented by 64 bytes of UTF-8 or less.

See Also:
Constant Field Values

MAX_SHORT_NAME_ASCII_BYTES

public static final int MAX_SHORT_NAME_ASCII_BYTES
Encoding has special "short" forms for field names that can be represented by 64 bytes of UTF-8 or less.

See Also:
Constant Field Values

MAX_SHORT_NAME_UNICODE_BYTES

public static final int MAX_SHORT_NAME_UNICODE_BYTES
Maximum byte length for short non-ASCII names is slightly less due to having to reserve bytes 0xF8 and above (but we get one more as values 0 and 1 are not valid)

See Also:
Constant Field Values

MAX_SHARED_NAMES

public static final int MAX_SHARED_NAMES
Longest back reference we use for field names is 10 bits; no point in keeping much more around

See Also:
Constant Field Values

MAX_SHARED_STRING_VALUES

public static final int MAX_SHARED_STRING_VALUES
Longest back reference we use for short shared String values is 10 bits, so up to (1 << 10) values to keep track of.

See Also:
Constant Field Values

MAX_SHARED_STRING_LENGTH_BYTES

public static final int MAX_SHARED_STRING_LENGTH_BYTES
Also: whereas we can refer to names of any length, we will only consider text values that are considered "tiny" or "short" (ones encoded with length prefix); this value thereby has to be maximum length of Strings that can be encoded as such.

See Also:
Constant Field Values

MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING

public static final int MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
And to make encoding logic tight and simple, we can always require that output buffer has this amount of space available before encoding possibly short String (3 bytes since longest UTF-8 encoded Java char is 3 bytes). Two extra bytes need to be reserved as well; first for token indicator, and second for terminating null byte (in case it's not a short String after all)

See Also:
Constant Field Values

INT_MARKER_END_OF_STRING

public static final int INT_MARKER_END_OF_STRING
We need a byte marker to denote end of variable-length Strings. Although null byte is commonly used, let's try to avoid using it since it can't be embedded in Web Sockets content (similarly, 0xFF can't). There are multiple candidates for bytes UTF-8 can not have; 0xFC is chosen to allow reasonable ordering (highest values meaning most significant framing function; 0xFF being end-of-content and so on)

See Also:
Constant Field Values

BYTE_MARKER_END_OF_STRING

public static final byte BYTE_MARKER_END_OF_STRING
See Also:
Constant Field Values

BYTE_MARKER_END_OF_CONTENT

public static final byte BYTE_MARKER_END_OF_CONTENT
In addition we can use a marker to allow simple framing; splitting of physical data (like file) into distinct logical sections like JSON documents. 0xFF makes sense here since it is also used as end marker for Web Sockets.

See Also:
Constant Field Values

HEADER_BYTE_1

public static final byte HEADER_BYTE_1
First byte of data header

See Also:
Constant Field Values

HEADER_BYTE_2

public static final byte HEADER_BYTE_2
Second byte of data header

See Also:
Constant Field Values

HEADER_BYTE_3

public static final byte HEADER_BYTE_3
Third byte of data header

See Also:
Constant Field Values

HEADER_VERSION_0

public static final int HEADER_VERSION_0
Current version consists of four zero bits (nibble)

See Also:
Constant Field Values

HEADER_BYTE_4

public static final byte HEADER_BYTE_4
Fourth byte of data header; contains version nibble, may have flags

See Also:
Constant Field Values

HEADER_BIT_HAS_SHARED_NAMES

public static final int HEADER_BIT_HAS_SHARED_NAMES
Indicator bit that indicates whether encoded content may have Shared names (back references to recently encoded field names). If no header available, must be processed as if this was set to true. If (and only if) header exists, and value is 0, can parser omit storing of seen names, as it is guaranteed that no back references exist.

See Also:
Constant Field Values

HEADER_BIT_HAS_SHARED_STRING_VALUES

public static final int HEADER_BIT_HAS_SHARED_STRING_VALUES
Indicator bit that indicates whether encoded content may have shared String values (back references to recently encoded 'short' String values, where short is defined as 64 bytes or less). If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parsers has to store up to 1024 most recently seen distinct short String values.

See Also:
Constant Field Values

HEADER_BIT_HAS_RAW_BINARY

public static final int HEADER_BIT_HAS_RAW_BINARY
Indicator bit that indicates whether encoded content may contain raw (unquoted) binary values. If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parser can not assume that specific byte values always have default meaning (specifically, content end marker 0xFF and header signature can be contained in binary values)

Note that this bit being true does not automatically mean that such raw binary content indeed exists; just that it may exist. This because header is written before any binary data may be written.

See Also:
Constant Field Values

TOKEN_PREFIX_SHARED_STRING_SHORT

public static final int TOKEN_PREFIX_SHARED_STRING_SHORT
See Also:
Constant Field Values

TOKEN_PREFIX_TINY_ASCII

public static final int TOKEN_PREFIX_TINY_ASCII
See Also:
Constant Field Values

TOKEN_PREFIX_SMALL_ASCII

public static final int TOKEN_PREFIX_SMALL_ASCII
See Also:
Constant Field Values

TOKEN_PREFIX_TINY_UNICODE

public static final int TOKEN_PREFIX_TINY_UNICODE
See Also:
Constant Field Values

TOKEN_PREFIX_SHORT_UNICODE

public static final int TOKEN_PREFIX_SHORT_UNICODE
See Also:
Constant Field Values

TOKEN_PREFIX_SMALL_INT

public static final int TOKEN_PREFIX_SMALL_INT
See Also:
Constant Field Values

TOKEN_PREFIX_MISC_OTHER

public static final int TOKEN_PREFIX_MISC_OTHER
See Also:
Constant Field Values

TOKEN_LITERAL_EMPTY_STRING

public static final byte TOKEN_LITERAL_EMPTY_STRING
See Also:
Constant Field Values

TOKEN_LITERAL_NULL

public static final byte TOKEN_LITERAL_NULL
See Also:
Constant Field Values

TOKEN_LITERAL_FALSE

public static final byte TOKEN_LITERAL_FALSE
See Also:
Constant Field Values

TOKEN_LITERAL_TRUE

public static final byte TOKEN_LITERAL_TRUE
See Also:
Constant Field Values

TOKEN_LITERAL_START_ARRAY

public static final byte TOKEN_LITERAL_START_ARRAY
See Also:
Constant Field Values

TOKEN_LITERAL_END_ARRAY

public static final byte TOKEN_LITERAL_END_ARRAY
See Also:
Constant Field Values

TOKEN_LITERAL_START_OBJECT

public static final byte TOKEN_LITERAL_START_OBJECT
See Also:
Constant Field Values

TOKEN_LITERAL_END_OBJECT

public static final byte TOKEN_LITERAL_END_OBJECT
See Also:
Constant Field Values

TOKEN_MISC_INTEGER

public static final int TOKEN_MISC_INTEGER
Type (for misc, other) used for regular integral types (byte/short/int/long)

See Also:
Constant Field Values

TOKEN_MISC_FP

public static final int TOKEN_MISC_FP
Type (for misc, other) used for regular floating-point types (float, double)

See Also:
Constant Field Values

TOKEN_MISC_LONG_TEXT_ASCII

public static final int TOKEN_MISC_LONG_TEXT_ASCII
Type (for misc, other) used for variable length UTF-8 encoded text, when it is known to only contain ASCII chars. Note: 2 LSB are reserved for future use; must be zeroes for now

See Also:
Constant Field Values

TOKEN_MISC_LONG_TEXT_UNICODE

public static final int TOKEN_MISC_LONG_TEXT_UNICODE
Type (for misc, other) used for variable length UTF-8 encoded text, when it is NOT known to only contain ASCII chars (which means it MAY have multi-byte characters) Note: 2 LSB are reserved for future use; must be zeroes for now

See Also:
Constant Field Values

TOKEN_MISC_BINARY_7BIT

public static final int TOKEN_MISC_BINARY_7BIT
Type (for misc, other) used for "safe" (encoded by only using 7 LSB, giving 8/7 expansion ratio). This is usually done to ensure that certain bytes are never included in encoded data (like 0xFF) Note: 2 LSB are reserved for future use; must be zeroes for now

See Also:
Constant Field Values

TOKEN_MISC_SHARED_STRING_LONG

public static final int TOKEN_MISC_SHARED_STRING_LONG
Type (for misc, other) used for shared String values where index does not fit in "short" reference range (which is 0 - 30). If so, 2 LSB from here and full following byte are used to get 10-bit index. Values

See Also:
Constant Field Values

TOKEN_MISC_BINARY_RAW

public static final int TOKEN_MISC_BINARY_RAW
Raw binary data marker is specifically chosen as separate from other types, since it can have significant impact on framing (or rather fast scanning based on structure and framing markers).

See Also:
Constant Field Values

TOKEN_MISC_INTEGER_32

public static final int TOKEN_MISC_INTEGER_32
Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating 32-bit integer (int)

See Also:
Constant Field Values

TOKEN_MISC_INTEGER_64

public static final int TOKEN_MISC_INTEGER_64
Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating 32-bit integer (long)

See Also:
Constant Field Values

TOKEN_MISC_INTEGER_BIG

public static final int TOKEN_MISC_INTEGER_BIG
Numeric subtype (2 LSB) for TOKEN_MISC_INTEGER, indicating BigInteger type.

See Also:
Constant Field Values

TOKEN_MISC_FLOAT_32

public static final int TOKEN_MISC_FLOAT_32
Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating 32-bit IEEE single precision floating point number.

See Also:
Constant Field Values

TOKEN_MISC_FLOAT_64

public static final int TOKEN_MISC_FLOAT_64
Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating 64-bit IEEE double precision floating point number.

See Also:
Constant Field Values

TOKEN_MISC_FLOAT_BIG

public static final int TOKEN_MISC_FLOAT_BIG
Numeric subtype (2 LSB) for TOKEN_MISC_FP, indicating BigDecimal type.

See Also:
Constant Field Values

TOKEN_KEY_EMPTY_STRING

public static final byte TOKEN_KEY_EMPTY_STRING
Let's use same code for empty key as for empty String value

See Also:
Constant Field Values

TOKEN_PREFIX_KEY_SHARED_LONG

public static final int TOKEN_PREFIX_KEY_SHARED_LONG
See Also:
Constant Field Values

TOKEN_KEY_LONG_STRING

public static final byte TOKEN_KEY_LONG_STRING
See Also:
Constant Field Values

TOKEN_PREFIX_KEY_SHARED_SHORT

public static final int TOKEN_PREFIX_KEY_SHARED_SHORT
See Also:
Constant Field Values

TOKEN_PREFIX_KEY_ASCII

public static final int TOKEN_PREFIX_KEY_ASCII
See Also:
Constant Field Values

TOKEN_PREFIX_KEY_UNICODE

public static final int TOKEN_PREFIX_KEY_UNICODE
See Also:
Constant Field Values

sUtf8UnitLengths

public static final int[] sUtf8UnitLengths
Additionally we can combine UTF-8 decoding info into similar data table. Values indicate "byte length - 1"; meaning -1 is used for invalid bytes, 0 for single-byte codes, 1 for 2-byte codes and 2 for 3-byte codes.

Constructor Detail

SmileConstants

public SmileConstants()