Skip to content

Latest commit

 

History

History
780 lines (559 loc) · 22.5 KB

README.adoc

File metadata and controls

780 lines (559 loc) · 22.5 KB

tengi - Near-Realtime Communication Foundation

Join%20Chat icon coverage license Apache%20License%202 brightgreen

tengi is designed as a cross-language communication platform. Using several distinct transport layers, tengi always finds the most efficient way to communicate between the client and server.
Available transports range from a plain TCP transmission over RDP (a reliable UDP implementation), standard UDP or HTTP Long-Polling as fallback.

Features

Cutting-Edge Technology

tengi implements industry standards like socket layer communication using TCP and UDP but also offers support cutting-edge technologies like SCTP and HTTP2 for future perspective. By transparently offering a variety of transportation layers using a common, flexible but simple and easy-to-use API, tengi is class-leading for the cross-language communication sector.

Simplicity driven APIs

The famous quote ‘if you can’t explain it to a six year old, you don’t understand it yourself’, by Albert Einstein, is not only true to Physics but also to software APIs. Complex APIs prevent people from using your solution or have a very flat learning curve and getting started is a hard task.
Good APIs are easy readable, self explanatory and fast to get started with.

tengi, for this reason, hides complexity behind a simple but powerful API. An echo-server example is less than 20 lines of code but complex tasks can be achieved using a rich set of system-hooks for advanced users.

Data Protocol and Reliability

tengi’s default protocol is a highly bandwidth optimized serialization and transmission solution. Supporting compressed int32 and int64 data types (when 64 bit types are supported by client languages) as well as providing way to serialize objects in custom way with guarantee of minimal overhead.

Depending on the selected transport layer additional overhead is necessary (e.g. HTTP, HTTP2). Due to the fact tengi always finds the best communication path, ending up using expensive transports is very unlikely. Still tengi supports communication through corporate firewalls and unreliable networks (e.g. mobile providers).

Asynchronous and Reactivity

Reactive design principles and event driven systems are most common way to start nowadays scalable solutions. Therefore tengi offers full support of the reactive pattern. From bottom up the internal implementation is completely based on a reactive design and this is also passed to the user.

Performance and the Java Power

More and more developers start using Java, or better said the JVM (Java Virtual Machine), as the foundation for their backend technologies. Powerful JIT (Just-In-Time) compilation and optimization based on runtime behavior bundled with the automatic resource management are a perfect base for server systems. In addition tengi is optimized for high performance, low latency and minimal Garbage Collector pressure.

Using latest Java 8 features like Closures (Lambdas), interface evolution and optimized data structures provides a compact and future-oriented external API.

Cross-Language Clients

tengi is designed to provide a simple communication protocol. The protocol specification is fully documented and easy to implement on further languages.

Currently planned clients include, but are not limited to, C#, C++, HTML5 (using ASM.js) and ActionScript, next to an obvious Java client.

Getting Started

API Description

The Java API (JavaDoc) is available here.

Client Example

public void createEchoClient() {
  // Create configuration using the builder
  Configuration configuration = new ConfigurationBuilder()
    // Configure available transports
    .addTransport(ClientTransports.TCP_TRANSPORT, ClientTransports.HTTP_TRANSPORT)

    // Build final configuration
    .build();

  // Create client instance using configuration
  Client client = Client.create(configuration);

  // Connect to remote server
  client.connect("127.0.0.1", this::onConnect);
}

private void onConnect(Connection connection) {
  connection.addMessageListener(this::onMessage);

  Packet packet = new Packet("request");
  packet.setValue("name", "Steven");
  connection.writeObject(packet);
}

private void onMessage(Connection connection, Message message) {
  Packet packet = message.getBody();
  String helloWorld = packet.getValue("value");
  System.out.println(helloWorld);
}

Server Example

public void createEchoServer() {
  // Create configuration using the builder
  Configuration configuration = new ConfigurationBuilder()
    // Configure available transports
    .addTransport(ServerTransports.TCP_TRANSPORT, ServerTransports.HTTP2_TRANSPORT,
                  ServerTransports.HTTP_TRANSPORT)

    // Build final configuration
    .build();

  // Create server instance using configuration
  Server server = Server.create(configuration);

  CompletableFuture<Channel> future = server.start(this::onConnect);
}

private void onConnect(Connection connection) {
  connection.addMessageListener(this::onMessage);
}

private void onMessage(Connection connection, Message message) {
  Packet packet = message.getBody();
  String name = packet.getValue("name");
  Packet response = new Packet("response");
  response.setValue("value", "Hello World " + name);
  connection.writeObject(response);
}

Maven Coordinates

Javadoc

API Walkthrough

Configuration

Core

Transports

TCP
UDP
RDP
WebSocket
HTTP2
HTTP Long-Polling

Connection

Listener

MessageListener
ConnectionListener
ConnectionConnectedListener

Logging

Serialization

Packet
Marshallable
Marshaller and MarshallerFilter
Message
Debugging

Server

Transports

Server

Broadcaster

Client

Transports

Client

Protocol Specification

This chapter describes the tengi internal default protocol and serialization techniques.

It contains information about the available built-in data types, their sizes and value ranges. In addition it describes the the protocol and packet headers, as well as definitions how compression and serialization of certain special types is handled.

The tengi default protocol is designed to be low overhead and any kind of object is expected as non-null by default. Values that can be null must be written explicitly and adding a marker byte to the stream.

Note

Even if most computer architecture these days as based on Little Endian, the protocol is completely implemented to the rules of Big Endian. If a system based on Little Endian encoding is used, conversion between Little Endian and Big Endian is necessary before writing or after reading the byte stream.

Codec (Encoder, Decoder)

The com.noctarius.tengi.serialization.codec.Codec class consists of two sub-interfaces which should never be implemented independently but always as a complete codec.

A Codec defines a way to serialize and de-serialize values of a predefined set of special built-in types and objects of various, user defined types.

tengi, by default, offers a low-overhead and fast codec implementation which is automatically picked up and instantiated.

The Codec provides reader and writer methods for the distinct data types and two methods to read or write objects. Codec::writeObject writes non-null objects and throws an exception when null is passed to the method. Codec::writeNullableObject offers an automatic way to handle values that possibly can be null. It adds an extra Byte to the stream to mark the value to be null or not.

Data Types

Table 1. Built In DataTypes
Name Java Length Min Max Note

Byte

byte

8 Bit

-128

127

Unsigned Byte

short

8 Bit

0

255

Byte-Array

byte[]

8 Bit per index

Short

short

16 Bit

−32,768

32,767

Char

char

16 Bit

\u0000 (0)

\uffff (65,535)

Int32

int

32 Bit

-231

-231 - 1

Compressed Int32

int

8 Bit - 40 Bit

-231

-231 - 1

Int32 Compression

Int64

long

64 Bit

-263

263 - 1

Compressed Int64

long

8 Bit - 72 Bit

-263

263 - 1

Int64 Compression

Float

float

32 Bit

±1.4e-45

±3.4028235e38

Single-precision IEEE 754 floating point

Double

double

64 Bit

±4.9e-324

±1.7976931348623157e308

Double-precision IEEE 754 floating point

Boolean

boolean

1 Bit

false

true

Written as 8 Bit

BitSet

boolean[]

1 Bit

false

true

BitSet Compression explained below

String

String

32 Bit length, + content

UTF-8 encoded content

Identifier

Identifier

128 Bit

Optimized UUIDv4

Byte

Unsigned Byte

Byte-Array

Short

Char

Int32

Int64

Float

Double

Boolean

Identifier

Int32 Compression

The Int32 Compression can be used to write Int32 values that are expected to be quite small in most cases but might exceed the range of smaller data types in certain cases.

Note

Int32 Compression is not supported per Encoder::writeObject but needs to be used explicitly using Encoder::writeCompressedInt32 and read by Decoder::readCompressedInt32. Integers written with Encoder::writeObject will always be written as uncompressed Int32 values.

The actual Int32 will be compressed into one to five bytes. Due to the nature of how the compression works the biggest values need a few additional bits to store required metadata, therefore an additional byte is necessary. That said the Compressed Int32 is only recommended for generally small values.

Bits of the value are stores left to right. However the first byte can only store 5 bits and uses the most significant bit to store the original signed bit and the second most significant bit stores information if the final value needs to be inverted before being returned.

The later information is necessary to nicely compress, near zero, negative or big values. When values are meant to be stored as an inverted bit sequence is up to the encoder implementation. A recommended way is to compare leading zeros in inverted and non-inverted form to take the better compressable version.

The least signification bit of every byte stores if another byte is about to follow up.

Content bits are stored from right to left in Big Endian encoding, unneccesary content bits in the first byte must be set to 0.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|S|I| Content |F|
+-+-+-+-+-+-+-+-+
Table 2. First Byte Bits
Bits Description Values

0

Signed Bit

Stores the signed bit of the original value

1

Inverted Bit

Stores if the final value needs to be inverted

2-6

Content Bits

Stores up to 5 bits

7

Follow (F)

0 if stream ends, 1 if chunk follows

Every following byte stores 7 additional bit of data and again a follow up bit.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|   Content   |F|
+-+-+-+-+-+-+-+-+
Table 3. Further Byte Bits
Bits Description Values

0-6

Content Bits

Stores up to 6 bits

7

Follow (F)

0 if stream ends, 1 if chunk follows

As an example on how to apply this logic in the real world let’s have a look at the following section.

Given is a value

A:Int32 = -2147483648 (minimal Int32 value)

This transformed into the bit representation as an integer looks like:

0               1               2               3               4
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

After storing the signed bit we can remove it and count the leading zeros which results in another 32 leading zeros and 0 writeable bits. In this example no additional invertation is applied and we store the minimal Int32 in one byte as follows:

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+

Another example is:

B:Int32 = -10;

In this case the binary representation looks like:

0               1               2               3               4
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 0 1 1 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Storing the signed bit and applying value invertion we result in:

0               1               2               3               4
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 1 0 0 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Obviously the value now can be stored in way less bits again. Counting leading zeros and calculating values to write we end up with leadingZeros=28 and writeableBits=4. After writing the value to the byte stream we again end up with one byte of content.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 1 0 1 0 0 1 0|
+-+-+-+-+-+-+-+-+

Int64 Compression

The Int64 Compression works exactly as the Int32 compression but can store more information.It is used to write Int64 values that are expected to be quite small in most cases but might exceed the range of smaller data types in certain cases.

Note

Int64 Compression is not supported per Encoder::writeObject but needs to be used explicitly using Encoder::writeCompressedInt64 and read by Decoder::readCompressedInt64. Integers written with Encoder::writeObject will always be written as uncompressed Int64 values.

The actual Int64 will be compressed into one to five bytes. Due to the nature of how the compression works the biggest values need a few additional bits to store required metadata, therefore an additional byte is necessary. That said the Compressed Int64 is only recommended for generally small values.

Bits of the value are stores left to right. However the first byte can only store 5 bits and uses the most significant bit to store the original signed bit and the second most significant bit stores information if the final value needs to be inverted before being returned.

The later information is necessary to nicely compress, near zero, negative or big values. When values are meant to be stored as an inverted bit sequence is up to the encoder implementation. A recommended way is to compare leading zeros in inverted and non-inverted form to take the better compressable version.

The least signification bit of every byte stores if another byte is about to follow up.

Content bits are stored from right to left in Big Endian encoding, unneccesary content bits in the first byte must be set to 0.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|S|I| Content |F|
+-+-+-+-+-+-+-+-+
Table 4. First Byte Bits
Bits Description Values

0

Signed Bit

Stores the signed bit of the original value

1

Inverted Bit

Stores if the final value needs to be inverted

2-6

Content Bits

Stores up to 5 bits

7

Follow (F)

0 if stream ends, 1 if chunk follows

Every following byte stores 7 additional bit of data and again a follow up bit.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|   Content   |F|
+-+-+-+-+-+-+-+-+
Table 5. Further Byte Bits
Bits Description Values

0-6

Content Bits

Stores up to 6 bits

7

Follow (F)

0 if stream ends, 1 if chunk follows

As an example on how to apply this logic in the real world let’s have a look at the following section.

Given is a value

A:Int64 = -9223372036854775808 (minimal Int64 value)

This transformed into the bit representation as an integer looks like:

0               1               2               3               4
5               6               7               8               9
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

After storing the signed bit we can remove it and count the leading zeros which results in another 64 leading zeros and 0 writeable bits. In this example no additional invertation is applied and we store the minimal Int32 in one byte as follows:

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+

Another example is:

B:Int64 = -10;

In this case the binary representation looks like:

0               1               2               3               4
5               6               7               8               9
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1|1 1 1 1 0 1 1 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Storing the signed bit and applying value invertion we result in:

0               1               2               3               4
5               6               7               8               9
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0 1 0 0 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Obviously the value now can be stored in way less bits again. Counting leading zeros and calculating values to write we end up with leadingZeros=60 and writeableBits=4. After writing the value to the byte stream we again end up with one byte of content.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1 1 0 1 0 0 1 0|
+-+-+-+-+-+-+-+-+

BitSet Compression

A BitSet is compressed using a chunk based encoding. Every chunk has a minium and maximum size. Chunk types are selected based on the number of necessary content slots.

Chunks are named as Single, Double, Quad after the numbers of bytes used to represent them. A Single chunk is the only chunk that can be used to store no element which represents a NULL value for the serialized BitSet.

The actual content is always written from left to right. That said, a Single chunk is filled from bit 4 onwards with data.

Chunks are marked using the first two bits of a stream as defined in the following table:

Table 6. Chunk Type Signature
Chunk Type Octal Binary

Single

0x01

0b01

Double

0x02

0b10

Quad

0x03

0b11

The last bit of a chunk always indicates if another chunk is going to follow or not. If another chunk follows up, the first two following bits again show the chunk type and encoding starts over with the same procedure.

It is up to the compressor implementation on how to select the individual chunks but it is recommended to select the best combination of chunks. For example 11 values can be stored as a Double and a Single, no matter in which order those two chunks appear, therefore it is optimal to start using Quad with at least 14 values.

Single Chunk

A Single chunk is represented by a single byte and can hold 1 to 3 content slots and the special value NULL which is represented as a zero-length chunk.

0               1
 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|CTS|SZ |C    |F|
+-+-+-+-+-+-+-+-+
Table 7. Single Chunk Bits
Bits Description Values

0-1

Chunk Type Signature (CTS)

See Chunk Type Signature

2-3

Size (SZ)

1-3, 0 indicates NULL

4-6

Content Slots

1-3 values

7

Follow (F)

0 if stream ends, 1 if chunk follows

Double Chunk

A Double chunk is represented by two byte and can hold between 4 to 10 content slots. Due to the minimum number of content slots and the given number of bits to store the element count, the read element count from the bits need to be added to the minimum.

Example:
0b001 = 1
1 + 4 = 5

0               1               2
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|CTS| SZ  |      Content      |F|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Table 8. Double Chunk Bits
Bits Description Values

0-1

Chunk Type Signature (CTS)

See Chunk Type Signature

2-4

Size (SZ)

4-10

5-14

Content Slots

4-10 values

15

Follow (F)

0 if stream ends, 1 if chunk follows

Quad Chunk

A Quad chunk is represented by four byte and can hold between 11 to 25 content slots. Due to the minimum number of content slots and the given number of bits to store the element count, the read element count from the bits need to be added to the minimum.

Example:
0b0011 = 3
3 + 11 = 14

0               1               2               3               4
 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|CTS|   SZ  |                     Content                     |F|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Table 9. Quad Chunk Bits
Bits Description Values

0-1

Chunk Type Signature (CTS)

See Chunk Type Signature

2-5

Size (SZ)

11-25

6-30

Content Slots

11-25 values

31

Follow (F)

0 if stream ends, 1 if chunk follows

String Serialization

TypeId

Protocol Header

Initial Packet

Last Packet

Packet Header

Object Header

Nullable Object