The V1 consensus network has the potential to become the bottleneck in V1 deployments, since by definition it must serialize transactions on a channel. The "deliver" side of consensus is probably the most straightforward application in the entire V1 architecture: Binary blobs are simply pulled from ordered storage and then transmitted to multiple clients using token-based flow control.
Profiling the current Kafka orderer in deliver-only contexts shows almost nothing but serialization overhead, generically malloc()/memset()/memcpy(). Serialization and deserialization is pure overhead, adding no value to the data other than allowing it to be transmitted on the wire. For example the curent Kafka orderer stores blocks as the serialized form of the "Block" proto. To transmit a block:
1. The block is deserialized from Kafka storage (malloc()/memset()/memcpy())
2. Trivally converted to a DeliverResponse
3. Serialized again during gRPC transmission (malloc()/memset()/memcpy())
4. Copied yet again by the gRPC implementation in order to add a 5-byte header
to the serialized proto (malloc()/memset()/memcpy()).
The overhead of Steps 1 and 3 can be eliminated by storing and transmitting pre-serialized DeliverResponse objects. This is possible by modifying the DeliverResponse proto to implement a 3rd recursive type. This recursive type stores a pre-serialized DeliverResponse as a byte vector.
Servers can take advantage of this type by using a custom grpc.Codec for the DeliverResponse server. This Codec recognizes the recursive type and simply returns the pre-serialized response:
Now, deliver servers can transmit pre-serialized responses without the overhead of 1 and 3. Note that absolutely no change to the client is required, and this behavior is completely optional on the server side as well.
I have modfied the Kafka orderer to store serialized DeliverResponse objects (instead of Block objects), and to then transmit them directly with the custom Codec. The results of these experiments appear in the attached slide.
My argument is that we should go ahead and modify the DeliverResponse proto and add the custom codec in order to support the most efficient delivery services possible.