Introduction
Karmem is a fast binary serialization format. The priority of Karmem is to be easy to use while been fast as possible. It's optimized to take Golang and TinyGo's maximum performance and is efficient for repeatable reads, reading different content of the same type. Karmem has proven to be ten times faster than Google Flatbuffers, with the additional overhead of bounds-checking included.
Motivation
Karmem was designed to tackle the most prominent issue: sharing data between wasm-hosts, wasm-guests, and native guests. Currently, we use something similar to a "command-event pattern", where multiple wasm-instances receive the same event/data and then return one command/data. But, first, calling exported/imported functions are expensive and almost prohibitive. So we tried many options as possible.
First, why not choose Witx? It is a good project and aimed at WASM. However, it seems more complex to use and not limited to serialization. Furthermore, that project was not intended to be portable to non-wasm and doesn't support Golang, our primary language.
Why not use Flatbuffers? We tried, but it's not fast enough and causes panic due to the lack of bound-checking. Also, decoding using to struct (using "Object-API) is terrible and generates too much garbage.
Why not use Cap'n'Proto? It's a good alternative but lacks implementation for Zig and AssemblyScript, which is a top priority. It also has more allocations, and the generated API is more complicated to use than Karmem.
Compare
How Karmem work compared with other serializers?
-
Read without parsing/decode: Karmem offers random-access for data. You can access any data directly without parsing/unpacking. Reading fields have almost zero performance overhead compared to reading native structs.
-
Backward Compatibility: Karmem was designed to offer backward compatibility, regardless of the language. That is achieved using tables (
struct name table
), allowing new fields to be appended at anytime. -
Schema Defined: Karmem was built with a custom schema language. The schema eliminates the need for runtime parsing, increasing performance and reliability.
-
Null-Safety: Null is hard to handle and may cause more issues than benefits. That is not an issue if null is not part of the schema. Karmem doesn't handle pointers or null or optional fields. That may be a compromise, but that makes deserialization predictable, reusable, faster and safer in our tests.
-
Native Slices: Karmem uses native-slice where possible, Zig and Golang currently support it, which means you don't have functions like
SomeList.Get(index)
to access each field. -
Fixed Offset Source: Unlike other schemas, Karmem doesn't use offsets based on the value position. That may introduce some security issues, mitigated using Limited Arrays, but makes it easier to implement in any language.
Performance
Performance varies across each compiler and language, Golang is based language for comparison and was compiled using GC (Golang Compiler) for native platforms and TinyGo for WebAssembly/WASI.
In the following benchmarks, consider "Reading" as random-accessing some fields of the serialized data, "Decode" as decoding/unmarshalling the entire serialized data to an native struct, "Encode" as the process of encoding/marshalling an native struct to Karmem/Flatbuffers.
Struct VS Karmem VS Flatbuffers @ Native
Performance comparison with Flatbuffers and Karmem using similar schemas, with same amount of data. Also, comparing performance with native-struct.
Flatbuffers VS Karmem @ WebAssembly
Performance comparison with Flatbuffers and Karmem using similar schemas, with same amount of data. Running on Wazero.
Karmem VS Karmem @ WebAssembly
Performance between Karmem implementation. Notice, the decoded schema contains strings, which penalizes some non-UTF8 string.