Converge
Converge is the prototype exchange format for versioned node. It provides mutable nodes, a reasonable selection of value types, arbitrary formats, node migration, and deterministic encryption.
Versions
Converge versions encrypt the values of the node, and leave the fetch
capabilities.
Version Outer Structure
| Field | Type | Purpose |
|---|---|---|
| node | union[generation] binary[vk(generation)] |
generation and public key |
| version | binary[sg(generation)] |
signature |
| piece_fcs | array[pieces] union[g] binary[h(g)] |
piece fetch caps |
| part_fcs | array[parts] union[2*g+t] binary[t?vk(g):sg(g)] |
part fetch caps |
| parent_fcs | array[parents] union[g] binary[fc(g)] |
parent fetch caps |
| body_ct | binary[] |
encrypted inner structure |
The signature algorithm, thus the public key and signature type, and
encryption algorithm are determined by the generation field.
The node field holds the public key. The version field holds the
signature with a domain separator of “Converge Version Outer” over the
generation, the fetch capabilities, the parent fetch capabilities, and
the hash of the body ciphertext.
sign( "Converge Version Outer", write_cap
, hash(generation || piece_fcs || part_fcs || parent_fcs) || hash(body_ct));
The body ciphertext is encrypted with the read capability, using a hash of the generation, fetch capabilities, and parent capabilites, as well as the public key as context.
encrypt( "Converge Version Inner", read_key, body
, hash(node || piece_fcs || part_fcs || parent_fcs));
The piece_fcs and part_fcs are each sorted, such that the lowest
cryptographic generations come first, version fetch caps come before
node fetch caps, and smaller hashes/keys come before larger ones. This
should match the encoding of the fetch capabilities.
Version Inner Structure
Inside the body ciphertext is another atlv-encoded structure.
| Field | Type | Purpose |
|---|---|---|
| piece_rcs | pieces * binary[ek(g)] |
piece read caps |
| part_rcs | parts * binary[ek(g)] |
part read caps |
| link_frcs | array[2*n] fetch_read_cap |
link caps |
| data | union[] value | binary[] |
(compressed) value |
| formats | array[] union[g] binary[vk(g) ek(g)] |
format caps |
| parent_rcs | parents * inherit_read_cap |
parent read caps |
| write_caps_ct | binary[] |
write caps ciphertext |
The entries in the piece_rcs and part_rcs fields correspond to the
piece_fcs and part_fcs in the outer structure, and hold a symmetric
encryption key for the generation indicated by the fetch capability with
the same index.
The link_frcs field contains link fetch and read caps.
fetch_read_cap = union[2*g+t] binary[t?vk(g):sg(g)] binary[ek(g)]
The entries in the parent_rcs similarly correspond to the parent_fcs
field, but can hold two additional values:
inherit_read_cap = (binary[ek(g)] | array[0)
- a binary with data: an encryption key for appropriate cipher.
- an empty array: no read capability provided (parent caps only)
The write capabilities are encrypted with a key derived from the secret signing key, with the context of the hash of the generation, fetch capabilities, and outer parent capabilites, as well as hash of the data and inner read capabilities.
encrypt( "Converge Version Write Capabilities"
, keyFromKey("Converge Version Write", write_cap), write_caps
, hash(node || piece_fcs || part_fcs || parent_fcs)
|| hash(piece_rcs || part_rcs || data || formats || parent_rcs));
Version Write Capabilites
Inside the write capability ciphertext is series of atlv-encoded values.
| Field | Type | Purpose |
|---|---|---|
| wcs | (pieces' + parts' + links') * inherit_write_cap{ek(g)} |
write caps |
The entries correspond to the pieces, parts, and links in the value.
inherit_write_cap{x} = binary[x] | array[0]
- a binary with data: a signing key for appropriate signature scheme.
- an empty array: no write capability provided
Version Summaries
The signature generation for versions are arranged such that a summary of their content can be stored that replaces the ciphertext of their body with a hash.
| Field | Type | Purpose |
|---|---|---|
| node | union[generation] binary[vk(generation)] |
public key |
| version | binary[sg(generation)] |
signature |
| body_hash | binary[h(generation)] |
hash of body ciphertext |
| piece_fcs | array[pieces] union[g] binary[h(g)] |
piece fetch caps |
| part_fcs | array[parts] union[2*g+t] binary[t?vk(g):sg(g)] |
part fetch caps |
| parent_fcs | array[parents] union[g] binary[sg(g)] |
parent fetch caps |
This structure can be verified by in the same way as the version, except the body ciphertext is pre-hashed.
Pieces
When the value in a version gets too big, it is broken up into pieces. Pieces are identified by their hash, have no formats nor parents, and are convergently encrypted.
Piece Outer Structure
| Field | Type | Purpose |
|---|---|---|
| piece | union[generation] binary[h(generation)] |
hash of this piece |
| piece_fcs | array[pieces] union[g] binary[h(g)] |
piece fetch caps |
| part_fcs | array[parts] union[2*g+t] binary[t?vk(g):sg(g)] |
part fetch caps |
| body_ct | binary[] |
encrypted inner structure |
The hash algorithm, thus the piece identifier type, and encryption
algorithm are determined by the generation field.
The piece field holds the hash with a domain separator of “Converge
Piece Outer” over the generation, the fetch capabilities, and the hash
of the body ciphertext.
sign( "Converge Piece Outer", write_cap
, hash(generation || piece_fcs || part_fcs) || hash(body_ct));
The body is convergently encrypted with the read capability, bound to the generation and fetch capabilities.
read_key = keyFromPlaintext( "Converge Piece Inner", body
, hash(generation || piece_fcs || part_fcs));
encrypt( "Converge Piece Inner", read_key, body
, hash(generation || piece_fcs || part_fcs));
Like with versions, the piece_fcs and part_fcs are each sorted, such
that the lowest cryptographic generations come first, version fetch caps
come before node fetch caps, and smaller hashes/keys come before larger
ones. This should match the encoding of the fetch capabilities.
Piece Inner Structure
The body of a piece is essentially a version without parents or formats:
| Field | Type | Purpose |
|---|---|---|
| piece_rcs | pieces * binary[ek(g)] |
piece read caps |
| part_rcs | parts * binary[ek(g)] |
part read caps |
| link_frcs | array[2*n] fetch_read_cap |
link caps |
| data | union[] value | binary[] |
the, possibly compressed, value |
| write_caps_ct | binary[] |
encrypted write capabilites |
All fields except write_caps_ct are encoded exactly the same they are
in a version.
The write_caps_ct may be of zero length to indicate that the piece has
no write capability and thus holds no write capabilities for any of the
pieces, parts or links in the piece. If the piece does hold write
capabilities they are convergently encrypted, bound to the rest of the
data in the piece.
write_cap = keyFromPlaintext("Converge Piece Write Capabilities", write_caps
, hash(generation || piece_fcs || part_fcs)
|| hash(piece_rcs || part_rcs || link_frcs || data));
encrypt( "Converge Piece Write Capabilities", write_cap, write_caps
, hash(generation || piece_fcs || part_fcs)
|| hash(piece_rcs || part_rcs || link_frcs || data));
The write capabilites within the ciphertext are identical to those within a version.
Size Limits
- Uncompressed values may not larger than 4MB.
- Versions and parts may reference, at most, 256 pieces and 256 parts.
- Versions may not have more than 64 parents.
Values
Within the body of both versions and pieces are values. When values are compressed, they are stored as bytes, but the uncompressed form always begins with a union:
| Purpose | Encoding |
|---|---|
| natural number | union[0] binary[vlq8] |
| tagged union | union[1] union[] value |
| floating point number | union[2] binary[2,4,8,16] |
| moment in time | union[3] union[scale] binary[] |
| piece | union[4] binary[idx] |
| part | union[5] binary[idx] |
| link | union[6] array[0] |
| bytes | union[8] binary[] |
| abstract bytes | union[9] array[branches] union[idx] binary[len] |
| text | union[10] binary[] |
| abstract text | union[11] array[branches] union[idx] binary[len] |
| sequence | union[12] array[leaves] value |
| abstract sequence | union[13] array[branches] union[idx] binary[len] |
| map | union[14] array[leaves] (key value) |
| abstract map | union[15] array[branches] (union[idx] key binary[len]) |
Compressed values may simply apply general purpose compression to the atlv-encoded values here, or implement a structure-aware compression mechanism.
VLQ8
Many number types are encoded with a 256-value bijective variable length quantity, VLQ8. Essentially base 256, except there is no zero digit. Instead each byte represents a digit one more than the byte value. Zero is then represented with an empty binary value.
Mechanically, these are parsed by accumulating from 0, and for each byte multiplying by 256, adding the byte value, and then adding 1.
Natural Numbers
union[0] binary[]
Natural numbers are encoded with VLQ8.
Hosts may have size limits on the natural numbers they support.
Tagged Union
union[1] union[] value
Tagged unions encode n-ary sum types, the tag goes into the inner union.
Binary Floating Point Numbers
union[2] binary[]
Floating point numbers are encoded following IEEE 754. Their size is indicated by the number of bytes in the binary. Decimal floating point values are not suported.
Hosts may, of course, limit their support, so binary32 and binary64 are likely the most portable.
Moments in Time
union[3] union[scale] binary[]
| Value | Time Scale |
|---|---|
| 0 | International Atomic Time |
| 1 | Earth Civic Time |
| 2 | Barycentic Time Coordinate |
| 3 | Local Monotomic Time |
| 48-63 | Private Use |
| 3136-4095 | Private Use |
International Atomic Time
union[3] union[0] binary[8,12,16]
The first 8 bytes are a big-endian count of seconds biased such that 2⁶³ is 1977-01-01 00:00:00 TAI. Smaller counts correspond to proper time at the mean altitude of the atomic clocks participating prior to 1977. Larger counts correspond to the gravitationally corrected proper time, theoretically at mean sea level.
The remaining 4 or 8 bytes encode a count of nanoseconds or attoseconds after the second. Nanosecond and attosecond counts of one second or more are reserved for future use.
This time scale has a range of 584.9 billion years. However, times outside the range of negative 4.5 billions years to positive 8 billion years are ill-defined, as it measures proper time within Earth’s gravity well.
Earth Civic Time
union[3] union[1] binary[6]
A big-endian count of milliseconds biased such that 2000-01-01 00:00:00 UTC is 2⁴⁷. The unix epoch, 1970-01-01 00:00:00 UTC, is 139790803555328.
This time scale follows UTC or whatever is in fashion at the time, and may have overlaps and discontinuities due to leap seconds.
Barycentric Time Coordinate
union[3] union[2] binary[8,12,16]
TCB represents the proper time of a clock at rest following the center of mass of the Solar System. TCB seconds, nanoseconds, and attoseconds are represented by the same method as TAI. The TCB and TAI timescales match at 2⁶³, 1977-01-01 00:00:32.184 TCB and 1977-01-01 00:00:00:00.000 TAI. However, TCB has a non-linear relationship with TAI amounting to roughly half a second faster per year, and .
Local Monotonic Time
union[3] union[3] binary[]
Follows some monotonically increasing clock. Values are encoded in the same method as natural numbers.
Interpretation of these times are context dependent.
Piece References
union[4] binary[idx]
The binary contains a VLQ8 index of the piece in the piece_fcaps
field of the outer structure, and thus also in the piece_rcaps field
in the inner structure.
Semantically, the entire value of the referenced piece is embedded into the value tree here.
Part References
union[5] binary[idx]
The binary contains the a VLQ8 index of the part in the part_fcaps
field of the outer structure, and thus also in the part_rcaps field in
the inner structure.
Link Capabilities
union[6] array[0]
Links in the link_frcaps structure are in the order of appearance in
the value, and thus need no index.
Raw Bytes
union[8] binary[]
Contains raw bytes, either as top level of a small byte tree or, as the only value in the node, a leaf in a byte tree.
Abstract Bytes
union[9] array[branches] union[idx] binary[len]
A branch node in a byte tree.
idx is the index of a piece that contains either raw or abstract
bytes. These pieces do not have write capabilities.
len is a VLQ8-encoded field which indicates the one less than the
total number of bytes to incorporate from the piece. If the piece
contains more bytes than the length, the extra bytes are ignored. If the
piece contains fewer bytes than the length, the missing bytes are taken
as all zeros.
In order to minimize revealed information about the structure of the byte tree, all referenced pieces should have the same depth.
Raw Text
union[10] binary[]
Contains raw text, encoded with UTF-8, either as top level of a small text tree or, as the only value in the node, a leaf in a text tree.
Abstract Text
union[11] array[branches] union[idx] binary[len]
A branch node in a text tree.
idx is the index of a piece that contains either raw or abstract text.
These pieces do not have write capabilities.
len is a VLQ8-encoded field which indicates one less than the total
number of bytes of text to incorporate from the piece. If the piece
contains more bytes than the length, the extra bytes are ignored,
however the total number of bytes must fall on a UTF-8 character
boundary. If the piece contains fewer bytes than the length, the void is
filled with NULL characters.
In order to minimize revealed information about the structure of the text tree, all referenced pieces should have the same depth.
Raw Sequence
union[12] array[leaves] value
Contains raw sequence, either as top level of a small sequence or, as the only value in the node, a leaf of a sequence.
Abstract Sequence
union[13] array[branches] union[idx] binary[len]
A branch node in a sequence.
idx is the index of a piece that contains either a raw or abstract
sequence. These pieces may have write capabilities.
len is a VLQ8-encoded field which indicates one less than the total
number of values to incorporate from the piece. If the piece contains
more values than the length, the extra values are ignored. If the piece
contains fewer values than the length, the missing values are taken as
empty, and retreiving them will act like retreiving an index beyond the
total length of the sequence.
In order to minimize revealed information about the structure of the sequence, all referenced pieces should have the same depth.
Raw Map
union[14] array[2*leaves] (key value)
Contains key-value pairs, either as top level of a small map or, as the only value in the node, a leaf of a map. The keys must be in ascending order.
Abstract Map
union[15] array[2*branches] (union[idx] key[key] binary[len])
A branch node in a map.
idx is the index of a piece that contains either a raw or abstract
map. These pieces may have write capabilities.
len is a VLQ8-encoded field which indicates one less than the total
number of entries to incorporate from the piece. If the piece contains
more values than the length, the extra values are ignored. If the piece
contains fewer values than the length, the missing values are taken as
empty, and retreiving them by index will act like retreiving an index
beyond the total length of the map.
key indicates the maximum key of those entries. The entries must be
arranged with strictly acending maximal keys.
In order to minimize revealed information about the structure of the map, all referenced pieces should have the same depth. That is, this should be a B-tree.
Keys
Both raw and abstract maps are indexed by keys. Keys are either a simple key, or an array of at least two simplex keys.
| Purpose | Encoding |
|---|---|
| simplex key | union[] ... |
| complex key | array[2+] simplex_key |
Simplex keys are similar to a subset of values:
| Purpose | Encoding |
|---|---|
| natural number | union[0] binary[] |
| tagged union | union[1] union[] key |
| moment in time | union[3] binary[] |
| bytes | union[8] binary[] |
| text | union[10] binary[] |
Note that, via tagged unions, simplex keys may contain complex keys.
Compression Methods
The compression method is determined by the first byte of the compressed value.
| Indicator | Method |
|---|---|
| 0 | Zstandard |
Zstandard Compression
Zstandard is fast compression algorithm with very fast decompression that achieves high compression ratios. Converge modifies the framing, omitting the magic number and replacing the header with a three byte, big-endian content length.
The maximum uncompressed content length is 4MB, so the two high bits of
U below must be zero. Given a binary of:
0: 00 UV WX YZ ...
Pass the following frame to a typical Zstandard library:
0: 28 b5 2f fd c0 YZ WX UV 00 ...
That is, prepend the magic number, a frame header indicating to decompress in a single memory segment, with the provided content length. There’s no checksum or dictionary.
Finals
Finals enable key rotation. They allow the originator of a node to mark a particular version as the last one, and indicate a successor node to transition to. They convey the new read capability, to those with a read capability for the current node, but they do not convey the write capability.
| Field | Type | Purpose |
|---|---|---|
| version | union[g0] binary[sg(g0)] |
generation and signature of last valid version |
| summary | binary[] |
the summary of the version just before verifying |
| node | union[g1] binary[vk(g1)] |
generation and public key of new node |
| readcap_ct | binary[iv(g0) + ek(g1)] |
encrypted read capability of the new version |
| authority | union[g2] binary[vk(g2)] |
verifying key authorized to rotate this node |
| commitment | binary[] |
the proof that the authority is authorized |
| signature | binary[sg(g2)] |
signature on the rest of the final by the authority |
As hinted above, the cryptographic generations of the old node, new node, and authorized rotation key may all be different.
Finals don’t show up in versioned nodes directly, but are reflected in that a request for the most recent version of a node might return a version from a different node, and a node may suddenly become read-only.
The summary of the last version is extracted just before verifying, such that the final version can be verified as being of the old node by injecting the summary and immediately verifying it.
The public key for the node can be reconstructed from the commitment and the authority. The current way to do this is using committed keys, but future generations may specify other methods.
The read capability for the new node is encrypted with the read capability for the current braid, using the final version, new node, commitment, and authorized verifying key as context.
encrypt( "Converge Final Read", old_read_key, new_read_key
, version || node || commitment || authority );
The signature is over the concatenation of the version (including generation), summary, new node (including generation), and the read capability ciphertext.
sign( "Converge Final", authority
, version || summary || node || readcap_ct);
Cryptographic Generations
There are currently one defined cryptographic generation.
| Gen | Hash | Signature | Cipher | Final Menthod |
|---|---|---|---|---|
| 0 | blake3 | r255b3 | xchacha8-blake3-iv | Commited Keys |
Further generations will be added when post-quantum signature schemes become settled, additional types are defined, or vernode storage formats other than converge are defined. Generations for other node storage formats will begin at 64, and allocate 64 generations to each format. The 65th block of generations (4096-4159) is reserved for private and experimental use.
Cryptography
Hash Function
Hash functions must be collision resistance. Hashes with resistance to length extension attacks are preferred, but not required, as lengths are included in the data.
Hashes are used in two modes, with and without a domain separator.
hash(domain, buf)hashes the buffer in a specific domainhash(buf)hashes the buffer without a domain
blake3
Blake3 is a well-known hash function that is very fast on a variety of hardware and not vulnerable to length extension attacks.
hash(domain, buf)hashes using the key derivation mode.hash(buf)hashes with the normal mode.
Signature Scheme
Signature schemes are required to be entirely deterministic.
The following operations are required:
keyFromKey(domain, key_material)derives a secret key from some varible length key material.toPublic(secret_key)derives the public key from the secret key.sign(domain, secret_key, buf)sign a buffer with a secret key.verify(domain, public_key, signature, buf)verifies that the signature was produced with the secret key corresponding to this public key.
There are some additional requirements covered in Committed Keys.
r255b3
r255b3 is a traditional Schnorr signature scheme based on the Ristretto255 prime order group and the Blake3 hash function. It has 48 byte signatures, and 32 byte secret and public keys.
Deterministic Authenticated Cipher
Deterministic authenticated cipher is an encryption scheme with some slightly unusual properties. In particular, it is fully deterministic, the same inputs always yield the same outputs. This is required for automated synchronization in a delay-tolerant environment, but technically prevents it from being secure against an adaptive chosen plaintext attack. Otherwise, it must meet the usual AEAD requirements with the addition of key commitment.
Converge requires four functions of these ciphers:
keyFromPlaintext(domain, plaintext)derives a key unique to the domain from the plaintext.keyFromKey(domain, key_material)derives a key from a fixed domain string and some other key material.encrypt(domain, key, plaintext, context)takes such a key, the plaintext, and the associated data, and produces a ciphertext,decrypt(domain, key, ciphertext, context)takes a key, the ciphertext, and the associated data, and returns the plaintext if the checks pass, and nothing otherwise.encryptConvergent(domain, plaintext, context, convergence)derives the key usingkeyFromPlaintextand thenencrypts, returning both the created key and the ciphertext. Ciphers may specify an optimized version of this that, for example, encrypts in fewer passes than the general version.
xchacha8-blake3-iv
XChaCha8-Blake3-IV combines the extended-nonce 8-round variation of the IETF ChaCha20 with a truncated Blake3 digest as the IV to form a deterministic authenticated encryption with associated data scheme with a 24-byte tag.
keyFromPlaintext computes a blake3 hash with the domain over the
hashes of the plaintext followed by the string “Convergent Key for
XChaCha8-Blake3-IV”.
keyFromKey computes a blake3 hash with the domain over the provided
key material followed by the string “Derived Key for
XChaCha8-Blake3-IV”.
encrypt computes a blake3 hash with the domain over the hashes of the
plaintext and context followed by the string “Nonce for
XChaCha8-Blake3-IV” and the encryption key, takes the first 24 bytes,
and uses that at the initialization vector for XChaCha8 encryption.
decrypt decrypts the ciphertext using XChaCha8 with the key, then
performs the same hash on the plaintext and context to derive the nonce.
if the first 24 bytes of the hash match the initialization vector, the
plaintext is returned, otherwise an error is returned.
This allows for an optimized version of encryptConvergent that can
derive the key and nonce in a single pass over the plaintext.
Committed Keys
Converge can use specially constructed secret and public keys to commit them to a verifying key. This imposes some additional requirements on signatures. Namely, an additional type and a pair of operations that can irreversibly transform a valid secret-public key pair into another valid secret-public key pair, and a means to derive that scalar from a public key and a public key from some other signature scheme. The general scheme is committing to obsolescence.
class Committed s where
derive_scalar :: Domain -> PublicKey s -> PublicKey c -> Scalar s
scale_public :: Scalar s -> PublicKey s -> PublicKey s
scale_secret :: Scalar s -> SecretKey s -> SecretKey s
To generate a key pair, first generate a commitment key pair. Then derive a scalar from the public half and the verifying key. Then use that scalar to scale the commitment key pair into the in-use key pair.
When it is time to migrate to a new key, release the commitment public key, and sign the final with the verifying key associated with it. Anyone who receives that final can then derive the public key for themselves (and verify the signature) to establish its authenticity.
r255b3
As r255b3 uses ristretto255 scalars as secret keys and ristretto255 points as public keys, the scalar type for committed keys is the same as the secret key. That is a ristretto255 scalar.
derive_scalar uses a blake3 key derivation hash with the domain to
hash the serialization of the two public keys, and then reduces that to
a scalar.
scale_public is scalar multiplication of the public key, while
scale_secret is multiplication on ristretto255 scalars.