Principals

Recall our canister that called itself learned its own identity, or principal via a couple of system calls. We take a closer look at principals.




The following canister prints the identity of the caller and the identity of itself:

#define IMPORT(m,n) __attribute__((import_module(m))) __attribute__((import_name(n)));
#define EXPORT(n) asm(n) __attribute__((visibility("default")))
typedef unsigned u32;
void reply_append(void*, u32)     IMPORT("ic0", "msg_reply_data_append");
void reply(void)                  IMPORT("ic0", "msg_reply");
u32 caller_size(void)             IMPORT("ic0", "msg_caller_size");
void caller_copy(void*, u32, u32) IMPORT("ic0", "msg_caller_copy");
u32 self_size(void)               IMPORT("ic0", "canister_self_size");
void self_copy(void*, u32, u32)   IMPORT("ic0", "canister_self_copy");
unsigned char buf[128] = "DIDL\x01\x6d\x7b\x02\x00\x00";

void go() EXPORT("canister_query go");
void go() {
  unsigned char *p = buf + 10;
  // Append caller id to buffer.
  u32 n = caller_size();
  *p++ = n;
  caller_copy(p, 0, n);
  p += n;
  // Append self id to buffer.
  n = self_size();
  *p++ = n;
  self_copy(p,0, n);
  p += n;
  // Reply.
  reply_append(buf, p - buf);
  reply();
}

We see:

$ dfx canister call callerself go
(
  blob "O\f2\c7\9fp\06}$\bb\baJ\16G7\e0\ed\ddb\80*\c6\03S\1f\a0\fc\85[\02",
  blob "\00\00\00\00\00\00\00\01\01\01",
)

These are both principals, shown in an awkward format mixing hex escapes and printable characters. Raw hex is clearer for our purposes:

4ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b02
00000000000000010101

The second shorter principal is the ID of the canister, which is chosen by the system. It ends with 0x01, indicating it is an opaque ID.

The first principal is the user that made the call. We can confirm this by running:

$ dfx identity get-principal
m37qu-j2p6l-dz64a-gpusl-xoskc-zdtpy-hn3vr-iakwg-anjr7-ih4qv-nqe

…and then change to uppercase, add padding, decode with Base32, and discard the initial 4-byte CRC checksum:

$ (dfx identity get-principal;echo ===) | tr a-z A-Z | base32 -d -i | xxd -p -c 80 | tail -c +9
4ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b02

The principal ends in 0x02, indicating it is a self-authenticating ID, that is, its first 28 bytes is the SHA-224 hash of the public key of the user, whose details we now walk through.

Seed

Internet Computer users typically run Keysmith to generate their keys, ideally on an air-gapped computer.

First, we generate a seed:

$ keysmith generate

This generates a random seed, appends a checksum, then writes them to seed.txt. The output format is interesting: rather than hexadecimal or similar, we see words from the BIP39 Word List, which makes it easier for a human to commit the seed to memory.

For example, if the system gets 1 every time for 128 consecutive coin flips, then the generated seed.txt contains:

zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo wrong

(See? Seeds are easy to remember!)

The word zoo is last on a list of size 2048, thus represents 2047, which is 11 1-bits in a row, and wrong represents 2037, which is 7 more 1-bits glued to a 4-bit checksum. (The checksum is the first nibble of the SHA256 hash of 128 1-bits in a row, which turns out to be 0x5.)

Private Key

Private and public keys are derived from the seed. We write the private key to identity.pem:

$ keysmith private-key

We can use this private key with dfx by copying it to the appropriate subdirectory:

$ mkdir ~/.config/dfx/identity/zoo
$ cp identity.pem ~/.config/dfx/identity/zoo
$ dfx identity use zoo

This file contains a magic value representing the secp256k1 curve, along with the private key:

-----BEGIN EC PARAMETERS-----
BgUrgQQACg==
-----END EC PARAMETERS-----
-----BEGIN EC PRIVATE KEY-----
MHQCAQEEIN1zCmH5j+Vzx2duWVdyfAHO3zxaBL66Od+SRF4L0uyHoAcGBSuBBAAK
oUQDQgAEPMhJx31erTrq8uqCHchda7EEg7vpeHXQEK2iYp5Khj6BV5Peaa5P/ORt
UsSxTtGjrkDoW1O1y2x+1t6J2AxDBQ==
-----END EC PRIVATE KEY-----

Let’s dump the base64 decoding of the private key:

30740201010420dd730a61f98fe573c7676e5957727c01cedf3c5a04beba
39df92445e0bd2ec87a00706052b8104000aa144034200043cc849c77d5e
ad3aeaf2ea821dc85d6bb10483bbe97875d010ada2629e4a863e815793de
69ae4ffce46d52c4b14ed1a3ae40e85b53b5cb6c7ed6de89d80c4305

This consists of boilerplate around the 32-byte private key:

dd730a61f98fe573c7676e5957727c01cedf3c5a04beba39df92445e0bd2ec87

and the 65-byte public key:

04
3cc849c77d5ead3aeaf2ea821dc85d6bb10483bbe97875d010ada2629e4a863e
815793de69ae4ffce46d52c4b14ed1a3ae40e85b53b5cb6c7ed6de89d80c4305

Public Key

We could have computed the public key ourselves from the private key. Another way is to run keysmith, which derives it from the seed:

$ keysmith public-key
043cc849c77d5ead3aeaf2ea821dc85d6bb10483bbe97875d010ada2629e4a863e815793de69
ae4ffce46d52c4b14ed1a3ae40e85b53b5cb6c7ed6de89d80c4305

This is an ECDSA public key on the secp256k1 curve, that is, it represents a point on the curve y2 = x3 + 7 modulo p where:

> p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F

The 04 indicates the point is in uncompressed form, that is, we have 32 bytes holding the x-coordinate followed by 32 bytes holding the y-coordinate:

> x = 0x3cc849c77d5ead3aeaf2ea821dc85d6bb10483bbe97875d010ada2629e4a863e
> y = 0x815793de69ae4ffce46d52c4b14ed1a3ae40e85b53b5cb6c7ed6de89d80c4305

In Haskell or similar we can verify this lies on the curve:

> (y^2 - x^3 - 7) `mod` p
0

Principal

It takes a few steps to derive a principal from a public key. First, we prefix the public key with a certain magic string then compute a SHA224 hash:

$ (echo 3056301006072a8648ce3d020106052b8104000a034200; keysmith public-key) | xxd -r -p | sha224sum
4ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b  -

Next, we append 02 to indicate a self-authenticating ID and compute its CRC-32 checksum. The gzip format happens to include a CRC-32 checksum 8 bytes from the end of the file, and we abuse this fact:

(echo 4ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b; echo 02) | xxd -r -p | gzip | tail -c 8 | head -c 4 | xxd -p | tac -rs .. ; echo

66ff0a27

Concatenating this checksum, the SHA224 hash, and the 02 byte yields:

66ff0a274ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b02

Lastly, we encode this in RFC 4648 Base32 except we:

  • Use lowercase letters instead of uppercase.

  • Insert dashes every five characters.

  • Drop the padding at the end.

$ echo 66ff0a274ff2c79f70067d24bbba4a164737e0eddd62802ac603531fa0fc855b02 | xxd -r -p | base32 | tr -d = | tr A-Z a-z | sed 's/...../&-/g'
m37qu-j2p6l-dz64a-gpusl-xoskc-zdtpy-hn3vr-iakwg-anjr7-ih4qv-nqe

Thankfully, keysmith and dfx can do all this for us:

$ keysmith principal
m37qu-j2p6l-dz64a-gpusl-xoskc-zdtpy-hn3vr-iakwg-anjr7-ih4qv-nqe
$ dfx identity get-principal
m37qu-j2p6l-dz64a-gpusl-xoskc-zdtpy-hn3vr-iakwg-anjr7-ih4qv-nqe

We still have to explain the above magic string:

3056301006072a8648ce3d020106052b8104000a034200

It results from encoding our public key in the DER format, which can be seen by pasting the following into an online encoder:

{"seq": [
  {"seq": [
    {"oid": {"oid": "1.2.840.10045.2.1"}},
    {"oid": {"oid": "1.3.132.0.10"}}
    ]},
  {"bitstr": {"hex": "00043cc849c77d5ead3aeaf2ea821dc85d6bb10483bbe97875d010ada2629e4a863e815793de69ae4ffce46d52c4b14ed1a3ae40e85b53b5cb6c7ed6de89d80c4305"}}
  ]}

The Object Identifier 1.2.840.10045.2.1 means ecPublicKey and 1.3.132.0.10 means secp256k1.


💡