Stream: wasmtime on k8s

Topic: Is nano process suitable for long-running pod?


view this post on Zulip Notification Bot (Mar 19 2020 at 00:12):

Stream created by Owen Ou.

view this post on Zulip Luke Wagner (Mar 19 2020 at 00:13):

hi!

view this post on Zulip Owen Ou (Mar 19 2020 at 00:14):

Hello :wave:

view this post on Zulip Owen Ou (Mar 19 2020 at 00:15):

Created this stream to communicate. I'm unsure if I should use the wasmtime stream or a new one (this). Let's try and see how it goes :)

view this post on Zulip Owen Ou (Mar 19 2020 at 00:20):

Posting Twitter discussion here for visibility: Krustlet spawns a thread for a long-running pod (https://github.com/deislabs/krustlet/blob/2eb577b88041349c463cc14b80cb83f63cc27e3a/crates/wasi-provider/src/wasi_runtime.rs#L139) which isn't recommended for a multi-tenant k8s cluster due to possible Spectre attack IIUC

Kubernetes Rust Kubelet. Contribute to deislabs/krustlet development by creating an account on GitHub.

view this post on Zulip Luke Wagner (Mar 19 2020 at 00:48):

(I'm not a Zulip expert, but it seems reasonable to have a separate stream just for these topics.)

view this post on Zulip Luke Wagner (Mar 19 2020 at 00:50):

Yeah, so to restate what I said in the DM, it seems like it all depends on the context the service is being run in. In full generality (long-running tasks, either with access to time or attacker is able to externally time), I believe you do have to put separate tenants/domains in separate OS processes. It's a good question of what's the best way to achieve that with either a krustlet or containerd-shim approach.

view this post on Zulip Andrew Brown (Mar 19 2020 at 15:35):

@Owen Ou, I'm interested in this as well; I tried to get krustlet running a while back but ran into issues so I put that on hold. Have you had better luck with that or WaPC? Or are you thinking about creating something completely different?

view this post on Zulip Owen Ou (Mar 24 2020 at 16:03):

@Andrew Brown I'm trying to get more information from the krustlet folks this week. I'm thinking of building a containerd-shim that drives wasmtime.

view this post on Zulip Owen Ou (Mar 24 2020 at 16:08):

Regarding the wasmtime part, I would need wasmtime experts to chime in. It sounds like the most secure way is that each wasmtime instance runs in an OS process. But that means we don't get nano processes :(. Is there a way to avoid Spectre attack for nano processes?

view this post on Zulip Andrew Brown (Mar 24 2020 at 16:19):

I'm no expert but there have been discussions about this; cc: @Mingqiu Sun, @Pat Hickey

view this post on Zulip Owen Ou (Mar 24 2020 at 17:15):

@Luke Wagner @Peter Huene Circling back on the "wasmtime daemon" idea. There is a wasmtimed process that schedules wasm modules on a free wasmtime process in the wasmtime process pool. One nano process is scheduled on a wasmtime process at a time. Wasmtimed won't schedule module on a wasmtime process if it's already occupied by a nano process. wasmtimed-1.png

view this post on Zulip Peter Huene (Mar 24 2020 at 17:50):

I think that was the gist of what Luke was describing, although I wonder if he meant it at the containerd-shim level or at a krustlet -implementation level. I also think that there's timing attack mitigations we can do in WASI as well (a la limiting access to high-resolution timers like browsers do) that may assist in reducing attack surface for multi-tenant nanoprocesses in the same OS process.

view this post on Zulip Pat Hickey (Mar 24 2020 at 17:53):

sorry, I don't have very much insight on this question, I'm not sure how the trust boundaries are being drawn in this problem space

view this post on Zulip Mingqiu Sun (Mar 24 2020 at 19:01):

You may consider using the Intel Protection Key Tech for spectre protection at the thread level. But currently there is a 16 domain limitation.

view this post on Zulip Luke Wagner (Mar 25 2020 at 00:35):

@Owen Ou Yes, that is what I was imagining. That design wasn't specific to either the krustlet or containerd-shim approach. One high-level takeaway I had from the earlier chat was that the containerd-shim approach may force process creation in a way that might be incompatible with this approach -- it'd be good to verify that claim, though.

view this post on Zulip Luke Wagner (Mar 25 2020 at 00:36):

@Peter Huene Even if we take away time via WASI impl, if the wasm can be long-running and the attacker can externally time how long it takes to run (b/c, e.g., the wasm is running as part of a request/response loop), then a timing attack is still possible (b/c the attacker wasm can vary how long it takes based on the speculatively-stolen secret value).

view this post on Zulip Taylor Thomas (Mar 31 2020 at 17:04):

Hey all! Finally joined in here so you can just ask Krustlet questions directly :D

view this post on Zulip Matt Fisher (Mar 31 2020 at 17:20):

RE: the threading vs. separate OS process approach: we are still working through Krustlet's design. We are still weighing both approaches, and both are possible in the current architecture - just write a new wasmtime Provider that spawns wasmtime instances in a new process and let the Provider manage the instance.

You can take a look at wasi_runtime.rs to see how this is accomplished: https://github.com/deislabs/krustlet/blob/af061f0487fdacdb407bb798501b92b95f78f978/crates/wasi-provider/src/wasi_runtime.rs#L132

I'm not familiar with the speculative timing attacks mentioned here WRT spawning untrusted wasmtime instances in separate threads. Is there some ticket or design doc that describes this attack in more detail?

Kubernetes Rust Kubelet. Contribute to deislabs/krustlet development by creating an account on GitHub.

view this post on Zulip Owen Ou (Apr 01 2020 at 16:29):

@Matt Fisher: I think @Luke Wagner and @Peter Huene have more info on ^

view this post on Zulip Luke Wagner (Apr 01 2020 at 17:34):

@Matt Fisher The issue isn't really specific to wasmtime; it's more of a general Spectre consequence that is our new reality. Basically, the only general way to prevent Spectre attacks is to use an OS process boundary (which are occasionally breached, but at least CPU/OS vendors work in concert to fix these by adding mitigations to context/ring switches). Acknowledging this fact is why browsers are all doing process-per-origin/site (https://chromium.googlesource.com/chromium/src/+/master/docs/security/side-channel-threat-model.md). Wasmtime really doesn't have a say in the matter in the absence of any sub-process "Time Protection" (https://ts.data61.csiro.au/publications/csiro_full_text//Ge_YCH_19.pdf) primitives. Of course, in constrained execution scenarios (where you can limit what the attacker can do or observe), one can avoid OS processes, as edge compute vendors have done, but making that argument takes a lot more work and context.

view this post on Zulip Owen Ou (Apr 02 2020 at 05:35):

@Taylor Thomas @Matt Fisher I'm trying to get krustlet to run on EKS. I was able to get the krustlet node to register but kubectl logs didn't work:

k get nodes -o wide
NAME                                           STATUS   ROLES    AGE   VERSION              INTERNAL-IP      EXTERNAL-IP    OS-IMAGE         KERNEL-VERSION                  CONTAINER-RUNTIME
ip-192-168-23-105.us-west-2.compute.internal   Ready    <none>   55m   v1.14.9-eks-1f0ca9   192.168.23.105   34.221.97.70   Amazon Linux 2   4.14.171-136.231.amzn2.x86_64   docker://18.9.9
ip-192-168-55-201.us-west-2.compute.internal   Ready    agent    8s    v1.17.0              192.168.55.201   <none>         <unknown>        <unknown>                       mvp
k get po
NAME                    READY   STATUS       RESTARTS   AGE
hello-world-wasi-rust   0/1     ExitCode:0   0          2s
k logs hello-world-wasi-rust
Error from server: Get https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust: x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs
KUBECONFIG=./kubeconfig-sa PFX_PATH=./krustlet.pfx PFX_PASSWORD=password ./krustlet-wasi
[2020-04-02T05:26:34Z ERROR kubelet::server] error handling connection: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:ssl/record/rec_layer_s3.c:1544:SSL alert number 42

There seems to be some cert issue. Do you happen to know what went wrong?

On the node, I could curl the log though by ignoring the cert:

curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v -k
*   Trying 192.168.55.201...
* TCP_NODELAY set
* Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=.; L=.; O=.; OU=.; CN=krustlet
*  start date: Apr  2 04:44:00 2020 GMT
*  expire date: Apr  2 04:44:00 2021 GMT
*  issuer: CN=kubernetes
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
> GET /containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust HTTP/1.1
> Host: 192.168.55.201:3000
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 200 OK
< content-length: 116
< date: Thu, 02 Apr 2020 05:41:59 GMT
<
hello from stdout!
hello from stderr!
FOO=bar
CONFIG_MAP_VAL=cool stuff
POD_NAME=hello-world-wasi-rust
Args are: []
* Connection #0 to host 192.168.55.201 left intact

Another question is that I was trying to run kubectl logs on my local machine against EKS and 192.168.55.201 is an internal IP. Would I be able to access the log via this internal IP? Should there be an external IP field for Krustlet in kubectl get node? This is what a normal kubelet registers on the same node with both internal ip and external ip:

k get nodes -o wide
NAME                                           STATUS   ROLES    AGE   VERSION              INTERNAL-IP      EXTERNAL-IP      OS-IMAGE         KERNEL-VERSION                  CONTAINER-RUNTIME
ip-192-168-23-105.us-west-2.compute.internal   Ready    <none>   50m   v1.14.9-eks-1f0ca9   192.168.23.105   34.221.97.70     Amazon Linux 2   4.14.171-136.231.amzn2.x86_64   docker://18.9.9
ip-192-168-55-201.us-west-2.compute.internal   Ready    <none>   50m   v1.14.9-eks-1f0ca9   192.168.55.201   54.187.160.172   Amazon Linux 2   4.14.171-136.231.amzn2.x86_64   docker://19.3.6

view this post on Zulip Matt Fisher (Apr 02 2020 at 15:13):

Go's TLS crypto library sure is funny... Do you have steps available for how you generated the certificate for the Krustlet node?

@Taylor Thomas have you seen this error before? Perhaps it has to do with the common-name parameter when generating the certificate...

view this post on Zulip Taylor Thomas (Apr 02 2020 at 15:43):

@Owen Ou It looks like you gave the cert the common name of "krustlet" but it has a host name of ip-192-168-55-201.us-west-2.compute.internal That could possibly be the issue. As for the external IP thing, we should probably open an issue to have it register that as well, although for this initial case, we have been targeting things that don't necessarily have a publicly accessible IP address

view this post on Zulip Owen Ou (Apr 02 2020 at 17:54):

@Taylor Thomas @Matt Fisher:

I followed the step in https://github.com/deislabs/krustlet/blob/master/docs/howto/krustlet-on-aks.md#step-2-create-certificate to generate the cert.

Same issue after changing the CN to ip-192-168-55-201.us-west-2.compute.internal:

k logs hello-world-wasi-rust
Error from server: Get https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust: x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs
curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v -k
*   Trying 192.168.55.201...
* TCP_NODELAY set
* Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=.; L=.; O=.; OU=.; CN=ip-192-168-55-201.us-west-2.compute.internal
*  start date: Apr  2 17:44:00 2020 GMT
*  expire date: Apr  2 17:44:00 2021 GMT
*  issuer: CN=kubernetes
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
> GET /containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust HTTP/1.1
> Host: 192.168.55.201:3000
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 200 OK
< content-length: 116
< date: Thu, 02 Apr 2020 17:52:13 GMT
<
hello from stdout!
hello from stderr!
FOO=bar
CONFIG_MAP_VAL=cool stuff
POD_NAME=hello-world-wasi-rust
Args are: []
* Connection #0 to host 192.168.55.201 left intact
Kubernetes Rust Kubelet. Contribute to deislabs/krustlet development by creating an account on GitHub.

view this post on Zulip Taylor Thomas (Apr 02 2020 at 17:55):

"cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs" looks like we need to set its IP as well?

view this post on Zulip Taylor Thomas (Apr 02 2020 at 17:55):

I didn't need to on the AKS example, but not sure what the differences are

view this post on Zulip Owen Ou (Apr 02 2020 at 18:17):

@Taylor Thomas Setting IP as the CN? I did that as well and still got the same error.

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:43):

I honestly am not good with cert stuff but I think you need to set the SAN with the IP as well (not as the CN):

[ req ]
default_bits       = 2048
distinguished_name = req_distinguished_name
req_extensions     = req_ext
[ req_distinguished_name ]
countryName                 = Country Name (2 letter code)
stateOrProvinceName         = State or Province Name (full name)
localityName               = Locality Name (eg, city)
organizationName           = Organization Name (eg, company)
commonName                 = Common Name (e.g. server FQDN or YOUR name)
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
IP.1 = <ip_addr>

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:43):

Let me put together a full file

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:51):

Try this:

[ req ]
distinguished_name = req_distinguished_name
x509_extensions = v3_req
prompt = no
[ req_distinguished_name ]
C = US
ST = .
L = .
OU = .
CN = krustlet
[ v3_req ]
subjectAltName = @alt_names
[alt_names]
IP.1 = 192.168.55.201

With this command:
openssl req -new -sha256 -newkey rsa:2048 -keyout krustlet.key -out krustlet.csr -nodes -config test_csr.cnf

Saving that file as `test_csr.cnf.

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:51):

That should generate your CSR

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:52):

And maybe with that IP address it will allow it

view this post on Zulip Taylor Thomas (Apr 02 2020 at 18:53):

curl will never be successful without -k unless you have the CA cert from k8s available

view this post on Zulip Owen Ou (Apr 02 2020 at 21:49):

I tried your cnf file and regenerated certs. Still the same x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs error

view this post on Zulip Owen Ou (Apr 02 2020 at 21:56):

I tried to curl with the generated crt on the host and got an error:

curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v --cacert ./krustlet.crt
*   Trying 192.168.55.201...
* TCP_NODELAY set
* Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: ./krustlet.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

view this post on Zulip Taylor Thomas (Apr 02 2020 at 22:09):

@Owen Ou we might need to do a little debug session tomorrow. I am kind of confused how I got the certs to work so easily with AKS and it is being difficult with EKS, even though they are technically the same process. If you are available tomorrow, I can make some time

view this post on Zulip Owen Ou (Apr 02 2020 at 22:10):

@Taylor Thomas That would be perfect. Thank you! I'm available anytime after 2pm PST tomorrow. My email o@heroku.com

view this post on Zulip Owen Ou (Apr 02 2020 at 23:06):

Hmm...I think it's an EKS issue: https://github.com/awslabs/amazon-eks-ami/issues/341. The crt signed by kube api doesn't have IP bit:

$ openssl x509 -noout -text -in krustlet.crt
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            36:65:33:da:2b:34:a5:e5:e7:ee:4a:36:4b:5c:c3:c0:45:5c:de:a3
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=kubernetes
        Validity
            Not Before: Apr  2 05:42:00 2020 GMT
            Not After : Apr  2 05:42:00 2021 GMT
        Subject: C=US, ST=., L=., O=., OU=., CN=krustlet
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:ee:4a:1c:d5:24:6c:98:6c:87:0a:2b:09:74:16:
                    9e:b9:01:15:92:dd:1e:0a:8f:58:19:68:a8:5c:3a:
                    73:c1:d1:d7:ca:f0:30:c1:f3:09:40:71:f8:e1:3d:
                    c4:bd:6e:c5:08:ba:36:27:7c:ab:85:4d:01:97:81:
                    b7:e2:d0:39:8f:a4:09:e1:d0:77:3e:7c:80:60:6b:
                    e3:c0:5a:16:e3:ed:ec:06:64:40:b0:15:2a:c1:fe:
                    2a:fb:ed:ad:b6:11:d3:93:f7:88:2b:4a:0c:be:d9:
                    3f:c9:1e:0a:95:b6:50:63:5e:d4:04:95:6a:23:11:
                    b7:23:a6:8e:c0:0d:51:1b:9d:c9:f7:23:9b:ea:c5:
                    85:0a:bb:12:55:15:4c:99:61:97:5d:29:2c:6f:03:
                    02:11:44:18:fa:88:b8:9f:04:46:b4:df:e4:27:81:
                    91:ba:5b:51:b9:ea:f9:df:ff:00:99:e3:69:f7:4e:
                    af:ce:f0:9c:cb:23:0f:51:68:53:ab:0d:33:0d:27:
                    92:d5:02:41:12:d8:5e:3c:bd:00:03:bc:98:21:f4:
                    99:09:d1:19:21:02:1c:8d:5a:99:0e:f9:44:c5:6d:
                    04:82:6e:06:e7:eb:e9:d2:91:18:be:96:3a:7b:81:
                    89:e7:d6:ce:ca:26:8b:76:cc:05:25:fd:83:ee:d4:
                    64:f5
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier:
                B4:02:BB:10:95:05:31:73:2D:1E:44:E3:81:67:51:4B:7D:F2:25:E9
    Signature Algorithm: sha256WithRSAEncryption
         46:f2:e8:44:f4:f3:35:0d:56:32:df:5d:63:1d:0d:72:f4:98:
         2e:3d:c3:05:dc:86:09:90:da:83:9e:28:74:a0:39:0b:43:4e:
         90:a8:9f:a9:61:7f:2c:44:74:a0:b0:21:b6:b7:46:a5:d8:cd:
         bf:68:30:32:19:4e:84:73:57:77:26:c0:78:d5:0e:21:d5:4d:
         d4:4a:c9:8f:08:41:7f:d1:62:9b:b8:d1:4b:1f:4d:98:9a:15:
         21:d2:26:bc:b3:6f:10:80:d3:53:43:71:29:39:39:6d:8e:0c:
         67:a9:02:50:a9:37:b2:c4:3e:f0:30:eb:1a:1a:95:93:04:c4:
         04:38:e3:89:55:e4:84:a4:fa:df:24:fa:44:88:20:46:c0:7d:
         b9:c1:71:8a:63:a3:db:ee:ad:05:57:46:1b:b4:e4:1c:ff:75:
         85:85:42:7a:40:87:10:34:af:53:8d:0c:f8:0e:10:96:53:37:
         a4:97:5f:25:d2:23:9e:d4:6a:05:be:f9:a2:bd:47:ad:09:65:
         90:4b:0f:c1:63:eb:b8:62:60:ee:2e:e1:92:cd:ae:e3:04:54:
         b8:8c:b3:8e:36:22:4b:bd:97:ae:5a:51:c5:16:b2:13:cc:cc:
         17:74:92:ee:60:28:22:02:a2:e0:29:0e:f8:cf:92:cf:a8:85:
         2e:3a:b6:eb

But I can see IP in the altname for the csr that I uploaded:

$ k describe CertificateSigningRequest
Name:         krustlet
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration

CreationTimestamp:  Thu, 02 Apr 2020 15:47:50 -0700
Requesting User:    kubernetes-admin
Status:             Approved,Issued
Subject:
  Common Name:    192.168.55.201
  Serial Number:
Subject Alternative Names:
         DNS Names:     ip-192-168-55-201.us-west-2.compute.internal
         IP Addresses:  192.168.55.201
Events:  <none>
We're facing an issue that was reported before as part of another issue (#244 (comment)) but since the original issue was about something different which is fixed by now I'm extracting this...

view this post on Zulip Owen Ou (Apr 02 2020 at 23:08):

Can Krustlet support getting the log by DNS name? For example, https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust becomes https://ip-192-168-55-201.us-west-2.compute.internal:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust. Perhaps that's recommended by EKS. Besides, IP addr can change. Using DNS name is more stable.

view this post on Zulip Owen Ou (Apr 03 2020 at 01:10):

Is this diagram accurate for Krustlet?
krustlet.png

I see that there is no kube-proxy equivalent for krustlet yet. I assume it's going to be in the same krustlet process.

view this post on Zulip Taylor Thomas (Apr 03 2020 at 15:57):

DNS is perfectly acceptable, I don't think that is a required change for Krustlet unless we aren't configuring something. But I know we are setting the hostname when creating the node, so it seems like a k8s thing (could be totally wrong). If it is something we need to change in Krustlet, let me know and we'll add it in

As for your diagram, it looks correct, except for the providers. There is a 1:1 mapping between each krustlet "node" and provider. To clarify: each running Krustlet process only has 1 provider, though you can run multiple krustlet processes on the same node

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:03):

@Owen Ou So it looks like AKS is using DNS addresses:

# Normal pod
https://aks-agentpool-81651327-vmss000000:10250/containerLogs/kube-system/tunnelfront-864c788cf6-mtkg4/tunnel-front
# Krustlet pod
https://krustlet:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust

view this post on Zulip Owen Ou (Apr 03 2020 at 17:33):

@Taylor Thomas Krustlet boots with the node IP. It seems like AKS does a reverse lookup of IP -> DNS and call the DNS instead?

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:38):

Krustlet sets the hostname (there is also an ability to override it) on the node object. Here is an example node from my cluster:

apiVersion: v1
kind: Node
metadata:
  annotations:
    node.alpha.kubernetes.io/ttl: "0"
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2020-04-02T18:13:32Z"
  labels:
    beta.kubernetes.io/arch: wasm32-wasi
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: wasm32-wasi
    kubernetes.io/hostname: Taylors-MacBook-Pro.local
    kubernetes.io/os: linux
    kubernetes.io/role: agent
    type: krustlet
  name: krustlet-wasi
  resourceVersion: "2182784"
  selfLink: /api/v1/nodes/krustlet-wasi
  uid: 89b11130-8f61-44d5-b92b-edd348497517
spec:
  podCIDR: 10.244.0.0/24
  podCIDRs:
  - 10.244.0.0/24
  taints:
  - effect: NoExecute
    key: krustlet/arch
    value: wasm32-wasi
  - effect: NoSchedule
    key: node.kubernetes.io/unreachable
    timeAdded: "2020-04-02T18:23:57Z"
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    timeAdded: "2020-04-02T21:21:48Z"
status:
  addresses:
  - address: 10.10.76.188
    type: InternalIP
  - address: Taylors-MacBook-Pro.local
    type: Hostname
  allocatable:
    cpu: "4"
    ephemeral-storage: 61255492Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 4032800Ki
    pods: "30"
  capacity:
    cpu: "4"
    ephemeral-storage: 61255492Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 4032800Ki
    pods: "30"
  conditions:
  - lastHeartbeatTime: "2020-04-02T18:13:32Z"
    lastTransitionTime: "2020-04-02T18:23:57Z"
    message: Kubelet stopped posting node status.
    reason: NodeStatusUnknown
    status: Unknown
    type: Ready
  - lastHeartbeatTime: "2020-04-02T18:13:32Z"
    lastTransitionTime: "2020-04-02T18:13:32Z"
    message: kubelet has sufficient disk space available
    reason: KubeletHasSufficientDisk
    status: "False"
    type: OutOfDisk
  - lastHeartbeatTime: "2020-04-02T18:13:32Z"
    lastTransitionTime: "2020-04-02T18:23:57Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: MemoryPressure
  - lastHeartbeatTime: "2020-04-02T18:13:32Z"
    lastTransitionTime: "2020-04-02T18:23:57Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: DiskPressure
  - lastHeartbeatTime: "2020-04-02T18:13:32Z"
    lastTransitionTime: "2020-04-02T18:23:57Z"
    message: Kubelet never posted node status.
    reason: NodeStatusNeverUpdated
    status: Unknown
    type: PIDPressure
  daemonEndpoints:
    kubeletEndpoint:
      Port: 3001
  nodeInfo:
    architecture: wasm-wasi
    bootID: ""
    containerRuntimeVersion: mvp
    kernelVersion: ""
    kubeProxyVersion: v1.17.0
    kubeletVersion: v1.17.0
    machineID: ""
    operatingSystem: linux
    osImage: ""
    systemUUID: ""

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:39):

If you look at the address block, it has both the IP and the hostname setup

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:39):

Basically, I don't see anything with this being a krustlet problem and everything with how we get certs configured properly

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:40):

But I could totally be wrong

view this post on Zulip Taylor Thomas (Apr 03 2020 at 17:55):

@Owen Ou Is there someone you know who has lots of EKS knowledge? It may be useful to engage them to see if we are creating certs properly for EKS

view this post on Zulip Owen Ou (Apr 04 2020 at 00:58):

@Taylor Thomas This is the node info (one EKS kubelet node and one krustlet node):

apiVersion: v1
items:
- apiVersion: v1
  kind: Node
  metadata:
    annotations:
      node.alpha.kubernetes.io/ttl: "0"
      volumes.kubernetes.io/controller-managed-attach-detach: "true"
    creationTimestamp: "2020-04-02T04:30:13Z"
    labels:
      alpha.eksctl.io/cluster-name: krustlet-o
      alpha.eksctl.io/instance-id: i-0f4560ac42f0488cb
      alpha.eksctl.io/nodegroup-name: ng-6acf5969
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/instance-type: m5.large
      beta.kubernetes.io/os: linux
      failure-domain.beta.kubernetes.io/region: us-west-2
      failure-domain.beta.kubernetes.io/zone: us-west-2a
      kubernetes.io/arch: amd64
      kubernetes.io/hostname: ip-192-168-23-105.us-west-2.compute.internal
      kubernetes.io/os: linux
    name: ip-192-168-23-105.us-west-2.compute.internal
    resourceVersion: "245495"
    selfLink: /api/v1/nodes/ip-192-168-23-105.us-west-2.compute.internal
    uid: a5151dd5-749a-11ea-95cb-0a905beb9b08
  spec:
    providerID: aws:///us-west-2a/i-0f4560ac42f0488cb
  status:
    addresses:
    - address: 192.168.23.105
      type: InternalIP
    - address: 34.221.97.70
      type: ExternalIP
    - address: ip-192-168-23-105.us-west-2.compute.internal
      type: Hostname
    - address: ip-192-168-23-105.us-west-2.compute.internal
      type: InternalDNS
    - address: ec2-34-221-97-70.us-west-2.compute.amazonaws.com
      type: ExternalDNS
    allocatable:
      attachable-volumes-aws-ebs: "25"
      cpu: "2"
      ephemeral-storage: "19316009748"
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 7762632Ki
      pods: "29"
    capacity:
      attachable-volumes-aws-ebs: "25"
      cpu: "2"
      ephemeral-storage: 20959212Ki
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 7865032Ki
      pods: "29"
    conditions:
    - lastHeartbeatTime: "2020-04-04T00:47:12Z"
      lastTransitionTime: "2020-04-02T04:30:13Z"
      message: kubelet has sufficient memory available
      reason: KubeletHasSufficientMemory
      status: "False"
      type: MemoryPressure
    - lastHeartbeatTime: "2020-04-04T00:47:12Z"
      lastTransitionTime: "2020-04-02T04:30:13Z"
      message: kubelet has no disk pressure
      reason: KubeletHasNoDiskPressure
      status: "False"
      type: DiskPressure
    - lastHeartbeatTime: "2020-04-04T00:47:12Z"
      lastTransitionTime: "2020-04-02T04:30:13Z"
      message: kubelet has sufficient PID available
      reason: KubeletHasSufficientPID
      status: "False"
      type: PIDPressure
    - lastHeartbeatTime: "2020-04-04T00:47:12Z"
      lastTransitionTime: "2020-04-02T04:30:33Z"
      message: kubelet is posting ready status
      reason: KubeletReady
      status: "True"
      type: Ready
    daemonEndpoints:
      kubeletEndpoint:
        Port: 10250
    images:
    - names:
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni@sha256:a6d23b9fb3d4ba549321e32a28c42d8e79da203897072e93874472ab9e80b768
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.5.5
      sizeBytes: 263850871
    - names:
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy@sha256:d3a6122f63202665aa50f3c08644ef504dbe56c76a1e0ab05f8e296328f3a6b4
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.14.6
      sizeBytes: 82044796
    - names:
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns@sha256:ff6eadc11a45d8cbad5473b0950e01230c7f23bcb53392c80550feab69f905f1
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns:v1.6.6
      sizeBytes: 44336675
    - names:
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause-amd64@sha256:bea77c323c47f7b573355516acf927691182d1333333d1f41b7544012fab7adf
      - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause-amd64:3.1
      sizeBytes: 742472
    nodeInfo:
      architecture: amd64
      bootID: 058a481f-38c1-4368-8faf-1b67966bb06a
      containerRuntimeVersion: docker://18.9.9
      kernelVersion: 4.14.171-136.231.amzn2.x86_64
      kubeProxyVersion: v1.14.9-eks-1f0ca9
      kubeletVersion: v1.14.9-eks-1f0ca9
      machineID: ec2574a7ef250c7988b7b403a4de213e
      operatingSystem: linux
      osImage: Amazon Linux 2
      systemUUID: EC2574A7-EF25-0C79-88B7-B403A4DE213E
- apiVersion: v1
  kind: Node
  metadata:
    annotations:
      node.alpha.kubernetes.io/ttl: "0"
      volumes.kubernetes.io/controller-managed-attach-detach: "true"
    creationTimestamp: "2020-04-02T23:01:03Z"
    labels:
      beta.kubernetes.io/arch: wasm32-wasi
      beta.kubernetes.io/os: linux
      kubernetes.io/arch: wasm32-wasi
      kubernetes.io/hostname: ip-192-168-55-201.us-west-2.compute.internal
      kubernetes.io/os: linux
      kubernetes.io/role: agent
      type: krustlet
    name: ip-192-168-55-201.us-west-2.compute.internal
    resourceVersion: "118145"
    selfLink: /api/v1/nodes/ip-192-168-55-201.us-west-2.compute.internal
    uid: d3f725a8-7535-11ea-95cb-0a905beb9b08
  spec:
    podCIDR: 10.244.0.0/24
    taints:
    - effect: NoSchedule
      key: node.kubernetes.io/unreachable
      timeAdded: "2020-04-03T01:25:45Z"
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      timeAdded: "2020-04-03T01:25:50Z"
  status:
    addresses:
    - address: 192.168.55.201
      type: InternalIP
    - address: ip-192-168-55-201.us-west-2.compute.internal
      type: Hostname
    allocatable:
      cpu: "4"
      ephemeral-storage: 61255492Ki
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 4032800Ki
      pods: "30"
    capacity:
      cpu: "4"
      ephemeral-storage: 61255492Ki
      hugepages-1Gi: "0"
      hugepages-2Mi: "0"
      memory: 4032800Ki
      pods: "30"
    conditions:
    - lastHeartbeatTime: "2020-04-02T23:01:03Z"
      lastTransitionTime: "2020-04-03T01:25:45Z"
      message: Kubelet stopped posting node status.
      reason: NodeStatusUnknown
      status: Unknown
      type: Ready
    - lastHeartbeatTime: "2020-04-02T23:01:03Z"
      lastTransitionTime: "2020-04-02T23:01:03Z"
      message: kubelet has sufficient disk space available
      reason: KubeletHasSufficientDisk
      status: "False"
      type: OutOfDisk
    - lastHeartbeatTime: "2020-04-02T23:01:03Z"
      lastTransitionTime: "2020-04-03T01:25:45Z"
      message: Kubelet never posted node status.
      reason: NodeStatusNeverUpdated
      status: Unknown
      type: MemoryPressure
    - lastHeartbeatTime: "2020-04-02T23:01:03Z"
      lastTransitionTime: "2020-04-03T01:25:45Z"
      message: Kubelet never posted node status.
      reason: NodeStatusNeverUpdated
      status: Unknown
      type: DiskPressure
    - lastHeartbeatTime: "2020-04-02T23:01:03Z"
      lastTransitionTime: "2020-04-03T01:25:45Z"
      message: Kubelet never posted node status.
      reason: NodeStatusNeverUpdated
      status: Unknown
      type: PIDPressure
    daemonEndpoints:
      kubeletEndpoint:
        Port: 3000
    nodeInfo:
      architecture: wasm-wasi
      bootID: ""
      containerRuntimeVersion: mvp
      kernelVersion: ""
      kubeProxyVersion: v1.17.0
      kubeletVersion: v1.17.0
      machineID: ""
      operatingSystem: linux
      osImage: ""
      systemUUID: ""
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

view this post on Zulip Owen Ou (Apr 04 2020 at 01:00):

How did you get the container log ULR for the normal pod (the containerLogs one)? I want to compare it to understand why IP (https://192.168.55.201:3000) was preferred instead of the DNS name.

view this post on Zulip Owen Ou (Apr 04 2020 at 01:03):

(deleted)

view this post on Zulip Ralph (Apr 04 2020 at 13:42):

Keep at it, @Owen Ou . When we figure out precisely what is going on with Krustlet and EKS that will help us document it and then work to get it working elsewhere, too. GKE, Alibaba, DO, and so on.....

view this post on Zulip Taylor Thomas (Apr 06 2020 at 16:01):

How did you get the container log ULR for the normal pod (the containerLogs one)? I want to compare it to understand why IP (https://192.168.55.201:3000) was preferred instead of the DNS name.

Funny story, this is actually because my node was offline when I did kubectl logs so when the request timed out, it told me that what the URL it timed out trying to hit. I am guessing you might be able to find it with the right logging level in your apiserver logs

view this post on Zulip Taylor Thomas (Apr 07 2020 at 19:45):

@Owen Ou if you are still having problems, someone in the community created instructions for using Inlets to tunnel traffic to your krustlet node. Should work anywhere for any k8s cluster: https://gist.github.com/alexellis/d55d6d6a96ea9ae8d9d65b95297ec27e

krustlet-inlets. GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Owen Ou (Apr 20 2020 at 01:08):

Trying to help automate more of the EKS setup: https://github.com/deislabs/krustlet/pull/197 cc: @Peter Huene @Taylor Thomas @Matt Fisher

Add --node-labels to the command line option which appends extra labels when registering a Krustlet node. Disclaimer: My Rust skill is rusty and my code may not be idiomatic Rust code :-). This fix...

view this post on Zulip Taylor Thomas (Apr 20 2020 at 16:29):

@Owen Ou Can you take a look at https://github.com/deislabs/krustlet/pull/199? We had the author of that PR join us in our weekly call today. He said he took the work you did and then modified it to parse everything according to the rule specified in the kubelet help text

I'm new to Rust and krustlet, so forgive me if it's not spot on. Happy to take feedback and incorporate. I think this effectively enforces the rules specified by the man page, and also han...

view this post on Zulip Owen Ou (Apr 20 2020 at 22:06):

@Taylor Thomas I see. I only quickly added support for labels to avoid manually labeling nodes (https://github.com/deislabs/krustlet/pull/197) but it doesn't validate according to the spec. It's a nice thing to have

Add --node-labels to the command line option which appends extra labels when registering a Krustlet node. Disclaimer: My Rust skill is rusty and my code may not be idiomatic Rust code :-). This fix...

view this post on Zulip Owen Ou (Apr 20 2020 at 22:06):

Is there anything I could help with https://github.com/deislabs/krustlet/issues/187? It's a blocker to get a full experience on EKS

I don't have repro steps other than standing up an EKS cluster and waiting for the node to show up as NotReady. Prior to the panic, the node was Ready and was successfully running WebAssembly a...

view this post on Zulip Owen Ou (Apr 20 2020 at 22:08):

Also, I didn't know you guys have a community weekly call. Is it open to everybody? Thinking that I may join and ask questions when we collaborate more closely in the future.

view this post on Zulip Taylor Thomas (Apr 20 2020 at 22:24):

Yep, we have an open call every monday

view this post on Zulip Taylor Thomas (Apr 20 2020 at 22:24):

And feel free to take on 187. Ryan suggested something we could do there

view this post on Zulip Taylor Thomas (Apr 23 2020 at 16:57):

@Peter Huene @Owen Ou https://github.com/deislabs/krustlet/issues/187#issuecomment-618516479

I don't have repro steps other than standing up an EKS cluster and waiting for the node to show up as NotReady. Prior to the panic, the node was Ready and was successfully running WebAssembly a...

view this post on Zulip Peter Huene (Apr 23 2020 at 17:49):

I'll check today to see how the fix goes.

view this post on Zulip Peter Huene (Apr 24 2020 at 03:14):

Fix is looking good so far. I'm going to leave the nodes running for a little while and monitor the service log.

view this post on Zulip Owen Ou (Apr 25 2020 at 18:50):

Trying to help validate this too: https://github.com/deislabs/krustlet/issues/187#issuecomment-619424431

I don't have repro steps other than standing up an EKS cluster and waiting for the node to show up as NotReady. Prior to the panic, the node was Ready and was successfully running WebAssembly a...

view this post on Zulip Owen Ou (Apr 25 2020 at 18:54):

For wasm-to-oci, is it expected that pushing to docker hub fails? https://github.com/engineerd/wasm-to-oci/issues/7

Is pushing to docker hub supported? wasm-to-oci --log debug push target/wasm32-wasi/debug/hello-world-rust.wasm docker.io/jingweno/wasm:1234 Error: failed commit on ref "manifest-sha256:beb9a5...

view this post on Zulip Radu Matei (Apr 25 2020 at 19:57):

(https://github.com/engineerd/wasm-to-oci/issues/7#issuecomment-619432537)
TL; DR - the proposal used by wasm-to-oci, while an official proposal, is not yet implemented in most container registries - which actively reject unknown media types.

Is pushing to docker hub supported? wasm-to-oci --log debug push target/wasm32-wasi/debug/hello-world-rust.wasm docker.io/jingweno/wasm:1234 Error: failed commit on ref "manifest-sha256:beb9a5...

Last updated: Nov 22 2024 at 16:03 UTC