Stream created by Owen Ou.
hi!
Hello :wave:
Created this stream to communicate. I'm unsure if I should use the wasmtime stream or a new one (this). Let's try and see how it goes :)
Posting Twitter discussion here for visibility: Krustlet spawns a thread for a long-running pod (https://github.com/deislabs/krustlet/blob/2eb577b88041349c463cc14b80cb83f63cc27e3a/crates/wasi-provider/src/wasi_runtime.rs#L139) which isn't recommended for a multi-tenant k8s cluster due to possible Spectre attack IIUC
(I'm not a Zulip expert, but it seems reasonable to have a separate stream just for these topics.)
Yeah, so to restate what I said in the DM, it seems like it all depends on the context the service is being run in. In full generality (long-running tasks, either with access to time or attacker is able to externally time), I believe you do have to put separate tenants/domains in separate OS processes. It's a good question of what's the best way to achieve that with either a krustlet or containerd-shim approach.
@Owen Ou, I'm interested in this as well; I tried to get krustlet running a while back but ran into issues so I put that on hold. Have you had better luck with that or WaPC? Or are you thinking about creating something completely different?
@Andrew Brown I'm trying to get more information from the krustlet folks this week. I'm thinking of building a containerd-shim that drives wasmtime.
Regarding the wasmtime part, I would need wasmtime experts to chime in. It sounds like the most secure way is that each wasmtime instance runs in an OS process. But that means we don't get nano processes :(. Is there a way to avoid Spectre attack for nano processes?
I'm no expert but there have been discussions about this; cc: @Mingqiu Sun, @Pat Hickey
@Luke Wagner @Peter Huene Circling back on the "wasmtime daemon" idea. There is a wasmtimed process that schedules wasm modules on a free wasmtime process in the wasmtime process pool. One nano process is scheduled on a wasmtime process at a time. Wasmtimed won't schedule module on a wasmtime process if it's already occupied by a nano process. wasmtimed-1.png
I think that was the gist of what Luke was describing, although I wonder if he meant it at the containerd-shim level or at a krustlet -implementation level. I also think that there's timing attack mitigations we can do in WASI as well (a la limiting access to high-resolution timers like browsers do) that may assist in reducing attack surface for multi-tenant nanoprocesses in the same OS process.
sorry, I don't have very much insight on this question, I'm not sure how the trust boundaries are being drawn in this problem space
You may consider using the Intel Protection Key Tech for spectre protection at the thread level. But currently there is a 16 domain limitation.
@Owen Ou Yes, that is what I was imagining. That design wasn't specific to either the krustlet or containerd-shim approach. One high-level takeaway I had from the earlier chat was that the containerd-shim approach may force process creation in a way that might be incompatible with this approach -- it'd be good to verify that claim, though.
@Peter Huene Even if we take away time via WASI impl, if the wasm can be long-running and the attacker can externally time how long it takes to run (b/c, e.g., the wasm is running as part of a request/response loop), then a timing attack is still possible (b/c the attacker wasm can vary how long it takes based on the speculatively-stolen secret value).
Hey all! Finally joined in here so you can just ask Krustlet questions directly :D
RE: the threading vs. separate OS process approach: we are still working through Krustlet's design. We are still weighing both approaches, and both are possible in the current architecture - just write a new wasmtime Provider that spawns wasmtime instances in a new process and let the Provider manage the instance.
You can take a look at wasi_runtime.rs to see how this is accomplished: https://github.com/deislabs/krustlet/blob/af061f0487fdacdb407bb798501b92b95f78f978/crates/wasi-provider/src/wasi_runtime.rs#L132
I'm not familiar with the speculative timing attacks mentioned here WRT spawning untrusted wasmtime instances in separate threads. Is there some ticket or design doc that describes this attack in more detail?
@Matt Fisher: I think @Luke Wagner and @Peter Huene have more info on ^
@Matt Fisher The issue isn't really specific to wasmtime; it's more of a general Spectre consequence that is our new reality. Basically, the only general way to prevent Spectre attacks is to use an OS process boundary (which are occasionally breached, but at least CPU/OS vendors work in concert to fix these by adding mitigations to context/ring switches). Acknowledging this fact is why browsers are all doing process-per-origin/site (https://chromium.googlesource.com/chromium/src/+/master/docs/security/side-channel-threat-model.md). Wasmtime really doesn't have a say in the matter in the absence of any sub-process "Time Protection" (https://ts.data61.csiro.au/publications/csiro_full_text//Ge_YCH_19.pdf) primitives. Of course, in constrained execution scenarios (where you can limit what the attacker can do or observe), one can avoid OS processes, as edge compute vendors have done, but making that argument takes a lot more work and context.
@Taylor Thomas @Matt Fisher I'm trying to get krustlet to run on EKS. I was able to get the krustlet node to register but kubectl logs
didn't work:
k get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-192-168-23-105.us-west-2.compute.internal Ready <none> 55m v1.14.9-eks-1f0ca9 192.168.23.105 34.221.97.70 Amazon Linux 2 4.14.171-136.231.amzn2.x86_64 docker://18.9.9 ip-192-168-55-201.us-west-2.compute.internal Ready agent 8s v1.17.0 192.168.55.201 <none> <unknown> <unknown> mvp
k get po NAME READY STATUS RESTARTS AGE hello-world-wasi-rust 0/1 ExitCode:0 0 2s
k logs hello-world-wasi-rust Error from server: Get https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust: x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs
KUBECONFIG=./kubeconfig-sa PFX_PATH=./krustlet.pfx PFX_PASSWORD=password ./krustlet-wasi [2020-04-02T05:26:34Z ERROR kubelet::server] error handling connection: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:ssl/record/rec_layer_s3.c:1544:SSL alert number 42
There seems to be some cert issue. Do you happen to know what went wrong?
On the node, I could curl the log though by ignoring the cert:
curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v -k * Trying 192.168.55.201... * TCP_NODELAY set * Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * TLSv1.2 (OUT), TLS header, Certificate Status (22): * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 * ALPN, server did not agree to a protocol * Server certificate: * subject: C=US; ST=.; L=.; O=.; OU=.; CN=krustlet * start date: Apr 2 04:44:00 2020 GMT * expire date: Apr 2 04:44:00 2021 GMT * issuer: CN=kubernetes * SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway. > GET /containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust HTTP/1.1 > Host: 192.168.55.201:3000 > User-Agent: curl/7.61.1 > Accept: */* > < HTTP/1.1 200 OK < content-length: 116 < date: Thu, 02 Apr 2020 05:41:59 GMT < hello from stdout! hello from stderr! FOO=bar CONFIG_MAP_VAL=cool stuff POD_NAME=hello-world-wasi-rust Args are: [] * Connection #0 to host 192.168.55.201 left intact
Another question is that I was trying to run kubectl logs
on my local machine against EKS and 192.168.55.201
is an internal IP. Would I be able to access the log via this internal IP? Should there be an external IP field for Krustlet in kubectl get node
? This is what a normal kubelet registers on the same node with both internal ip and external ip:
k get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-192-168-23-105.us-west-2.compute.internal Ready <none> 50m v1.14.9-eks-1f0ca9 192.168.23.105 34.221.97.70 Amazon Linux 2 4.14.171-136.231.amzn2.x86_64 docker://18.9.9 ip-192-168-55-201.us-west-2.compute.internal Ready <none> 50m v1.14.9-eks-1f0ca9 192.168.55.201 54.187.160.172 Amazon Linux 2 4.14.171-136.231.amzn2.x86_64 docker://19.3.6
Go's TLS crypto library sure is funny... Do you have steps available for how you generated the certificate for the Krustlet node?
@Taylor Thomas have you seen this error before? Perhaps it has to do with the common-name
parameter when generating the certificate...
@Owen Ou It looks like you gave the cert the common name of "krustlet" but it has a host name of ip-192-168-55-201.us-west-2.compute.internal
That could possibly be the issue. As for the external IP thing, we should probably open an issue to have it register that as well, although for this initial case, we have been targeting things that don't necessarily have a publicly accessible IP address
@Taylor Thomas @Matt Fisher:
I followed the step in https://github.com/deislabs/krustlet/blob/master/docs/howto/krustlet-on-aks.md#step-2-create-certificate to generate the cert.
Same issue after changing the CN to ip-192-168-55-201.us-west-2.compute.internal
:
k logs hello-world-wasi-rust Error from server: Get https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust: x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs
curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v -k * Trying 192.168.55.201... * TCP_NODELAY set * Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * TLSv1.2 (OUT), TLS header, Certificate Status (22): * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 * ALPN, server did not agree to a protocol * Server certificate: * subject: C=US; ST=.; L=.; O=.; OU=.; CN=ip-192-168-55-201.us-west-2.compute.internal * start date: Apr 2 17:44:00 2020 GMT * expire date: Apr 2 17:44:00 2021 GMT * issuer: CN=kubernetes * SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway. > GET /containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust HTTP/1.1 > Host: 192.168.55.201:3000 > User-Agent: curl/7.61.1 > Accept: */* > < HTTP/1.1 200 OK < content-length: 116 < date: Thu, 02 Apr 2020 17:52:13 GMT < hello from stdout! hello from stderr! FOO=bar CONFIG_MAP_VAL=cool stuff POD_NAME=hello-world-wasi-rust Args are: [] * Connection #0 to host 192.168.55.201 left intact
"cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs" looks like we need to set its IP as well?
I didn't need to on the AKS example, but not sure what the differences are
@Taylor Thomas Setting IP as the CN? I did that as well and still got the same error.
I honestly am not good with cert stuff but I think you need to set the SAN with the IP as well (not as the CN):
[ req ] default_bits = 2048 distinguished_name = req_distinguished_name req_extensions = req_ext [ req_distinguished_name ] countryName = Country Name (2 letter code) stateOrProvinceName = State or Province Name (full name) localityName = Locality Name (eg, city) organizationName = Organization Name (eg, company) commonName = Common Name (e.g. server FQDN or YOUR name) [ req_ext ] subjectAltName = @alt_names [alt_names] IP.1 = <ip_addr>
Let me put together a full file
Try this:
[ req ] distinguished_name = req_distinguished_name x509_extensions = v3_req prompt = no [ req_distinguished_name ] C = US ST = . L = . OU = . CN = krustlet [ v3_req ] subjectAltName = @alt_names [alt_names] IP.1 = 192.168.55.201
With this command:
openssl req -new -sha256 -newkey rsa:2048 -keyout krustlet.key -out krustlet.csr -nodes -config test_csr.cnf
Saving that file as `test_csr.cnf.
That should generate your CSR
And maybe with that IP address it will allow it
curl will never be successful without -k
unless you have the CA cert from k8s available
I tried your cnf file and regenerated certs. Still the same x509: cannot validate certificate for 192.168.55.201 because it doesn't contain any IP SANs
error
I tried to curl with the generated crt on the host and got an error:
curl https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust -v --cacert ./krustlet.crt * Trying 192.168.55.201... * TCP_NODELAY set * Connected to 192.168.55.201 (192.168.55.201) port 3000 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: ./krustlet.crt CApath: none * TLSv1.2 (OUT), TLS header, Certificate Status (22): * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (OUT), TLS alert, unknown CA (560): * SSL certificate problem: unable to get local issuer certificate * Closing connection 0 curl: (60) SSL certificate problem: unable to get local issuer certificate More details here: https://curl.haxx.se/docs/sslcerts.html curl failed to verify the legitimacy of the server and therefore could not establish a secure connection to it. To learn more about this situation and how to fix it, please visit the web page mentioned above.
@Owen Ou we might need to do a little debug session tomorrow. I am kind of confused how I got the certs to work so easily with AKS and it is being difficult with EKS, even though they are technically the same process. If you are available tomorrow, I can make some time
@Taylor Thomas That would be perfect. Thank you! I'm available anytime after 2pm PST tomorrow. My email o@heroku.com
Hmm...I think it's an EKS issue: https://github.com/awslabs/amazon-eks-ami/issues/341. The crt signed by kube api doesn't have IP bit:
$ openssl x509 -noout -text -in krustlet.crt Certificate: Data: Version: 3 (0x2) Serial Number: 36:65:33:da:2b:34:a5:e5:e7:ee:4a:36:4b:5c:c3:c0:45:5c:de:a3 Signature Algorithm: sha256WithRSAEncryption Issuer: CN=kubernetes Validity Not Before: Apr 2 05:42:00 2020 GMT Not After : Apr 2 05:42:00 2021 GMT Subject: C=US, ST=., L=., O=., OU=., CN=krustlet Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:ee:4a:1c:d5:24:6c:98:6c:87:0a:2b:09:74:16: 9e:b9:01:15:92:dd:1e:0a:8f:58:19:68:a8:5c:3a: 73:c1:d1:d7:ca:f0:30:c1:f3:09:40:71:f8:e1:3d: c4:bd:6e:c5:08:ba:36:27:7c:ab:85:4d:01:97:81: b7:e2:d0:39:8f:a4:09:e1:d0:77:3e:7c:80:60:6b: e3:c0:5a:16:e3:ed:ec:06:64:40:b0:15:2a:c1:fe: 2a:fb:ed:ad:b6:11:d3:93:f7:88:2b:4a:0c:be:d9: 3f:c9:1e:0a:95:b6:50:63:5e:d4:04:95:6a:23:11: b7:23:a6:8e:c0:0d:51:1b:9d:c9:f7:23:9b:ea:c5: 85:0a:bb:12:55:15:4c:99:61:97:5d:29:2c:6f:03: 02:11:44:18:fa:88:b8:9f:04:46:b4:df:e4:27:81: 91:ba:5b:51:b9:ea:f9:df:ff:00:99:e3:69:f7:4e: af:ce:f0:9c:cb:23:0f:51:68:53:ab:0d:33:0d:27: 92:d5:02:41:12:d8:5e:3c:bd:00:03:bc:98:21:f4: 99:09:d1:19:21:02:1c:8d:5a:99:0e:f9:44:c5:6d: 04:82:6e:06:e7:eb:e9:d2:91:18:be:96:3a:7b:81: 89:e7:d6:ce:ca:26:8b:76:cc:05:25:fd:83:ee:d4: 64:f5 Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Basic Constraints: critical CA:FALSE X509v3 Subject Key Identifier: B4:02:BB:10:95:05:31:73:2D:1E:44:E3:81:67:51:4B:7D:F2:25:E9 Signature Algorithm: sha256WithRSAEncryption 46:f2:e8:44:f4:f3:35:0d:56:32:df:5d:63:1d:0d:72:f4:98: 2e:3d:c3:05:dc:86:09:90:da:83:9e:28:74:a0:39:0b:43:4e: 90:a8:9f:a9:61:7f:2c:44:74:a0:b0:21:b6:b7:46:a5:d8:cd: bf:68:30:32:19:4e:84:73:57:77:26:c0:78:d5:0e:21:d5:4d: d4:4a:c9:8f:08:41:7f:d1:62:9b:b8:d1:4b:1f:4d:98:9a:15: 21:d2:26:bc:b3:6f:10:80:d3:53:43:71:29:39:39:6d:8e:0c: 67:a9:02:50:a9:37:b2:c4:3e:f0:30:eb:1a:1a:95:93:04:c4: 04:38:e3:89:55:e4:84:a4:fa:df:24:fa:44:88:20:46:c0:7d: b9:c1:71:8a:63:a3:db:ee:ad:05:57:46:1b:b4:e4:1c:ff:75: 85:85:42:7a:40:87:10:34:af:53:8d:0c:f8:0e:10:96:53:37: a4:97:5f:25:d2:23:9e:d4:6a:05:be:f9:a2:bd:47:ad:09:65: 90:4b:0f:c1:63:eb:b8:62:60:ee:2e:e1:92:cd:ae:e3:04:54: b8:8c:b3:8e:36:22:4b:bd:97:ae:5a:51:c5:16:b2:13:cc:cc: 17:74:92:ee:60:28:22:02:a2:e0:29:0e:f8:cf:92:cf:a8:85: 2e:3a:b6:eb
But I can see IP in the altname for the csr that I uploaded:
$ k describe CertificateSigningRequest Name: krustlet Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration CreationTimestamp: Thu, 02 Apr 2020 15:47:50 -0700 Requesting User: kubernetes-admin Status: Approved,Issued Subject: Common Name: 192.168.55.201 Serial Number: Subject Alternative Names: DNS Names: ip-192-168-55-201.us-west-2.compute.internal IP Addresses: 192.168.55.201 Events: <none>
Can Krustlet support getting the log by DNS name? For example, https://192.168.55.201:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust
becomes https://ip-192-168-55-201.us-west-2.compute.internal:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust
. Perhaps that's recommended by EKS. Besides, IP addr can change. Using DNS name is more stable.
Is this diagram accurate for Krustlet?
krustlet.png
I see that there is no kube-proxy equivalent for krustlet yet. I assume it's going to be in the same krustlet process.
DNS is perfectly acceptable, I don't think that is a required change for Krustlet unless we aren't configuring something. But I know we are setting the hostname when creating the node, so it seems like a k8s thing (could be totally wrong). If it is something we need to change in Krustlet, let me know and we'll add it in
As for your diagram, it looks correct, except for the providers. There is a 1:1 mapping between each krustlet "node" and provider. To clarify: each running Krustlet process only has 1 provider, though you can run multiple krustlet processes on the same node
@Owen Ou So it looks like AKS is using DNS addresses:
# Normal pod https://aks-agentpool-81651327-vmss000000:10250/containerLogs/kube-system/tunnelfront-864c788cf6-mtkg4/tunnel-front # Krustlet pod https://krustlet:3000/containerLogs/default/hello-world-wasi-rust/hello-world-wasi-rust
@Taylor Thomas Krustlet boots with the node IP. It seems like AKS does a reverse lookup of IP -> DNS and call the DNS instead?
Krustlet sets the hostname (there is also an ability to override it) on the node object. Here is an example node from my cluster:
apiVersion: v1 kind: Node metadata: annotations: node.alpha.kubernetes.io/ttl: "0" volumes.kubernetes.io/controller-managed-attach-detach: "true" creationTimestamp: "2020-04-02T18:13:32Z" labels: beta.kubernetes.io/arch: wasm32-wasi beta.kubernetes.io/os: linux kubernetes.io/arch: wasm32-wasi kubernetes.io/hostname: Taylors-MacBook-Pro.local kubernetes.io/os: linux kubernetes.io/role: agent type: krustlet name: krustlet-wasi resourceVersion: "2182784" selfLink: /api/v1/nodes/krustlet-wasi uid: 89b11130-8f61-44d5-b92b-edd348497517 spec: podCIDR: 10.244.0.0/24 podCIDRs: - 10.244.0.0/24 taints: - effect: NoExecute key: krustlet/arch value: wasm32-wasi - effect: NoSchedule key: node.kubernetes.io/unreachable timeAdded: "2020-04-02T18:23:57Z" - effect: NoExecute key: node.kubernetes.io/unreachable timeAdded: "2020-04-02T21:21:48Z" status: addresses: - address: 10.10.76.188 type: InternalIP - address: Taylors-MacBook-Pro.local type: Hostname allocatable: cpu: "4" ephemeral-storage: 61255492Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 4032800Ki pods: "30" capacity: cpu: "4" ephemeral-storage: 61255492Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 4032800Ki pods: "30" conditions: - lastHeartbeatTime: "2020-04-02T18:13:32Z" lastTransitionTime: "2020-04-02T18:23:57Z" message: Kubelet stopped posting node status. reason: NodeStatusUnknown status: Unknown type: Ready - lastHeartbeatTime: "2020-04-02T18:13:32Z" lastTransitionTime: "2020-04-02T18:13:32Z" message: kubelet has sufficient disk space available reason: KubeletHasSufficientDisk status: "False" type: OutOfDisk - lastHeartbeatTime: "2020-04-02T18:13:32Z" lastTransitionTime: "2020-04-02T18:23:57Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: MemoryPressure - lastHeartbeatTime: "2020-04-02T18:13:32Z" lastTransitionTime: "2020-04-02T18:23:57Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: DiskPressure - lastHeartbeatTime: "2020-04-02T18:13:32Z" lastTransitionTime: "2020-04-02T18:23:57Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: PIDPressure daemonEndpoints: kubeletEndpoint: Port: 3001 nodeInfo: architecture: wasm-wasi bootID: "" containerRuntimeVersion: mvp kernelVersion: "" kubeProxyVersion: v1.17.0 kubeletVersion: v1.17.0 machineID: "" operatingSystem: linux osImage: "" systemUUID: ""
If you look at the address block, it has both the IP and the hostname setup
Basically, I don't see anything with this being a krustlet problem and everything with how we get certs configured properly
But I could totally be wrong
@Owen Ou Is there someone you know who has lots of EKS knowledge? It may be useful to engage them to see if we are creating certs properly for EKS
@Taylor Thomas This is the node info (one EKS kubelet node and one krustlet node):
apiVersion: v1 items: - apiVersion: v1 kind: Node metadata: annotations: node.alpha.kubernetes.io/ttl: "0" volumes.kubernetes.io/controller-managed-attach-detach: "true" creationTimestamp: "2020-04-02T04:30:13Z" labels: alpha.eksctl.io/cluster-name: krustlet-o alpha.eksctl.io/instance-id: i-0f4560ac42f0488cb alpha.eksctl.io/nodegroup-name: ng-6acf5969 beta.kubernetes.io/arch: amd64 beta.kubernetes.io/instance-type: m5.large beta.kubernetes.io/os: linux failure-domain.beta.kubernetes.io/region: us-west-2 failure-domain.beta.kubernetes.io/zone: us-west-2a kubernetes.io/arch: amd64 kubernetes.io/hostname: ip-192-168-23-105.us-west-2.compute.internal kubernetes.io/os: linux name: ip-192-168-23-105.us-west-2.compute.internal resourceVersion: "245495" selfLink: /api/v1/nodes/ip-192-168-23-105.us-west-2.compute.internal uid: a5151dd5-749a-11ea-95cb-0a905beb9b08 spec: providerID: aws:///us-west-2a/i-0f4560ac42f0488cb status: addresses: - address: 192.168.23.105 type: InternalIP - address: 34.221.97.70 type: ExternalIP - address: ip-192-168-23-105.us-west-2.compute.internal type: Hostname - address: ip-192-168-23-105.us-west-2.compute.internal type: InternalDNS - address: ec2-34-221-97-70.us-west-2.compute.amazonaws.com type: ExternalDNS allocatable: attachable-volumes-aws-ebs: "25" cpu: "2" ephemeral-storage: "19316009748" hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 7762632Ki pods: "29" capacity: attachable-volumes-aws-ebs: "25" cpu: "2" ephemeral-storage: 20959212Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 7865032Ki pods: "29" conditions: - lastHeartbeatTime: "2020-04-04T00:47:12Z" lastTransitionTime: "2020-04-02T04:30:13Z" message: kubelet has sufficient memory available reason: KubeletHasSufficientMemory status: "False" type: MemoryPressure - lastHeartbeatTime: "2020-04-04T00:47:12Z" lastTransitionTime: "2020-04-02T04:30:13Z" message: kubelet has no disk pressure reason: KubeletHasNoDiskPressure status: "False" type: DiskPressure - lastHeartbeatTime: "2020-04-04T00:47:12Z" lastTransitionTime: "2020-04-02T04:30:13Z" message: kubelet has sufficient PID available reason: KubeletHasSufficientPID status: "False" type: PIDPressure - lastHeartbeatTime: "2020-04-04T00:47:12Z" lastTransitionTime: "2020-04-02T04:30:33Z" message: kubelet is posting ready status reason: KubeletReady status: "True" type: Ready daemonEndpoints: kubeletEndpoint: Port: 10250 images: - names: - 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni@sha256:a6d23b9fb3d4ba549321e32a28c42d8e79da203897072e93874472ab9e80b768 - 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.5.5 sizeBytes: 263850871 - names: - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy@sha256:d3a6122f63202665aa50f3c08644ef504dbe56c76a1e0ab05f8e296328f3a6b4 - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.14.6 sizeBytes: 82044796 - names: - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns@sha256:ff6eadc11a45d8cbad5473b0950e01230c7f23bcb53392c80550feab69f905f1 - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/coredns:v1.6.6 sizeBytes: 44336675 - names: - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause-amd64@sha256:bea77c323c47f7b573355516acf927691182d1333333d1f41b7544012fab7adf - 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause-amd64:3.1 sizeBytes: 742472 nodeInfo: architecture: amd64 bootID: 058a481f-38c1-4368-8faf-1b67966bb06a containerRuntimeVersion: docker://18.9.9 kernelVersion: 4.14.171-136.231.amzn2.x86_64 kubeProxyVersion: v1.14.9-eks-1f0ca9 kubeletVersion: v1.14.9-eks-1f0ca9 machineID: ec2574a7ef250c7988b7b403a4de213e operatingSystem: linux osImage: Amazon Linux 2 systemUUID: EC2574A7-EF25-0C79-88B7-B403A4DE213E - apiVersion: v1 kind: Node metadata: annotations: node.alpha.kubernetes.io/ttl: "0" volumes.kubernetes.io/controller-managed-attach-detach: "true" creationTimestamp: "2020-04-02T23:01:03Z" labels: beta.kubernetes.io/arch: wasm32-wasi beta.kubernetes.io/os: linux kubernetes.io/arch: wasm32-wasi kubernetes.io/hostname: ip-192-168-55-201.us-west-2.compute.internal kubernetes.io/os: linux kubernetes.io/role: agent type: krustlet name: ip-192-168-55-201.us-west-2.compute.internal resourceVersion: "118145" selfLink: /api/v1/nodes/ip-192-168-55-201.us-west-2.compute.internal uid: d3f725a8-7535-11ea-95cb-0a905beb9b08 spec: podCIDR: 10.244.0.0/24 taints: - effect: NoSchedule key: node.kubernetes.io/unreachable timeAdded: "2020-04-03T01:25:45Z" - effect: NoExecute key: node.kubernetes.io/unreachable timeAdded: "2020-04-03T01:25:50Z" status: addresses: - address: 192.168.55.201 type: InternalIP - address: ip-192-168-55-201.us-west-2.compute.internal type: Hostname allocatable: cpu: "4" ephemeral-storage: 61255492Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 4032800Ki pods: "30" capacity: cpu: "4" ephemeral-storage: 61255492Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 4032800Ki pods: "30" conditions: - lastHeartbeatTime: "2020-04-02T23:01:03Z" lastTransitionTime: "2020-04-03T01:25:45Z" message: Kubelet stopped posting node status. reason: NodeStatusUnknown status: Unknown type: Ready - lastHeartbeatTime: "2020-04-02T23:01:03Z" lastTransitionTime: "2020-04-02T23:01:03Z" message: kubelet has sufficient disk space available reason: KubeletHasSufficientDisk status: "False" type: OutOfDisk - lastHeartbeatTime: "2020-04-02T23:01:03Z" lastTransitionTime: "2020-04-03T01:25:45Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: MemoryPressure - lastHeartbeatTime: "2020-04-02T23:01:03Z" lastTransitionTime: "2020-04-03T01:25:45Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: DiskPressure - lastHeartbeatTime: "2020-04-02T23:01:03Z" lastTransitionTime: "2020-04-03T01:25:45Z" message: Kubelet never posted node status. reason: NodeStatusNeverUpdated status: Unknown type: PIDPressure daemonEndpoints: kubeletEndpoint: Port: 3000 nodeInfo: architecture: wasm-wasi bootID: "" containerRuntimeVersion: mvp kernelVersion: "" kubeProxyVersion: v1.17.0 kubeletVersion: v1.17.0 machineID: "" operatingSystem: linux osImage: "" systemUUID: "" kind: List metadata: resourceVersion: "" selfLink: ""
How did you get the container log ULR for the normal pod (the containerLogs
one)? I want to compare it to understand why IP (https://192.168.55.201:3000) was preferred instead of the DNS name.
(deleted)
Keep at it, @Owen Ou . When we figure out precisely what is going on with Krustlet and EKS that will help us document it and then work to get it working elsewhere, too. GKE, Alibaba, DO, and so on.....
How did you get the container log ULR for the normal pod (the containerLogs one)? I want to compare it to understand why IP (https://192.168.55.201:3000) was preferred instead of the DNS name.
Funny story, this is actually because my node was offline when I did kubectl logs
so when the request timed out, it told me that what the URL it timed out trying to hit. I am guessing you might be able to find it with the right logging level in your apiserver logs
@Owen Ou if you are still having problems, someone in the community created instructions for using Inlets to tunnel traffic to your krustlet node. Should work anywhere for any k8s cluster: https://gist.github.com/alexellis/d55d6d6a96ea9ae8d9d65b95297ec27e
Trying to help automate more of the EKS setup: https://github.com/deislabs/krustlet/pull/197 cc: @Peter Huene @Taylor Thomas @Matt Fisher
@Owen Ou Can you take a look at https://github.com/deislabs/krustlet/pull/199? We had the author of that PR join us in our weekly call today. He said he took the work you did and then modified it to parse everything according to the rule specified in the kubelet help text
@Taylor Thomas I see. I only quickly added support for labels to avoid manually labeling nodes (https://github.com/deislabs/krustlet/pull/197) but it doesn't validate according to the spec. It's a nice thing to have
Is there anything I could help with https://github.com/deislabs/krustlet/issues/187? It's a blocker to get a full experience on EKS
Also, I didn't know you guys have a community weekly call. Is it open to everybody? Thinking that I may join and ask questions when we collaborate more closely in the future.
Yep, we have an open call every monday
And feel free to take on 187. Ryan suggested something we could do there
@Peter Huene @Owen Ou https://github.com/deislabs/krustlet/issues/187#issuecomment-618516479
I'll check today to see how the fix goes.
Fix is looking good so far. I'm going to leave the nodes running for a little while and monitor the service log.
Trying to help validate this too: https://github.com/deislabs/krustlet/issues/187#issuecomment-619424431
For wasm-to-oci
, is it expected that pushing to docker hub fails? https://github.com/engineerd/wasm-to-oci/issues/7
(https://github.com/engineerd/wasm-to-oci/issues/7#issuecomment-619432537)
TL; DR - the proposal used by wasm-to-oci, while an official proposal, is not yet implemented in most container registries - which actively reject unknown media types.
Last updated: Nov 22 2024 at 16:03 UTC