Stream: general

Topic: ✔ Private internal components?


view this post on Zulip James Mart (Dec 22 2023 at 19:43):

I have a question related to the static analysis of components in the WebAssembly component model.

Let's say I have a wasm component (#1) with an export A.
I have another component #2 that imports A and B.
Now I compose a third component out of #1 and #2, and therefore it only imports B (since the A import is internally satisfied).

Can someone looking at component #3 tell that it contains functionality from A? Is it guaranteed that the wit file extracted from component #3 will reveal the presence of A? Or could that information be obfuscated?

view this post on Zulip Lann Martin (Dec 22 2023 at 19:52):

The exports of component 3 won't inherently reveal anything about components 1 or 2. The actual encoded bytes of component 3 will (with existing tooling) reveal everything about the composition.

view this post on Zulip Lann Martin (Dec 22 2023 at 19:53):

You could conceptually have a tool that "compiles" the composition of 1 and 2 into a core module that behaves the same as the composition, but I'm not aware of anyone doing that (and arguably that wouldn't be "composing" in the component model sense any more).

view this post on Zulip James Mart (Dec 23 2023 at 02:48):

Thanks for the reply, Lann. Consider a component that by its imports and exports appears only to do screen rendering, but in fact it also "secretly" makes HTTP requests to a server? I know if I parse the bytecode or monitor the network I may discover the functionality, but if I understand your answer, then I can't necessarily "trust" the declared imports and exports tell the full story.

If this is true, I don't really understand the claims I've heard about static analyzability in the component model, for example in some of the recent WasmCon talks. I suppose if I'm the author of all of the components then I can statically analyze the final product, but I can't really statically analyze a composition if it uses any registry components.

view this post on Zulip Jeff Parsons (Dec 23 2023 at 08:54):

James Mart said:

Consider a component that by its imports and exports appears only to do screen rendering, but in fact it also "secretly" makes HTTP requests to a server? I know if I parse the bytecode or monitor the network I may discover the functionality, but if I understand your answer, then I can't necessarily "trust" the declared imports and exports tell the full story.

No, that is absolutely not a thing a component can do. Regardless of how it is structured (or obfuscated) internally, the only way it can perform any kind of IO is through its imports and exports.

view this post on Zulip Jeff Parsons (Dec 23 2023 at 09:01):

I might be mistaken, but I think the source of confusion might have been what Lann said about hypothetically converting a composed component into a core Wasm module. At that point any guarantees about internal component structure would be destroyed, but the resulting core Wasm module still couldn't just decide to interact with the outside world arbitrarily; it can still only call the outside world via its imports and be called from the outside world via its exports.

view this post on Zulip Ralph (Dec 23 2023 at 11:53):

I think there's another thing here: the use of the word "secret" implies some wasm module that can't be understood. So far as I can tell, there's no such thing as an opaque module -- it's a spec, after all, so it can be examined internally easily enough.

view this post on Zulip Ralph (Dec 23 2023 at 11:55):

Where @James Mart says, ts imports and exports appears only to do screen rendering, but in fact it also "secretly" makes HTTP requests to a server-- another way to say this is that the wams runtime controls any interaction between the module inside the sandbox and the outside world. As a result, with a compliant wasm runtime, that "secret" http call will fail miserably because it's not a declared export it can call.

view this post on Zulip Ralph (Dec 23 2023 at 11:59):

I'm still, after this thread, not really clear about your objective, @James Mart, or is it merely curiosity? You say earlier can't necessarily "trust" the declared imports and exports tell the full story. That seems to imply you're trying to say you are not convinced that the inner code can call the http endpoint secretly. And the answer to that is yes, you can, because the runtime will not permit any outbound calls from any module without permitting it explicitly. Any such call with bonk. It doesn't matter whether you've parsed the actual wasm or merely the exports: without a) the export declared and b) the runtime permitting the use of that call the network request will bonk.

view this post on Zulip Ralph (Dec 23 2023 at 12:00):

if you GIVE modules full OS-style permissions AND if they therefore invoke an host export http request function AND you didn't scan the module beforehand then you might be surprised to find the "secret" call will work.

view this post on Zulip Ralph (Dec 23 2023 at 12:02):

do I understand things correctly? This is, by the way, precisely the same guarantee that core wasm gives you: that you can't make calls across the sandbox boundary without the host runtime giving that permission, whether your exported http api is WIT or whether it's a custom declared api (or js bindings for that matter).

view this post on Zulip Ralph (Dec 23 2023 at 12:03):

the module or component cannot, of itself, make that outbound call actually work. Only the runtime can permit that, and only for the specific exported api it presents to the module to call.

view this post on Zulip Ralph (Dec 23 2023 at 12:03):

Someone correct me here, because it's a fairly important set of points, if I'm misunderstanding something.

view this post on Zulip Ralph (Dec 23 2023 at 12:08):

Typically, this is the https://www.ibm.com/topics/log4j scenario, in which a dependency of a dependency used log4j without the jdni fix, enabling an inner depedency to maliciously execute remote code inside an environment's secure boundary.

The Log4J vulnerability, also known as “Log4Shell,” is a critical vulnerability discovered in the Apache Log4J logging library in November 2021.

view this post on Zulip Ralph (Dec 23 2023 at 12:09):

this just won't work unless you give that component permission to do this.

view this post on Zulip Ralph (Dec 23 2023 at 12:09):

is that the kind of situation you're thinking about?

view this post on Zulip Lann Martin (Dec 23 2023 at 14:11):

@James Mart I think I understand your question better now. A more-concrete example may help:

Component A imports a "database" interface.

view this post on Zulip Lann Martin (Dec 23 2023 at 14:17):

All interaction with the "real world" must ultimately happen through the outermost component's imports and exports. You might not be able to tell that the composed components are interacting with a database (as in scenarios 2 and 3), but you can see all of the ways it is permitted to interact with the outside world.

view this post on Zulip Ralph (Dec 23 2023 at 14:21):

and more, even without that, you can examine the core wasm in or linked by the component itself and examine what each one tries to do. Right now that's hard but all the tools exist; I'm quite sure that we'll have this tooling very soon that will be mostly point-and-click.

view this post on Zulip Ralph (Dec 23 2023 at 14:22):

One of the very neat things about the component model is that you can accidentally ship log4j and without complete OS-style exports to use, absolutely nothing will happen.

view this post on Zulip Ralph (Dec 23 2023 at 14:24):

@Lann Martin , just to close your examples, in Scenario 3 you can only SEE the networking import, but without the host's database export support, component c's database usage will fail. Yes?

view this post on Zulip Ralph (Dec 23 2023 at 14:26):

you can't "see" it (without looking at the assembly codepath itself, which you can do) but the inner also can't USE it without the host's permission/export implementation.

view this post on Zulip Lann Martin (Dec 23 2023 at 14:29):

Ah, to clarify (can't edit :rolling_eyes:): in scenario 3, component C is an "adapter": its database export is used to fulfill component A's import.

view this post on Zulip Ralph (Dec 23 2023 at 14:43):

we do need Zulip to support Mermaid diagrams here, it would help

view this post on Zulip Ralph (Dec 23 2023 at 14:43):

:-)

view this post on Zulip Ralph (Dec 23 2023 at 14:45):

The two points are:

view this post on Zulip Ralph (Dec 23 2023 at 14:47):

  1. you can't interact with the outside world (external to the sandbox) without using the public imports and exports of the outermost component.
  2. This remains true regardless what the inner components or modules try to do.
  3. In all cases, you can examine the code paths of all modules in advance; wasm is a public specification.

view this post on Zulip Ralph (Dec 23 2023 at 14:48):

and 4. off-by-two errors

view this post on Zulip James Mart (Dec 23 2023 at 15:33):

Thanks for the replies, everyone.

@Ralph I know the internal structure can be examined with various tooling so nothing is ultimately "secret." But in terms of being able to trustlessly execute third-party components, I don't want to have to examine the bytecode of each to ensure they aren't misbehaving. That was the perspective I was coming from.

But I understand better now, thanks to all of your replies. What I'm realizing is that, even if you create a composition of multiple internal components, you can't embed IO (like an HTTP request) into a module in such a way that it hides that capability from the final components imports (the way that you can embed a non-persistent in-memory database as in @Lann Martin's Scenario 2 above). Ultimately, IO is definitionally an interaction with the outside world and would therefore need to be imported by the final component.

view this post on Zulip Ralph (Dec 23 2023 at 15:34):

Yup!

view this post on Zulip Notification Bot (Dec 23 2023 at 15:35):

James Mart has marked this topic as resolved.


Last updated: Oct 23 2024 at 20:03 UTC