Stream: wasm

Topic: Python guest runtime and bindings


view this post on Zulip Joel Dice (Feb 28 2023 at 21:04):

Like Calvin's topic, but for Python.
Please reply or DM me if you're interested in meeting and collaborating on Python guest component tooling.
If you already expressed interest in my earlier discussion of a Python guest binding generator in the #wit-bindgen stream, I'll assume you're interested in this, as well :)

view this post on Zulip Calvin Prewitt (Feb 28 2023 at 21:05):

Myself and @Daniel Macovei are interested in Python guest

view this post on Zulip Robin Brown (Feb 28 2023 at 21:08):

We don't currently have a Python Bytecode Alliance project for Componentizing Python. It'd be great for this group to talk about what approaches they're using and how to make an equivalent to Componentize-JS.

view this post on Zulip Joel Dice (Feb 28 2023 at 21:20):

Please post your availability here if you're planning to join us: https://www.when2meet.com/?19002194-nc5IB

view this post on Zulip Kevin Smith (Feb 28 2023 at 21:23):

Definitely interested.

view this post on Zulip Joel Dice (Mar 06 2023 at 15:43):

I went ahead and scheduled this for Friday at noon ET / 9am PT. We'll meet at https://meet.jit.si/PythonGuestComponents-2023-03-10
DM me your email address if you'd like an email invitation. See you then!

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (Mar 10 2023 at 16:48):

Reminder: we're meeting at https://meet.jit.si/PythonGuestComponents-2023-03-10 in about 10 minutes. Agenda and notes here: https://hackmd.io/q5SWcHt1TWaYWcMtt-xI9g
See you soon!

Join a WebRTC video conference powered by the Jitsi Videobridge
or

view this post on Zulip Joel Dice (Mar 21 2023 at 14:33):

I'd like to schedule the next Python Component Tooling meeting for Thursday at 1pm ET (10am PT). Let me know if you'd like to attend but that time doesn't work, in which case I can reschedule. Agenda and notes here: https://hackmd.io/4l5OFAwISZuXl6MtdxOlEA
@Brett Cannon FYI

or

view this post on Zulip Brett Cannon (Mar 21 2023 at 19:09):

I can make it, but I will have to move a (not critical) meeting to do so. So we can keep the time if it works for folks, but I won't complain if it moves either. :wink:

view this post on Zulip Joel Dice (Mar 22 2023 at 17:15):

Folks who would like to attend: please indicate your availability here: https://www.when2meet.com/?19300681-boaTm. I'll move it if there's another time that works so Brett doesn't have to shuffle his schedule.

view this post on Zulip Joel Dice (Mar 23 2023 at 13:15):

Looks like everyone's available at 3pm ET (noon PT) -- let's meet then at https://meet.jit.si/PythonComponentTooling-2023-03-23

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (Mar 23 2023 at 18:58):

:point_up: This is starting in a couple of minutes.

view this post on Zulip Kevin Smith (Mar 24 2023 at 14:32):

I forgot one part of the numpy / pandas work yesterday that was a pretty big missing piece. Numpy requires setjmp/longjmp and one extension in pandas is C++ and requires exceptions (specifically __cxa_allocate_exception/__cxa_throw). I just stubbed them out for now to get them to compile. I don't know exactly when they get triggered during use though. setjmp/longjmp has showed up in other work as well (Ruby, Lua) so we'll likely need a general solution for them at some point.

view this post on Zulip Kevin Smith (Mar 24 2023 at 19:14):

BTW, just for the last hour or so before I go on vacation, I thought I'd see what it might take to get SciPy compiled. This looks like a much bigger problem. Numpy could be compiled without the LAPACK/BLAS libraries, but it looks like SciPy requires them. And.... they need a FORTRAN compiler. I haven't found any information about compiling FORTRAN to WASM using anything but Emscripten.

view this post on Zulip Brett Cannon (Mar 24 2023 at 23:18):

Yep, we were once asked if we wanted to fund work to make a Fortran compiler work under WebAssembly :big_smile:

view this post on Zulip Joel Dice (Mar 24 2023 at 23:35):

I'd be curious to know whether Flang or LFortran are mature enough to meet SciPy's needs, and whether they can be made to target wasm32-wasi.

view this post on Zulip Joel Dice (Mar 24 2023 at 23:36):

Apparently LFortran can translate Fortran code to C++ code, which is interesting.

view this post on Zulip Joel Dice (Apr 04 2023 at 13:57):

Would 3pm ET (noon PT) on Thursday work for the next Python Component Tooling meeting? If so, let's meet then at https://meet.jit.si/PythonGuestComponents-2023-04-06.
@Asen Alexandrov FYI

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (Apr 06 2023 at 18:15):

We'll meet at the above Jitsi room in ~45 minutes. Agenda and notes here: https://hackmd.io/kdXktZ8DQriSAvlm_YOfJw

or

view this post on Zulip Jamey Sharp (Apr 10 2023 at 20:07):

I've learned a few more things about how dynamic linking might work, and what steps we might take next, from discussion with Luke Wagner, Alex Crichton, and others.
In particular, I didn't understand that components can already contain multiple core wasm modules, with the component defining how to link together their imports and exports. Together with toolchain conventions, possibly matching what's in the existing Emscripten support for dynamic linking, something like wit-component or componentize-py could synthesize the right glue to make dlopen work.
Suggested next steps are to dig into those Emscripten conventions and also to understand how Pyodide uses Emscripten's conventions.

Conventions supporting interoperatibility between tools working with WebAssembly. - tool-conventions/DynamicLinking.md at main · WebAssembly/tool-conventions

view this post on Zulip Joel Dice (May 01 2023 at 16:02):

Shall we meet on Thursday at 3pm ET (noon PT)? If that works for everyone, we'll meet at https://meet.jit.si/PythonGuestComponents-2023-05-04

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (May 01 2023 at 16:06):

Also, componentize-py is now feature-complete, i.e. it should be able to handle arbitrary WIT worlds. However, the generated bindings are not particularly ergonomic or idiomatic, so the big remaining TODO is to factor out wasmtime-py type binding generator and reuse it. I'd also like some feedback on how to convert from Python exceptions to WIT results and vice-versa.

Contribute to dicej/componentize-py development by creating an account on GitHub.

view this post on Zulip Joel Dice (May 01 2023 at 16:08):

One more update: Jamey and I have a pretty solid plan for "dynamic" linking which I think will fit the Python ecosystem's needs very nicely. We should have an RFC up for feedback by Thursday.

view this post on Zulip Asen Alexandrov (May 04 2023 at 08:55):

By the way, I'm putting this stream on the highlight in the "Future work" of an article we'll publish today or tomorrow at WasmLabs. Let me know if you find this inappropriate and I will remove the reference - https://se2-bindings.wasm-labs.pages.dev/articles/wasm-host-to-python/#future-work

How to leverage Python and WebAssembly to securely extend your web application capabilities using Suborbital and Wasm Labs tooling and language runtimes.

view this post on Zulip Ralph (May 04 2023 at 14:31):

why would this be "inappropriate"? It's a good post about how you do this right now.

view this post on Zulip Asen Alexandrov (May 04 2023 at 14:41):

Ralph said:

why would this be "inappropriate"? It's a good post about how you do this right now.

By "inappropriate" I meant pointing people to this Zulip stream for further digging. I personally cannot think of a reason against this, but I prefer to ask before I put someone in the lime light.

view this post on Zulip Alex Crichton (May 04 2023 at 15:11):

Yeah no worries, this is a public Zulip instance and the intent is to have lots of folks take a look and discuss here, so no need to avoid linking it!

view this post on Zulip Joel Dice (May 04 2023 at 16:52):

We'll be meeting in a few minutes at https://meet.jit.si/PythonGuestComponents-2023-05-04. Agenda and notes here: https://hackmd.io/vJeeNh1KSvq1449O5OqL7w

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (May 04 2023 at 17:01):

Oops, sorry, not in a few minutes -- I had it on my calendar wrong. It's two hours from now: noon PT / 3pm ET

view this post on Zulip Mossaka (Joe) (May 04 2023 at 17:10):

Could you please grand permission to the hackmd doc?

view this post on Zulip Joel Dice (May 04 2023 at 17:15):

Done; thanks for the reminder.

view this post on Zulip Kevin Smith (May 04 2023 at 20:07):

Joel Dice said:

We'll be meeting in a few minutes at https://meet.jit.si/PythonGuestComponents-2023-05-04. Agenda and notes here: https://hackmd.io/vJeeNh1KSvq1449O5OqL7w

I guess you shouldn't feel too bad about this, I missed the meeting because I had it in my calendar as 3pm Central...

view this post on Zulip Joel Dice (May 04 2023 at 20:21):

Sorry for the confusion. I'm going to start using https://zulip.com/help/format-your-message-using-markdown#global-times from now on.

Zulip uses Markdown to allow you to easily format your messages. Even if you've never heard of Markdown, you are probably familiar with basic Markdown formatting, such as using * at the start of a line in a bulleted list, or around text to indicate emphasis. | This page provides an overview of all the formatting available in Zulip. There is a convenient message formatting reference in the Zulip app that you can use whenever you need a reminder of the formatting syntax below. | In Zulip, you can make text bold or italic, or cross it out with strikethrough.

view this post on Zulip Asen Alexandrov (May 04 2023 at 20:45):

@Joel Dice , @Brett Cannon , @Jamey Sharp I missed the meeding, but I can give you an empirical answer to this

does wasi-sdk expose a C preprocessor symbol or something with the SDK version so we can extract it while building cpython/wheels?

It does not. Was looking for this a month ago, but had to end up relying on the build script that sets up the SDK to also provide its version as a define/env_var where it was later needed for packaging. I only found the __clang_version__ among the defines, when building with wasi-sdk.

view this post on Zulip Joel Dice (May 05 2023 at 15:42):

Question for the Python experts: What's the most idiomatic way to generate Python bindings for the following WIT world?

world foo {
  import foo: interface {
    variant error {
      oops,
      oh-no(string),
      yikes
    }

    bar: func(n: u32) -> result<u32, error>
  }
}

I'm thinking something like what wasmtime-py currently generates, extended to make error usable as an exception which may be raised:

@dataclass
class ErrorOops:
    pass

@dataclass
class ErrorOhNo:
    value: str

@dataclass
class ErrorYikes:
    pass

@dataclass
class Error(Exception):
    value: Union[ErrorOops, ErrorOhNo, ErrorYikes]

# May raise `Error`
def bar(n: int) -> int: ...

The drawback of doing this is that Error doesn't appear in the type signature of bar due https://peps.python.org/pep-0484/#exceptions, whereas it _would_ if we didn't try to "exception-ify" the result. I assume the above is more idiomatic than def bar(n: int) -> Result[int, Error], though.

view this post on Zulip Shannon Duncan (shadowcodex) (May 05 2023 at 17:22):

Not a python expert, and not sure I grasp the quesiton 100%. But I typically use DocStrings to convey to users what kind of Exceptions a function could throw vs the statically typed throws x that Java provides. If I'm not mistaken the base exception class already accepts a string so ErrorOhNo would have that by default.

I structure my exceptions similar to the way airflow codebase does:

class Error(Exception):
    """Error Exception Type Binding"""
    pass

class ErrorOops(Error):
    pass

class ErrorOhNo(Error):
    pass

class ErrorYikes:
    pass

def bar(n: int) -> int:
    """Function binding that may throw Exceptions [Error, ErrorOops, ErrorOhNo, ErrorYikes]""""
    ...

Then the type hints on bar would show the docstring.

Airflow Exception Definition File

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - airflow/exceptions.py at v1-10-stable · apache/airflow

view this post on Zulip Shannon Duncan (shadowcodex) (May 05 2023 at 17:29):

I do believe python users would rather do:

try:
    bar(123)
except ErrorOops:
    pass

than

result, error = bar(123)
if isinstance(error, ErrorOops):
  exception case...

view this post on Zulip Kevin Smith (May 05 2023 at 17:43):

Shannon Duncan (shadowcodex) said:

class Error(Exception):
    """Error Exception Type Binding"""
    pass

class ErrorOops(Error):
    pass

class ErrorOhNo(Error):
    pass

class ErrorYikes(Error):
    pass

def bar(n: int) -> int:
    """Function binding that may throw Exceptions [Error, ErrorOops, ErrorOhNo, ErrorYikes]""""
    ...

This feels right to me. I don't think making the exceptions as dataclasses would be required. Python exceptions take any number of arguments which are accessible through the args attribute already. Of course, there is no one-true documentation format for documenting the exceptions (I personally like the numpy docstring style, but there are others).

view this post on Zulip Joel Dice (May 05 2023 at 17:53):

The reason I was using @dataclass is that wasmtime-py currently uses it when generating Python types from WIT variants, records, etc. and I'm trying to be consistent with that approach. Ideally we'd have some static type checking to verify that ErrorOhNo has a payload but ErrorOops does not, for example, and your IDE could warn you if you mixed them up.

view this post on Zulip Joel Dice (May 05 2023 at 17:55):

Likewise, wasmtime-py uses Union when generating code for WIT variants and I'm trying to be consistent with that. Could be that wasmtime-py's generation could use improvement, though, so I'm certainly open to that. I have basically zero Python experience, so I'll defer to just about anybody on this :)

view this post on Zulip Shannon Duncan (shadowcodex) (May 05 2023 at 17:58):

I'm hoping to dig into wasmtime-py this weekend. I'll try and keep an eye out for that and see if there are some improvements we can recommend. My plan is over the next few weeks to ramp up and start contributing to wasmtime-py.

view this post on Zulip Kevin Smith (May 05 2023 at 18:11):

variants and records are more formal data structures, so to me it makes sense for them to be dataclasses. Exceptions in Python are typically less formal. It seems like in most cases, they just get a message passed to them. I do wonder if making an Exception a dataclass would break any of the existing Exception class behaviors. I don't know off-hand. We may have to defer to @Brett Cannon on that one.

view this post on Zulip Joel Dice (May 05 2023 at 18:16):

Yeah, where it gets interesting is when an variant is used as the err case of a result _and_ elsewhere, e.g. as a parameter to a function or a field in a record. So it may not _only_ be used as an exception.

view this post on Zulip Joel Dice (May 05 2023 at 18:20):

I see that wasmtime-py defines this:

               T = TypeVar('T')
                @dataclass
                class Ok(Generic[T]):
                    value: T
                E = TypeVar('E')
                @dataclass
                class Err(Generic[E]):
                    value: E

                Result = Union[Ok[T], Err[E]]

Perhaps only Err needs to extend Exception, in which case we don't any variant or its cases to extend it. I.e. you can raise Err(ErrorOhNo("trouble")) but not raise ErrorOhNo("trouble").

view this post on Zulip Kevin Smith (May 05 2023 at 20:18):

I played around a bit with dataclass and Exception. It doesn't appear to have any obvious ill effects. You still get an args attribute with the values. The repr value is slightly different because dataclass adds the names of the fields, but I don't see that as a problem. You might want to add frozen=True to the dataclass call so that the fields can't be written to. If you write to the value field, the args attribute no longer matches.

view this post on Zulip Brett Cannon (May 09 2023 at 22:47):

I don't think I have ever seen an exception written as a dataclass. Exceptions can have attributes on them, but for the common case where extra data isn't useful in code itself, the inheritance hierarchy conveys the important information and you provide a human-readable message.

And returning a Result type isn't done in Python; you raise exceptions as necessary and can document what exceptions you explicitly raise.

I'm also not sure what information you're trying convey with your error variant. Am I to view each individual variant as a bit of information for a larger error type, or each their own type of error that are grouped together for typing convenience? My brain reads it as the latter, so I would assume it would be something more like:

class Oops(Exception): pass

class OhNo(Exception):
    def __init__(self, message, value):
        self.value = value  # I don't see a name for the parameter to the `oh-no` variant.
        super().__init__(message)

class Yikes(Exception): pass

Now you could have an error base class that they all inherit from:

class Error(Exception): pass

class Oops(Error): pass

class OhNo(Error):
    ...

class Yikes(Error): pass

Having a common exception class for an overall API that has multiple, custom exceptions is common.

view this post on Zulip Joel Dice (May 09 2023 at 23:54):

Thanks for the input, everybody. In case it wasn't clear: componentize-py needs to be able to generate Python bindings for arbitrary WIT files, and those WIT files aren't necessarily designed with Python (or any specific programming language) in mind. So when it gets something like this:

world foo {
  import foo: interface {
    variant error {
      oops,
      oh-no(string),
      yikes
    }
    struct foo {
      x: u32,
      what: error
    }

    bar: func(n: u32) -> result<u32, error>
    baz: func(e: error) -> foo
  }
}

... it needs to do the best it can. What information is the error variant trying to convey? Who knows? Imagine somebody else wrote it and we have no idea what they were trying to convey. That's componetize-py's perspective -- it gets WIT someone else wrote and generates Python bindings for it. So the question is: what's the most idiomatic Python code it can generate from WIT files which may be entirely un-Pythonic (e.g. using variants, records, u32s, or who knows what to represent errors)?

view this post on Zulip Kevin Smith (May 10 2023 at 17:26):

I still feel like @Shannon Duncan (shadowcodex) 's overall exception hierarchy is the correct way, but maybe just add the dataclass features like you originally had to add formal definitions of the payload. As Brett mentioned, I've never seen exceptions as dataclasses before either, but they might be needed to make this work on both sides. While it's a little sketchy, maybe add an args property to the base exception class as well to return a tuple that contains the data value. That way e.args and e.value won't get out of sync.

view this post on Zulip Simon Willison (May 10 2023 at 17:40):

I'm really interested in solving the "run untrusted Python code in a WASM sandbox inside my Python programs" problem. I wrote up some notes on what I'm looking to solve here: https://gist.github.com/simonw/b9a1f080714785b7ee16c7d04db12210

Short version: I want to be able to say "result = execute_untrusted_python(untrusted_code_string, memory_limit_in_bytes=8196, time_limit_in_seconds=1.0)" and get back the result of executing that code in a safe sandbox, with enforced memory and time limits.

GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Shannon Duncan (shadowcodex) (May 10 2023 at 17:41):

@Kevin Smith @Brett Cannon @Joel Dice

Is the real challenge here that we only know it’s an exception cause the wit says error? But if it says e or problem or some other random word we would skip the exception stuff all together?

To me it isn’t obvious yet how from a WIT we could generate any exception classes. We only know this case cause of how the variants are spelled, in future edge case they could label their error variant as X or something.

Does WIT have any formal way of handling error/exceptions/etc?

view this post on Zulip Joel Dice (May 10 2023 at 17:46):

@Shannon Duncan (shadowcodex) the name is not relevant -- it's the fact that the type is used as the second type argument to result. I.e. any time we have result<T, E>, where T and E are types, we need to treat E as an "error" type.

view this post on Zulip Joel Dice (May 10 2023 at 17:50):

So yes, WIT's formal way of representing failures is using result, similar to how Rust and ML-style languages do it.

view this post on Zulip Kevin Smith (May 10 2023 at 17:54):

Simon Willison said:

Short version: I want to be able to say "result = execute_untrusted_python(untrusted_code_string, memory_limit_in_bytes=8196, time_limit_in_seconds=1.0)" and get back the result of executing that code in a safe sandbox, with enforced memory and time limits.

You can do this to some extent now, but there are limitations. When you are submitting code to the WASM Python instance, you are running a completely separate Python instance than the original interpreter (including a completely separate standard library and installed packages). Using a package like wasmtime-py will allow you to run a python.wasm file inside your Python interpreter and it will be completely sandboxed (although you can allow file system access to specific directories if you wish). You will need to write one export function to execute the submitted Python code and return the result. There is an example very similar to this in udf_impl.c at https://github.com/singlestore-labs/python-wasi/tree/main/udf. I'd have to double-check the wasmtime-py API, but I'm pretty sure you can set memory limits. Timeouts would likely have to be done using async or threads in your application.

Utilities for building CPython for the WASI platform - python-wasi/udf at main · singlestore-labs/python-wasi

view this post on Zulip Kevin Smith (May 10 2023 at 17:55):

@Shannon Duncan (shadowcodex) That is a good point. It may be that wasmtime-py's way of doing this is best we can do.

view this post on Zulip Simon Willison (May 10 2023 at 17:56):

Completely separate Python instance is exactly what I'm after - I want it to have access to the Python standard library, but I don't need it to have access to any of my other code other than what I pass into it

view this post on Zulip Simon Willison (May 10 2023 at 17:57):

The problem I've been having with this is that I don't know very much C at all, so I've been hoping to stumble across an example that does exactly what I'm looking for - I'm confident I'm far from the only person who wants to solve this problem, "python in a sandbox" is a thing that's been wanted by the wider Python community for decades

view this post on Zulip Kevin Smith (May 10 2023 at 17:59):

@Simon Willison The UDF example I pointed to is pretty much what you want, but it does take some work to put the pieces together. Although, if you build python.wasm in that parent project, then run build.sh in the udf directory, you're pretty close to having it.

view this post on Zulip Simon Willison (May 10 2023 at 18:00):

It's frustrating because I'm 100% this is possible using existing Python WASM runtimes and the python.wasm build from https://github.com/vmware-labs/webassembly-language-runtimes/releases/tag/python%2F3.11.3%2B20230428-7d1b259 - but actually figuring out how to do it has mostly defeated me, bare this example here which uses a tmp filesystem in a way I'd rather avoid: https://til.simonwillison.net/webassembly/python-in-a-wasm-sandbox

Wasm Language Runtimes provides popular language runtimes (Ruby, Python, …) precompiled to WebAssembly that are tested for compatibility and kept up to date when new versions of upstream languages are released - Release python/3.11.3+20230428-7d1b259 · vmware-labs/webassembly-language-runtimes
I've been trying to figure this out for ages. Tim Bart responded to [my call for help on Hacker News](https://news.ycombinator.com/item?id=34598024) with [this extremely useful code example](https://g

view this post on Zulip Joel Dice (May 10 2023 at 18:00):

@Simon Willison wasmtime-py + componentize-py should do what you need and not require writing any C or Rust code. You would need to write a bit of WIT to represent the interface the host uses to talk to the guest running in the sandbox, but otherwise it would be pure Python on both sides.

view this post on Zulip Simon Willison (May 10 2023 at 18:01):

This https://github.com/dicej/componentize-py ? interesting, hadn't seen that one

Contribute to dicej/componentize-py development by creating an account on GitHub.

view this post on Zulip Joel Dice (May 10 2023 at 18:02):

Yes, it's quite new and still under development, but it works.

view this post on Zulip Joel Dice (May 10 2023 at 18:03):

My next goal is to publish artifacts to pypi so you can pip install it.

view this post on Zulip Simon Willison (May 10 2023 at 18:03):

I have a strong hunch that there is massive, pent-up demand for an easy way to safely run untrusted Python and JavaScript code using wasmtime-py / wasmer-python / etc, and the first project to release a "pip install" package that can do this (and hide all of the WASM / WIT / etc details) will find themselves with a massively popular project

view this post on Zulip Joel Dice (May 10 2023 at 18:04):

The tricky bit would be hiding the WIT details. We'd need some way to generate WIT from Python code, I guess.

view this post on Zulip Joel Dice (May 10 2023 at 18:04):

(which can't be done in the general case, but could be done for a subset of cases)

view this post on Zulip Simon Willison (May 10 2023 at 18:05):

it's frustrating because it feels like this should be one of the most obvious and useful applications of WASM, but it's way too hard to figure out how to do it right now

I would hope I don't need to learn WIT - I only want one function exposed to me, "run_python_code_in_sandbox_and_return_stringified_result(untrusted_string_of_python_code)" - basically I want a safe eval() alternative

view this post on Zulip Shannon Duncan (shadowcodex) (May 10 2023 at 18:07):

Joel Dice said:

Shannon Duncan (shadowcodex) the name is not relevant -- it's the fact that the type is used as the second type argument to result. I.e. any time we have result<T, E>, where T and E are types, we need to treat E as an "error" type.

Thanks Joel, I'm learning :smile: more and more! If that's the case I believe E should be of type Exception. Saw some discussion on some forum somewhere about adding dataclass attribute to Exception but I think that effects the __str__ dundermethod.

view this post on Zulip Joel Dice (May 10 2023 at 18:08):

@Simon Willison I agree it sounds great. I think the main thing missing is a sort of reverse binding generator which, instead of generating Python from WIT, generates WIT from (a subset of) Python. Not a trivial project, but doable.

view this post on Zulip Shannon Duncan (shadowcodex) (May 10 2023 at 18:09):

Joel Dice said:

The tricky bit would be hiding the WIT details. We'd need some way to generate WIT from Python code, I guess.

Wonder if function decorators could help solve this.

view this post on Zulip Simon Willison (May 10 2023 at 18:09):

I think I want something much simpler than that - literally a version of eval() that I can call where the arbitrary code I pass to it is evaluated in a WASM sandbox

view this post on Zulip Joel Dice (May 10 2023 at 18:10):

Right; I mean that the Python->WIT thing would happen under the hood and not be exposed to the app developer.

view this post on Zulip Simon Willison (May 10 2023 at 18:10):

That would solve the problem I have today - I'd be happy to adopt some brilliant future solution that lets me use function decorators and generates WIT and suchlike, but honestly I just want to run eval("3 * 5") and get back 15 safe in the knowledge that untrusted code can't break my application or takeover my computer

view this post on Zulip Joel Dice (May 10 2023 at 18:11):

oh, actually I see what you're saying now -- we just want to pass a string of Python code to the sandboxed interpreter and have it eval'd there. Yeah,, that wouldn't need any WIT stuff.

view this post on Zulip Joel Dice (May 10 2023 at 18:15):

So we could use componentize-py today to generate a component with a simple, general-purpose interface, e.g. func eval(code: string) -> result<string, string>. You'd need to deserialize (unpickle?) the result according to the expected Python type, I guess.

view this post on Zulip Simon Willison (May 10 2023 at 18:20):

Yup, that would solve my problem perfectly - I'm completely fine rolling my own serialization/deserialization stuff on top of that

view this post on Zulip Simon Willison (May 10 2023 at 18:20):

I'd probably use JSON for that to avoid any security concerns involving pickle

view this post on Zulip Joel Dice (May 10 2023 at 18:35):

If I have some time this week, I'll put together a proof-of-concept for this and report back. I haven't actually used wasmtime-py yet, so this is a good excuse to try it out.

view this post on Zulip Simon Willison (May 10 2023 at 18:37):

that would be amazing! Can't wait to see what you come up with

view this post on Zulip Milan (May 10 2023 at 20:22):

Another potential stepping stone on the way to compenetize-py as the sandboxing mechanism for running python could be using a runtime like Deno with pyodide. I submitted a trivial patch to pyodide so that you can run pyodide in Deno via the npm compatibility layer. That way you can use Deno to sandbox the io and WASM to sandbox the python runtime. Still needs some docs but added some examples to the pyodide issue on Deno support.

Deno support was discussed in #1477 (comment) and I think it would be good to add it and have some minimal tests in CI. As discussed in the linked PR only a few minor fixes are needed. I'll open a ...

view this post on Zulip Simon Willison (May 10 2023 at 20:26):

// example.ts
import pyodideModule from "npm:pyodide/pyodide.js";
const { loadPyodide } = pyodideModule;
const pyodide = await loadPyodide();
const result = await pyodide.runPythonAsync(`
3+4
`);
console.log("result:", result.toString());

Yeah that's exactly what I want to be able to do - I'd love to be able to do that in Python, not just in JavaScript

view this post on Zulip Simon Willison (May 10 2023 at 20:36):

Although I realize that the catch with Pyodide is that it doesn't provide an easy way to restrict memory usage - I guess because that's protection that browsers already provide. For server-side code I want the ability to restrict to a specific number of MBs of available memory for the untrusted code to operate in

view this post on Zulip Shannon Duncan (shadowcodex) (May 10 2023 at 21:20):

Simon Willison said:

Although I realize that the catch with Pyodide is that it doesn't provide an easy way to restrict memory usage - I guess because that's protection that browsers already provide. For server-side code I want the ability to restrict to a specific number of MBs of available memory for the untrusted code to operate in

Yeah that has to be provided by the runtime. Browsers vs wasmtime.

view this post on Zulip Simon Willison (May 10 2023 at 21:43):

Wrote up an experiment I did running Pyodide inside Deno inside a Python subprocess: https://til.simonwillison.net/deno/pyodide-sandbox

I continue to seek a solution to the Python sandbox problem. I want to run an untrusted piece of Python code in a sandbox, with limits on memory and time. Previous attempt: [Run Python code in a WebA

view this post on Zulip Brett Cannon (May 10 2023 at 22:34):

Shannon Duncan (shadowcodex) said:

Joel Dice said:

Shannon Duncan (shadowcodex) the name is not relevant -- it's the fact that the type is used as the second type argument to result. I.e. any time we have result<T, E>, where T and E are types, we need to treat E as an "error" type.

Thanks Joel, I'm learning :smile: more and more! If that's the case I believe E should be of type Exception. Saw some discussion on some forum somewhere about adding dataclass attribute to Exception but I think that effects the __str__ dundermethod.

You must inherit from Exception if you are going to raise an exception. And as I said, I have never seen an exception class be a dataclass, so you're in uncharted territory in terms of compatibility.

view this post on Zulip Brett Cannon (May 10 2023 at 22:36):

Kevin Smith said:

I still feel like Shannon Duncan (shadowcodex) 's overall exception hierarchy is the correct way, but maybe just add the dataclass features like you originally had to add formal definitions of the payload. As Brett mentioned, I've never seen exceptions as dataclasses before either, but they might be needed to make this work on both sides. While it's a little sketchy, maybe add an args property to the base exception class as well to return a tuple that contains the data value. That way e.args and e.value won't get out of sync.

Every Python exception already has an args attribute thanks to Exception:
``python

try:
... raise RuntimeError("I have an args")
... except RuntimeError as exc:
... print(dir(exc))
... print(exc.args)
...
['__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', 'add_note', 'args', 'with_traceback']
('I have an args',)

And you can pass an arbitrary number of arguments:
```python
>>> try:
...     raise RuntimeError("I have an args")
... except RuntimeError as exc:
...     print(dir(exc))
...     print(exc.args)
...
['__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', 'add_note', 'args', 'with_traceback']
('I have an args',)
>>> try:
...     raise RuntimeError("I have an args", "so many args")
... except RuntimeError as exc:
...     print(dir(exc))
...     print(exc.args)
...
['__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', 'add_note', 'args', 'with_traceback']
('I have an args', 'so many args')

view this post on Zulip Joel Dice (May 10 2023 at 22:58):

Here's what I ended up doing in componentize-py (happy to change it if there's a better option that works in all cases):
I started with how wasmtime-py currently represents result:

                T = TypeVar('T')
                @dataclass
                class Ok(Generic[T]):
                    value: T
                E = TypeVar('E')
                @dataclass
                class Err(Generic[E]):
                    value: E

                Result = Union[Ok[T], Err[E]]

And made the smallest change that could possibly work, which is to make Err extend Exception:

                T = TypeVar('T')
                @dataclass
                class Ok(Generic[T]):
                    value: T
                E = TypeVar('E')
                @dataclass
                class Err(Generic[E], Exception):
                    value: E

                Result = Union[Ok[T], Err[E]]

That means the E type need not extend Exception. So values of type E can't, in general, be raised, but values of type Err[E] can. For exports, componentize-py will catch Errs and turn them into results to pass back to the host. For imports, if the host returns an error, it will be wrapped in an Err and raised.

view this post on Zulip Joel Dice (May 11 2023 at 13:50):

@Brett Cannon (or anyone else who knows): Is it possible to build CPython 3.11.x for WASI on Windows without resorting to WSL? I was up until 1AM last night trying everything I could think of (MSYS2, Cygwin, various flavors of Visual Studio) but never got it working. Building the bootstrap python.exe was trouble-free using build.bat, and I was able to use the configure script to configure for wasm32-unknown-wasi, but was never able to make without either compiler or missing file errors.
For context, I'm setting up CI for componentize-py. Worst case, I can just build using Linux, publish the result, and use it on Windows.

view this post on Zulip Notification Bot (May 11 2023 at 13:50):

A message was moved here from #wasm > Interpreted Language Guests by Joel Dice.

view this post on Zulip Brett Cannon (May 11 2023 at 18:27):

@Joel Dice I have never tried to do a cross-compile under Windows, so I have no clue what would be involved (I have always done it via Linux in CI or WSL; see https://github.com/brettcannon/cpython-wasi-build for how I have currently automated it)

Unofficial WASI builds of CPython. Contribute to brettcannon/cpython-wasi-build development by creating an account on GitHub.

view this post on Zulip Joel Dice (May 11 2023 at 18:29):

Ok, no worries. If WSL is required on Windows, that's fine -- just wanted to make sure I wasn't missing anything.

view this post on Zulip Ralph (May 11 2023 at 19:31):

people build things on windows?

view this post on Zulip Ralph (May 11 2023 at 19:31):

that's news to me

view this post on Zulip Joel Dice (May 11 2023 at 22:28):

In case anyone wants to try componentize-py out, there are now pre-built binaries available: https://github.com/dicej/componentize-py/releases/tag/canary

 $ curl -Ls https://github.com/dicej/componentize-py/releases/download/canary/componentize-py-canary-macos-aarch64.tar.gz|tar xz
 $ ./componentize-py --help
A utility to convert Python apps into Wasm components

Usage: componentize-py [OPTIONS] <COMMAND>

Commands:
  componentize  Generate a component from the specified Python app and its dependencies
  bindings      Generate Python bindings for the world and write them to the specified directory
  help          Print this message or the help of the given subcommand(s)

Options:
  -d, --wit-path <WIT_PATH>  File or directory containing WIT document(s) [default: wit]
  -w, --world <WORLD>        Name of world to target (or default world if `None`)
  -q, --quiet                Disable non-error output
  -h, --help                 Print help
  -V, --version              Print version
Contribute to dicej/componentize-py development by creating an account on GitHub.

view this post on Zulip Emile Fugulin (May 12 2023 at 13:46):

@Joel Dice I will give it a shot, I am trying to find a way to run arbitrary python code from our rust code. I read https://wasmlabs.dev/articles/wasm-host-to-python/ and ended up here. Does componentize need to have the python code when building or it would be possible to load the code at runtime?

How to leverage Python and WebAssembly to securely extend your web application capabilities using Suborbital and Wasm Labs tooling and language runtimes.

view this post on Zulip Joel Dice (May 12 2023 at 14:19):

Currently it wants the Python code while building, although you can always inject code using eval at runtime.

view this post on Zulip Joel Dice (May 12 2023 at 14:21):

BTW, I'm planning to make componentize-py usable as a Python library, hopefully next week. Then you'll be able to pip install it and write code that generates and runs (via wasmtime-py) components on-the-fly.

view this post on Zulip Joel Dice (May 12 2023 at 16:06):

@Simon Willison I spent some time this morning creating a wasmtime-py/componentize-py demo per our earlier conversation: https://github.com/dicej/component-sandbox-demo. Unfortunately, it doesn't actually work yet, since wasmtime does not yet have a built-in WASI Preview 2 implementation (work in progress: https://github.com/bytecodealliance/wasmtime/issues/6370). There's also a bug where the binding generator sometimes uses Python keywords as identifiers, but that should be easy to fix.

Contribute to dicej/component-sandbox-demo development by creating an account on GitHub.
We have been working on a prototype of what WASI Preview 2 support will look like in Wasmtime for 7 months now! https://github.com/bytecodealliance/preview2-prototyping/ The work is not yet totally...

view this post on Zulip Joel Dice (May 12 2023 at 17:00):

Per @Ryan Levick (rylev) 's suggestion, I'm going to try adding an option to componentize-py to replace all the WASI imports with trapping stubs and see how far that gets us. Longer-term, we'll want a general-purpose "virutal WASI" component which provides e.g. a virtual, in-memory filesystem, etc. for this kind of application.

view this post on Zulip Joel Dice (May 12 2023 at 21:00):

I've added a --stub-wasi option to componentize-py, and have updated the above demo, which now works.

view this post on Zulip Ralph (May 15 2023 at 14:35):

@Joel Dice and all, I'd love to get your thoughts about benefits to understanding how your work here might integrate with https://github.com/microsoft/vscode-wasm?

A WASI implementation that uses VS Code's extension host as the implementing API - GitHub - microsoft/vscode-wasm: A WASI implementation that uses VS Code's extension host as the implementi...

view this post on Zulip Ralph (May 15 2023 at 14:35):

open ended question. I haven't had the chance to think deeply about it yet myself, so.... just throwing that out there.

view this post on Zulip Joel Dice (May 15 2023 at 16:09):

@Ralph I'll confess I don't know much about VSCode, and I can't quite tell what that project is for. Does it support hosting WASI Preview 2 components, or just WASI Preview 1 modules? If the former, then componentize-py could certainly integrate nicely with it.

view this post on Zulip Ralph (May 15 2023 at 16:14):

it hosts preview 1 at the moment using wasi shims sitting on top of node (vscode's engine). In addition, it brings debugging wire-up directly into the ide oob. Very slick, and if we can wire things up to share, that would be coolio. Of course, it will eventually move to preview 2, but I've asked them to enable the javascript experience as well first.

view this post on Zulip Ralph (May 15 2023 at 16:14):

open ended conversation here, but at some point we should set up a demo and chat/noodle for all the python heads....

view this post on Zulip Ralph (May 15 2023 at 16:15):

python.gif

view this post on Zulip Ralph (May 15 2023 at 16:15):

webshell.gif

view this post on Zulip Ralph (May 15 2023 at 16:16):

those are just quick examples; it's not finished or smooth yet, so the ultimate form of experience is entirely malleable

view this post on Zulip Brett Cannon (May 15 2023 at 22:04):

@Joel Dice don't worry about the VS Code stuff; I work on it and it's why I'm here, so it's being looked after (and to @Ralph : the more important thing is installing Python projects which is a separate concern).

As for name clashes with keywords, FYI the convention in Python is to add a traililng _ to a name to avoid the clash.

view this post on Zulip Pamela McA'Nulty (May 16 2023 at 17:27):

@Joel Dice Thanks for componentize-py and the demo repos! Is there a way I could bring in 3rd party libraries that have c-extensions (specifically, numpy) in my component? Specifically, I'm getting "Original error was: No module named 'numpy.core._multiarray_umath'". I'm thinking that it's either (a) not actually possible using the wasmtime/componentize tool chain, or (b) that some part of that chain needs to be built with the correct wheels.

view this post on Zulip Joel Dice (May 16 2023 at 17:29):

@Pamela McA'Nulty I'm glad you asked, because that's what I'm working on at the moment (details here: https://hackmd.io/IlY4lICRRNy9wQbNLdb2Wg). Unfortunately, it's not possible yet, but I hope to make it possible in the near future.

or

view this post on Zulip bjorn3 (May 16 2023 at 19:24):

(left a comment with a question on the hackmd)

view this post on Zulip Joel Dice (May 23 2023 at 17:29):

How does sound for our next meeting? Let me know if you'd like to attend and that doesn't work for you.

view this post on Zulip Shannon Duncan (shadowcodex) (May 23 2023 at 22:24):

@Joel Dice are y'all doing official calendar invites?

view this post on Zulip Joel Dice (May 23 2023 at 22:26):

Not yet, but I can start doing that. I'll create one and add you if you DM me your email address.

view this post on Zulip wayne (May 24 2023 at 02:18):

:wave: what's this meeting about? running python apps in webassembly runtimes? or embedding runtimes in python apps?

view this post on Zulip Jamey Sharp (May 24 2023 at 06:28):

It's about running Python apps in WebAssembly runtimes, yes, and specifically constructing components implemented in Python that follow the Component Model (https://github.com/WebAssembly/component-model/).

Repository for design and specification of the Component Model - GitHub - WebAssembly/component-model: Repository for design and specification of the Component Model

view this post on Zulip Joel Dice (May 25 2023 at 18:33):

Planning to meet in about 30 minutes at https://meet.jit.si/PythonComponentTooling
Agenda and notes here: https://hackmd.io/ZXNfJqvFQ0KvaWRImWSnqg

Join a WebRTC video conference powered by the Jitsi Videobridge
or

view this post on Zulip Kevin Smith (May 25 2023 at 19:51):

I should have mentioned in the meeting that I was using wasix when building numpy, so that might have paved over some issues that they may want to fix in a more permanent way like Brett did with Python itself. That could add to the number of changes needed for WASI.

view this post on Zulip Brett Cannon (Jun 08 2023 at 18:04):

I assume there's a meeting today?

view this post on Zulip Joel Dice (Jun 08 2023 at 18:15):

Yes, 45 minutes from now at https://meet.jit.si/PythonComponentTooling

Join a WebRTC video conference powered by the Jitsi Videobridge

view this post on Zulip Joel Dice (Jun 08 2023 at 18:17):

Will post an agenda here shortly: https://hackmd.io/HXrhjkMXRI20jU9x46UK2A

or

view this post on Zulip Joel Dice (Jun 13 2023 at 17:49):

@Brett Cannon @Kushal Das I've managed to build a WASI libpython3.11.so and call into it via the C API from another .so: https://github.com/dicej/component-linking-demo. Now I'm trying to import ujson, which is somewhat predictably failing, considering only ujson.cpython-311-darwin.so is in sys.path. So I'm trying to figure out what the appropriate file name(s) for a WASI build of ujson might be (ujson.cpython-311-wasi.so, maybe?). When I add debug logging to trace all file opens and stats, I don't see anything, so it's not clear to me that importlib is looking for _anything_ on the filesystem. Any advice for debugging?

Demo of shared-everything linking using the WebAssembly Component Model - GitHub - dicej/component-linking-demo: Demo of shared-everything linking using the WebAssembly Component Model

view this post on Zulip Brett Cannon (Jun 13 2023 at 18:35):

 ./run_wasi.sh -c "import importlib.machinery; print(importlib.machinery.EXTENSION_SUFFIXES)"
[]

Looks like it's completely disabled ATM for extension modules since no one expected it to work. :sweat_smile: I opened https://github.com/python/cpython/issues/105738 to fix it.

❯ ./run_wasi.sh -c "import importlib.machinery; print(importlib.machinery.EXTENSION_SUFFIXES)" [] See https://bytecodealliance.zulipchat.com/#narrow/stream/223391-wasm/topic/Python.20guest.20runtim...

view this post on Zulip Brett Cannon (Jun 13 2023 at 18:42):

@Joel Dice would an experimental build or patch work? I'm realizing I don't know if I can even fix this upstream due to the lack of dlopen() to even build against. But I can probably give you a patch to apply to Python's source to test this out.

view this post on Zulip Joel Dice (Jun 13 2023 at 19:09):

Yes, a patch would be great. I've already forked the cpython repo to make it build with a patched version of wasi-sdk 21. Just trying to get everything working before I start opening upstream PRs.

view this post on Zulip Joel Dice (Jun 22 2023 at 13:51):

Python meeting today at at https://meet.jit.si/PythonComponentTooling. Feel free to add to the agenda: https://hackmd.io/DpFFGyoYRtq5UBfv1ZCT8Q

Join a WebRTC video conference powered by the Jitsi Videobridge
or

view this post on Zulip Jamey Sharp (Jun 22 2023 at 16:18):

I think you meant ?

view this post on Zulip Joel Dice (Jun 22 2023 at 16:58):

Yes :point_up:

view this post on Zulip Joel Dice (Jul 06 2023 at 13:52):

FYI, further discussion of Python guest tooling will happen in #SIG-Guest-Languages

view this post on Zulip Kevin Smith (Jul 17 2023 at 20:11):

Our next Python SIG meeting is on Thursday, July 20. @Joel Dice will be on vacation. Does anyone have any agenda items for that meeting?

view this post on Zulip Kevin Smith (Jul 19 2023 at 14:01):

If no one has agenda items for tomorrow's meeting, we can cancel this one. Any objections?

view this post on Zulip Kevin Smith (Aug 02 2023 at 18:45):

@Joel Dice is out again this week. If anyone has anything to discuss, I can host the meeting. I've been out of the Wasm loop for a little while because of other project priorities, so I don't have anything new right now. Just let me know if you have a reason to meet this week.

view this post on Zulip Brett Cannon (Aug 02 2023 at 19:02):

Only thing I had was I tried to compile MicroPython via WASI but failed due to its use of setjmp.h. I was going to ask what the status of the exceptions proposal was since it seems that's necessary to fix that for WASI-libc?

view this post on Zulip Robin Brown (Aug 04 2023 at 18:41):

Let's shift meeting announcements/scheduling over to a topic in #SIG-Guest-Languages in the future.

view this post on Zulip Robin Brown (Aug 04 2023 at 18:42):

@Brett Cannon I put the threads proposal on the main group agenda for next Tuesday. I'll also add on exceptions too.


Last updated: Oct 23 2024 at 20:03 UTC