Stream: SIG-Guest-Languages

Topic: Compile python binary packages for componentize-py


view this post on Zulip Ramon Klass (Jan 17 2024 at 21:19):

what is the status of building python packages for componentize-py? For example the regex package which is written in C

view this post on Zulip Joel Dice (Jan 17 2024 at 21:39):

I created https://github.com/dicej/wasi-wheels with an ambition to start collecting recipes for building WASI wheels for various packages. I've only added NumPy so far, though. I tried a couple of other non-trivial packages, but got stuck in both cases:

I don't know anything about the regex package, but if the C code is reasonably portable, it should work fine.

Python wheels built for WASI. Contribute to dicej/wasi-wheels development by creating an account on GitHub.

view this post on Zulip Ramon Klass (Jan 17 2024 at 21:43):

thanks I'll have a look, how do you use these wheels? As in, currently I pip install my dependencies and then run componentize-py but that uses the host site-packages

view this post on Zulip Joel Dice (Jan 17 2024 at 21:51):

Here's an example that does it without any kind of package management: https://github.com/bytecodealliance/componentize-py/tree/main/examples/matrix-math . Another approach is to use a virtual environment (e.g. python -m venv .venv && source .venv/bin/activate) and then pip install numpy-1.26.0-cp312-abi3-wasm32-wasi.whl. However, that wasi-wheels repo I linked to doesn't actually build wheels yet -- it just tars up the build dir. Should be easy to add a step to build a proper wheel, though.

view this post on Zulip Joel Dice (Jan 17 2024 at 21:53):

It's all at the proof-of-concept stage at this point, so the developer experience is still quite primitive.

view this post on Zulip Ramon Klass (Jan 19 2024 at 14:48):

@Joel Dice for the wasi-wheels repo, to get system python3.12 and pip3.12, (which are run in the build script of numpy) should I make install the fork in the repo or use a release python3.12?
I'm trying to write a Dockerfile to make it easier for people to run the toolchain

view this post on Zulip Joel Dice (Jan 19 2024 at 14:53):

You can use a normal (unforked) release of Python 3.12. The submodule that points to a fork is only used for compiling the C code to Wasm and linking against the Wasm version of libpython3.12.so -- it doesn't need to be used for installing and running setuptools.

view this post on Zulip Joel Dice (Jan 19 2024 at 14:55):

You might want to look at the .github/workflows/release.yaml file and use that as a model for the Dockerfile. Thanks for working on that!

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:08):

what branch of your wasi-sdk fork do I use for this?

view this post on Zulip Joel Dice (Jan 19 2024 at 21:11):

You can use either the shared-library-alpha-4 tag or the wasi-sockets-alpha-1 tag. The makefile currently uses a pre-built version of the former.

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:12):

yeah I want to build linux/arm64 binaries since running those is a lot faster than rosetta for me :)

view this post on Zulip Joel Dice (Jan 19 2024 at 21:13):

makes sense

view this post on Zulip Joel Dice (Jan 19 2024 at 21:13):

you might be able to use the stock wasi-sdk 21 release at this point

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:13):

side note: wasi-sdk in general lacks linux-arm64 binaries, but I also don't know how you would build these on gh-actions

view this post on Zulip Joel Dice (Jan 19 2024 at 21:14):

yeah, we'd need to update CI to build them; probably not too hard

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:22):

but, how? I'm only aware of the linux and the macos runner, if the macos runner has docker installed then you could compile the binaries in a debian arm64 docker container, or you would need to setup a cross-compiler on the linux runner, but I hope my information is outdated

view this post on Zulip Joel Dice (Jan 19 2024 at 21:24):

Yeah, you'd need a cross compiler, e.g. https://github.com/bytecodealliance/componentize-py/blob/main/.github/workflows/release.yaml#L228-L234

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:25):

fascinating, I remember when setting up cross compilation was witchcraft, now you basically set a handful of environment variables it seems :)

view this post on Zulip Joel Dice (Jan 19 2024 at 21:26):

depends on what you're building -- sometimes it's still witchcraft

view this post on Zulip Ramon Klass (Jan 19 2024 at 21:26):

fortran says hi

view this post on Zulip Ramon Klass (Jan 20 2024 at 12:46):

sidetracking myself with building wasi-sdk took so long that I didn't progress further yesterday, so I kept the package, will locally use that, but the compilation time of wasi-sdk makes it impractical to compile that during building the docker image. Anyway I'll continue when I have time

view this post on Zulip Joel Dice (Jan 20 2024 at 18:43):

Yeah, building LLVM takes a long time, unfortunately. The rest of the wasi-sdk build is relatively quick, though.

view this post on Zulip Joel Dice (Jan 25 2024 at 14:37):

@Ramon Klass Here's an awesome PR adding a bunch of new packages to the wasi-wheels repo, FYI: https://github.com/dicej/wasi-wheels/pull/1

Hi! Thanks for your great work on componentize-py. I've been working to componentize an existing Python application with a number of native dependencies. I've included my progress in this PR – I'm ...

view this post on Zulip Joel Dice (Jan 25 2024 at 14:59):

^ courtesy of @Chris Dickinson

view this post on Zulip Ramon Klass (Jan 25 2024 at 15:11):

oh wow :) thanks for reminding me. Great work @Chris Dickinson

view this post on Zulip Ramon Klass (Jan 25 2024 at 15:36):

did you say wasi-wheels should be compatible to the latest upstream wasi-sdk release or am I misremembering? (the makefile still downloads your build)

view this post on Zulip Joel Dice (Jan 25 2024 at 15:37):

Yes, I think upstream wasi-sdk 21 should work. I haven't tested it yet, though.

view this post on Zulip Ramon Klass (Jan 25 2024 at 15:52):

how do I use these with componentize-py?

view this post on Zulip Joel Dice (Jan 25 2024 at 15:54):

You should be able to download the desired .tar.gz file(s) (using e.g. curl), untar them, and make sure componentize-py knows where to find them if they're not in the current directory. https://github.com/bytecodealliance/componentize-py/tree/main/examples/matrix-math is an example using numpy-wasi.tar.gz

view this post on Zulip Ramon Klass (Jan 25 2024 at 16:06):

oh the python code I run expects some env variable, it seems like you did not implement that yet?

view this post on Zulip Joel Dice (Jan 25 2024 at 16:07):

How are you running the component? If you're using wasmtime run, you'll need to use the --env flag to pass environment variables to the guest.

view this post on Zulip Ramon Klass (Jan 25 2024 at 16:09):

no it's a bit uglier, they used environment variables for optional features, so not having the env varable set makes the code import a dependency that I don't want in the component

view this post on Zulip Ramon Klass (Jan 25 2024 at 16:10):

which means I need the environment variable during creation of the component already

view this post on Zulip Joel Dice (Jan 25 2024 at 16:10):

ah, so it needs to be set during pre-initialization (i.e. when running the top level of the script)? Yeah, that's not yet supported, but it shouldn't be hard to add.

view this post on Zulip Ramon Klass (Jan 25 2024 at 16:21):

ModuleNotFoundError: No module named 'zlib'

componentize-py really needs a python build with zlib (just to see how far I can get I used python's way of setting environment variables, but I also forked the repo and will look into the environment variable part)

view this post on Zulip Joel Dice (Jan 25 2024 at 16:23):

Yeah, that's come up before. There's no fundamental reason why zlib shouldn't work AFAIK. Maybe it uses setjmp/longjmp for error handling or something?

view this post on Zulip Joel Dice (Jan 25 2024 at 16:24):

There's this, so apparently it can be done: https://github.com/ryuukk/zlib_wasm

zlib build script to target web assembly without emscripten - GitHub - ryuukk/zlib_wasm: zlib build script to target web assembly without emscripten

view this post on Zulip Ramon Klass (Jan 25 2024 at 16:31):

https://github.com/vmware-labs/webassembly-language-runtimes/releases/tag/libs%2Fzlib%2F1.2.13%2B20230623-2993864
https://github.com/vmware-labs/webassembly-language-runtimes/blob/300c157844ff30799232528970fdd554f9d6a495/python/v3.11.3/wlr-build-deps.sh

wlr has build scripts for it already, their version of wasm python has zlib, bzip2, uuid and sqlite

Wasm Language Runtimes provides popular language runtimes (Ruby, Python, …) precompiled to WebAssembly that are tested for compatibility and kept up to date when new versions of upstream languages are released - Release libs/zlib/1.2.13+20230623-2993864 · vmware-labs/webassembly-language-runtimes

view this post on Zulip Joel Dice (Jan 25 2024 at 16:34):

Cool. I wonder if we could just drop that into the CPython 3.12 build that componentize-py uses.

view this post on Zulip Brett Cannon (Jan 25 2024 at 20:00):

Joel Dice said:

Yeah, that's come up before. There's no fundamental reason why zlib shouldn't work AFAIK. Maybe it uses setjmp/longjmp for error handling or something?

It's because I haven't had time to start working towards having external dependencies get built into the builds I produce yet.

view this post on Zulip Richard Backhouse (Feb 05 2024 at 18:18):

I was able to get zlib into componentize-py. I wasn't aware of the version that the vmware team built so I built my own. It was pretty straightforward using the standard environment variables , Here is the build script I used, I just built the static libvary.

#!/bin/bash

export WASI_SDK_PATH=/opt/wasi-sdk
export CC="${WASI_SDK_PATH}/bin/clang"
export CFLAGS="-fPIC"
export CXX="${WASI_SDK_PATH}/bin/clang++"
export LDSHARED=${CC}
export AR="${WASI_SDK_PATH}/bin/ar"
export RANLIB="${WASI_SDK_PATH}/bin/ranlib"
./configure --static
make

Integrating into the componentize-py cpython involved uncommenting the zlib statement in Modules/Setup (not sure if this is the correct way of enabling the extension module, maybe Setup.local is the correct place to enable but that didn't work for me)

zlib  zlibmodule.c -lz -I<path to zlib dir>/zlib -L<path to zlib dir>/zlib

I had to add a additional line to the libpython3.12.so build step in build.rs to add the zlib library. I added under the -whole-archive flag and not --no-whole-archive flag. Not sure which one is the correct to use.

        run(Command::new(wasi_sdk.join("bin/clang"))
            .arg("-shared")
            .arg("-o")
            .arg(cpython_wasi_dir.join("libpython3.12.so"))
            .arg("-Wl,--whole-archive")
            .arg(cpython_wasi_dir.join("libpython3.12.a"))
            .arg("<path to zlib dir>/libz.a")
            .arg("-Wl,--no-whole-archive")
            .arg(cpython_wasi_dir.join("Modules/_hacl/libHacl_Hash_SHA2.a"))
            .arg(cpython_wasi_dir.join("Modules/_decimal/libmpdec/libmpdec.a"))
            .arg(cpython_wasi_dir.join("Modules/expat/libexpat.a")))?;
    }

view this post on Zulip Brett Cannon (Feb 05 2024 at 21:15):

Richard Backhouse said:

Integrating into the componentize-py cpython involved uncommenting the zlib statement in Modules/Setup (not sure if this is the correct way of enabling the extension module, maybe Setup.local is the correct place to enable but that didn't work for me)

Setup.local should have worked. Did you set the marker specifying you wanted it statically compiled?

And FYI once I get WASI to tier 2 support for CPython and deal w/ preview 2 support, I'm going to tackle https://github.com/python/cpython-source-deps and getting those to compile.

Source for packages that the cpython build process depends on - GitHub - python/cpython-source-deps: Source for packages that the cpython build process depends on

view this post on Zulip Ralph (Feb 06 2024 at 10:53):

God bless you, Brett Cannon

view this post on Zulip Richard Backhouse (Feb 06 2024 at 11:38):

Brett Cannon said:

Richard Backhouse said:

Integrating into the componentize-py cpython involved uncommenting the zlib statement in Modules/Setup (not sure if this is the correct way of enabling the extension module, maybe Setup.local is the correct place to enable but that didn't work for me)

Setup.local should have worked. Did you set the marker specifying you wanted it statically compiled?

And FYI once I get WASI to tier 2 support for CPython and deal w/ preview 2 support, I'm going to tackle https://github.com/python/cpython-source-deps and getting those to compile.

Thanks Brett not setting the marker is probably the problem.

Regarding the source packages I see openssl is included. I have been going down that path to get it in as a module as I have a package that depends on it. It looks though that this is a no go until sockets get properly supported in the wasi-sdk. Is that a correct assumption ?

view this post on Zulip Joel Dice (Feb 06 2024 at 19:05):

Richard Backhouse said:

Regarding the source packages I see openssl is included. I have been going down that path to get it in as a module as I have a package that depends on it. It looks though that this is a no go until sockets get properly supported in the wasi-sdk. Is that a correct assumption ?

I'm currently building componentize-py using a temporary fork of wasi-libc with full sockets support, so no worries there. The bigger issue with OpenSSL is that Wasm does not yet have constant time operations, which makes crypto primitives vulnerable to timing attacks, so enabling the ssl module is not necessarily a great idea from a security perspective.

At the recent BA contributors summit, we discussed starting a wasi-tls project to make TLS available to guest applications in a secure way. In the Python ecosystem, we might want to resurrect https://peps.python.org/pep-0543/ so the Python stdlib is not tied to OpenSSL specifically.

Per #447, this is an (in-progress) implementation of wasi-sockets, including a new wasm32-wasi-preview2 target. Currently, TCP, UDP, and name resolution are mostly implemented (minus a few sockopt ...

view this post on Zulip Brett Cannon (Feb 06 2024 at 20:19):

Unfortunately Python's stdlib very much relies on the ssl module (and thus OpenSSL) for HTTPS. A way to solve this is trying to revive PEP 543 as Joel suggested, although all people involved burned out of open source, so whomever took it on would need to pick it up entirely.

Another option is to try and introduce an HTTP fetch API to Python that uses wasi-http. A fetch API has been brought up before, but someone needs the time to pursue it (it's on my very long list :sweat_smile:).


Last updated: Dec 23 2024 at 12:05 UTC