Stream: SIG-Guest-Languages

Topic: componentize-py with cpython-wasi-build


view this post on Zulip Ramon Klass (Aug 20 2024 at 22:09):

I've tried to use the cpython binaries in componentize-py, but didn't get very far because the libpython3.13.a needs to be compiled with -fPIC for this to work unless I'm reading the output wrong, it spews a lot of errors in the form of
wasm-ld: error: componentize-py/pybuild/libpython3.13.a(getbuildinfo.o): relocation R_WASM_MEMORY_ADDR_LEB cannot be used against symbol initialized; recompile with -fPIC

view this post on Zulip Joel Dice (Aug 20 2024 at 22:24):

Yup, hence https://github.com/bytecodealliance/componentize-py/blob/4f3045bfe3609f0627b8be595ce6a3f1faff979d/build.rs#L289

Contribute to bytecodealliance/componentize-py development by creating an account on GitHub.

view this post on Zulip Ramon Klass (Aug 20 2024 at 22:25):

yes, I guess what I'm asking is if Brett could add that to the other build too

view this post on Zulip Brett Cannon (Aug 26 2024 at 17:56):

What's the impact going to be? The build I create isn't entirely a shared library on it's own to begin with since WASI doesn't directly support that w/o using componentize-py. Is it just binary size that's going to be unhappy? And is this something to turn on universally for all WASI builds (i.e. upstream this), or more of a special case? Basically why hasn't @Joel Dice asked for this before?

And @Ramon Klass , the build process is documented at https://devguide.python.org/getting-started/setup-building/#wasi if you ever need to do your own custom build.

These instructions cover how to get a working copy of the source code and a compiled version of the CPython interpreter (CPython is the version of Python available from https://www.python.org/). It...

view this post on Zulip Joel Dice (Aug 26 2024 at 18:51):

-fPIC does have performance and binary size costs, which is part of why wasi-libc currently builds seperate .o files for the .a and .so libraries. Personally, I think those costs aren't worth worrying about, but some folks care.

As for why I haven't asked for it: I've been content to use my lightly patched fork, with no urgent need to upstream those patches. Sounds @Ramon Klass is motivated, though :)

view this post on Zulip Ramon Klass (Aug 26 2024 at 18:57):

well if the patcfh for socket support is still needed then we can't move away from the fork anyway, so I'm not sure how useful proving that you can build a version with upstream cpython-wasi which does not support everything the current build does would be

view this post on Zulip Brett Cannon (Aug 27 2024 at 16:51):

Joel Dice said:

-fPIC does have performance and binary size costs

We talking 1% or 10% impact?

Joel Dice said:

As for why I haven't asked for it: I've been content to use my lightly patched fork, with no urgent need to upstream those patches.

Except I'm starting to get questions beyond this as to why componentize-py doesn't work w/ the latest releases of CPython or why/how does it deviate. :sweat_smile:

Ramon Klass said:

if the patch for socket support is still needed

It is because I can't even test the socket support until there is thread support and I'm not comfortable claiming socket is supported w/o the test suite working (i.e. I tried a couple of weeks ago and nearly every test errored out or was skipped; the only test that didn't failed). And as a side note, a key reason VS Code doesn't support WASI 0.2 is because of the lack of threads and now important they are to networking, and thus devaluing the work to port VS Code (and thus CPython for me) to WASI 0.2 and use new features (things build fine, just nothing new is turned on within Python).

view this post on Zulip Joel Dice (Aug 27 2024 at 17:07):

We talking 1% or 10% impact?

For libc.a, the binary size difference was about 1% (889KB vs 900KB). I never did any timing benchmarks, so I don't have any numbers for that; sorry.

view this post on Zulip Joel Dice (Aug 27 2024 at 17:10):

FWIW, I had a lot of the asyncio socket tests passing at one point. The ones that didn't pass involved e.g. process forking and signal handling. Not sure if it would be practical to carve out the working subset of tests and run only those for WASI; might be a lot of work.

view this post on Zulip Ramon Klass (Aug 27 2024 at 19:11):

that's unfortunate, especially since the threads are only needed to monitor the socket. As Joel said only running the async tests might be an option but I'm not familiar with the python test suite yet, maybe I should indeed go through your guide at least once :)

view this post on Zulip Ramon Klass (Aug 27 2024 at 19:54):

@Joel Dice side note not related to the cpython builds: since you ship the libc of wasi-sdk in componentize-py, that means wasi-wheels needs to use the same wasi-sdk version for any wheels that compiled c code

I'd like to bump both to wasi-sdk 24 and see how it goes now that I fixed my buildchain, but I thought it would make sense to upgrade wasi-sdk alongside cpython since for python users it's clear that they need different wheels for py313, but not so clear that a componentize-py upgrade breaks certain wheel files

view this post on Zulip Brett Cannon (Aug 28 2024 at 17:27):

Joel Dice said:

FWIW, I had a lot of the asyncio socket tests passing at one point. The ones that didn't pass involved e.g. process forking and signal handling. Not sure if it would be practical to carve out the working subset of tests and run only those for WASI; might be a lot of work.

The asyncio tests will just make sure sockets are async, not that the socket themselves work appropriately. Much like w/ anything POSIX, our socket test suite is extensive and I don't feel comfortable claiming support until I know exactly where the incompatibilities lie. And with a test suite just shy of 800 tests, I would not expect unthreading it is going to be quick and easy. It also doesn't help that someone who volunteered to look into socket support thought it wasn't worth it w/o more work, e.g., lack of getaddrinfo() (https://github.com/python/cpython/issues/121634#issuecomment-2271446647).

https://bytecodealliance.zulipchat.com/#narrow/stream/394175-SIG-Guest-Languages/topic/Python.20subgroup/near/450807393 Requires #120371 since WASI SDK 22 is required. But this supposedly has some ...

view this post on Zulip Joel Dice (Aug 28 2024 at 17:37):

Ironically, the CPython asyncio test suite was a huge help in getting getaddrinfo working and addressing corner cases. I'd be curious what specifically makes it "not usable". Issues on the wasi-libc repo would be most welcome!

view this post on Zulip Joel Dice (Aug 28 2024 at 17:45):

The asyncio tests will just make sure sockets are async

That surprises me. This test, for example, seems to be doing more than checking whether sockets are async. They're connecting, reading, writing, etc. And this one tests multiplexing. Others test UDP multicast, various ioctl settings, etc. It was all quite helpful getting the wasi-libc stuff into shape.

The Python programming language. Contribute to python/cpython development by creating an account on GitHub.
The Python programming language. Contribute to python/cpython development by creating an account on GitHub.

view this post on Zulip Brett Cannon (Aug 29 2024 at 17:50):

Joel Dice said:

The asyncio tests will just make sure sockets are async

That surprises me. This test, for example, seems to be doing more than checking whether sockets are async. They're connecting, reading, writing, etc. And this one tests multiplexing. Others test UDP multicast, various ioctl settings, etc. It was all quite helpful getting the wasi-libc stuff into shape.

But that's still unfortunately a subset of the socket tests; when I say "sockets are supported", people are going to think of https://github.com/python/cpython/blob/main/Lib/test/test_socket.py passing, not test_asyncio.

Regardless, work -- aka VS Code as the primary user of my WASI work -- isn't moving to WASI 0.2 yet no matter what support I get going, so sinking my time into setting up a new tier -- since wasm32-wasip2 is a new triple to cover w/ e.g., buildbots -- isn't worth the 20% time I'm getting for WASI starting next month. Get VS Code to support WASI 0.2 and then we can talk about worrying about network support w/o threads.

The Python programming language. Contribute to python/cpython development by creating an account on GitHub.

view this post on Zulip Joel Dice (Aug 29 2024 at 17:54):

Indeed, I wouldn't say "sockets are supported". I'd say "a useful subset of sockets are supported" -- enough to make non-trivial libraries like Redis-Py work. It's a big step forward from "not at all supported", even if doesn't solve every possible case.

view this post on Zulip Joel Dice (Aug 29 2024 at 18:07):

It's a bummer that the main sockets tests require multithreading to run; I'd expect most of them would pass otherwise. I understand that it's kind of inevitable that you'd want more than one thread (or else multiple processes) when testing blocking sockets, so it's not surprising, but I hope folks will see beyond that and recognize the usefulness of what we've built already rather than assume they'll need to wait for multithreading before socket support is useful.

view this post on Zulip Ralph (Aug 29 2024 at 18:51):

actually, that's not Brett

view this post on Zulip Ralph (Aug 29 2024 at 18:51):

that's me, and other

view this post on Zulip Ralph (Aug 29 2024 at 18:51):

we get to demonstrate that it's important to do. Brett, you, and others have done amazing things here already.

view this post on Zulip Brett Cannon (Aug 29 2024 at 20:36):

Joel Dice said:

It's a bummer that the main sockets tests require multithreading to run; I'd expect most of them would pass otherwise. I understand that it's kind of inevitable that you'd want more than one thread (or else multiple processes) when testing blocking sockets, so it's not surprising, but I hope folks will see beyond that and recognize the usefulness of what we've built already rather than assume they'll need to wait for multithreading before socket support is useful.

Dirk should hopefully be at the plumber's summit so you can bug him in person to change VS Code's plans. :wink:

view this post on Zulip Joel Dice (Aug 29 2024 at 20:45):

I'm not trying to change anyone's plans or tell them what to do; just pointing out that WASIp2 sockets support in Python is already in a usable state. No problem at all if now is not the time to upstream it into CPython. componentize-py is there for anyone who wants to use it in the meantime.

view this post on Zulip Ralph (Aug 30 2024 at 09:43):

yeah, this is a Microsoft/BigCorp thing. Branded things do not invest in "usable states". Their customers demand money if bugs happen, or demand immediate fixes whether they're paying for that extra "sudden" work or not. And that sudden work, as you'll imagine, is a very expensive opportunity cost for other things suddenly NOT done.

view this post on Zulip Ralph (Aug 30 2024 at 09:44):

My job is to help Dirk's tree understand where small steps along with our work here place vscode in the position to "suddenly lean in" when it has enough stability for them to commit to unknown future customers.

view this post on Zulip Ralph (Aug 30 2024 at 09:47):

Interestingly, Kubernetes really changed all that; implementing even WASI support was speculative on Dirk's part (partly based on my sharing the beautiful vision of the future, paved with gold, that components lead us toward....). These days some customers will actually run something in a "usable state", which is great! But that remains a minority.

view this post on Zulip Ralph (Aug 30 2024 at 09:48):

my job is to break that even more :-)

view this post on Zulip Ralph (Aug 30 2024 at 09:48):

you can think of the .NET thing the same way. There's like two years of work behind the point we're at now -- lots of persuasion on my part and lots of extended vision on theirs.

view this post on Zulip Ralph (Aug 30 2024 at 09:49):

(This, not mentioning all the hard work of you and timmy and scott and others)

view this post on Zulip Ralph (Aug 30 2024 at 10:31):

now, all that said, documenting this thread somewhere and making it "the way" until we can get resources to tackle the "socket coverage problem" in cpython would be a good thing. We can't live on Joel's fork forever. :-)


Last updated: Nov 22 2024 at 17:03 UTC