Stream: wasmtime

Topic: github status and/or woes


view this post on Zulip Alex Crichton (Feb 04 2026 at 19:12):

in the spirit of starting a thread on this, I've noticed notification emails being received very late today, I'm only just now getting notifications in email for stuff that happened hours ago

view this post on Zulip Victor Adossi (Feb 05 2026 at 10:08):

Continuing in that spirit GH actions and GH as a whole has been slow/buggy this week (and a little of last week)? My emails have been also coming through slowly, but haven't seen actions failures as a symptom just yet

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:10):

pages are loading slowly and getting intermittent errors for me rn

https://www.githubstatus.com/ says "notifications are delayed" but it seems like its their whole system that is being sluggish

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:10):

aaand now I'm getting unicorns

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:36):

git pushes are failing now too

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:36):

man things fell over fast

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:46):

Screenshot 2026-02-09 at 10.46.38.jpg

it's like it's christmas!

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:47):

so colorful!

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:57):

when I do manage to load a PR or something, it seems like actions are not getting scheduled for the PR at all

view this post on Zulip Till Schneidereit (Feb 09 2026 at 17:23):

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

view this post on Zulip Ralph (Feb 09 2026 at 17:54):

there was a snow hour over here, too

view this post on Zulip Ralph (Feb 09 2026 at 17:55):

my US team seems like it's working again?

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:00):

Till Schneidereit said:

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

routing through germany I'm getting extremely slow git pushes right now as well as unicorns -- my guess is this is more of a backend thing than a frontend

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:12):

everything got green for a bit and it's all back to very red

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:26):

We're not really keeping track per-se, but at some point we're going to cross the threshold of "it would be cheaper to hire someone to maintain self-hosted CI infrastructure"

view this post on Zulip Chris Fallin (Feb 09 2026 at 19:30):

Three Mondays in a row with major outages; perhaps all BA member companies should adopt 4-day workweeks Tue-Fri ¯\_(ツ)_/¯

view this post on Zulip bjorn3 (Feb 10 2026 at 12:48):

Till Schneidereit said:

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

Wouldn't that only help if the bytecodealliance enterprise account itself is a European account?

view this post on Zulip Ralph (Feb 10 2026 at 14:55):

I'm fairly sure there was an underlying resource failure of some sort; it absolutely went out here in the EU as well, but was back up in about 15 minutes or so......

view this post on Zulip Ralph (Feb 10 2026 at 14:55):

FWIW, all of gh is on a stability freeze -- no new rollouts or config changes of any sort -- to fully understand and rectify what happened particularly this week.

view this post on Zulip Alex Crichton (Feb 10 2026 at 15:35):

Screenshot 2026-02-10 at 09.34.58.jpg

so it begins anew...

view this post on Zulip Ralph (Feb 10 2026 at 16:41):

good god

view this post on Zulip Ralph (Feb 10 2026 at 16:41):

are you running again, or is it STILL there?

view this post on Zulip Alex Crichton (Feb 10 2026 at 16:56):

I have done much GitHub myself this morning and the page is all green now so hopefully fine...

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:17):

Screenshot 2026-02-11 at 10.17.07.jpg

Another day, more errors. I'm seeing a lot of delayed notifications this morning as well as a lot of spurious failures in this CI run

view this post on Zulip Ralph (Feb 11 2026 at 16:31):

all I can do here is listen to the pain and pass it along to Ben

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:48):

Oh that's understandable yeah, this is primarily a heads-up channel for us so we can share what we're seeing and be aware of outages/problems on our end

view this post on Zulip Ralph (Feb 11 2026 at 16:50):

totes git it; I'm just letting you know that I'm backchanneling but also that I can't do more than that

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:51):

that's also much appreciated too!

view this post on Zulip Victor Adossi (Feb 11 2026 at 16:59):

Maybe we can get the CEO of GraphQL on the phone
(apologies, couldn't resist)

view this post on Zulip Ralph (Feb 11 2026 at 16:59):

hey, any port in a storm, right?

view this post on Zulip Ralph (Feb 11 2026 at 17:51):

as it happens, the ex CEO of GH is starting his own new GH, so maybe we can all move there while they don't charge anything? :-)

view this post on Zulip Ralph (Feb 11 2026 at 17:52):

meanwhile, the poor pm who has to deal with all this from customers:
image.png

view this post on Zulip Ralph (Feb 11 2026 at 17:52):

due diligence: he IS kidding, painfully

view this post on Zulip Alex Crichton (Feb 11 2026 at 22:06):

We've talked about retries and such before, but here's an example of an exponential backoff and it just fails every time...

view this post on Zulip Lann Martin (Feb 11 2026 at 23:46):

Could be hitting the rate limit for unauthenticated requests...

view this post on Zulip Lann Martin (Feb 11 2026 at 23:55):

Could try using the gh CLI which can download via authenticated API calls, e.g. for the example you linked this seems to work: gh release download --repo bytecodealliance/wasm-tools wasm-tools-1.0.27 -p wasm-tools-1.0.27-x86_64-linux.tar.gz
I believe gh is preinstalled for standard actions runners but it might require a bit more config to make it authenticate as the action: https://docs.github.com/en/actions/tutorials/authenticate-with-github_token#example-1-passing-the-github_token-as-an-input
Alternatively: https://github.com/marketplace/actions/release-downloader

view this post on Zulip Chris Fallin (Feb 11 2026 at 23:58):

It looks like each attempt there downloads ~55kB then stalls -- I'd expect a rate limit to immediately return a 429 or 500 or whatever. Looks like maybe a CDN/cache problem as each download stalls at the same chunk? In any case, points more to "flaky platform" than "problem that we can solve easily" IMHO

view this post on Zulip Till Schneidereit (Feb 12 2026 at 15:51):

we could also try to cache tool downloads, so we presumably at least are closer to the storage the bits come from, and they all come from the same storage?

view this post on Zulip fitzgen (he/him) (Apr 14 2026 at 13:48):

https://www.githubstatus.com/ is green but I'm getting intermittent unicorns rn

view this post on Zulip Alex Crichton (Apr 16 2026 at 23:25):

If you see

---- cli_tests::test_programs::p3_cli_serve_hello_world_many_no_concurrent_reuse stdout ----
failed to wait for child or read stdio: child failed Output { status: ExitStatus(ExitStatus(1)), stdout: "", stderr: "\nthread 'tokio-rt-worker' (8740) panicked at C:\\Users\\runneradmin\\.cargo\\registry\\src\\index.crates.io-1949cf8c6b5b557f\\tokio-1.51.1\\src\\sync\\mpsc\\list.rs:278:9:\nattempt to subtract with overflow\nnote: run with `RUST_BACKTRACE=1` environment variable to display a backtrace\n" }
Error: failed to read body

Caused by:
    0: error reading a body from connection
    1: unexpected EOF during chunk size line

in CI logs it's a spurious failure. This is https://github.com/tokio-rs/tokio/issues/8061 and while this has been a bug in Tokio for a long time it seems the tokio update in https://github.com/bytecodealliance/wasmtime/pull/13104 caused scheduling changes such that it happens more frequently now.

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:02):

CI is broken until https://github.com/bytecodealliance/wasmtime/pull/13150 lands

view this post on Zulip Chris Fallin (Apr 20 2026 at 16:06):

sorry about that; did I miss something on the add-a-new-crate checklist? annoying that this doesn't surface until a release

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:07):

last time I dug into this it's actually impossible to prevent this from happening, we're forced to, when adding a new crate, accept that CI will be broken on the next publication

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:08):

I forget exactly why though, and things have changed where we publish things ahead-of-time now, and we add crates rarely enough I never bothered to re-check

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:08):

so, no, no mistake on your part and our docs don't mention this, it's just always a fun surprise on the next publish heh

view this post on Zulip Pat Hickey (Apr 23 2026 at 16:52):

https://www.githubstatus.com/incidents/myrbk7jvvs6p its another day ending in y

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:51):

Ok this is a first I think -- github seems to have corrupted a merge to the main branch

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

https://github.com/bytecodealliance/wasmtime/pull/13180 just landed on the tip of tree, and the diff there looks as-expected

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

However the squashed commit -- https://github.com/bytecodealliance/wasmtime/commit/0c3a69f18df3e6939048b68e9d0dcb5a4d4518f3 -- seems to additionally include a revert of the parent commit -- https://github.com/bytecodealliance/wasmtime/commit/54929c175c1249b8d1978a76c54f92c0317b0181

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

so github has helpfully reverted a commit for us

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

I've... never seen data corruption before

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:56):

how many other PRs have landed and been silently reverted.... I have no idea

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:57):

that's... extremely odd? race condition wrt base branch maybe? (clearly a GitHub bug)

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:58):

according to https://www.githubstatus.com/incidents/zsg1lk7w13cf

We have identified a regression in merge queue behavior present when squash merging or rebasing. We have identified the root-cause and are in the process of reverting the change.

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:58):

the perils of opaque SaaS providers

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:59):

(I say as an employee of a SaaS provider, speaking with other employees of other SaaS providers)

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:00):

well, I'll just reland Nick's patch and pray that's the only victim

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

it's ... kind of insanely lucky that I caught this

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

I just happened to want to do a small follow-up and couldn't find the code when I was trying to do that

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

otherwise we never would have noticed this

view this post on Zulip Chris Fallin (Apr 23 2026 at 21:19):

at the risk of re-igniting the "should we be on GitHub" question, silently losing a commit is kind of the worst sin that a git host could commit

view this post on Zulip Chris Fallin (Apr 23 2026 at 21:20):

I don't know what to do about that but just want to say it out loud

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:21):

status page now says:

Update - We have resolved a regression present when using merge queue with either squash merges or rebases. If you use merge queue in this configuration, some pull requests may have been merged incorrectly between 2026-04-23 16:05-20:43 UTC.

I can only hope they realize how utterly serious this is

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:21):

uptime is almost nothing compared to data loss

view this post on Zulip Alex Crichton (Apr 23 2026 at 22:08):

whelp it happened again

view this post on Zulip Alex Crichton (Apr 23 2026 at 22:09):

Let's not land anything else today...

view this post on Zulip Till Schneidereit (Apr 24 2026 at 12:47):

Maybe others did as well, but FYI I got an email from GitHub notifying us of two PRs being dropped instead of merged, so they at least seem to realize that yes, this is quite terrible

view this post on Zulip Alex Crichton (Apr 24 2026 at 13:17):

I also got an email yeah, I'll mitigate this morning

view this post on Zulip David Bryant (Apr 24 2026 at 15:51):

Yes, I received that email notification from GitHub as well, identifying the PRs impacted and providing follow-up details. Thanks, everyone.

view this post on Zulip Ralph (Apr 25 2026 at 08:42):

update: yes, they realize it was quite terrible. :-/

view this post on Zulip Pat Hickey (Apr 27 2026 at 18:16):

https://www.githubstatus.com/

view this post on Zulip Ralph (Apr 27 2026 at 18:31):

AGAIN????

view this post on Zulip Chris Fallin (Apr 27 2026 at 19:49):

fyi that (i) it seems we missed patching v24 (LTS) from our January CVE (https://github.com/bytecodealliance/wasmtime/issues/13211 just reported); and (ii) I will not do a patch-release for this today, because GitHub Status is red. Another point for "what the fuck, we need a different repository host"

view this post on Zulip Alex Crichton (Apr 28 2026 at 14:23):

according to https://github.blog/news-insights/company-news/an-update-on-github-availability/ silently reverting commits is not data loss since the previous commits were still in the history

also as news to all it's a day ending in 'y' so there's another github outage today

view this post on Zulip Ralph (Apr 28 2026 at 14:34):

jesus, what a mess

view this post on Zulip Ralph (Apr 28 2026 at 14:34):

wish I could help, but I can't

view this post on Zulip Chris Fallin (Apr 28 2026 at 14:46):

The blog post also describes the merge-queue bug as affecting "merge groups" with more than one PR; that's not us, but we were still affected. Even aside from the PR spin about "no data loss" (sure, if you want to call it data corruption instead, we can), that's concerning from an accurate-postmortem point of view

view this post on Zulip Till Schneidereit (Apr 28 2026 at 15:10):

in all this, I do want to give credit for the fact that we were notified via email within a few hours, and the email included the affected PRs. In combination with the commits still being addressable, that at least meant that even in the extremely unlikely scenario where no other copies would've existed, we could've restored them, and we knew we'd have to pretty quickly

view this post on Zulip Ralph (Apr 28 2026 at 15:13):

speaking or trying to speak objectively, it sucks and if it were normally the case I'd never put my work there; it hasn't been that bad before, and I don't have insight into what is the issue now (I could ask, but I already know the people I know are underwater as you might imagine trying to stablize things), but hey -- make it work or lose the user is pretty much the name of the game.

view this post on Zulip Ralph (Apr 28 2026 at 15:14):

there are other things going on of course, including the yearly rate of growth that I'm not at liberty to discuss but that is absolutely insane and which makes my megacorp gasp. But again, none of that matters if they corrupt my stuff, let alone block working each week a bunch of times.

view this post on Zulip Ralph (Apr 29 2026 at 12:36):

https://mitchellh.com/writing/ghostty-leaving-github pretty much sums up most of the people I know:

Lately, I've been very publicly critical of GitHub. I've been mean about it. I've been angry about it. I've hurt people's feelings. I've been lashing out. Because GitHub is failing me, every single day, and it is personal. It is irrationally personal. I love GitHub more than a person should love a thing, and I'm mad at it. I'm sorry about the hurt feelings to the people working on it.

I've felt this way for a long time, but for the past month I've kept a journal where I put an "X" next to every date where a GitHub outage has negatively impacted my ability to work2. Almost every day has an X. On the day I am writing this post, I've been unable to do any PR review for ~2 hours because there is a GitHub Actions outage3. This is no longer a place for serious work if it just blocks you out for hours per day, every day.

It's not a fun place for me to be anymore. I want to be there but it doesn't want me to be there. I want to get work done and it doesn't want me to get work done. I want to ship software and it doesn't want me to ship software.

view this post on Zulip Ralph (Apr 29 2026 at 12:36):

lotsa fun

view this post on Zulip Scott Waye (Apr 30 2026 at 16:13):

From the MS people in runtime, it seems the amount of work github is having to do has increased significantly from AI

view this post on Zulip Scott Waye (Apr 30 2026 at 16:15):

https://github.blog/news-insights/company-news/an-update-on-github-availability/

view this post on Zulip Ralph (Apr 30 2026 at 18:00):

OH YES

view this post on Zulip Ralph (Apr 30 2026 at 18:01):

another data point: over the past two years, roughly 90% of the data in the entire world was created by AI

view this post on Zulip Ralph (Apr 30 2026 at 18:01):

and we can guess how much of that was worthless

view this post on Zulip Ralph (Apr 30 2026 at 18:02):

so... if you throw in the GH "unversal user" bug they had to fix the past two months and the AI scale up and the human scale up it's a hard job. That said, uptime and consistency are their raison d'etre, as they say......

view this post on Zulip Ralph (Apr 30 2026 at 18:02):

so.... do they a raison?


Last updated: May 03 2026 at 23:15 UTC