Stream: wasmtime

Topic: github status and/or woes


view this post on Zulip Alex Crichton (Feb 04 2026 at 19:12):

in the spirit of starting a thread on this, I've noticed notification emails being received very late today, I'm only just now getting notifications in email for stuff that happened hours ago

view this post on Zulip Victor Adossi (Feb 05 2026 at 10:08):

Continuing in that spirit GH actions and GH as a whole has been slow/buggy this week (and a little of last week)? My emails have been also coming through slowly, but haven't seen actions failures as a symptom just yet

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:10):

pages are loading slowly and getting intermittent errors for me rn

https://www.githubstatus.com/ says "notifications are delayed" but it seems like its their whole system that is being sluggish

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:10):

aaand now I'm getting unicorns

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:36):

git pushes are failing now too

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:36):

man things fell over fast

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:46):

Screenshot 2026-02-09 at 10.46.38.jpg

it's like it's christmas!

view this post on Zulip Alex Crichton (Feb 09 2026 at 16:47):

so colorful!

view this post on Zulip fitzgen (he/him) (Feb 09 2026 at 16:57):

when I do manage to load a PR or something, it seems like actions are not getting scheduled for the PR at all

view this post on Zulip Till Schneidereit (Feb 09 2026 at 17:23):

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

view this post on Zulip Ralph (Feb 09 2026 at 17:54):

there was a snow hour over here, too

view this post on Zulip Ralph (Feb 09 2026 at 17:55):

my US team seems like it's working again?

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:00):

Till Schneidereit said:

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

routing through germany I'm getting extremely slow git pushes right now as well as unicorns -- my guess is this is more of a backend thing than a frontend

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:12):

everything got green for a bit and it's all back to very red

view this post on Zulip Alex Crichton (Feb 09 2026 at 19:26):

We're not really keeping track per-se, but at some point we're going to cross the threshold of "it would be cheaper to hire someone to maintain self-hosted CI infrastructure"

view this post on Zulip Chris Fallin (Feb 09 2026 at 19:30):

Three Mondays in a row with major outages; perhaps all BA member companies should adopt 4-day workweeks Tue-Fri ¯\_(ツ)_/¯

view this post on Zulip bjorn3 (Feb 10 2026 at 12:48):

Till Schneidereit said:

going through a VPN with a European exit node might help? https://eu.githubstatus.com/

Wouldn't that only help if the bytecodealliance enterprise account itself is a European account?

view this post on Zulip Ralph (Feb 10 2026 at 14:55):

I'm fairly sure there was an underlying resource failure of some sort; it absolutely went out here in the EU as well, but was back up in about 15 minutes or so......

view this post on Zulip Ralph (Feb 10 2026 at 14:55):

FWIW, all of gh is on a stability freeze -- no new rollouts or config changes of any sort -- to fully understand and rectify what happened particularly this week.

view this post on Zulip Alex Crichton (Feb 10 2026 at 15:35):

Screenshot 2026-02-10 at 09.34.58.jpg

so it begins anew...

view this post on Zulip Ralph (Feb 10 2026 at 16:41):

good god

view this post on Zulip Ralph (Feb 10 2026 at 16:41):

are you running again, or is it STILL there?

view this post on Zulip Alex Crichton (Feb 10 2026 at 16:56):

I have done much GitHub myself this morning and the page is all green now so hopefully fine...

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:17):

Screenshot 2026-02-11 at 10.17.07.jpg

Another day, more errors. I'm seeing a lot of delayed notifications this morning as well as a lot of spurious failures in this CI run

view this post on Zulip Ralph (Feb 11 2026 at 16:31):

all I can do here is listen to the pain and pass it along to Ben

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:48):

Oh that's understandable yeah, this is primarily a heads-up channel for us so we can share what we're seeing and be aware of outages/problems on our end

view this post on Zulip Ralph (Feb 11 2026 at 16:50):

totes git it; I'm just letting you know that I'm backchanneling but also that I can't do more than that

view this post on Zulip Alex Crichton (Feb 11 2026 at 16:51):

that's also much appreciated too!

view this post on Zulip Victor Adossi (Feb 11 2026 at 16:59):

Maybe we can get the CEO of GraphQL on the phone
(apologies, couldn't resist)

view this post on Zulip Ralph (Feb 11 2026 at 16:59):

hey, any port in a storm, right?

view this post on Zulip Ralph (Feb 11 2026 at 17:51):

as it happens, the ex CEO of GH is starting his own new GH, so maybe we can all move there while they don't charge anything? :-)

view this post on Zulip Ralph (Feb 11 2026 at 17:52):

meanwhile, the poor pm who has to deal with all this from customers:
image.png

view this post on Zulip Ralph (Feb 11 2026 at 17:52):

due diligence: he IS kidding, painfully

view this post on Zulip Alex Crichton (Feb 11 2026 at 22:06):

We've talked about retries and such before, but here's an example of an exponential backoff and it just fails every time...

view this post on Zulip Lann Martin (Feb 11 2026 at 23:46):

Could be hitting the rate limit for unauthenticated requests...

view this post on Zulip Lann Martin (Feb 11 2026 at 23:55):

Could try using the gh CLI which can download via authenticated API calls, e.g. for the example you linked this seems to work: gh release download --repo bytecodealliance/wasm-tools wasm-tools-1.0.27 -p wasm-tools-1.0.27-x86_64-linux.tar.gz
I believe gh is preinstalled for standard actions runners but it might require a bit more config to make it authenticate as the action: https://docs.github.com/en/actions/tutorials/authenticate-with-github_token#example-1-passing-the-github_token-as-an-input
Alternatively: https://github.com/marketplace/actions/release-downloader

view this post on Zulip Chris Fallin (Feb 11 2026 at 23:58):

It looks like each attempt there downloads ~55kB then stalls -- I'd expect a rate limit to immediately return a 429 or 500 or whatever. Looks like maybe a CDN/cache problem as each download stalls at the same chunk? In any case, points more to "flaky platform" than "problem that we can solve easily" IMHO

view this post on Zulip Till Schneidereit (Feb 12 2026 at 15:51):

we could also try to cache tool downloads, so we presumably at least are closer to the storage the bits come from, and they all come from the same storage?

view this post on Zulip fitzgen (he/him) (Apr 14 2026 at 13:48):

https://www.githubstatus.com/ is green but I'm getting intermittent unicorns rn

view this post on Zulip Alex Crichton (Apr 16 2026 at 23:25):

If you see

---- cli_tests::test_programs::p3_cli_serve_hello_world_many_no_concurrent_reuse stdout ----
failed to wait for child or read stdio: child failed Output { status: ExitStatus(ExitStatus(1)), stdout: "", stderr: "\nthread 'tokio-rt-worker' (8740) panicked at C:\\Users\\runneradmin\\.cargo\\registry\\src\\index.crates.io-1949cf8c6b5b557f\\tokio-1.51.1\\src\\sync\\mpsc\\list.rs:278:9:\nattempt to subtract with overflow\nnote: run with `RUST_BACKTRACE=1` environment variable to display a backtrace\n" }
Error: failed to read body

Caused by:
    0: error reading a body from connection
    1: unexpected EOF during chunk size line

in CI logs it's a spurious failure. This is https://github.com/tokio-rs/tokio/issues/8061 and while this has been a bug in Tokio for a long time it seems the tokio update in https://github.com/bytecodealliance/wasmtime/pull/13104 caused scheduling changes such that it happens more frequently now.

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:02):

CI is broken until https://github.com/bytecodealliance/wasmtime/pull/13150 lands

view this post on Zulip Chris Fallin (Apr 20 2026 at 16:06):

sorry about that; did I miss something on the add-a-new-crate checklist? annoying that this doesn't surface until a release

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:07):

last time I dug into this it's actually impossible to prevent this from happening, we're forced to, when adding a new crate, accept that CI will be broken on the next publication

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:08):

I forget exactly why though, and things have changed where we publish things ahead-of-time now, and we add crates rarely enough I never bothered to re-check

view this post on Zulip Alex Crichton (Apr 20 2026 at 16:08):

so, no, no mistake on your part and our docs don't mention this, it's just always a fun surprise on the next publish heh

view this post on Zulip Pat Hickey (Apr 23 2026 at 16:52):

https://www.githubstatus.com/incidents/myrbk7jvvs6p its another day ending in y

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:51):

Ok this is a first I think -- github seems to have corrupted a merge to the main branch

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

https://github.com/bytecodealliance/wasmtime/pull/13180 just landed on the tip of tree, and the diff there looks as-expected

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

However the squashed commit -- https://github.com/bytecodealliance/wasmtime/commit/0c3a69f18df3e6939048b68e9d0dcb5a4d4518f3 -- seems to additionally include a revert of the parent commit -- https://github.com/bytecodealliance/wasmtime/commit/54929c175c1249b8d1978a76c54f92c0317b0181

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

so github has helpfully reverted a commit for us

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:52):

I've... never seen data corruption before

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:56):

how many other PRs have landed and been silently reverted.... I have no idea

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:57):

that's... extremely odd? race condition wrt base branch maybe? (clearly a GitHub bug)

view this post on Zulip Alex Crichton (Apr 23 2026 at 20:58):

according to https://www.githubstatus.com/incidents/zsg1lk7w13cf

We have identified a regression in merge queue behavior present when squash merging or rebasing. We have identified the root-cause and are in the process of reverting the change.

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:58):

the perils of opaque SaaS providers

view this post on Zulip Chris Fallin (Apr 23 2026 at 20:59):

(I say as an employee of a SaaS provider, speaking with other employees of other SaaS providers)

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:00):

well, I'll just reland Nick's patch and pray that's the only victim

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

it's ... kind of insanely lucky that I caught this

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

I just happened to want to do a small follow-up and couldn't find the code when I was trying to do that

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:01):

otherwise we never would have noticed this

view this post on Zulip Chris Fallin (Apr 23 2026 at 21:19):

at the risk of re-igniting the "should we be on GitHub" question, silently losing a commit is kind of the worst sin that a git host could commit

view this post on Zulip Chris Fallin (Apr 23 2026 at 21:20):

I don't know what to do about that but just want to say it out loud

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:21):

status page now says:

Update - We have resolved a regression present when using merge queue with either squash merges or rebases. If you use merge queue in this configuration, some pull requests may have been merged incorrectly between 2026-04-23 16:05-20:43 UTC.

I can only hope they realize how utterly serious this is

view this post on Zulip Alex Crichton (Apr 23 2026 at 21:21):

uptime is almost nothing compared to data loss

view this post on Zulip Alex Crichton (Apr 23 2026 at 22:08):

whelp it happened again

view this post on Zulip Alex Crichton (Apr 23 2026 at 22:09):

Let's not land anything else today...

view this post on Zulip Till Schneidereit (Apr 24 2026 at 12:47):

Maybe others did as well, but FYI I got an email from GitHub notifying us of two PRs being dropped instead of merged, so they at least seem to realize that yes, this is quite terrible

view this post on Zulip Alex Crichton (Apr 24 2026 at 13:17):

I also got an email yeah, I'll mitigate this morning

view this post on Zulip David Bryant (Apr 24 2026 at 15:51):

Yes, I received that email notification from GitHub as well, identifying the PRs impacted and providing follow-up details. Thanks, everyone.

view this post on Zulip Ralph (Apr 25 2026 at 08:42):

update: yes, they realize it was quite terrible. :-/

view this post on Zulip Pat Hickey (Apr 27 2026 at 18:16):

https://www.githubstatus.com/

view this post on Zulip Ralph (Apr 27 2026 at 18:31):

AGAIN????

view this post on Zulip Chris Fallin (Apr 27 2026 at 19:49):

fyi that (i) it seems we missed patching v24 (LTS) from our January CVE (https://github.com/bytecodealliance/wasmtime/issues/13211 just reported); and (ii) I will not do a patch-release for this today, because GitHub Status is red. Another point for "what the fuck, we need a different repository host"

view this post on Zulip Alex Crichton (Apr 28 2026 at 14:23):

according to https://github.blog/news-insights/company-news/an-update-on-github-availability/ silently reverting commits is not data loss since the previous commits were still in the history

also as news to all it's a day ending in 'y' so there's another github outage today

view this post on Zulip Ralph (Apr 28 2026 at 14:34):

jesus, what a mess

view this post on Zulip Ralph (Apr 28 2026 at 14:34):

wish I could help, but I can't

view this post on Zulip Chris Fallin (Apr 28 2026 at 14:46):

The blog post also describes the merge-queue bug as affecting "merge groups" with more than one PR; that's not us, but we were still affected. Even aside from the PR spin about "no data loss" (sure, if you want to call it data corruption instead, we can), that's concerning from an accurate-postmortem point of view

view this post on Zulip Till Schneidereit (Apr 28 2026 at 15:10):

in all this, I do want to give credit for the fact that we were notified via email within a few hours, and the email included the affected PRs. In combination with the commits still being addressable, that at least meant that even in the extremely unlikely scenario where no other copies would've existed, we could've restored them, and we knew we'd have to pretty quickly

view this post on Zulip Ralph (Apr 28 2026 at 15:13):

speaking or trying to speak objectively, it sucks and if it were normally the case I'd never put my work there; it hasn't been that bad before, and I don't have insight into what is the issue now (I could ask, but I already know the people I know are underwater as you might imagine trying to stablize things), but hey -- make it work or lose the user is pretty much the name of the game.

view this post on Zulip Ralph (Apr 28 2026 at 15:14):

there are other things going on of course, including the yearly rate of growth that I'm not at liberty to discuss but that is absolutely insane and which makes my megacorp gasp. But again, none of that matters if they corrupt my stuff, let alone block working each week a bunch of times.

view this post on Zulip Ralph (Apr 29 2026 at 12:36):

https://mitchellh.com/writing/ghostty-leaving-github pretty much sums up most of the people I know:

Lately, I've been very publicly critical of GitHub. I've been mean about it. I've been angry about it. I've hurt people's feelings. I've been lashing out. Because GitHub is failing me, every single day, and it is personal. It is irrationally personal. I love GitHub more than a person should love a thing, and I'm mad at it. I'm sorry about the hurt feelings to the people working on it.

I've felt this way for a long time, but for the past month I've kept a journal where I put an "X" next to every date where a GitHub outage has negatively impacted my ability to work2. Almost every day has an X. On the day I am writing this post, I've been unable to do any PR review for ~2 hours because there is a GitHub Actions outage3. This is no longer a place for serious work if it just blocks you out for hours per day, every day.

It's not a fun place for me to be anymore. I want to be there but it doesn't want me to be there. I want to get work done and it doesn't want me to get work done. I want to ship software and it doesn't want me to ship software.

view this post on Zulip Ralph (Apr 29 2026 at 12:36):

lotsa fun

view this post on Zulip Scott Waye (Apr 30 2026 at 16:13):

From the MS people in runtime, it seems the amount of work github is having to do has increased significantly from AI

view this post on Zulip Scott Waye (Apr 30 2026 at 16:15):

https://github.blog/news-insights/company-news/an-update-on-github-availability/

view this post on Zulip Ralph (Apr 30 2026 at 18:00):

OH YES

view this post on Zulip Ralph (Apr 30 2026 at 18:01):

another data point: over the past two years, roughly 90% of the data in the entire world was created by AI

view this post on Zulip Ralph (Apr 30 2026 at 18:01):

and we can guess how much of that was worthless

view this post on Zulip Ralph (Apr 30 2026 at 18:02):

so... if you throw in the GH "unversal user" bug they had to fix the past two months and the AI scale up and the human scale up it's a hard job. That said, uptime and consistency are their raison d'etre, as they say......

view this post on Zulip Ralph (Apr 30 2026 at 18:02):

so.... do they a raison?

view this post on Zulip Alex Crichton (May 04 2026 at 14:30):

most of my page loads right now don't have css and stop loading halfway through the page, no current incident and iunno if it's my internet, but wanted to note

view this post on Zulip Alex Crichton (May 04 2026 at 14:54):

yeah I can't load github at all so I'm limited to work currently where it doesn't involve the web ui...

view this post on Zulip Alex Crichton (May 04 2026 at 15:03):

or I just needed to restart my browser?! sometimes you never know...

view this post on Zulip Chris Fallin (May 04 2026 at 15:06):

currently loading fine for me fwiw; different ISP and geo so who knows what network weather looks like for you of course

view this post on Zulip Alex Crichton (May 04 2026 at 16:06):

well, there is now https://www.githubstatus.com/incidents/72q3n8yxthcy

view this post on Zulip Alex Crichton (May 04 2026 at 16:06):

(comments aren't or are slow to go through)

view this post on Zulip Chris Fallin (May 05 2026 at 15:19):

time for the Daily Incident: https://www.githubstatus.com/incidents/1j40g94rn22j (currently blocking the patch-release merge)

view this post on Zulip Chris Fallin (May 05 2026 at 15:20):

my two thoughts are "what the fuck" and "we should have the platform-move discussion again"

view this post on Zulip Pat Hickey (May 05 2026 at 16:34):

from what im seeing, many projects are "having the platform-move discussion", most prominently on orange site the dude who started vagrant or something has committed to move but where to is unknown

view this post on Zulip Pat Hickey (May 05 2026 at 16:34):

my best guess is we need to sit tight for 6 months or more, still, until places worth moving to get more capable of taking on projects like ours

view this post on Zulip Alex Crichton (May 05 2026 at 16:35):

Given the scale of our CI and donation from msft to host BA projects I'm not even sure where it would be possible to move to.

view this post on Zulip Pat Hickey (May 05 2026 at 16:36):

yeah the code hosting is basically inconsequential, its the CI thats a pretty massive engineering undertaking

view this post on Zulip Alex Crichton (May 05 2026 at 16:37):

like, I'm just as unhappy about this as anyone else, but I don't think we have any other options available to us which don't start with "where do we move our 200+ concurrent runners to for free"

view this post on Zulip Pat Hickey (May 05 2026 at 16:40):

and if theyre offering any nontrivial number of concurrent runners to any random account that signs up for free, theyre just as unsustainable as github, so it needs to be somewhere that would sponsor a project like ours but make actual money off the lots of other people migrating off github... doesnt seem very likely

view this post on Zulip fitzgen (he/him) (May 05 2026 at 16:41):

or the BA could pay for it in theory, but it really needs to be a service, not something that we have to sys admin ourselves and build out CI infra from scratch on top of AWS or whatever

view this post on Zulip Pat Hickey (May 05 2026 at 16:43):

agreed with alex that we really have no choice but to grit our teeth and hope that either github gets their shit together or somehow an alternative appears with an on-ramp both technically (running our existing actions stuff with a minimum of finagling) and financially (could we afford 20k/yr? very likely. 100k/yr? not gonna happen)

view this post on Zulip Pat Hickey (May 05 2026 at 16:43):

I have not even attempted to pencil out what our runners would cost at e.g. ec2 list price

view this post on Zulip Pat Hickey (May 05 2026 at 16:44):

(could someone do that exercise? just for fun?)

view this post on Zulip fitzgen (he/him) (May 05 2026 at 16:49):

Pat Hickey said:

(could someone do that exercise? just for fun?)

I don't think we can just get off-the-shelf quotes with our scale, or at least not with circle CI. requires reaching out to their sales team. FWIW, a single CI run for us seems like it would eat all the monthly credits of their $15/month tier. gotta talk to their sales team for info on larger plans than that.

view this post on Zulip Alex Crichton (May 05 2026 at 16:50):

circle ci just got bought as well :grimacing:

view this post on Zulip Pat Hickey (May 05 2026 at 16:51):

oh yeah i just meant a back of the envelope in terms of ec2 prices and figure best case if youre paying a CI provider you're paying 2x ec2 list price

view this post on Zulip Pat Hickey (May 05 2026 at 16:51):

maybe more like 10x, idk

view this post on Zulip Pat Hickey (May 05 2026 at 16:52):

but ive never even priced it out in terms of ec2

view this post on Zulip Ralph (May 05 2026 at 17:02):

I wholeheartedly support anything you all want to do in order to get work done. No question there. The only thing I have done is make sure through my connections that they hear your conversations and they definitely know you aren't the only ones. Just the ones I know best.

view this post on Zulip Ralph (May 05 2026 at 17:02):

whatever you want to do, we're in.

view this post on Zulip Chris Fallin (May 05 2026 at 17:09):

(missed the convo I kicked off, oops) yeah, I generally agree that we don't have a ready-made realistic option; but "having the conversation again" is exactly evaluating where we are on the tradeoff axis right now.

back of the envelope: per https://github.com/bytecodealliance/wasmtime/actions/metrics/usage?dateRangeType=DATE_RANGE_TYPE_PREVIOUS_MONTH, in April we spent 777,773 CPU-minutes of CI time. That's 777773 * 60 / (30 * 86400) = 18.004 CPU-seconds per second, or 18 cores steady-state

of course the other high-order bits are (i) macOS and Windows too; (ii) load is spiky of course, we want high parallelism then have long periods of zero load. Our peak load for one full CI run is a few hundred jobs, probably something like 500 cores? (Above stats give total number of job runs but there's no way to know total number of CI triggers and distribution of job-count per trigger)

Let's say we provision 96 Linux/x86-64 CPU cores; that's 12 t3.2xlarge, each with 8 cores / 32GiB RAM; on a 1-year committed rate in US-East (Ohio) or US-West (Oregon) that's $0.2399/hr per machine, or $18300/year for the 12 machines. (https://aws.amazon.com/savingsplans/compute-pricing/)

Windows machines only go to t3.xlarge on that table (4 cores / 16 GiB) but taking 4 of those at $0.1732/hr is $4400/year.

macOS machines are $1.97/hour for bare-metal M4 Pro (14 cores, 48GiB); $12.5k/yr for that alone. M4s are pretty fast so that probably still wouldn't be the bottleneck.

So that's $35k/year for a little CI fleet with pretty good Linux capacity and acceptable Windows and macOS, as a floor. Plus sysadmin overhead (no idea what software exists, but I know that there are ready-made open source CI packages out there)

view this post on Zulip Chris Fallin (May 05 2026 at 17:10):

(and that single little mac mini is ouch; I wonder if the hardware can be rented cheaper elsewhere)

view this post on Zulip Chris Fallin (May 05 2026 at 17:10):

(to be clear I'm not volunteering to maintain that! but good to know what self-hosting cost would be, at a minimum)

view this post on Zulip Pat Hickey (May 05 2026 at 17:14):

Thanks Chris thats super helpful! so the minimum back of envelope a for-profit CI service that took our actions yamls and made it all go brr could concievably charge us would be, say, 70k

view this post on Zulip Pat Hickey (May 05 2026 at 17:15):

thats definitely getting into the range of "the BA cant really afford that unless github is down to like 90% uptime"

view this post on Zulip Chris Fallin (May 05 2026 at 17:16):

yeah, maybe less with benefit of scale; down to maybe $20k compute costs if one gets closer to the 18-core true steady-state cost, with perfect binpacking across customers, so 40-50k As A Service

view this post on Zulip Chris Fallin (May 05 2026 at 17:18):

Pat Hickey said:

thats definitely getting into the range of "the BA cant really afford that unless github is down to like 90% uptime"

the, uh, "good news" is that depending on how you count, GitHub is currently at 84.88% uptime

view this post on Zulip Pat Hickey (May 05 2026 at 17:20):

im not going to really look into exactly how thats calculated there except a gut feel that its a little bit pessimistic. not that the actual situation is good, but

view this post on Zulip Chris Fallin (May 05 2026 at 17:21):

Yeah, it's "any active incident" I think, which includes both "CI isn't running at all" and e.g. "search might not be working". 97% for Actions is still not great ("one 9") but not yet truly catastrophic

view this post on Zulip Till Schneidereit (May 05 2026 at 17:22):

if we want to have a chance to make any of this actually work in a sustainable way, I don't think we could do it without a full-time ops person. And that's on top of the investment needed to port everything we have to this new setup that we'd be setting up. Which for the record my team will definitely not be able to prioritize at all

view this post on Zulip Till Schneidereit (May 05 2026 at 17:24):

which is to say, I think these numbers are highly optimistic when it comes to what we actually have to invest including paid time investment, and that I'm also skeptical we could really do better when it comes to reliability etc without investing several times as much as what we're getting from GitHub for free

view this post on Zulip Chris Fallin (May 05 2026 at 17:33):

So the more realistic option if we have to go there is another CI provider -- the aws self-hosted option above is both a datapoint and (as Pat said) a floor-with-some-profit-multiplier for what this would likely cost via SaaS. That at least cuts out the need for ops folks and (hopefully, if a provider is competitive) has better reliability, though the porting cost is still huge. (Business opportunity for someone to build a GHA-compatible CI host though!)

Agreed this is not the practical choice today given all the above. I just wish we had better options!

view this post on Zulip Jacob Lifshay (May 05 2026 at 18:26):

AWS is pretty expensive, for comparison renting 500 cores worth of 12-core VPS instances from Contabo (what I'm using for my own web server, though I have a 6-core VPS) comes out to $12k/yr

view this post on Zulip Pat Hickey (May 05 2026 at 20:57):

Yeah agree with Till we should not consider running our own CI, whether on a cloud or renting bare metal. we'd need a fullly managed offering, and it would have to be very substantially compatible with github actions and all the related machinery required to make releases.

view this post on Zulip Pat Hickey (May 05 2026 at 21:03):

And thats a big barrier. I assume that somewhere at some other hyperscalar there is a team vibe coding at that goal as hard as they possibly can right now to make hay while the sun shines, but we dont want to be the alpha testers for that solution either. which is why my "wait and see what the world looks like in 6 months" estimate is probably itself unrealistic. we are stuck with what we have got for a while, best we can do is re-evaluate what the world looks like this winter

view this post on Zulip Alex Crichton (May 12 2026 at 17:03):

mail is coming in slow today -- https://www.githubstatus.com/incidents/z3jhyg3l0dvx

view this post on Zulip Notification Bot (May 12 2026 at 17:04):

A message was moved here from #wasmtime > Should we rethink our disclosure policy? by Alex Crichton.


Last updated: May 26 2026 at 09:09 UTC