Stream: general

Topic: Porting to a custom ISA


view this post on Zulip Roger Thorpe (Dec 29 2023 at 21:53):

We have a custom processor that we use deep inside most of our ASICs. It is dual stack based with 3 auto-increment address registers. Somewhat reminiscent of, I hate to say it, Forth. We have a new ASIC close to fabrication that has 256 of these processors and we are looking for a better software solution. It seems Webassembly might be the answer. We are looking for insight on how to do this. Comments please

view this post on Zulip nagisa (Dec 29 2023 at 22:29):

One negative to Wasm for an use-case like this is that its instruction set is not quite minimal (so you end up having to have implementations for instructions like popcnt, as compilers generally won’t provide a way to avoid them) and also because wasm is not really wasm without all of the runtime bits around it (e,g, entities like memories)

view this post on Zulip Roger Thorpe (Dec 29 2023 at 23:55):

In general adding custom instructions to our architecture is what we do. Indeed we already have popcount and adding clz and clt are trivial. Embedded systems generally have limited and fixed memories so that might be a bit of a challenge, but you never know until you try.
Does anyone know if this has been done successfully before? We might learn from their agony

view this post on Zulip Chris Fallin (Dec 29 2023 at 23:58):

You could certainly use Wasm as part of a compiler pipeline; if the machine is Turing-complete and has enough memory, then it should be possible to AOT-compile Wasm bytecode to your custom ISA. Two things though:

view this post on Zulip Roger Thorpe (Dec 30 2023 at 02:45):

Just as I thought, WASM's stackiness is very primitive so its only a handy way to describe things.
The real issue for us is not to run general applications but to be able to run high-level code that is targeted for our embedded device. We see the security as an asset to our systems. Memory is an issue as 64k is not far off maximum.
Should we just look to Rust?

view this post on Zulip Chris Fallin (Dec 30 2023 at 08:37):

Wasm has a pretty specific security guarantee: it gives a strong sandbox boundary. Is that what your use-case needs (e.g. downloading and running untrusted code)?

I ask because at least in the embedded realm, security could mean “against malicious firmware” or it could mean “against memory unsafety vulnerabilities”; for the latter note that within the sandbox the Wasm code can still corrupt its heap if written in an unsafe language, that would just be contained to the sandbox is all. Writing the Wasm module itself in rust protects against that. They’re really orthogonal dimensions.

It’d be useful to know your threat model. Is all the software within the Wasm module or is there something beneath it that the sandbox guards?

view this post on Zulip Roger Thorpe (Dec 30 2023 at 23:09):

Our ASICs are secure in that they use PUFs for security keys etc., and our code and data within the ASIC needs to be secure BUT we need to allow 3rd party applications to run as well. So assuming that the application to be downloaded is from a secure source we are still vulnerable to 'ourselves and our customers', be they malicious or not. So we are thinking about only accepting WASM as input but we still seem to have security holes. We currently have a supervisor processor that receives incoming encrypted apps into a sandbox. We decrypt it and signature check it prior to loading it into a code memory anywhere on the ASIC. Even that may not be enough. We are certainly memory constrained but could we leave the code in the sandbox and run a lightweight interpreter (or something) over it to ensure that the code cannot sneek out of its bounds. The sandbox should then have been converted to our ISA and moved to code memory.

I guess we don't care if a 3rd party app corrupts its own heap as long as we can detect it, trap it, delete it or something.

It seems that if we start with a simple WASM interpreter that takes 3rd party apps and runs them regardless of optimization then we at least have another layer of security.

We are not afraid of stack machines. They can be made to run faster than RISC machines and have superscalar capabilities. Our current pipelines are running at over 10GHz, so a few extra cycles for a client app is no big deal!

We can take a leaf out of Elon's book and start simple and optimize.


Last updated: Jan 24 2025 at 00:11 UTC