Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for code loading for upgradable contracts and proxying via LDC. #451

Open
otrho opened this issue Feb 4, 2023 · 14 comments
Open
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested

Comments

@otrho
Copy link

otrho commented Feb 4, 2023

LDC has not really been tested thoroughly and currently the VM+compiler have many shortcomings around supporting its use.

Related issues:

Goals

Loading code via LDC is desired as a cheaper alternative to making cross contract calls to utilise library oriented functionality from already on-chain code.

I'm still a bit hazy on the actual mechanics of the following, but it's also desired for upgrading contracts and implementing proxying.

All of the discussion below assumes the granularity of 'code' to be loaded is a Sway function. They would be loaded and called using Sway's calling convention.

Problems

There are a few problems with the current implementation, some simple, some not so simple.

  1. LDC reverts if $sp != $ssp. This was made in an effort to avoid clobbering user data but it actually mostly hinders its use. The user (i.e., the Sway compiler usually) should be able to manage the stack itself, understanding that LDC will load to $ssp.
  2. The compiler does not currently produce position independent code (PIC). If LDC were to be used on some arbitrary code in a contract, placing it at an offset different to where it is in the origin, the control flow will be corrupt. This is due to the VM (and compiler) primarily using absolute addressing for jumps.
  3. There is no way to determine where the code to be loaded can be found using the compiler and tooling. LDC takes an external contract address, an offset and a length and copies the code to $ssp. To use it today would have to be done entirely in embedded ASM with hardcoded values calculated manually.

Proposed solutions

Respectively:

  1. Relax the rules for $sp. Easy.
  2. Introduce relative jumps, and deprecate absolute immediate jumps. Keep dynamic absolute jumps, currently used for returning from function calls and far, far calls. The compiler switches to only emitting PIC for all contracts. This way we don't have to decide ahead of time whether code may be loaded in the future.
  3. Here's where it gets particularly gnarly.

Tooling and compiler

One proposal is to make code loading essentially a compile time operation. That is, if a contract wants to load from an external contract it must be specified at compile time (of the loading contract) where to load from. This would be the simplest solution for a few reasons:

  • The knowledge of function offsets and lengths can be determined at compile time of any contract and emitted as an artefact, alongside the binary, ABI spec, etc. These can then be used by the compiler for contracts which wish to load those functions. The symbol tables could be published or carefully regenerated, if need be.
  • Which functions to be loaded could be specified in Forc.toml, which formalises the process somewhat.

The compiler then has two options for performing the actual loads. These methods aren't exclusive though the former is simpler than the latter:

  • The compiler could produce a binary which performs all loading before jumping to any entry points. The offsets to these functions would be baked into the binary as constants and could be called easily. The compiler would also not need to manage the stack, as prior to an entry point being called there is no user data or call frames to preserve.
  • An alternative would be to allow dynamic loading. The symbol table in Forc.toml is still used but the actual loading is done conditionally at runtime. In this case the compiler would need to manage the stack, copying it to somewhere safe first. Intrinsic functions would be needed for both reading symbol table values and for making the load.

The dynamic loading would require some extra support from the VM for stack manipulation. Currently we have only CFEI and CFSI which grow and shrink the stack (i.e., increment and decrement $sp) by immediates. I'm thinking the process of moving the stack contents would be something like:

  1. Allocate code_len bytes on the the stack, where code_len is the size of the code loaded by LDC. I.e., $sp += code_len.
  2. MCP the current stack content from where it was to up to the top of the stack, being careful of overlaps, etc.
  3. LDC the code. The new $ssp should be at the bottom of the newly copied stack.
  4. Restore $sp to the top. This is only required if LDC resets $sp to the new $ssp. If it were to leave $sp alone we'd be fine.

Step 1 and probably 4 both need to increment $sp by an arbitrary value, requiring something like a CFE instruction. #346

Outstanding questions

For each of the use cases does the above address them?

For simply loading a library function the static method seems to work well. Particularly for functions called a lot, it'd be cheaper than contract calls.

For upgrading contracts? I'm not sure how that would work -- is the idea that you can implement a new contract, copy a bunch of old code from an older version and add/replace some newer functionality? It could use the static method above. This would be cheaper than publishing a contract which redundantly contains already on-chain code, so it makes sense to me. But is that the idea?

And for proxies? I think the idea is the proxy contract has addresses in storage and loads code from them to call, with an admin API for updating those addresses and lengths. In that case the dynamic method from above would work, I think. The admin would load from the symbol table and send them to the proxy. The proxy would call the load instrinsic.

Anyway, this is an actual Request For Comments, especially from @adlerjohn @Voxelot @simonr0204 @mitchmindtree @mohammadfawaz and @nfurfaro but I'm sure it's interesting for others on the @FuelLabs/client @FuelLabs/tooling and @FuelLabs/sway-compiler teams. @sezna too, if you like. 😉

@otrho otrho added enhancement New feature or request help wanted Extra attention is needed question Further information is requested labels Feb 4, 2023
@otrho otrho self-assigned this Feb 4, 2023
@nfurfaro
Copy link
Contributor

nfurfaro commented Feb 4, 2023

Thank you for writing this up @otrho !
From a user's perpective, I feel that of the 2 primary use cases, upgrading contracts is probably the more critical of the 2, but statically loading onchain libraries allows us to have practically ~unlimited contract size so also brings a lot of value.
As for upgradability/proxies:
I want to be able to deploy my contract once, and give the users of my contract the assurance that it will always be found at a particular ContractId. This way, they can build on top of it knowing it will always be there.
But, I also want the freedom to iterate on the logic provided by my contract and improve it, especially in the early days.
Finally, I want to preserve the state(storage) of my contract, having it persist throughout potentially multiple upgrades.
On Ethereum this is achieved via Proxy contracts and delegateCall. The Proxy gives us the fixed address and the persistent storage. The delegateCall instruction is used to call Version1 of the Implementation contract (where the business logic lives). This call executes in the context of the Proxy contract, so it can mutate the state of the proxy contract.
When Version2 of the implementation is deployed, and admin must call the Proxy and update the stored implementation address to point at Version2. Now, calls to the Proxy use the new upgraded business logic, but have the same state as Version1 had.

@nfurfaro
Copy link
Contributor

nfurfaro commented Feb 4, 2023

Crazy idea:
What if we decouple the mechanisms for upgradability and loading code ?
LDC could be used for loading onchain libs, and would be static , i.e: all loading could be done at compile time which might simplify the changes needed in the compiler?
For upgrades, the proxy would still provide a persistent ContractId for users, and would still store the ContractId of the current implementation contract. it would simply forward calls to the implementation with a call.
Storage would be handled by the implementation. There would need be a mechanism to deploy version2 of the implementation with its storage initialized with a copy of version1's storage, which provides us with the persistent storage property we need.
This involves more calls than the pure ldc based approach, but is quite similar to Ethereum's delegateCall approach.

@otrho
Copy link
Author

otrho commented Feb 6, 2023

OK, so traditional upgrading and proxying are essentially the same thing, the former uses the latter. In that case the dynamic loading is how we could proceed. Solidity's delegateCall is replaced with a load_code and then a call with Sway.

But I think your crazy idea could be maybe done with a special 'upgradeable' flag for ABI methods. Each flagged method would have builtin delegation, managed entirely by the compiler, which has a stub to dynamically link those methods via storage and LDC before running the first entry point. To upgrade them would need to be formalised via a std library call from a script or something, to make sure it's done correctly.

This actually seems like a nice compromise. Simpler for the compiler (no stack crap nor CFE needed) but still pretty powerful and perhaps easier to use, from a Sway dev perspective. It might be too inflexible for some applications though, in which case we go ahead with fully dynamic loading support (and CFE).

One caveat with compiler supported upgradability would be the cost of loading functions which aren't called. Every upgraded function would have to be loaded whether its called or not. Even checking whether a function has been upgraded or not has non-zero cost, since it's a storage read.

@mitchmindtree
Copy link
Contributor

mitchmindtree commented Feb 6, 2023

Nice work on writing this up @otrho!

Upgradeable Contracts

Storage would be handled by the implementation. There would need be a mechanism to deploy version2 of the implementation with its storage initialized with a copy of version1's storage, which provides us with the persistent storage property we need.

I guess we'd also want to ensure that there was some way for users who interact with the contract to easily check what the current "implementation" contract ID is, as users interacting with upgradeable contracts probably want to be able to verify the current implementation contract ID. I.e. they need to be able to construct transactions that are invalidated if the contract is updated before their transaction is submitted. I imagine this is easy to do with the "dynamic loading" approach, as your proxy contract can just provide a non-delegate method that returns the delegate contract ID. For an "upgradeable" feature, we'd probably need to generate some method for this.

I guess this implies upgradeable contracts might also need a special case for the forc deploy command, e.g. one that:

  1. Submits a Create transaction for V2 implementation contract and then
  2. Submits a Script transaction for updating the upgradeable contract's storage to point to V2.

Edit: Come to think of it, that means forc build should probably generate both contract and script binaries for upgradeable contracts too.

This seems like it would very much bake-in the way of upgrading contracts into Forc and Sway. On the other hand, if we were to go the dynamic loading approach, maybe we could be more hands-off and let the Apps team and wider Fuel ecosystem work out the nicest way of upgrading contracts over time with custom forc plugins, without us taking on the responsibility of getting it right the first time ourselves?

Dynamic Loading Attributes?

If we're to go the dynamic loading route, do we want to require some kind of attribute decorator for functions that touch LDC code anyway? Especially in the case that the function is also #[storage(read)], as that implies that the function can potentially behave completely differently throughout it's life on-chain based on what is in storage (e.g. if the value being read from storage is a ContractId)? Devs might want to be particularly aware if the function that they're calling can do totally arbitrary logic? Or maybe the #[storage(read)] attribute is already enough of a heads-up?

LDC + Storage Interaction

Related side-note: is the idea that code loaded via LDC can interact with contract storage in an arbitrary way? Perhaps we should conservatively disallow dynamically loading code that interacts with storage in favour of only allowing to load "pure" functions, so that an author can limit what parts of storage their loaded code has access to? That said, I guess it might be hard to know what limitations you want to apply when the whole point of the feature is upgradeability 🤔

@freesig
Copy link
Contributor

freesig commented Feb 6, 2023

Just throwing in my two cents. I'm not sure we should have upgradable contracts at all.
A contract is only as secure as it's upgrade policy because an upgrade allows you to rewrite the rules of the contract. Which means you can do anything.
Since the upgrade needs to be performed synchronously, you must either have all users online at the same time to agree on the upgrade (virtually impossible) or have some form of delegated trust.
But this is delegating trust to something that can change the rules of the game.

If some part of an application is fine with delegated trust then why even run it on chain?

The alternative is to keep older versions running and bridge between the old and the new version. This allows each user to decide if they want to "upgrade" in the same way they decide whether or not they want to stay on any contract or chain.

@nfurfaro
Copy link
Contributor

nfurfaro commented Feb 6, 2023

traditional upgrading and proxying are essentially the same thing, the former uses the latter.

Yes, although upgrades are not the only use case for proxies. Along the lines of ERC-1167, they are also used for "minimal proxy" clones.

In cases where a potentially large number of contracts will be deployed and they have some portion of their functionality in common (think smart-contract wallets, where each contract is nearly identical other than perhaps the owner), a single template/implementation contract is deployed, and then any number of tiny proxies are deployed, each with perhaps a different owner. The proxies delegate all computation to the implementation (which may or may not be upgradeable). In the case of smart-contract wallets the users may need to opt-in to upgrades (pretty sure argent wallet takes this approach).
While I feel like having upgradable contracts is going to be a highly sought-after feature on Fuel, I do think it's critical that we keep the implementation flexible to allow developers to:
a. choose whether or not their contracts will be upgradeable, and
b. implement a migration plan from upgradeable => locked/no longer upgradeable after some level of feature completeness/stability is reached.

is the idea that code loaded via LDC can interact with contract storage in an arbitrary way?

Yeah, this is what delegateCall does, basically allowing remote code (from an external contract) to be executed in the context of the calling contract, which allows upgradeable contracts via proxy to work. It is on the developer to ensure their contract only loads code from a trusted/verified source!

we'd also want to ensure that there was some way for users who interact with the contract to easily check what the current "implementation" contract ID

100%. in so called "Transparent Proxies" this is usually achieved by storing the implementation address in a "public" storage slot, and there is a getter for this called implementation().
The newer UUPS style of proxy does this a bit differently, but seems to be the way that OpenZeppelin is pushing users to adopt.

@mitchmindtree
Copy link
Contributor

Yes, although upgrades are not the only use case for proxies

Considering this, my intuition is that we're probably better off going the dynamic loading route as it unlocks all this flexibility under a single feature, and let's the library ecosystem experiment with nice APIs on top for proxies and upgradability. Later on we can think about how we can enable those patterns more easily in tooling as the Fuel-y way is discovered.

It might be a shame to bake-in specific support for upgradability into the language and stock Forc commands, only to end up requiring dynamic loading in the future for other kinds of proxies anyways, and then potentially have some library come up with a nicer / more Fuel-esque way of upgrading contracts anyways that makes the in-language support redundant?

@simonr0204
Copy link
Contributor

@freesig , to your point

If some part of an application is fine with delegated trust then why even run it on chain?

This is pretty common though - a governance module proposes a change which only becomes actionable when a vote threshold is reached.

@freesig
Copy link
Contributor

freesig commented Feb 7, 2023

@freesig , to your point

If some part of an application is fine with delegated trust then why even run it on chain?

This is pretty common though - a governance module proposes a change which only becomes actionable when a vote threshold is reached.

Yeh not to derail this convo. @Voxelot gave the example of weighted staked voting as a reason that you might want this on-chain. This does make sense to me. I guess my worry is that all these approaches are by definition are a weaker security model and if it's not clear that you are interacting with a contract that does have lower security then users might get into trouble and blame it on fuel. It would be nice if you knew when interacting with a contract that it or any of it's dependencies do some form or security weakening so that it's clear.

Imagine FOO DEX that has an upgradable contract via some DOA vote. If that vote was corrupted or some how influenced so the contract was upgraded to an malicious contract and anyone using FOO DEX looses their tokens will they blame FOO DEX or Fuel? I'm honestly not sure but it's something to consider when making these patterns super easy for devs to implicitly include.

@nfurfaro
Copy link
Contributor

nfurfaro commented Feb 8, 2023

@freesig I agree that generally speaking, adding the option to upgrade contracts takes away from certain guarantees unique to blockchain, i.e: the code deployed at this ContractId will always be the same.
I feel that it's also true that it's very difficult to build and deploy non-trivial software in 1 go; having the ability to get feedback and iterate is hugely valuable when it comes to building something that will stand the test of time.
Contracts that are always upgradeable forevermore may as well just be deployed to AWS. But if there's a mechanism to disable upgrades after a specified point, I think upgrades can add value rather than detract from it. I don't think it will be Fuel's place to make this design decision for teams building on Fuel, but we should give them the tools to build great things and let their users judge their approach.

Just my 2 cents... :)

@otrho
Copy link
Author

otrho commented Feb 13, 2023

OK, there are still a bunch of questions/unknowns here. It looks like the preferred use case in Sway will be dynamic loading, implying conditional / optional calls to LDC, meaning we definitely need CFE along with dynamic jumps from the VM.

The unit of code that we'll be able to load is a function. But what does that mean? Does it need to be an entry point in the external contract? Currently they're the only functions with a published ABI and also the only functions guaranteed not to be inlined, so I think this is a must. At the same time the loaded code can't make any function calls of its own as they won't be loaded. i.e., for them to be usable they must have all calls inlined. This actually isn't really feasible.

I think the idea that we allow loading of arbitrary code from another contract doesn't work. We need to be able to prepare that code for loading. It needs to have its ABI published (via build artefacts) and to be a self contained unit with all its library calls and other calls inlined.

And if this is the case then we could keep absolute control flow for some cases and use relative jumps most of the time (and all of the time for loadable functions).

Another question which remains around dynamic loading is what this looks like in Sway. We don't have function pointers (yet) so we'll need some other magic. Or both.

Naively we could have a magic type (like the current ABI/Contract stuff) where you say:

    let delegate_obj = load_external_spec_for("function identifier of some sort pulled from storage");
    delegate_obj.call("calling the delegate");

For this to type-check the delegate_obj would need to be a special type, and knowing the args to call() would be tied to that identifier and loading it. I have a feeling making this work in a statically typed language like Sway is going to be very hacky.

A safer approach would be to just declare a function signature as external (a bit like FFI declarations) and to call it. The actual loading would be done on demand, as it's called (the first time) and type checking would be easier as the top level declaration is just there. But parameterising that declaration to be dynamically configured is tricky.

The compiler could handle it all, as in managing how the metadata around the external function specs are loaded from and stored in storage. Something like an ExternalFunction struct which can only live in storage blocks. The external function declarations reference them somehow, via an attribute? And there's an API for updating the ExternalFunction spec which would be type checked as much as possible, although that's down to trusting the artefacts.

Wow, when spelled out like that it sounds really dangerous. Obviously updating the ExternalFunction to point to somewhere new would need to be protected by an 'owner only' predicate which also feels like should be handled by the compiler, otherwise we're providing a very unsafe mechanism which could very easily be misused.

@otrho
Copy link
Author

otrho commented Feb 13, 2023

Nick said:

I feel that it's also true that it's very difficult to build and deploy non-trivial software in 1 go; having the ability to get feedback and iterate is hugely valuable when it comes to building something that will stand the test of time.

I think that this iteration, update, feedback loop could be done off chain though, with proper publishing tools, like a crates.io repository or similar. i.e., if there's a new version then make a noise. As someone with no experience in this area, this whole idea seems really unsafe and prone to exploitation.

@nfurfaro
Copy link
Contributor

I think that this iteration, update, feedback loop could be done off chain though

I feel like the gold standard for contracts is to be immutable. So even upgradeable contracts ought to be striving to disable the upgrade mechanism at some point, ideally asap. The nice thing about upgrades via the proxy pattern is that an upgradable contract can have a fixed ContractId, ,allowing other contracts to integrate with them without themselves needing to be upgradeable just to keep up with the changing ContractIds.

@otrho
Copy link
Author

otrho commented Feb 16, 2023

There is a need to try and nail the core functionality down, particularly around the VM, prior to beta-4 and definitely before mainnet.

At a minimum we need PIC, which implies relative jumps are added to the VM and the Sway compiler employs them everywhere.

@adlerjohn has suggested offline that the MVP proxy could then be implemented in ASM directly. It would be fairly coarse in that it could only load entire external contracts, but all that is required in that case is to perform the actual load and then to manage the delegation (for all selectors), along with the mechanisms for updating the metadata around the external contract description (i.e., the address and size of the external contract in storage).

After this MVP the refinements around how to implement the same safely in Sway can be designed and implemented progressively without the pressure of future milestone deadlines and breaking existing code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants