-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add C/C++ guide #134
base: master
Are you sure you want to change the base?
add C/C++ guide #134
Changes from 16 commits
5265b6d
1f64706
9650e38
7aecffd
85343a2
b1471d7
a52a723
696768c
c229d16
0556196
19253ee
9585d5c
1e5ee1b
63d742c
5c6a92b
c6ca269
912f6ed
3ae2678
4dab415
e86d665
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Compiling C/C++ to Ewasm | ||
|
||
First an introduction, then a basic step-by-step guide, then advanced things. Warning: the Ewasm spec and tools below are subject to change. | ||
|
||
## Introduction | ||
|
||
An Ewasm contract is a WebAssembly module with the following restrictions: | ||
|
||
- The module's imports must be among the [Ewasm helper functions](https://github.com/ewasm/design/blob/master/eth_interface.md) which resemble EVM opcodes to interact with the client. | ||
- The module's exports must be a `main` function which takes no arguments and returns nothing, and the `memory` of the module. | ||
- The module may not use floats or other [sources of non-determinism](https://github.com/WebAssembly/design/blob/master/Nondeterminism.md). | ||
|
||
## Caveats | ||
|
||
When writing Ewasm contracts in C/C++, one should bear in mind the following caveats: | ||
|
||
1. WebAssembly is still primitive and [lacks features](https://github.com/WebAssembly/design/blob/master/FutureFeatures.md). For example, WebAssembly lacks support for exceptions and we have no way to do system calls in Ewasm. Compilers and libraries are still primitive. For example, we have a patched version of libc to allow `malloc`, but the patches are not yet enough for `std::vector` because other memory managment calls are unavailable. But perhaps any memory management beyond memory allocation may be unwanted for Ewasm contracts since it costs gas. This situation will improve as WebAssembly, compilers, and libraries mature. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think this statement belongs here, it leaves the reader with too many unanswered questions. I think the best thing to do would be to compile a list of open design questions in one place (not specific to C/C++) and provide a link to it somewhere in this doc. A link to open issues on the ewasm/design repo might suffice for now if we don't have a more mature doc. To make this more helpful and constructive, it would be nice to conclude this section by saying something along the lines of, "For now, to work around these issues, ensure that you only use basic structs and
Consider dropping, I don't think this is critical. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that a guide is not a place for design discussions. I overhauled this paragraph. |
||
|
||
1. In the current Ewasm design, all communication between the contract and the client is done through the module's memory. For example, the message data ("call data") sent to the contract is accessed by calling `callDataCopy()`, which puts this data to WebAssembly memory at a location given by a pointer. This pointer must be to either to a statically allocated array, or to dynamically allocated memory using `malloc`. For example, before calling `callDataCopy()`, one may use `getCallDataSize()` to see how many bytes of memory to `malloc`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Consider linking to the EEI specs for these two methods. Also, this would all be much clearer with an example using code. Could you maybe link to the wrc20 example code in C++, or even include it inline here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just tried including an example, but ended up having to explain too many things. This may overwhelm a first-time user. I think it is better to just give a concise high-level explanation, and allow the user to explore concrete examples when they know the basics. I overhauled this paragraph. |
||
|
||
1. In the current Ewasm design, the Ethereum client writes data into WebAssembly as big-endian, but WebAssembly memory is little-endian, so has reversed bytes when the data is brought to/from the WebAssembly operand stack. For example, when the call data is brought into memory using `callDataCopy`, and those bytes are loaded to the WebAssembly stack using `i64.load`, all of the bytes are reversed. So extra C/C++ code may be needed to load bytes from the correct location and to reverse the loaded bytes. | ||
|
||
1. The output of compilers is a `.wasm` binary which may have imports and exports which do not meet Ewasm requirements. We have tools to fix the imports and exports. | ||
lrettig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
1. There are no tutorials for debugging/testing a contract. Hera supports extra Ewasm helper functions to print things, which have helped in writing test cases. A tutorial is needed to allow early adopters to debug/test their contracts without having to do it on the testnet. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Could you link to a doc on these, or otherwise make it more explicit here? Assume you are talking about things like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should mention the state of debugging tools since it is important for developers. I changed it to say that early adoptors can debug on the testnet for now. I am left feeling that there is a great need for tools. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree, was not suggesting removing this, but instead linking to docs we have elsewhere in this repo on debug tools. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you suggest a link? Do you think that it is reasonable to link instructions on how to write test fillers? |
||
|
||
## Basic Step-by-Step Guide | ||
|
||
First let's build the latest version of LLVM. Note: this section of the document allows you to build LLVM without any standard libraries. If you wish to use C/C++ standard libraries, then build the version of LLVM in the Advanced section below. That version can also be used here. | ||
|
||
```sh | ||
# checkout LLVM, clang, and lld | ||
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use git instead of svn? The instructions from Jake's doc seem reasonable no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mentioned that the official guide http://llvm.org/docs/GettingStarted.html uses svn. But changed our guide to use git. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am still compiling the git version to test it. Will revert to svn if there is a problem. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The git version successfully compiles wrc20. Compiling the git version of LLVM had a few errors along the way, but restarted each time and finally it finished. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Didn't realize svn was in the official guide. Glad to hear it works with git too! |
||
cd llvm/tools | ||
svn co http://llvm.org/svn/llvm-project/cfe/trunk clang | ||
svn co http://llvm.org/svn/llvm-project/lld/trunk lld | ||
cd ../.. | ||
|
||
# build LLVM, clang, and lld | ||
mkdir llvm-build | ||
cd llvm-build | ||
# note: if you want other targets than WebAssembly, then delete -DLLVM_TARGETS_TO_BUILD= | ||
cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD= -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=WebAssembly ../llvm | ||
make -j 8 | ||
``` | ||
|
||
Warning: this `cmake` step can take hours, requires a lot of disk space and memory, and may cause your computer to freeze. If there is an error, try again without the `-j 8` argument (which attempts to run eight parallel build processes). | ||
|
||
Next download and compile a wrc20 ewasm contract written in C: | ||
|
||
```sh | ||
git clone https://gist.github.com/poemm/68a7b70ec353abaeae64bf6fe95d2d52.git cwrc20 | ||
``` | ||
|
||
Note that in `main.c`, there are many arrays in global scope: LLVM puts global arrays in WebAssembly memory, which allows them to be used as pointer arguments to Ethereum helper functions. Before compiling, make sure that the `Makefile` has a path to `llvm-build` above, and that `main.syms` has a list of Ewasm helper functions you are using. | ||
|
||
Aside: If you are using C++, make sure to modify the Makefile to `clang++`, use `extern "C"` around the helper function declarations. | ||
|
||
```sh | ||
cd cwrc20 | ||
# edit the Makefile and main.syms as described above | ||
make | ||
``` | ||
|
||
The output is `main.wasm` which needs a cleanup of imports and exports to meet [Ewasm requirements](https://github.com/ewasm/design/blob/master/contract_interface.md). For this, we use [PyWebAssembly](https://github.com/poemm/pywebassembly), perform the cleanup manually, or use [wasm-chisel](https://github.com/wasmx/wasm-chisel), a program in Rust which can be installed with `cargo install chisel`. `wasm-chisel` is stricter and has more features, whereas `PyWebAssembly` is just enough for our use case, and Python is available on most machines. We therefore recommend using PyWebAssembly as follows: | ||
|
||
``` | ||
cd .. | ||
git clone https://github.com/poemm/pywebassembly.git | ||
cd pywebassembly/examples/ | ||
python3 ewasmify.py ../../cwrc20/main.wasm | ||
cd ../../cwrc20 | ||
``` | ||
|
||
Check whether the command line output of `ewasmify.py` above lists only [valid Ewasm imports and exports](https://github.com/ewasm/design/blob/master/eth_interface.md). To troubleshoot, you may wish to also inspect `main.wasm` in its text representation, so proceed to the next step with binaryen or wabt. | ||
|
||
We can convert from the `.wasm` binary format to the `.wat` (or `.wast`) text format (these are equivalent formats and can be converted back-and-forth). This conversion can be done with Binaryen's `wasm-dis`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Strictly speaking, this is not true. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. I removed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool. It's clearly not critical but we can bring it up on a call and try to get everyone on the same page. |
||
|
||
Aside: Alternatively one can use Wabt's `wasm2wat`. But Binaryen's `wasm-dis` is recommended because Ewasm studio uses Binaryen internally, and Binaryen can be quirky and fail to read a `.wat` generated by another program. Another tip: if Binaryen's `wasm-dis` can't read the `.wasm`, try using Wabt's `wasm2wat` then `wat2wasm` before trying again with Binaryen. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that this binaryen section should be removed and replaced with a link to Hugo's doc once it is merged. |
||
|
||
```sh | ||
cd .. | ||
git clone https://github.com/WebAssembly/binaryen.git # warning 90 MB, can also download precompiled binaries which are 15 MB | ||
cd binaryen | ||
mkdir build && cd build | ||
cmake .. | ||
make -j4 | ||
cd ../../cwrc20 | ||
../binaryen/build/bin/wasm-dis main_ewasmified.wasm > main_ewasmified.wat | ||
``` | ||
|
||
`main_ewasmified.wat` is an ewasm contract. See other notes for how to deploy it. Happy hacking! | ||
|
||
|
||
## Advanced | ||
|
||
The above guide is for compiling a C file with no libc. Next we use a package which provides a minimal toolchain which includes libc and libc++, as well as patches allowing things like `malloc`. | ||
|
||
``` | ||
git clone https://github.com/yurydelendik/wasmception.git | ||
cd wasmception | ||
make # Warning: this required lots of internet bandwidth, RAM, disk space, and one hour compiling on a mid-level laptop. | ||
cd .. | ||
``` | ||
Write down the end of the output of the above `make` command, it should include something like: `--sysroot=/home/user/repos/wasmception/sysroot`. | ||
|
||
Next we will download and build a version of wrc20 which uses `malloc`. Make sure to edit the `Makefile` with the sysroot data above, and change the path of `clang` to our newly compiled version which may look something like `/home/user/repos/wasmception/dist/bin/clang`. Make sure that `main.syms` has a list of Ewasm helper functions you are using. | ||
|
||
Aside: If you are using C++, make sure to modify the Makefile to `clang++`, use `extern "C"` around the helper function declarations, and follow other tips from wasmception. | ||
|
||
```sh | ||
git clone https://gist.github.com/poemm/91b64ecd2ca2f1cb4a88d31315313b9b.git cwrc20_with_malloc | ||
cd cwrc20_with_malloc | ||
# edit the Makefile and main.syms as described above | ||
make | ||
``` | ||
|
||
Now follow the same steps above to transform the output `main.wasm` into a valid Ewasm contract. | ||
|
||
Tutorials are needed for more advanced things. For example, to statically link against other C files, one can link the LLVM IR as described here https://aransentin.github.io/cwasm/. |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why you'd add this caveat here, but I don't find it constructive or helpful on its own, i.e., it just makes the reader worry that the instructions aren't going to work. Consider making this more constructive by saying something like, "Every effort is made to keep this document up to date, but if you notice anything wrong please feel free to submit a PR or an issue to report and/or fix it."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It is understood that things are subject to change. I completely removed it so that the guide is more concise.