Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm linker: aggressive rewrite towards Data-Oriented Design #22220

Open
wants to merge 78 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
e435ed7
wasm linker: aggressive DODification
andrewrk Nov 5, 2024
0763942
macho linker: conform to explicit error sets
andrewrk Dec 3, 2024
678649b
remove "FIXME" from codebase
andrewrk Dec 3, 2024
ac5ed14
macho linker conforms to explicit error sets, again
andrewrk Dec 4, 2024
658795e
elf linker: conform to explicit error sets
andrewrk Dec 4, 2024
30b21a8
rework error handling in the backends
andrewrk Dec 4, 2024
c193ed2
compiler: add type safety for export indices
andrewrk Dec 5, 2024
efb2f5b
std.array_list: tiny refactor for pleasure
andrewrk Dec 6, 2024
1269ae8
rewrite wasm/Emit.zig
andrewrk Dec 6, 2024
e4c0974
wasm codegen: fix some compilation errors
andrewrk Dec 7, 2024
b95cbb5
wasm: implement errors_len as a MIR opcode with no linker involvement
andrewrk Dec 7, 2024
15b0cd6
wasm codegen: switch on bool instead of int
andrewrk Dec 7, 2024
fb5c31e
wasm codegen: rename func: CodeGen to cg: CodeGen
andrewrk Dec 7, 2024
2e0e8ba
wasm: move error_name lowering to Emit phase
andrewrk Dec 7, 2024
5d40daa
wasm: use call_intrinsic MIR instruction
andrewrk Dec 8, 2024
8e79f92
switch to ArrayListUnmanaged for machine code
andrewrk Dec 8, 2024
f19d316
wasm: fix many compilation errors
andrewrk Dec 9, 2024
2e5a23f
wasm linker: support export section as implicit symbols
andrewrk Dec 12, 2024
1cda24b
frontend: add const to more Zcu pointers
andrewrk Dec 12, 2024
98b0320
wasm linker: implement name, module name, and type for function imports
andrewrk Dec 12, 2024
bad995d
wasm linker: flush implemented up to the export section
andrewrk Dec 12, 2024
5985311
wasm linker: flush export section
andrewrk Dec 12, 2024
2aaf879
wasm linker: finish the flush function
andrewrk Dec 13, 2024
dee7a5e
fix compilation when enabling llvm
andrewrk Dec 13, 2024
8046bdf
cmake: remove deleted file
andrewrk Dec 13, 2024
08631ed
add dev env for wasm
andrewrk Dec 14, 2024
6861a4d
remove bad deinit
andrewrk Dec 15, 2024
b2a5a7d
wasm codegen: fix lowering of 32/64 float rt calls
andrewrk Dec 16, 2024
16b5749
wasm codegen: remove dependency on PerThread where possible
andrewrk Dec 16, 2024
4d5ae08
wasm linker fixes
andrewrk Dec 16, 2024
7dea29a
wasm linker: implement name subsection
andrewrk Dec 16, 2024
7428da2
fix replaceVecSectionHeader
andrewrk Dec 16, 2024
4f938b9
std.Thread: don't export wasi_thread_start in single-threaded mode
andrewrk Dec 16, 2024
f969e6f
wasm linker: implement type index method
andrewrk Dec 16, 2024
0f18e87
wasm linker: implement missing logic
andrewrk Dec 18, 2024
bea7883
complete wasm.Emit implementation
andrewrk Dec 18, 2024
8205d37
fix calculation of nav alignment
andrewrk Dec 18, 2024
0421f59
wasm codegen: fix wrong union field for locals
andrewrk Dec 18, 2024
8d3010d
add safety for calling functions that get virtual addrs
andrewrk Dec 18, 2024
958f654
wasm linker: add __zig_error_name_table data when needed
andrewrk Dec 18, 2024
f951598
wasm codegen: fix extra index not relative
andrewrk Dec 18, 2024
a8f0855
wasm linker: fix calling imported functions
andrewrk Dec 19, 2024
c4783b5
std.ArrayHashMap: allow passing empty values array
andrewrk Dec 19, 2024
691ac8d
wasm linker: fix data segments memory flow
andrewrk Dec 19, 2024
15670ad
wasm linker: handle extern functions in updateNav
andrewrk Dec 19, 2024
52b78ba
wasm linker: allow undefined imports when lib name is provided
andrewrk Dec 19, 2024
258a585
wasm codegen: fix call_indirect
andrewrk Dec 19, 2024
debaed6
wasm linker: fix eliding empty data segments
andrewrk Dec 19, 2024
54938c5
wasm linker: implement data fixups
andrewrk Dec 19, 2024
b80d52a
wasm linker: avoid recursion in lowerZcuData
andrewrk Dec 19, 2024
413de73
wasm linker: also call lowerZcuData in updateFunc
andrewrk Dec 20, 2024
b0e4d46
wasm linker: initialize the data segments table in flush
andrewrk Dec 20, 2024
fb4b739
wasm linker: zcu data fixups are already applied
andrewrk Dec 20, 2024
7993718
implement error table and error names data segments
andrewrk Dec 20, 2024
176a5bb
wasm linker: fix data section in flush
andrewrk Dec 21, 2024
eedb855
implement the prelink phase in the frontend
andrewrk Dec 21, 2024
f1aa7b2
wasm linker: implement stack pointer global
andrewrk Dec 21, 2024
0716d3a
std.io: remove the "temporary workaround" for stage2_aarch64
andrewrk Dec 21, 2024
eb314ba
wasm linker: implement indirect function calls
andrewrk Dec 21, 2024
f66073f
fix stack pointer initialized to wrong vaddr
andrewrk Dec 21, 2024
01901e7
use fixed writer in more places
andrewrk Dec 21, 2024
69e7052
wasm linker: fix missing function type entry for import
andrewrk Dec 22, 2024
714769e
wasm linker: fix active data segment offset value
andrewrk Dec 22, 2024
3066efe
Compilation: account for C objects and resources in prelink
andrewrk Dec 22, 2024
0d24808
wasm linker: fix relocation parsing
andrewrk Dec 23, 2024
bc5a1de
wasm linker: fix crashes when parsing compiler_rt
andrewrk Dec 24, 2024
049f14c
fix missing missing entry symbol error when no zcu
andrewrk Dec 24, 2024
5c47f15
resolve merge conflicts
andrewrk Dec 27, 2024
6defd59
wasm linker: fix global imports in objects
andrewrk Dec 28, 2024
8e53629
can't use source location until return from this function
andrewrk Dec 29, 2024
fa62e3c
wasm linker: fix table imports in objects
andrewrk Dec 29, 2024
cba7480
fix bad archive name calculation
andrewrk Dec 29, 2024
a411947
wasm linker: chase relocations for references
andrewrk Dec 30, 2024
19247d5
wasm linker: improve error messages by making source locations more lazy
andrewrk Dec 30, 2024
0499e9d
wasm object parsing: fix handling of weak functions and globals
andrewrk Dec 30, 2024
fb08459
type checking for synthetic functions
andrewrk Dec 30, 2024
bb084e2
implement function relocations
andrewrk Dec 31, 2024
4d9ff7b
wasm linker: implement __wasm_call_ctors
andrewrk Dec 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -643,9 +643,8 @@ set(ZIG_STAGE2_SOURCES
src/link/StringTable.zig
src/link/Wasm.zig
src/link/Wasm/Archive.zig
src/link/Wasm/Flush.zig
src/link/Wasm/Object.zig
src/link/Wasm/Symbol.zig
src/link/Wasm/ZigObject.zig
src/link/aarch64.zig
src/link/riscv.zig
src/link/table_section.zig
Expand Down
2 changes: 1 addition & 1 deletion lib/std/Build/Step/CheckObject.zig
Original file line number Diff line number Diff line change
Expand Up @@ -2682,7 +2682,7 @@ const WasmDumper = struct {
else => unreachable,
}
const end_opcode = try std.leb.readUleb128(u8, reader);
if (end_opcode != std.wasm.opcode(.end)) {
if (end_opcode != @intFromEnum(std.wasm.Opcode.end)) {
return step.fail("expected 'end' opcode in init expression", .{});
}
}
Expand Down
6 changes: 6 additions & 0 deletions lib/std/Target.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1219,6 +1219,12 @@ pub const Cpu = struct {
} else true;
}

pub fn count(set: Set) std.math.IntFittingRange(0, needed_bit_count) {
var sum: usize = 0;
for (set.ints) |x| sum += @popCount(x);
return @intCast(sum);
}

pub fn isEnabled(set: Set, arch_feature_index: Index) bool {
const usize_index = arch_feature_index / @bitSizeOf(usize);
const bit_index: ShiftInt = @intCast(arch_feature_index % @bitSizeOf(usize));
Expand Down
13 changes: 8 additions & 5 deletions lib/std/Thread.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1018,12 +1018,15 @@ const WasiThreadImpl = struct {
return .{ .thread = &instance.thread };
}

/// Bootstrap procedure, called by the host environment after thread creation.
export fn wasi_thread_start(tid: i32, arg: *Instance) void {
if (builtin.single_threaded) {
// ensure function is not analyzed in single-threaded mode
return;
comptime {
if (!builtin.single_threaded) {
@export(wasi_thread_start, .{ .name = "wasi_thread_start" });
}
}

/// Called by the host environment after thread creation.
fn wasi_thread_start(tid: i32, arg: *Instance) callconv(.c) void {
comptime assert(!builtin.single_threaded);
__set_stack_pointer(arg.thread.memory.ptr + arg.stack_offset);
__wasm_init_tls(arg.thread.memory.ptr + arg.tls_offset);
@atomicStore(u32, &WasiThreadImpl.tls_thread_id, @intCast(tid), .seq_cst);
Expand Down
5 changes: 4 additions & 1 deletion lib/std/array_hash_map.zig
Original file line number Diff line number Diff line change
Expand Up @@ -641,10 +641,13 @@ pub fn ArrayHashMapUnmanaged(
return self;
}

/// An empty `value_list` may be passed, in which case the values array becomes `undefined`.
pub fn reinit(self: *Self, gpa: Allocator, key_list: []const K, value_list: []const V) Oom!void {
try self.entries.resize(gpa, key_list.len);
@memcpy(self.keys(), key_list);
if (@sizeOf(V) != 0) {
if (value_list.len == 0) {
@memset(self.values(), undefined);
} else {
assert(key_list.len == value_list.len);
@memcpy(self.values(), value_list);
}
Expand Down
6 changes: 2 additions & 4 deletions lib/std/array_list.zig
Original file line number Diff line number Diff line change
Expand Up @@ -267,8 +267,7 @@ pub fn ArrayListAligned(comptime T: type, comptime alignment: ?u29) type {
/// Never invalidates element pointers.
/// Asserts that the list can hold one additional item.
pub fn appendAssumeCapacity(self: *Self, item: T) void {
const new_item_ptr = self.addOneAssumeCapacity();
new_item_ptr.* = item;
self.addOneAssumeCapacity().* = item;
}

/// Remove the element at index `i`, shift elements after index
Expand Down Expand Up @@ -879,8 +878,7 @@ pub fn ArrayListAlignedUnmanaged(comptime T: type, comptime alignment: ?u29) typ
/// Never invalidates element pointers.
/// Asserts that the list can hold one additional item.
pub fn appendAssumeCapacity(self: *Self, item: T) void {
const new_item_ptr = self.addOneAssumeCapacity();
new_item_ptr.* = item;
self.addOneAssumeCapacity().* = item;
}

/// Remove the element at index `i` from the list and return its value.
Expand Down
12 changes: 0 additions & 12 deletions lib/std/io.zig
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ const Allocator = std.mem.Allocator;

fn getStdOutHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_OUTPUT_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdOutput;
}

Expand All @@ -36,10 +32,6 @@ pub fn getStdOut() File {

fn getStdErrHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_ERROR_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdError;
}

Expand All @@ -56,10 +48,6 @@ pub fn getStdErr() File {

fn getStdInHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_INPUT_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdInput;
}

Expand Down
184 changes: 5 additions & 179 deletions lib/std/wasm.zig
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@
const std = @import("std.zig");
const testing = std.testing;

// TODO: Add support for multi-byte ops (e.g. table operations)

/// Wasm instruction opcodes
///
/// All instructions are defined as per spec:
Expand Down Expand Up @@ -195,27 +193,6 @@ pub const Opcode = enum(u8) {
_,
};

/// Returns the integer value of an `Opcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn opcode(op: Opcode) u8 {
return @intFromEnum(op);
}

test "opcodes" {
// Ensure our opcodes values remain intact as certain values are skipped due to them being reserved
const i32_const = opcode(.i32_const);
const end = opcode(.end);
const drop = opcode(.drop);
const local_get = opcode(.local_get);
const i64_extend32_s = opcode(.i64_extend32_s);

try testing.expectEqual(@as(u16, 0x41), i32_const);
try testing.expectEqual(@as(u16, 0x0B), end);
try testing.expectEqual(@as(u16, 0x1A), drop);
try testing.expectEqual(@as(u16, 0x20), local_get);
try testing.expectEqual(@as(u16, 0xC4), i64_extend32_s);
}

/// Opcodes that require a prefix `0xFC`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand All @@ -241,12 +218,6 @@ pub const MiscOpcode = enum(u32) {
_,
};

/// Returns the integer value of an `MiscOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn miscOpcode(op: MiscOpcode) u32 {
return @intFromEnum(op);
}

/// Simd opcodes that require a prefix `0xFD`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand Down Expand Up @@ -512,12 +483,6 @@ pub const SimdOpcode = enum(u32) {
f32x4_relaxed_dot_bf16x8_add_f32x4 = 0x114,
};

/// Returns the integer value of an `SimdOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn simdOpcode(op: SimdOpcode) u32 {
return @intFromEnum(op);
}

/// Atomic opcodes that require a prefix `0xFE`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand Down Expand Up @@ -592,12 +557,6 @@ pub const AtomicsOpcode = enum(u32) {
i64_atomic_rmw32_cmpxchg_u = 0x4E,
};

/// Returns the integer value of an `AtomicsOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn atomicsOpcode(op: AtomicsOpcode) u32 {
return @intFromEnum(op);
}

/// Enum representing all Wasm value types as per spec:
/// https://webassembly.github.io/spec/core/binary/types.html
pub const Valtype = enum(u8) {
Expand All @@ -608,53 +567,24 @@ pub const Valtype = enum(u8) {
v128 = 0x7B,
};

/// Returns the integer value of a `Valtype`
pub fn valtype(value: Valtype) u8 {
return @intFromEnum(value);
}

/// Reference types, where the funcref references to a function regardless of its type
/// and ref references an object from the embedder.
pub const RefType = enum(u8) {
funcref = 0x70,
externref = 0x6F,
};

/// Returns the integer value of a `Reftype`
pub fn reftype(value: RefType) u8 {
return @intFromEnum(value);
}

test "valtypes" {
const _i32 = valtype(.i32);
const _i64 = valtype(.i64);
const _f32 = valtype(.f32);
const _f64 = valtype(.f64);

try testing.expectEqual(@as(u8, 0x7F), _i32);
try testing.expectEqual(@as(u8, 0x7E), _i64);
try testing.expectEqual(@as(u8, 0x7D), _f32);
try testing.expectEqual(@as(u8, 0x7C), _f64);
}

/// Limits classify the size range of resizeable storage associated with memory types and table types.
pub const Limits = struct {
flags: u8,
flags: Flags,
min: u32,
max: u32,

pub const Flags = enum(u8) {
WASM_LIMITS_FLAG_HAS_MAX = 0x1,
WASM_LIMITS_FLAG_IS_SHARED = 0x2,
pub const Flags = packed struct(u8) {
has_max: bool,
is_shared: bool,
reserved: u6 = 0,
};

pub fn hasFlag(limits: Limits, flag: Flags) bool {
return limits.flags & @intFromEnum(flag) != 0;
}

pub fn setFlag(limits: *Limits, flag: Flags) void {
limits.flags |= @intFromEnum(flag);
}
};

/// Initialization expressions are used to set the initial value on an object
Expand All @@ -667,18 +597,6 @@ pub const InitExpression = union(enum) {
global_get: u32,
};

/// Represents a function entry, holding the index to its type
pub const Func = struct {
type_index: u32,
};

/// Tables are used to hold pointers to opaque objects.
/// This can either by any function, or an object from the host.
pub const Table = struct {
limits: Limits,
reftype: RefType,
};

/// Describes the layout of the memory where `min` represents
/// the minimal amount of pages, and the optional `max` represents
/// the max pages. When `null` will allow the host to determine the
Expand All @@ -687,88 +605,6 @@ pub const Memory = struct {
limits: Limits,
};

/// Represents the type of a `Global` or an imported global.
pub const GlobalType = struct {
valtype: Valtype,
mutable: bool,
};

pub const Global = struct {
global_type: GlobalType,
init: InitExpression,
};

/// Notates an object to be exported from wasm
/// to the host.
pub const Export = struct {
name: []const u8,
kind: ExternalKind,
index: u32,
};

/// Element describes the layout of the table that can
/// be found at `table_index`
pub const Element = struct {
table_index: u32,
offset: InitExpression,
func_indexes: []const u32,
};

/// Imports are used to import objects from the host
pub const Import = struct {
module_name: []const u8,
name: []const u8,
kind: Kind,

pub const Kind = union(ExternalKind) {
function: u32,
table: Table,
memory: Limits,
global: GlobalType,
};
};

/// `Type` represents a function signature type containing both
/// a slice of parameters as well as a slice of return values.
pub const Type = struct {
params: []const Valtype,
returns: []const Valtype,

pub fn format(self: Type, comptime fmt: []const u8, opt: std.fmt.FormatOptions, writer: anytype) !void {
if (fmt.len != 0) std.fmt.invalidFmtError(fmt, self);
_ = opt;
try writer.writeByte('(');
for (self.params, 0..) |param, i| {
try writer.print("{s}", .{@tagName(param)});
if (i + 1 != self.params.len) {
try writer.writeAll(", ");
}
}
try writer.writeAll(") -> ");
if (self.returns.len == 0) {
try writer.writeAll("nil");
} else {
for (self.returns, 0..) |return_ty, i| {
try writer.print("{s}", .{@tagName(return_ty)});
if (i + 1 != self.returns.len) {
try writer.writeAll(", ");
}
}
}
}

pub fn eql(self: Type, other: Type) bool {
return std.mem.eql(Valtype, self.params, other.params) and
std.mem.eql(Valtype, self.returns, other.returns);
}

pub fn deinit(self: *Type, gpa: std.mem.Allocator) void {
gpa.free(self.params);
gpa.free(self.returns);
self.* = undefined;
}
};

/// Wasm module sections as per spec:
/// https://webassembly.github.io/spec/core/binary/modules.html
pub const Section = enum(u8) {
Expand All @@ -788,11 +624,6 @@ pub const Section = enum(u8) {
_,
};

/// Returns the integer value of a given `Section`
pub fn section(val: Section) u8 {
return @intFromEnum(val);
}

/// The kind of the type when importing or exporting to/from the host environment.
/// https://webassembly.github.io/spec/core/syntax/modules.html
pub const ExternalKind = enum(u8) {
Expand All @@ -802,11 +633,6 @@ pub const ExternalKind = enum(u8) {
global,
};

/// Returns the integer value of a given `ExternalKind`
pub fn externalKind(val: ExternalKind) u8 {
return @intFromEnum(val);
}

/// Defines the enum values for each subsection id for the "Names" custom section
/// as described by:
/// https://webassembly.github.io/spec/core/appendix/custom.html?highlight=name#name-section
Expand Down
Loading
Loading