NOTE: Paw is under active development and is not ready for production use. See the roadmap to get an idea of where things are going. Also see known issues for a list of known problems that will eventually be fixed.
An expressive scripting language
Paw is a high-level scripting language and runtime designed for embedding. Paw features static strong typing under a sound type system, generics, and bidirectional type checking. Paw is an interpreted language that runs on a stack-based virtual machine written in C.
Paw supports both line- and block-style comments. Nesting is not allowed in block-style comments.
// line comment
/* block
comment */
In Paw, toplevel declarations are treated as global items, meaning they can be accessed from anywhere within the module.
Such items can be marked pub
to indicate public visibility, which allows access from the outside (C or other Paw modules).
Otherwise, items are considered private to the containing module.
Items are resolved at compile-time, meaning only named functions, abstract data type (ADT) definitions, impl
blocks, and compile-time constants may appear at the toplevel.
Modules that are intended to run as scripts under the bundled Paw interpreter paw
should define an entrypoint function called main
with signature pub fn main(argv: [str]) -> int
.
paw
(the builtin Paw interpreter) will look for main
in the module's exported symbols and call it with arguments forwarded from the commandline.
When main
is finished, its return value is passed back to the process that invoked paw
.
Paw is statically-typed, meaning all types must be known at compile-time. Also note that Paw is strongly typed, meaning implicit conversions are not allowed.
The following example demonstrates creation of the basic value types. Composite types are discussed in tuple, structure, etc., below.
// initializer is validated against the type annotation
let b: bool = true;
let i: int = 123;
let f: float = 10.0e-1;
let s: str = 'abc';
let f: fn() -> int = || 123;
// type is inferred from the initializer
let b = false;
let i = 40 + 2;
let f = 1.0 * 2;
let s = 'de' + 'f';
let F = |x| x * 2;
// explicit type conversion operator
let b = 1 as bool;
let i = 2 + 3.4 as int;
The previous example showed examples of a simple kind of type inference, where the type of a variable is inferred from an initializer expression (the expression to the right of the =
).
Paw supports a more general kind of type inference for containers and closures, where the type of each 'unknown' must be inferred before the declaration in question goes out of scope.
The main caveat here is that Paw does not yet have support for 'generic bounds', so we can't handle, for example, a closure like |a, b| a + b
where the operator +
imposes a bound on both a
and b
, restricting the types they can be instantiated with.
For example:
let f = |a| a; // fn(?0) -> ?0
f(123); // infer ?0 = int
f(42);
let v = []; // [?1]
v.push('a'); // infer ?1 = str
let s: str = v[0];
Any variable referenced in the runtime must first be declared (all variables are locals, see Modules above). Otherwise, a "name error" is raised (see the section on error handling below). Local variables can be shadowed and 'rebound', and a global item may be shadowed by a local. Locals can also be captured in the body of a closure (see closures).
// initializer (' = 0') is required
let x: int = 0;
// rebind 'x' to a float (type is inferred from initializer)
let x = 6.63e-34;
Paw uses lexical scoping: variables declared in a given block can only be referenced from within that block, or one of its subblocks.
A block begins when a {
token is encountered, and ends when the matching }
is found.
Many language constructs use blocks to create their own scope, like functions, structures, for loops, etc.
Explicit scoping blocks are also supported.
{
let x = 42;
} // 'x' goes out of scope here
Functions are first-class in Paw, which means they are treated like any other Paw value. They can be stored in variables, or passed as parameters to higher-order functions. Note that named functions can only be defined at the toplevel in Paw. Closures, however, may be nested arbitrarily.
fn fib(n: int) -> int {
if n < 2 {
return n;
}
return fib(n - 2) + fib(n - 1);
}
fn make_fib(n: int) -> fn() -> int {
// captures 'n'
return || fib(n);
}
struct Object {
a: int,
b: str,
}
// all fields must be initialized
let o = Object{b: 'two', a: 1};
// unit structs are written without curly braces
struct Unit;
let u = Unit;
enum Choices {
First,
Second(int),
}
// unit variants are written without parenthesis
let c = Choices::First;
let c = Choices::Second(123);
Paw supports many common types of control flow.
// 'if-else' statement:
if i == 0 {
} else if i == 1 {
} else {
}
// Null chaining operator: return immediately (with None/Err) if the operand is None/Err
// (must appear in a function that returns Option<T>/Result<T, E>), otherwise, unwraps
// the Option/Result
fn maybe() -> Option<int> {
let i = maybe_none()?;
return fallible(i);
}
// 'break'/'continue' (must appear in a loop):
break;
continue;
// Numeric 'for' loop:
for i = 0, 10, 2 { // start, end[, step]
}
// 'while' loop:
let i = 0;
while i < 10 {
i = i + 1;
}
// 'do...while' loop:
let i = 10;
do {
i = i - 1;
} while i > 0;
let s = 'Hello, world!';
assert(s.starts_with('Hello'));
assert(s.ends_with('world!'));
assert(s[:5].ends_with('Hello'));
assert(s[-6:].starts_with('world!'));
assert(1 == s.find('ello'));
assert(-1 == s.find('Cello'));
let a = s.split(',');
assert(s == ','.join(a));
Paw supports basic parametric polymorphism. Variables with generic types must be treated generically, that is, they can only be assigned to other variables of the same type, passed to functions expecting a generic parameter, or stored in a container. This allows each template to be type checked a single time, rather than once for each unique instantiation, and makes it easier to generate meaningful error messages.
fn map<A, B>(f: fn(A) -> B, list: [A]) -> [B] {
let result = [];
for a in list {
result.push(f(a));
}
return result;
}
// infer A = float, B = int
let list = map(|f: float| f as int, [0.5, 1.5, 2.5]);
assert(list == [0, 1, 2]);
// struct template
struct Object<S, T> {
a: S,
b: T,
}
impl<A, B> Object<A, B> {
// methods on all instances of Object...
}
impl<T> Object<T> {
// methods for when '.a' and '.b' have the same type...
// For example:
pub fn swap() {
let t = self.a;
self.a = self.b;
self.b = t;
}
}
impl Object<int, str> {
// methods for this specific type of Object...
// For example:
pub fn equals(a: int, b: str) -> bool {
return a == self.a && b == self.b;
}
}
// explicit instantiation uses 'turbofish'
let o = Object::<float, float>{
a: 0.99,
b: 1.23,
};
o.swap();
// type inference is supported
let o = Object{
a: 123, // infer S = int
b: 'abc', // infer T = str
};
// field and method access using '.'
let a = o.a + 1;
let b = o.b + 'two';
let c = o.equals(a, b);
// o.swap not available: S != T
let unit = ();
let singleton = (42,);
let pair = (true, 'abc');
let triplet = (1.0, 'two', 3);
let a = singleton.0;
let b = pair.1;
let c = triplet.2;
let empty: [int] = [];
// infer T = str
let empty = [];
empty.push('a');
let list = [
[[1, 2, 3], [0]],
[[4, 5, 6], [1]],
[[7, 8, 9], [2]],
]
let list = [1, 2, 3];
assert(list[:1] == [1]);
assert(list[1:-1] == [2]);
assert(list[-1:] == [3]);
let empty: [int: str] = [:];
// infer K = int, V = str
let empty = [:];
empty[0] = 'abc';
let map = [1: 'a', 2: 'b'];
map[3] = 42;
map.erase(1);
assert(m == [2: 'b']);
// prints 'default'
print(m.get_or(1, 'default'));
fn divide_by_0(n: int) {
n = n / 0;
}
let status = try(divide_by_0, 42);
assert(status != 0);
Precedence | Operator | Description | Associativity |
---|---|---|---|
14 | () [] . ? |
Call, Subscript, Member access, Question mark | Left |
13 | ! - ~ # |
Not, Negate, Bitwise not, length | Right |
12 | as |
Cast | Left |
11 | * / % |
Multiply, Divide, Modulus | Left |
10 | + - |
Add, Subtract | Left |
9 | << >> |
Shift left, Shift right | Left |
8 | & |
Bitwise and | Left |
7 | ^ |
Bitwise xor | Left |
6 | | |
Bitwise or | Left |
5 | < <= > >= |
Relational comparisons | Left |
4 | == != |
Equality comparisons | Left |
3 | && |
And | Left |
2 | || |
Or | Left |
1 | = |
Assignment | Right |
- static, strong typing
- special-cased builtin containers (
[T]
and[K: V]
) - type inference for polymorphic
fn
andstruct
- sum types/discriminated unions (
enum
) - product types (tuple)
- custom garbage collector (using Boehm GC for now)
- methods using
impl
blocks - error handling (
try
needs to be an operator, or we need something like a 'parameter pack' for generics) - module system and
import
keyword - pattern matching (
switch
construct) - pattern matching (
if let
,let
bindings) - type inference for polymorphic
enum
- generic constraints/bounds
- compiler optimization passes
- refactor user-provided allocation interface to allow heap expansion
- The C API has pretty much 0 type safety
- Compiler will allow functions that don't return a value in all code paths
- Likely requires a CFG and some data flow analysis: it would be very difficult to get right otherwise
- Selector on nested tuple breaks the lexer (looks like a float:
t.0.1
)- Could resolve with an extra lexing pass, or use index expression (
t[x][y]
, wherex
andy
are compile-time constant expressions) - Similarly, we can't write things like
1.to_string()
: we require parenthesis around the1
- Could resolve with an extra lexing pass, or use index expression (
- Need to prevent situations where 2 methods with the same name are accessible from the same type
- Complicated by the fact that impl blocks can either target a polymorphic ADT, or an instantiation thereof
- In the first case, methods are available to all instantiations of the ADT, while in the second case they are available only to that particular instance.
- The second case allows us to specialize the body of a method for each specific type of instance.
- Could rework the code that checks if a method can be called on a given type: just check to see if a method exists already before registering/resolving it
- Probably don't want to resolve/instantiate things while searching, which is what the code currently does
- May create multiple HIR decl. nodes for builtin container instantiations
- Builtin containers (
[T]
and[K: V]
) are ADTs that support type inference on their generic parameters after creation (normally, type params are inferred from the initializer) - Need to deduplicate if we end up inferring a type that already exists
- Run dedup code when declarations go out of scope
- Might as well support inferring all ADT type parameters after creation, since we have to do it for builtin containers anyway
- Builtin containers (