Inacio Compiler Report #1
Recently I started coding a JavaScript compiler in Erlang. My idea is to run JavaScript in the BEAM Ecosystem getting all the benefits of it. The main goal is to make a multi-core engine able to be highly concurrent following the ECMA specification. This is my first report where I will document everything I am testing and explain my reasons for doing it.
Architecture overview
The main mechanisms of the architecture are:
- A lexer written in
Rust
which is a simple translator from JavaScript to Erlang Forms; - An execution engine which will read the code generated by the lexer and execute it.
Now let’s know more about the Lexer and how it is being implemented.
Lexer
The lexer will simply translates the JavaScript code to Erlang Forms. We are using Erlang Forms because it is easy to compile to Erlang binary in few lines (see more). To translate the code I created a walker algorithm that traverses the JavaScript AST and converts it to Erlang Forms. Here is a key example of the implementation:
// start.rs
// Omitted parts of the code for brevity
let allocator = Allocator::default();
let ret =
Parser::new(&allocator, &content, source_type)
.with_options(ParseOptions {
parse_regular_expression: true,
..ParseOptions::default()
})
.parse();
let result =
walker::walk(&ret.program.body)
// ^^^^^^
// Translate the JS Code to Erlang code
//
//
// walker.rs
// Omitted parts of the code for brevity
fn walk_static_member_expression(expr: &StaticMemberExpression) -> ErlangTerm {
ErlangTerm::static_member_expression(
ErlangTerm::atom(
match &expr.object {
Expression::Identifier(v) => v.name.to_string(),
_ => panic!("[walk_static_member_expression] Not implemented"),
}
),
ErlangTerm::atom(
expr.property.name.to_string()
),
)
}
// erlang_term_fmt.rs
ErlangTerm::StaticMemberExpression { object, property } => {
write!(f, "{{remote, 1, {}, {}}}", object, property,)
}
The result will be a formatted Erlang Form which will be written in a temporary file. Then, the engine written in Erlang will read this temporary file and execute it. In the future I would like to implement a message passing between the lexer and engine and the temporary file will be not needed anymore - but it is not the focus now.
To get more about how the translation works, let’s consider a JavaScript example code below.
export function main() {
const vvv = {
age: 234
}
erlang.display(vvv)
}
When the execution of the lexer is done the code below will be generated in the temporary file:
[{attribute,1,module,recursion},
{attribute,1,export,[{main,0}]},
{function,1,main,0,
[{clause,1,[],[],
[{match,1,
{var,1,vvv},
{call,1,
{remote,1,{atom,1,'inacio:object'},{atom,1,new}},
%% Creating an object internally
[{cons,1,{tuple,1,...},{nil,...}}]}},
{call,1,
{remote,1,{atom,1,erlang},{atom,1,display}},
[{var,1,vvv}]}]}]}]
Notice when creating an object we are calling internally 'inacio:object':new(...)
, this is an important mechanism in the compiler. The engine written in Erlang will be accessed directly by the generated code therefore we will call the engine for internal things every time.
With the basic parts working, we can focus what really cares now.
Next steps
We are following the ECMAScript specification, so for the next steps we need to think first of all about:
- global objects;
- execution contexts.
My first idea is to create a genesis
which will create the first execution context. The global object will be added to the first execution context.
- The global object will store the
globalThis
which will be inherited by the next execution contexts. You can see more about execution context here: https://tc39.es/ecma262/#sec-execution-contexts.