Emscripten
is a C/C++ -> LLVM -> JavaScript compiler. It's useful
and interesting, but one of its biggest limitations is about the
sync/async model: JavaScript is single-threaded event-driven languages,
therefore you cannot do a
First of all, please enjoy this demo, on which I really spent some time.
The demo is basically a stop-time animation:
//wait for some while
//draw something else
//wait...
You can verify this in the source code.
We all know that it is hard to convert it into the async model of JavaScript. Now would you please take a look at my ported code, it's almost identical to the original code, except for:
I fed emscripten with the code and blablabla — and the demo works. But wait! The code should be identical to the original code, which is synced!
Well, please let me explain a few more things before I reveal the secrets.
We may imagine that, to really 'go to sleep', we need to store all the context and restore it when we come back again. Indeed, I did so in the source code, whenever you see a `ASYNC_` macro, it involves pushing and poping to maintain the async stack.
The actual functions behind those macros are defined in async.h.
Well, I'm NOT going to propose a set of API or a library, instead I'm proposing a way of pre-processing the code, and I did that myself manually. It's doable and there're patterns, you may see how a for-loop is broken down according to the comments. I'll put technical details in the end.
The porting experience may not be as smooth as it looks like, actually `xmas` is rather straightforward, where there are rarely recursive for-loops or branches. But if you take a look at other demos, it is a nightmare to define callbacks and maintain the stack manually, just imagine that there's no `call` and `ret ASM macros, and you have to do `push`, `pop` and `jump` manually.
My point is that: the sync->async process can, and should be done by the pre-processor/compiler
Actually I made the second demo at first, before I knew the the secret weapon — streamlinejs, and here is an intuitive demo.
It's not a library, but a parser/compiler instead. I didn't go too deep into its mechanism, but from the results it generated, the mechanism should be similar as what I'll mentioned below. You may read this article for more details.
To build the first demo, all the placeholders are replace with underscores, which will be recognized by streamlinejs (as placeholders for calback), fortunately un-optimized JS generated by emscripten can be parsed without any problem — at lesat my demo.
Async function calls are different from (normal) sync funtion calls, an async call pushes the context into the async stack, including the callback (similar as the return address in the synced case) and returns. The central event dispatcher (the JS engine in our case) will call the callback eventually.
So the central idea is to identify all the async function calls, which are usually casuse by two reasons:
The first one should be easy: some functions are async natively, e.g. `SDL_Delay`. And if a function calls any other async funtions inside, it is async.
The second one is usually originated from loops and branches, which will be explained later.
I think that these can be identified by the compiler, in one of following stages:
- Pre-processing C/C++ — I did that manually myself
- LLVM bitcode — which I'm not so sure
- JavaScript — streamline itself is an example
There are advantages and disadvantages in different stages, for example it might be easier to optimize the code when parsing the C code; while it may be more light-weighted to store the local variables using JavaScript closures.
Here's an example:
// sync version
void work()
{
int j = 99;
SDL_Delay(1000);
printf("result %d\n", j);
}
Since
// context: stack for async calls
int work(context_t * context)
{
int j = 99;
push_callback(context, work__cb1); // set up callback
put_variable(context, j); // save local variables
SDL_Delay(1000, context); // async version of SDL_Delay
return 0; // get out and wait for SDL_Delay
}
int work__cb1(context_t * context)
{
get_variable(context, j);
pop(context); // prepare to return the previous chained callback
printf("result %d\n", j);
context->callback();
}
```
For-loops make the situation more complicated, which causes another type of async calls:
{
for(int i = 0; i < 10; ++i)
{
printf("hi ");
SDL_Delay(10);
printf("%d\n", i);
}
}
int f()
{
int i = 0;
start:
if(i >= 10) goto end;
printf("hi");
SDL_Delay(10);
printf("%d\n", i);
++ i;
goto start;
end:
// nothing
}
Now it is clear that we can split the function and make async calls
{
int i = 0;
// save i to the stack
// async call f_start();
}
int f_start(context)
{
// restore i
//pop stack
if(i >= 10) // async call f_end();
printf("hi ");
// save i
// push f_start2 into the stack
SDL_Delay(10, context);
return 0;
}
int f_start2(context)
{
// restore i
//pop stack
printf("%d\n", i);
// push i
// async call f_start() to continue the loop
return 0;
}
int f_end(context)
{
// pop stack
// async call callback of f()
}
Braches (if, switch etc) are similar, as long as we consider them as `goto`'s.
and so are return values.
The first step, I'd imagine, is that the async functions are explicitly marked through some mechanism. In my example, a placeholder is used.
Developers may still write programs in the sync fashion, for two reasons: one for the convenience writing new program, and the other for porting existing ones.
The compiler should detect, split and setup async functions automatically, the async stack should be managed by standard library while some API might be exposed.
There are two ways of managing the local variables, let me call them the C style and the JavaScript style:
The C style: Local variables of async functions are stored in dedicated area in the memory (HEAP or a special stack for async functions), instead of the normal stack. To avoid lots of memcpy's, the variables may be directly allocate there. Some push/pop operations may be optimized if the caller/callee is known (e.g. loops/branches)
The JavaScript style: streamlinejs is a good example. Async functions are broken into a series of resursive functions, and local variables are stored into the closures.
The JavaScript style is easy and intuitive, but the hidden overhead might not be negligible. It may be too late to optimize when the LLVM bitcode have been transformed into JavaScript.
My (naive & wild) imagination is that by defining a few essential async functions (such as
sleep()
in the hope of receiving and processing extra events.Demo
First of all, please enjoy this demo, on which I really spent some time.
tl;dr
- Motivation: to make `sleep` and other similar functions actually work in our Runtime, without lots of labor work porting the C/C++ code
- I actually made it worked
- `sync->async` is doable and I think it's not too hard - there are already implementations for JS
- I think this is a feature that cannot be supported by merely an external library
- Does it fit emscripten? Or is it possible to write a plugin for emscripten?
Demo explained(?)
The demo is basically a stop-time animation:
//draw something//wait for some while
//draw something else
//wait...
You can verify this in the source code.
We all know that it is hard to convert it into the async model of JavaScript. Now would you please take a look at my ported code, it's almost identical to the original code, except for:
- A few macros are introduced, which are defined in asyn2.h — actually the macros are pretty much just placeholders.
- A js function
js_napms
defined — which is a wrapper ofsetTimeout
I fed emscripten with the code and blablabla — and the demo works. But wait! The code should be identical to the original code, which is synced!
Well, please let me explain a few more things before I reveal the secrets.
Another Demo
Here's another demo, which is... the same as above. So what's the deal?We may imagine that, to really 'go to sleep', we need to store all the context and restore it when we come back again. Indeed, I did so in the source code, whenever you see a `ASYNC_` macro, it involves pushing and poping to maintain the async stack.
The actual functions behind those macros are defined in async.h.
Well, I'm NOT going to propose a set of API or a library, instead I'm proposing a way of pre-processing the code, and I did that myself manually. It's doable and there're patterns, you may see how a for-loop is broken down according to the comments. I'll put technical details in the end.
The porting experience may not be as smooth as it looks like, actually `xmas` is rather straightforward, where there are rarely recursive for-loops or branches. But if you take a look at other demos, it is a nightmare to define callbacks and maintain the stack manually, just imagine that there's no `call` and `ret ASM macros, and you have to do `push`, `pop` and `jump` manually.
My point is that: the sync->async process can, and should be done by the pre-processor/compiler
The Secrets of the 1st Demo
You didn't skip the previous section did you?Actually I made the second demo at first, before I knew the the secret weapon — streamlinejs, and here is an intuitive demo.
It's not a library, but a parser/compiler instead. I didn't go too deep into its mechanism, but from the results it generated, the mechanism should be similar as what I'll mentioned below. You may read this article for more details.
To build the first demo, all the placeholders are replace with underscores, which will be recognized by streamlinejs (as placeholders for calback), fortunately un-optimized JS generated by emscripten can be parsed without any problem — at lesat my demo.
Technical stuffs
Imagine that there a stack dedicated for async function calls, it is different from traditional stacks in that this stack is not cleared when a function exits.Async function calls are different from (normal) sync funtion calls, an async call pushes the context into the async stack, including the callback (similar as the return address in the synced case) and returns. The central event dispatcher (the JS engine in our case) will call the callback eventually.
So the central idea is to identify all the async function calls, which are usually casuse by two reasons:
- Calling an async function
- `jump` over an async call
The first one should be easy: some functions are async natively, e.g. `SDL_Delay`. And if a function calls any other async funtions inside, it is async.
The second one is usually originated from loops and branches, which will be explained later.
I think that these can be identified by the compiler, in one of following stages:
- Pre-processing C/C++ — I did that manually myself
- LLVM bitcode — which I'm not so sure
- JavaScript — streamline itself is an example
There are advantages and disadvantages in different stages, for example it might be easier to optimize the code when parsing the C code; while it may be more light-weighted to store the local variables using JavaScript closures.
Identify and transfrom async functions
Here's an example:
// sync version
void work()
{
int j = 99;
SDL_Delay(1000);
printf("result %d\n", j);
}
Since
SDL_Delay
is natively async, we have to transform `work` into its async counterpart, as follows:
// async version// context: stack for async calls
int work(context_t * context)
{
int j = 99;
push_callback(context, work__cb1); // set up callback
put_variable(context, j); // save local variables
SDL_Delay(1000, context); // async version of SDL_Delay
return 0; // get out and wait for SDL_Delay
}
int work__cb1(context_t * context)
{
get_variable(context, j);
pop(context); // prepare to return the previous chained callback
printf("result %d\n", j);
context->callback();
}
```
For-loops make the situation more complicated, which causes another type of async calls:
int f(){
for(int i = 0; i < 10; ++i)
{
printf("hi ");
SDL_Delay(10);
printf("%d\n", i);
}
}
f()
can be flattened as
int f()
{
int i = 0;
start:
if(i >= 10) goto end;
printf("hi");
SDL_Delay(10);
printf("%d\n", i);
++ i;
goto start;
end:
// nothing
}
Now it is clear that we can split the function and make async calls
int f(context){
int i = 0;
// save i to the stack
// async call f_start();
}
int f_start(context)
{
// restore i
//pop stack
if(i >= 10) // async call f_end();
printf("hi ");
// save i
// push f_start2 into the stack
SDL_Delay(10, context);
return 0;
}
int f_start2(context)
{
// restore i
//pop stack
printf("%d\n", i);
// push i
// async call f_start() to continue the loop
return 0;
}
int f_end(context)
{
// pop stack
// async call callback of f()
}
Braches (if, switch etc) are similar, as long as we consider them as `goto`'s.
local variables and return values
local variables may be stored and retrieved when we push/pop the async stack,and so are return values.
Compiler/Preprocessor Integration: Step 1
It should be clear now that this feature is kind of transformation, which cannot be supported by linking to an external library. Of course the pre-condition is that the transformation should be (almost) transparent, it should not be necessary for developers to maintain the stack manually.The first step, I'd imagine, is that the async functions are explicitly marked through some mechanism. In my example, a placeholder is used.
Developers may still write programs in the sync fashion, for two reasons: one for the convenience writing new program, and the other for porting existing ones.
The compiler should detect, split and setup async functions automatically, the async stack should be managed by standard library while some API might be exposed.
There are two ways of managing the local variables, let me call them the C style and the JavaScript style:
The C style: Local variables of async functions are stored in dedicated area in the memory (HEAP or a special stack for async functions), instead of the normal stack. To avoid lots of memcpy's, the variables may be directly allocate there. Some push/pop operations may be optimized if the caller/callee is known (e.g. loops/branches)
The JavaScript style: streamlinejs is a good example. Async functions are broken into a series of resursive functions, and local variables are stored into the closures.
The JavaScript style is easy and intuitive, but the hidden overhead might not be negligible. It may be too late to optimize when the LLVM bitcode have been transformed into JavaScript.
Compiler/Preprocessor Integration: Step 2
It might be possible to further reduce the work of writing/porting, as even marking async functions and define the placeholders for every async function declaration and every async function call is boring and error-prone.My (naive & wild) imagination is that by defining a few essential async functions (such as
SDL_Delay
), the compiler would
automatically recognize async functions, and set up the hidden
parameter. It's not perfect, especially when we need to link a number
libraries, but at least I think a C/C++ transformer would be possible and nice, perhaps based on LLVM? Limitations
- It might not work for muti-threading. Indeed I've been only thinking about single-threaded programs, especially most ones for terminal — But this should not affect the importance of this issue I think.
- Lots of overhead might be introduced in this way — But I guess the performance should not be affected much if well optimized
- C++: ctr/copy/dectr of objects might be a problem, or maybe not since they can be `flattened` into C-style?
- C++: try-catch might not work, the control flow is already diverted
- There a few limitations of streamlinejs, but I think many of them can be addressed if we process in the C phase.
Comments