The Basics of Procedures¶
What is a “procedure”?¶
In UNIT, a procedure is referring to the UNIT_Procedure type, which
holds instructions that will execute at runtime; or, in other words, a procedure
is a function that we’ll be generating code for.
Hint
We use the term “procedure” instead of “function” to distinguish between things that aren’t compiled by UNIT. A procedure is created by you and contains UNIT’s instructions, whereas a function could be something in the C standard library.
Creating the context¶
Before we can emit any instructions, we need to initialize a procedure.
This can be done through UNIT_Context_Init() or UNIT_Procedure_New().
But, if you look at the signature of those functions, they take a UNIT_Context *.
So, before we can create a procedure, we have to create a new context.
Assuming everything has been installed correctly,
UNIT’s primary header file should exist at $INCLUDE_PATH/unit/unit.h. If we
#include that, then everything in UNIT’s public C API will become available
in the namespace.
So, using our prior knowledge about how we initialize structures
(Init and New functions), let’s create a new context:
1 #include <unit/unit.h>
2
3 int main(void)
4 {
5 UNIT_Context context;
6 if (UNIT_FAILED(UNIT_Context_Init(&context))) {
7 return 1;
8 }
9
10 UNIT_Context_Clear(&context);
11 return 0;
12 }
When to use Init instead of New?
The primary difference between these two functions is what kind of allocation
the structure is stored on. New explicitly uses the heap, so only use
it when you actually need the heap; in other words, if you know that a structure
will outlive the current function, then use New. Otherwise, use Init.
In this case, we use Init because we don’t need the context to outlive the
main function.
Before proceeding, let’s make sure this works. For this tutorial, we’ll be using GCC, but you can use whatever compiler you want, as long as it can find UNIT’s header files and link against it.
So, let’s save the above as main.c and run it:
$ gcc main.c -o out -lunit
$ ./out
Creating a procedure¶
At this point, we have a context that we can use, so we can now create our first procedure!
Following UNIT’s naming convention, procedures can be created through
UNIT_Procedure_Init() and UNIT_Procedure_New(). Since we’re
running everything inside the main function, we’ll use Init with stack
memory:
1 #include <unit/unit.h>
2
3 int main(void)
4 {
5 UNIT_Context context;
6 if (UNIT_FAILED(UNIT_Context_Init(&context))) {
7 return 1;
8 }
9
10 UNIT_Procedure procedure;
11 if (UNIT_FAILED(UNIT_Procedure_Init(&procedure, &context, "main"))) {
12 return 1;
13 }
14
15 UNIT_Procedure_Clear(&procedure)
16 UNIT_Context_Clear(&context);
17 return 0;
18 }
But wait, there’s a bug here! Even when UNIT_Procedure_Init fails, the
context is still alive. Remember, we always need to call UNIT_Context_Clear
after UNIT_Context_Init was successful, otherwise our program will leak memory.
In addition, UNIT_Procedure_Init sets an error message. For our own sanity
as developers, let’s put a UNIT_PrintError before returning so we have
some idea of what went wrong if something were to fail:
1 #include <unit/unit.h>
2
3 int main(void)
4 {
5 UNIT_Context context;
6 if (UNIT_FAILED(UNIT_Context_Init(&context))) {
7 return 1;
8 }
9
10 UNIT_Procedure procedure;
11 if (UNIT_FAILED(UNIT_Procedure_Init(&procedure, &context, "main"))) {
12 UNIT_PrintError(&context, stderr);
13 UNIT_Context_Clear(&context);
14 return 1;
15 }
16
17 UNIT_Procedure_Clear(&procedure)
18 UNIT_Context_Clear(&context);
19 return 0;
20 }
Emitting instructions¶
Now that we have all the boilerplate code out of the way, we can start emitting instructions!
UNIT uses a stack-based IR, or in other words, you write code for a stack machine, and then UNIT will translate it to machine code.
Caution
Be careful to not confuse UNIT’s operand stack with the stack present in CPUs. The term “stack” in “stack-allocated variable” does not mean the same thing as “stack” in “operand stack”.
All instructions in UNIT have two common components of a stack-based instruction set:
The operation ID, often shortened to “opcode”.
The operation argument, often shortened to “oparg”.
Let’s start with the operation ID and ignore the argument for now. In UNIT, all
instructions are available in an enum called UNIT_Instruction.
The values of this enum are prefixed with UNIT_OP_. But, how do we actually
add instructions to the procedure?
Instructions can be added using a few functions under the UNIT_Procedure
namespace, but for now, let’s focus on UNIT_Procedure_AddOperation(),
which takes an opcode and oparg as an integer. If we wanted to make a program
that simply did return 0, we need two instructions:
UNIT_OP_LOAD_INTEGER, which pushes a constant integer onto the operand stack.UNIT_OP_RETURN_VALUE, which pops a value off the operand stack and returns it to the caller.
For LOAD_INTEGER, we need an oparg. This is the value that will be pushed
onto the stack. In our case, this will be 0. For RETURN_VALUE, we don’t
need an oparg, so we can put any value we want for the oparg. We’ll just stay
simple and pass 0.
Now, if we apply this to our code:
1 #include <unit/unit.h>
2
3 int main(void)
4 {
5 UNIT_Context context;
6 if (UNIT_FAILED(UNIT_Context_Init(&context))) {
7 return 1;
8 }
9
10 UNIT_Procedure procedure;
11 if (UNIT_FAILED(UNIT_Procedure_Init(&procedure, &context, "main"))) {
12 UNIT_PrintError(&context, stderr);
13 UNIT_Context_Clear(&context);
14 return 1;
15 }
16
17 if (UNIT_FAILED(UNIT_Procedure_AddOperation(&procedure, UNIT_OP_LOAD_INTEGER, 0))) {
18 UNIT_PrintError(&context, stderr);
19 UNIT_Procedure_Clear(&procedure);
20 UNIT_Context_Clear(&context);
21 return 1;
22 }
23
24 if (UNIT_FAILED(UNIT_Procedure_AddOperation(&procedure, UNIT_OP_RETURN_VALUE, 0))) {
25 UNIT_PrintError(&context, stderr);
26 UNIT_Procedure_Clear(&procedure);
27 UNIT_Context_Clear(&context);
28 return 1;
29 }
30
31 UNIT_Procedure_Clear(&procedure)
32 UNIT_Context_Clear(&context);
33 return 0;
34 }
But, this is really ugly. The error handling gets out of control very quickly, and this will only get worse as we add more instructions. We can clean this up by adding some macros tailored to our function, like so:
1 #include <unit/unit.h>
2
3 int main(void)
4 {
5 UNIT_Context context;
6 if (UNIT_FAILED(UNIT_Context_Init(&context))) {
7 return 1;
8 }
9
10 UNIT_Procedure procedure;
11 if (UNIT_FAILED(UNIT_Procedure_Init(&procedure, &context, "main"))) {
12 UNIT_PrintError(&context, stderr);
13 UNIT_Context_Clear(&procedure);
14 return 1;
15 }
16
17 #define ADDOP_INT(op, value) \
18 if (UNIT_FAILED(UNIT_Procedure_AddOperation(&procedure, op, value))) { \
19 UNIT_PrintError(&context, stderr); \
20 UNIT_Procedure_Clear(&context); \
21 UNIT_Context_Clear(&procedure); \
22 return 1; \
23 }
24
25 #define ADDOP(op) ADDOP_INT(op, 0)
26
27 ADDOP_INT(UNIT_OP_LOAD_INTEGER, 0);
28 ADDOP(UNIT_OP_RETURN_VALUE);
29
30 #undef ADDOP_INT
31 #undef ADDOP
32
33 UNIT_Procedure_Clear(&procedure)
34 UNIT_Context_Clear(&context);
35 return 0;
36 }
Warning
Control flow in macros is a common source of bugs. Handle this with care.
The macro gymnastics aren’t super pretty, but this will help us a lot as we add more instructions.
Tip
Another way to consolidate error handling code is to add a label above
the error handling code, and then jump to it with goto upon failure.
For example:
#include <unit/unit.h>
int main(void)
{
/* ... */
if (UNIT_FAILED(UNIT_Procedure_AddOperation(&operation, UNIT_OP_LOAD_INTEGER, 42))) {
goto error;
}
/* ... */
UNIT_Procedure_Clear(&procedure)
UNIT_Context_Clear(&context);
return 0;
error:
UNIT_Procedure_Clear(&procedure)
UNIT_Context_Clear(&context);
return 1;
}
Okay, let’s now try to compile and run our program.
gcc main.c -o out -lunit
./out
echo $?
0
Return code 0 – that makes sense. Let’s now try to return 1, just to confirm our code is running:
1 int main(void) {
2 /* ... */
3
4 #define ADDOP_INT(op, value) \
5 if (UNIT_FAILED(UNIT_Procedure_AddOperation(&procedure, op, value))) { \
6 UNIT_PrintError(&context, stderr); \
7 UNIT_Procedure_Clear(&context); \
8 UNIT_Context_Clear(&procedure); \
9 return 1; \
10 }
11
12 #define ADDOP(op) ADDOP_INT(op, 0)
13
14 ADDOP_INT(UNIT_OP_LOAD_INTEGER, 1);
15 ADDOP(UNIT_OP_RETURN_VALUE);
16
17 #undef ADDOP_INT
18 #undef ADDOP
19
20 /* ... */
21 }
$ gcc main.c -o out -lunit
$ ./out
$ echo $?
0
Huh? 0 again? What’s going on?
In the above code, all did was create the procedure – not actually compile or execute it. Forgetting to actually run the code can be a common mistake when developing a code generator.
So, how do we actually execute our instructions? We’ll talk about that next.