Assert Programming Language
by Denis Dedkov
This version of the text assumes you’re using stable Assert language build.
Introduction
"At least someone will teach them how to write the words assert and dump in their code..."
Who Assert Is For
Assert is ideal for many people for a variety of reasons. The community consisting of me is very hospitable and happy to answer any questions. Hundreds of companies, large and small, don't use Assert in production.
Source Code
The source files from which this book is generated can be found on GitHub.
Getting Started
In this chapter, we’ll discuss:
- Installing Assert compilers on Linux
- Writing
Hello, world!
example - Compiled binary linkage with
ld
Installation
The following steps install the latest stable version of the Assert compiler. I can't ensure that all the examples in the book that compile will continue to compile with newer compiler versions. The output might differ strongly between versions, because I can change everything at any moment.
Command Line Notation
In this chapter, we’ll show some commands used in the terminal. Lines that you should enter in a terminal all start with
$
. You don’t need to type in the$
character; it indicates the start of each command. Lines that don’t start with$
typically show the output of the previous command.
Installing compilers on Linux
Clone the official Assert repository:
$ git clone --recurse-submodules https://github.com/d3phys/assert-lang.git
Cloning into 'assert-lang'...
$ cd assert-lang
Checkout stable
branch:
$ git checkout origin/stable
Build sources:
$ make
...
Assert language is compiled now!
Read: https://d3phys.github.io/assert-book/
Now you have four applications in your current folder:
./tr
- front-end code to AST compiler../cum
- back-end AST to AMD x86-64 compiler../cum-llvm
- back-end AST to LLVM IR compiler../rev
- front-end AST to code decompiler.
Also you have asslib.o
and asslib-llvm.o
standard library relocatable object files.
Check Hello, World! example to learn how to use them.
Updating and Uninstalling
After you’ve installed Assert via git, updating to the latest version is easy. Pull the updates and run make again.
$ git pull
$ make
To uninstall Assert remove assert-lang local repository from your PC:
$ ls
... ... assert-lang ... ...
$ rm -R assert-lang
Hello, World!
Now that you’ve installed Assert compilers, let’s write your first Assert program.
It’s traditional when learning a new language to write a little program that prints
the text Hello, world!
to the screen, but we can't write it because Assert language
does not support string literals :D
That's why we will write program, that prints 448378203247
or 0x68656c6c6f
.
What means hello
. To be more careful, we would have to print 0x6f6c6c6568
(or elloh
),
because that's how the string would be stored in little-endian mode.
Create and edit hello.ass
file:
$ touch hello.ass
$ vi hello.ass
Write the following code:
dump main()
{
assert(out(448378203247));
return 0;
}
Now let's compile hello.ass
file.
At first we have to compile AST:
$ ./tr hello.ass hello.tree
Now you have to choose appropriate back-end compiler.
Legacy Compiler
To use legacy compiler compile the binary with ./cum
:
$ ./cum hello.tree hello.o
hello.o
is most common object file.
We can analyze it with readelf
or objdump
:
$ readelf -W -a hello.o
Now let's link hello.o
relocatable object file.
Note that the ld
arguments and dynamic linker may differ:
$ ld -o hello hello.o asslib.o /lib64/libc.so.6 -I/lib64/ld-linux-x86-64.so.2
Note! You can link standard Assert library dynamically:
ld -o hello hello.o asslib.so /lib64/libc.so.6 -I/lib64/ld-linux-x86-64.so.2
But don't forget to add
asslib.so
to the linker PATH.
LLVM IR compiler
To use llvm compiler run the following command:
$ ./cum-llvm hello.tree hello.ir
hello.ir
is generated LLVM IR. Possible output is listed below:
$ cat hello.ir
; ModuleID = 'hello.ir'
source_filename = "hello.ir"
define void @__ass_globals_init() {
.entry:
ret void
}
declare i64 @__ass_print(i64)
declare i64 @__ass_scan()
define i64 @main() {
.entry:
%0 = call i64 @__ass_print(i64 448378203247)
ret i64 0
}
Then you can compile it with llc:
$ llc hello.ir -O2 -o hello.o
Next link hello.o
with standard Assert libarary:
$ gcc hello.o asslib-llvm.o -o hello
Running program
That's it! We can run ./hello
program:
$ ./hello
448378203247
Common Programming Concepts
Specifically, you’ll learn about variables, basic types, functions, comments, and control flow. These basics will give you a strong starting point.
Assertion Failed!
However, the Assert language has a distinctive feature - the keyword
assert
.Read about assert keyword.
Assert
Keyword assert
is the best feature of the language.
Each statement line must be wrapped in the assert()
keyword.
Otherwise, this line will simply not get into the AST without any notice.
dump main()
{
assert(x = 2);
assert(x = x * 20);
return x;
}
Let's study the example. Check the code below:
dump main()
{
x = 2;
return 0;
}
After compilation it will look like this (again without any notice):
dump main()
{
while (0) {
x = 2;
}
return 0;
}
Why not just remove this code?
The answer is simple: it's not so convenient. Study the code below:
dump example() { if (x > 0) x = 10; }
If we just remove
x = 10;
it will result in a compilation error.dump example() { if (x > 0) }
Data Types
Each value in Assert is a signed 64-bit integer
.
Formally, we can say that Assert is a statically typed language.
But the language supports working with numbers as boolean variables.
I.e. the language normalizes (makes 0
or 1
) the result of logical operations.
For example:
dump main()
{
assert(boolean = 100 > 10 && 1337 == 1337);
assert(out(boolean));
assert(boolean = !boolean);
assert(out(boolean));
return 0;
}
$ ./boolean
1
0
Arrays
Assert supports arrays. It uses a simple and clear syntax. You can't declare array. But you can access any element of it like this:
assert(arr[10] = 101);
If you are accessing an element for the first time, the compiler will allocate minimum possible memory in the stack frame or in the bss section (read Scoping rules). Study the example below:
assert(GLOBAL[12] = 0);
dump main()
{
assert(local[3] = 1);
return 0;
}
assert(GLOBAL[12] = 0);
creates 13-bytes array and initializes 12th element with 0.assert(local[3] = 0);
creates 4-bytes array and initializes 3rd element with 0.
Overflow control
The language does not control overflow in any way.
Functions
You’ve already seen one of the most important functions in the language:
the main
function, which is the entry point of all programs.
You’ve also seen the dump
keyword, which allows you to declare new functions.
Note! All functions must have a
return
statement. It is not necessary to wrap thereturn
in theassert
keyword. Read about assert keyword.
dump main()
{
assert(function());
assert(return 0);
}
dump function()
{
assert(out(101));
assert(return 10);
}
$ ./bin
101
Parameters
We can define functions to have parameters, which are special variables that are part of a function’s signature.
dump main()
{
return print(101, 4);
}
dump print(val, times)
{
while (times) {
assert(out(val));
assert(times = times - 1);
}
return 0;
}
$ ./bin
101
101
101
101
Recursion
It is possible to use recursive calls. Check factorial example.
Control Flow
The most common constructs that let you control
the flow of execution of Assert code are if
expressions and while
loop.
if
Expression
An if
expression allows you to branch your code depending on conditions.
Study the example:
dump main()
{
assert(out(101));
assert(out(99));
return 0;
}
dump compare(val)
{
if (val > 100) {
return 1;
} else {
return 0;
}
}
$ ./bin
1
0
Note! You can skip braces in a single statment control flow
if
orwhile
block. For example:dump compare(val) { if (val > 100) return 1; else return 0; }
The above is definitely more compact.
Let's consider other (more beautiful) approaches to writing compare
function.
dump compare(val)
{
if (val > 100)
return 1;
return 0;
}
dump compare(val)
{
return val > 100;
}
And finally:
dump compare(val)
return val > 100;
Handling Multiple Conditions with else if
You can't use multiple conditions. But you still can write your code smarter.
Conditional Loops with while
A program will often need to evaluate a condition within a loop. While the condition is true, the loop runs. Study the example:
dump main()
{
assert(i = 0);
while (i < 4) {
assert(out(i));
assert(i = i + 1);
}
return 0;
}
$ ./bin
0
1
2
3
Scoping Rules
Assert scoping rules are pretty straight-forward:
Global overrides local.
The main reason is that the Assert AST does not support variable declarations. Let's study the example below:
assert(GLOBAL = 228);
dump main()
{
assert(out(GLOBAL));
assert(GLOBAL = 1337);
assert(out(GLOBAL));
return 0;
}
$ ./bin
228
1337
Memory allocation
Assert compiler ./cum
allocates:
- Local variables in stack frame.
- Global variables in bss segment.
Examples
This chapter is a collection of runnable examples that illustrate various Assert concepts and standard library keywords. You can find the complete code in assert-lang/examples.
Factorial
dump factorial(x) {
if (x <= 1)
return 1;
return factorial(x - 1) * x;
}
dump main() {
assert(out(factorial(in())));
return 0;
}
$ ./bin
10
3628800
Quadratic Equation
You can find the complete code in the Assert examples.
Here I will comment on some of the tricks that I used when writing the code.
Due to the fact that the language does not support real data types, I had to write
integer square root sqrt
function.
dump sqrt(n)
{
if (n <= 0)
return 0;
assert(sol = -404);
assert(x = n/2 || 1);
while (x != sol && x != sol + 1) {
assert(sol = x);
assert(x = (x + n/x) / 2);
}
return sol;
}
Appendix
The following sections contain reference material you may find useful.
A - Keywords
The following list contains keywords that are reserved for current use by the Assert language. As such, they cannot be used as identifiers, including names of functions, variables, parameters:
assert
- assertdump
- define a function or the function pointer typeif
- branch based on the result of a conditional expressionelse
- fallback forif
while
- loop conditionally based on the result of an expressionin
- standard input (asslib)out
- standard output (asslib)inv
- define constant
B - Grammar
Assert uses context-free grammar (CFG). The notation is a mixture of EBNF and my own preferences. The notation is described below. You can find full Assert language grammar here: assert-lang/grammar.
Usage | Notation |
---|---|
definition | -> |
concatenation | , |
termination | ; |
alternation | | |
optional | [ ... ] |
repetition | { ... } |
grouping | ( ... ) |
terminal string | " ... " |
B - Abstract Syntax Tree
AST was created for cross compilation with other stupid languages. That's why there are so many strange decisions. You can study the AST standard here.
Other cross compilation language projects are listed below:
- futherus/Language - Belarusian programming language
- kefirRzevo/Language - Lukashenko programming language
- k-kashapov/lang - The language of jokes
C - Performance
LLVM IR Compiler
It generates LLVM IR. You can run all possible optimizations that llvm supports.
Legacy Compiler
Of course, there are some performance issues. Let's compare the performance of the C compiler and the Assert compiler using the factorial as an example.
#include <stdio.h>
#include <stdint.h>
int64_t factorial(int64_t x)
{
if (x <= 1)
return 1;
return x * factorial(x - 1);
}
int main()
{
printf("%ld", factorial(20));
for (size_t i = 100000000; i > 0; i--)
factorial(20);
return 0;
}
$ g++ -o factorial_O0 -O0 factorial.cpp
$ g++ -o factorial_O1 -O1 factorial.cpp
Study the Assert code here:
dump factorial(x) {
if (x <= 0)
return 1;
return x * factorial(x - 1);
}
dump main() {
assert(x = 100000000);
assert(out(factorial(20)));
while (x > 0) {
assert(factorial(20));
assert(x = x - 1);
}
return 0;
}
Here is the C code and compiler optimization flags below:
$ ./tr factorial.ass factorial.tree
$ ./cum factorial.tree factorial.o
$ ld -o factorial factorial.o asslib.o /lib64/libc.so.6 -I/lib64/ld-linux-x86-64.so.2
Linux perf unility gives the following results:
Performance counter stats for './factorial':
5 843,50 msec task-clock:u # 0,999 CPUs utilized
0 context-switches:u # 0,000 /sec
0 cpu-migrations:u # 0,000 /sec
55 page-faults:u # 9,412 /sec
26 042 804 818 cycles:u # 4,457 GHz
55 400 158 492 instructions:u # 2,13 insn per cycle
6 500 036 070 branches:u # 1,112 G/sec
184 308 574 branch-misses:u # 2,84% of all branches
5,848005098 seconds time elapsed
5,841837000 seconds user
0,000000000 seconds sys
Performance counter stats for './factorial_O0':
5 839,94 msec task-clock:u # 0,999 CPUs utilized
0 context-switches:u # 0,000 /sec
0 cpu-migrations:u # 0,000 /sec
114 page-faults:u # 19,521 /sec
26 115 713 798 cycles:u # 4,472 GHz
26 203 004 347 instructions:u # 1,00 insn per cycle
6 200 471 267 branches:u # 1,062 G/sec
220 171 781 branch-misses:u # 3,55% of all branches
5,846114561 seconds time elapsed
5,835352000 seconds user
0,003325000 seconds sys
Performance counter stats for './factorial_O1':
4 124,80 msec task-clock:u # 1,000 CPUs utilized
0 context-switches:u # 0,000 /sec
0 cpu-migrations:u # 0,000 /sec
113 page-faults:u # 27,395 /sec
18 127 811 023 cycles:u # 4,395 GHz
19 803 003 158 instructions:u # 1,09 insn per cycle
6 100 470 116 branches:u # 1,479 G/sec
400 124 522 branch-misses:u # 6,56% of all branches
4,125413963 seconds time elapsed
4,123597000 seconds user
0,000000000 seconds sys
Performance is generally a complex thing. But we can conclude
that the compiler generates code comparable to the -O0
g++
option.
B - Supported Architectures
LLVM IR Compiler
LLVM IR is a platform-independent intermediate representation that can be used to represent code for any target architecture that is supported by LLVM.
Legacy Compiler
The following list contains supported architectures:
amd64
- also known as em64t or AMD, Intel x86-64.
F - Standard Library
Assert Standard Library (or asslib) is an interface between an abstract Assert language and an operating system. With its help, the execution of some keywords is implemented.
If you dump a compiled file, you can see standard names in the ELF-symtab section (6-7 in the output below):
$ readelf -s compiled.o
Symbol table '.symtab' contains 9 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello.o
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text
3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 .rodata
4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .data
5: 0000000000000000 0 SECTION LOCAL DEFAULT 4 .bss
6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __ass_print
7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __ass_scan
8: 0000000000000038 0 NOTYPE GLOBAL DEFAULT 1 _start
Sources
You can find the source code in the official Assert repository:
- asslib.s - standard library implementation for legacy backend
- asslib-llvm.c - standard library implementation for llvm backend
- STDLIB - ELF64 configuration file.