Leet (programming language)

Leet (or L33t) is an esoteric programming language based loosely on Brainfuck and named for the resemblance of its source code to the symbolic language "L33t 5p34k". L33t was designed by Stephen McGreal[1] and Alex Mole to be as confusing as possible. It is Turing-complete and has the possibility for self-modifying code. Software written in the language can make network connections and may therefore be used to write malware.[citation needed]

Language specification

edit

The basic data unit of L33t is the unsigned byte (big-endian), which can represent ASCII values and numbers in the range 0-255.

The source code is in "l33t 5p34k" and words are separated by spaces or carriage returns. The language uses 10 opcodes and each word in the source code is translated into an opcode by adding all the digits in the word together, e.g. l33t = 3 + 3 = 6. It is not necessary to use anything but digits in the code.

The language uses a 64K block of memory, and 2 pointers - a memory pointer and an instruction pointer. The l33t interpreter tokenizes all the words in the source to create a sequence of numerical opcodes, and places them in order into the memory block, starting at byte 0. The instruction pointer will keep incrementing until it encounters an END. The memory pointer starts at the first byte after the instructions. Memory "wraps": incrementing the memory and the instruction pointer past 64K will cause it to run around to byte 0, and vice versa.

Memory pointers can also be moved into the area of memory occupied by the instructions, so code can be self modified at runtime. Similarly, the instruction pointer will continue incrementing or jumping until it encounters an END, so code can be generated at runtime and subsequently executed.

Opcodes

edit
VALUE OPCODE DESCRIPTION
0 NOP No Operation, except to increment the instruction pointer.
1 WRT Writes the ASCII values of the byte under the memory pointer to the current connection (see CON). Increments the instruction pointer.
2 RD Reads a character from the current connection (see CON) and writes it to the byte currently under the memory pointer. Increments the instruction pointer.
3 IF Moves the instruction pointer forward to the command following the matching EIF, if the byte under the memory pointer is equal to zero.
If the byte under the memory pointer does not equal zero, IF simply increments the instruction pointer.
4 EIF Moves the instruction pointer backwards to the command following the matching IF, if the byte under the memory pointer is not equal to zero.
If the byte under the memory pointer does equal zero, EIF simply increments the instruction pointer.
5 FWD Move memory pointer forward by (next word+1) bytes. Adds 2 to the instruction pointer.
6 BAK Move memory pointer backward by (next word+1) bytes. Subtracts 2 from the instruction pointer.
7 INC Increment value of the byte under memory pointer by (next word+1). Adds 2 to the instruction pointer.
8 DEC Decrement value of the byte under memory pointer by (next word+1). Adds 2 to the instruction pointer.
9 CON Reads the 6 bytes starting with the memory pointer (the first 4 bytes specifying an IP in the format 127.0.0.1, and the last 2 bytes combining to make a 16-bit port number * ),
and opens a connection if possible. If a connection can't be opened, l33t will return the error message:
"h0s7 5uXz0r5! c4N'7 c0Nn3<7 l0l0l0l0l l4m3R !!!".

and reset the current connection to the last successful one (stdin/stdout if there were no previous successful connections).
If all 6 bytes read 0, l33t reverts to the local machine's stdin and stdout (this is the default setting upon starting a l33t program). Increments the instruction pointer.
Regardless of whether the connection was successful or not, the memory pointer will be left in the same place as it was. Only FWD and BAK move the memory pointer.

  • The port number can be calculated by something along the lines of: portNumber = (byte5 << 8) + byte
10 END Closes all open connections and ends the program. The value 10 won't end the program if it is used as data for opcodes FWD, BAK, INC or DEC.

Bugs

edit
F00l! teh c0d3 1s b1g3R th4n teh m3m0ry!!1!

You tried to load a program that is too big to fit in the memory. Note that at compile time, one byte is reserved for the memory buffer, so the program's size must be less than the memory size minus one byte.

Byt3 s1z3 must be at l34st 11, n00b!

The byte_size argument of new() was less than 11. The byte size of an interpreter must be at least 11 (to accommodate for the opcodes).

L0L!!1!1!! n0 l33t pr0gr4m l04d3d, sUxX0r!

run() called before any program was loaded.

Interpreters

edit

Python

edit

Written by Alex Mole. Does not support the CON opcode, but otherwise considered the "definitive" interpreter.[citation needed]

Ruby

edit

Written by Eric Redmond. This one contains an implementation of CON.

JavaScript

edit

By Phil McCarthy, it is based on the Python one but is a bit more interactive.

Interpreters for C have been written by Kuisma Salonen (for use in Linux) and by Alecs King.

Perl 6

edit

By Gaal Yahas. This interpreter is notable for being the first which comes with a debugger.

References

edit
edit