EvilVM Documentation Site


OK, I’ll be honest, I don’t receive questions frequently. So this document is really to answer the sorts of questions I’d expect ought to be frequent. It’s really a slush space for notes on odd cases, or describing EvilVM’s error messages (which are often terse to save space / complexity).

Compiler Errors

How do I understand exception errors?

EvilVM uses the Windows structured exception handler (SEH) to catch exceptions and recover. But, since it’s not possible to properly instrument SEH when the code is compiled dynamically, exception handling is somewhat simplistic. When an exception happens, you’ll see something like this:

Input State:
Line Number:      83 
Last word:        SCREENSHOT

Exception RIP:    4025a2 
In word:          UNKNOWN
Last call:        UNKNOWN
Exception RSP:    60fcd0 
Context PTR:      60f5c0 

Exception Record:
Exception Code:   80000003 
Exception Flags:  0 

The “Input State” gives you some clue about where either the interpreter or the compiler were when the exception happened. The line number is reset to 1 every time the server sends code to the compiler, so if you just sent a file, this number should correspond to where in the input file the error occurred. The “last word” field tells you the last word that was consumed and parsed by either the interpreter or the compiler. A very common error is to mistype a name in a source file, so these two together can help you quickly determine what was wrong if that happens.

In the “context” area, EvilVM tells you where rip was at the time of the exception. If this falls within defined dictionary word’s extent, you should see content in the “In word” field. The “Last call” field comes from the value of rsp at the point of the exception. This may not always be valid (if you’re using the >r, etc., words), but if it is in a dictionary word, it’ll be there too. Another possibility is that the error occurred in ASM code in the original shellcode (e.g., the transport, outer interpreter, etc.) – in this case, this field will show “shellcode”. If your exception comes from calls into external DLLs, or the values are otherwise not intelligible, they will read “UNKNOWN”, as there’s not much EvilVM can do here.

When an exception occurs, the entire CONTEXT struct is retained, and can be accessed at the indicated address. Refer to Microsoft’s docs for info about what’s in there.

There is also an exception code (c0000005 above). These are NTSTATUS codes, which usually have some useful meaning. I don’t want to include a full dictionary of status codes and error messages, as that will become quite large. But you can look these up in various places, such as here on Microsoft’s site. These will often provide extra flavor, though sometimes they won’t appear on Microsoft’s list because some libraries have their own codes. Check docs if you’re off the beaten path.

Note that when EvilVM goes to throw an exception on purpose, it will usually do so with an int3 instruction (hex value cc), which will show up as an exception code 0x80000003. That also means it’ll function as a breakpoint in a debugger, so that can sometimes be a handy way to figure out what’s going on.

I see Fixing 'here'; dictionary may be inconsistent now! in an error, what’s that mean?

This one was an annoying bug that came up every so often while I was fleshing out some of the compiler’s core features, especially when defining data structures in the dictionary. It can happen that code somehow corrupts the here pointer, which is pivotal to correctly processing the dictionary. Here’s a quick example of corrupting the here pointer:

\ constant has 1 f
1024 value BUFSIZE
\ but is used with 2 f's
create buffer BUFFSIZE allot

Note that there is a typo, so when BUFFSIZE is run by the interpreter, it can’t be found, and no proper size is put on the stack. So when allot goes to make space, it will take whatever is on the stack, which might be a really bad value sometimes. The result can be an invalid here pointer, and thus anything that ever touches it (including the exception pretty printer, annoyingly, due to its dependence on pad for printing numbers) will generate exceptions.

This is avoided by adding guards to the allot word, so at least the basic case can’t happen. But other situations can corrupt the here pointer, so there’s a failsafe. In the .exception pretty printer, it will do a sanity check to ensure that here falls somewhere within the allocated dictionary space. If not, it will estimate a “valid” pointer by going to the last defined word, and using its length field to derive a new, safe value.

This can still produce an unstable condition, though, because not all defined words have a sensible length field. The length field is actually just the length of the behavior, and so if you have used create to make spaces in the dictionary in concert with allot, then you could have some inconsistent data in your future.

Nevertheless, the system tries to do its best to stay interactive. If you encounter this error, and are concerned about stability, you might consider issuing a forget to rewind to the last mark point, and reloading code from there.

Compiler Conventions

How is the Forth environment state mapped to registers / global variables?

This will be of interest to you if you’re trying to debug broken code, if you’re writing syntax extensions in immediate words, or trying to make sense of exceptions or code disassembly. The following table highlights the important parts of the current runtime state in the EvilVM environment:

Object Meaning Notes
rdi Top item on data stack
r12 Pointer to second item on stack
r15 Pointer to base of global variable table
here Pointer to next available byte in dictionary
last Most recently defined word in dictionary set at ;
this Current word in dictionary set at : or create
dict Pointer to base of dictionary
entrypoint Start address of EvilVM shellcode spawn a thread here for fun
base current numerical base for numbers reading and printing

The global variables listed above are accessed at offsets into the global variable table (pointed to by r15). You can find these offsets (and lots of other interesting global variables) in the file at agent/table.asm. IO layers or other optional components may add to this table, and may or may not be consistent from one assembly to another. All global variables are of QWORD size. You can inspect them at runtime as follows:

0 glob 512 dump
clamp to 512 

60fce0  00 00 00 00 00 00 00 00 00 00 75 49 fa 7f 00 00 
60fcf0  2c 1b 7e 49 fa 7f 00 00 b8 01 7e 49 fa 7f 00 00 
60fd00  a0 34 7e 49 fa 7f 00 00 60 9d 76 49 fa 7f 00 00 
60fd10  00 ff 02 00 00 00 00 00 e9 1b 40 00 00 00 00 00 
60fd20  00 00 00 00 00 00 00 00 89 2a 40 00 00 00 00 00 
60fd30  ab 25 40 00 00 00 00 00 7b 32 80 00 00 00 00 00 
. . .

All other registers are fair game for custom assembly. Bear in mind that this also means that they’re all volatile, and should generally be considered caller-saved registers. The core API uses rax, rsi, rcx, and rbx quite frequently. Either save in the caller for maximum safety, or conduct thorough testing before relying on any special use of CPU registers.

Last updated on 3 May 2019