How it works
libinteractive reads an .idl file, which is a textual description of the functions implemented by the problemsetter's and contestants' programs that can be called from each other's code. From that file, all compilation rules and adapters can be generated so that both programs can be compiled and communicate with each other as if they were both written in the same language and compiled as a single program.
An advantage of libinteractive is that the problemsetter's and contestants' programs run in different processes (and potentially in different sandbox environments), so it is not necessary to protect the memory or the standard input to avoid cheating. It is also possible to write both implementations in different programming languages. This means that the problemsetter only needs to write one implementation and libinteractive handles the heavy lifting.
Implementation details
After reading the .idl file, libinteractive generates a Makefile (or a .bat file for Windows) and all the needed code for the problemsetter's and contestants' code to be able to invoke each other's functions. The autogenerated function shims serialize and deserialize the function parameters, send a message to the appropriate process and blocks until a reply message is received, at which point it deserializes the return value (if any) and returns it itself. This makes calling a function totally transparent from the point of view of both coders.
libinteractive uses named pipes to achieve inter-process communication and synchronization. The messages sent through the pipes consist of a function identifier (autogenerated for each function in the .idl), the serialized function parameters, and a validation cookie that must be returned as-is to validate that the message was sent and processed correctly. The binary format used is as follows:
message = function-id *field cookie
function-id = int ; a random 32-bits integer that identifies the called function
cookie = int ; a random 32-bits integer used as sentinel
field = byte | short | int | float | long | double | array
byte = UNSIGNED_CHAR
short = SHORT
int = INT
float = INT ; IEEE-754 binary32
double = LONG ; IEEE-754 binary64
long = LONG
array = *byte ; as many bytes as needed. explained below
All integers use the little-endian encoding (the native in x86 architectures),
so the C implementation simply needs to invoke the write system call with a
pointer to the parameter using sizeof(parameter) as the length.
All arrays in the .idl file must comply with the C rules for array sizes. This
means that all array dimensions (except maybe the first) must be integer
constants. The first dimension may be a variable, but it needs to be passed in
as an int parameter to the function.
Since the order and type of the parameters is determined by the .idl file (so it is known at compile time), it is not necessary to write the message length in the message, because the decoder is generated in a way that it will only read the required number of bytes from the pipe.
Finally, libinteractive will also generate a Makefile/.bat so that contestants
can compile all the code without worrying about details, in order to test their
implementations with sample input files provided by the problemsetter. One of
the programs built by the Makefile is a tiny C program called run, which
generates the named pipes, runs the programs, redirects stdin to the
problemsetter's program (if needed), and finally prints all program's
stdout/stderr with a tag to identify from which process it came from. Finally,
once all programs exit, it prints a summary of the memory used (maximum
resident size) and the total time used by the contestant's code (user time).
Performance
Since the problemsetter and contestants' code are not compiled together in a single binary, what was once a simple function call (0-10 CPU cycles of overhead in C, depending if it was possible to perform inlining of the function or not) now requires at least two system calls and two process switches, which adds 7-10 microseconds to the total execution time (wall time) and around 2-3 microseconds to the contestant time (user time). This means that if a problem requires more than ~400,000 function calls between processes, programs generated by libinteractive will most likely exceed time limits.
Part of this overhead is copying data between buffers and is not possible to remove without changing the programming model and thus breaking compatibility with standalone linked programs for easy experimentation. The rest of the overhead is caused by the operating system when making system calls and switching processes, so there is not much we can do. Experimentation showed that even when using alternative ways of performing IPC (shared memory + semaphores to indicate readiness), overall runtime improved by less than 5%, and user time increased by up to 50%.
If the program does not require a large volume of function calls, libinteractive is an excellent option to write your interactive problems.
Supported scenarios
Some of the scenarios that will work great with libinteractive are:
- Passing giant arrays with millions of elements between the problemsetter and the contestants' code. The serialized binary format is much more efficient to read (just one system call if the data fits on a 4096-byte buffer) than reading them from standard input (which requires parsing the numbers), so this will reduce contestant time significantly.
- Problems with several phases (like Parrots in IOI 2011) was almost impossible to support before and is now a supported scenario.
- Autogenerating the contestant's input. For instance, a technique used commonly in Facebook Hacker Cup is to ask the contestant to generate their own input using a linear congruential generator to avoid having to read a multi-megabyte input file.
- Since the processes run in different sandbox, it is now not needed to hide/obfuscate the program memory. This allows the problemsetter program to also fulfill the role of the validator, so it can write the final score obtained by the contestant.