Legend:
Library
Module
Module type
Parameter
Class
Class type
Main BIL module.
The module specifies Binary Instruction Language (BIL). A language to define a semantics of instructions. The semantics of the BIL language is defined at [1].
The language is defined using algebraic types. For each BIL constructor a smart constructor is defined with the same (if syntax allows) name. This allows to use BIL as a DSL embedded into OCaml:
Bil.([
v := src lsr i32 1;
r := src;
s := i32 31;
while_ (var v <> i32 0) [
r := var r lsl i32 1;
r := var r lor (var v land i32 1);
v := var v lsr i32 1;
s := var s - i32 1;
];
dst := var r lsl var s;
])
where i32 is defined as let i32 x = Bil.int (Word.of_int ~width:32 x) and v,r,s are some variables of type var; and src, dst are expressions of type exp.
@see <https://github.com/BinaryAnalysisPlatform/bil/releases/download/v0.1/bil.pdf> [1]: BIL Semantics.
str () t is formatted output function that matches "%a" conversion format specifier in functions, that prints to string, e.g., sprintf, failwithf, errorf and, surprisingly all Lwt printing function, including Lwt_io.printf and logging (or any other function with type ('a,unit,string,...) formatN`. Example:
Or_error.errorf "type %a is not valid for %a"
Type.str ty Exp.str exp
type info = string * [ `Ver of string ] * string option
name,Ver v,desc information attached to a particular reader or writer.
val version : string
Data representation version. After any change in data representation the version should be increased.
Serializers that are derived from a data representation must have the same version as a version of the data structure, from which it is derived. This kind of serializers can only read and write data of the same version.
Other serializers can actually read and write data independent on its representation version. A serializer, that can't store data of current version simply shouldn't be added to a set of serializers.
It is assumed, that if a reader and a writer has the same name and version, then whatever was written by the writer should be readable by the reader. The round-trip equality is not required, thus it is acceptable if some information is lost.
It is also possible, that a reader and a writer that has the same name are compatible. In that case it is recommended to use semantic versioning.
val size_in_bytes : ?ver:string ->?fmt:string ->t-> int
size_in_bytes ?ver ?fmt datum returns the amount of bytes that is needed to represent datum in the given format and version
default_reader returns information about default reader
val set_default_reader : ?ver:string ->string -> unit
set_default_reader ?ver name sets new default reader. If version is not specified then the latest available version is used. Raises an exception if a reader with a given name doesn't exist.
val with_reader : ?ver:string ->string ->(unit ->'a)->'a
with_reader ?ver name operation temporary sets a default reader to a reader with a specified name and version. The default reader is restored after operation is finished.
default_writer returns information about the default writer
val set_default_writer : ?ver:string ->string -> unit
set_default_writer ?ver name sets new default writer. If version is not specified then the latest available version is used. Raises an exception if a writer with a given name doesn't exist.
val with_writer : ?ver:string ->string ->(unit ->'a)->'a
with_writer ?ver name operation temporary sets a default writer to a writer with a specified name and version. The default writer is restored after operation is finished.
default_writer optionally returns an information about default printer
val set_default_printer : ?ver:string ->string -> unit
set_default_printer ?ver name sets new default printer. If version is not specified then the latest available version is used. Raises an exception if a printer with a given name doesn't exist.
val with_printer : ?ver:string ->string ->(unit ->'a)->'a
with_printer ?ver name operation temporary sets a default printer to a printer with a specified name and version. The default printer is restored after operation is finished.
is_referenced x p is true if x is referenced in some expression or statement in program p, before it is assigned.
val is_assigned : ?strict:bool ->var->stmt list-> bool
is_assigned x p is true if there exists such Move statement, that x occurs on the left side of it. If strict is true, then only unconditional assignments are accounted. By default, strict is false
val prune_unreferenced :
?such_that:(var-> bool)->?physicals:bool ->?virtuals:bool ->stmt list->stmt list
prune_unreferenced ?physicals ?virtuals ?such_that p remove all assignments to variables that are not used in the program p. This is a local optimization. The variable is unreferenced if it is not referenced in its lexical scope, or if it is referenced after the assignment. A variable is pruned only if it matches to one of the user specified kind, described below (no variable matches the default values, so by default nothing is pruned):
such_that matches a variable v for which such_that v is true;
physicals matches all physical variables (i.e., registers and memory locations). See Var.is_physical for more information. Note: passing true to this option is in general unsound, unless you're absolutely sure, that physical variables will not live out program p;
virtuals matches all virtual variables (i.e., such variables that were added to a program artificially and are not represented physically in a program). See Var.is_virtual for more information on virtual variables.
substitute x y p substitutes each occurrence of expression x by expression y in program p. The mnemonic to remember the order is to recall the sed's s/in/out syntax.
substitute_var x y p substitutes all free occurrences of variable x in program p by expression y. A variable is free if it is not bounded in a preceding statement or not bound with let expression.
free_vars bil returns a set of free variables in program bil. Variable is considered free if it is not bound in a preceding statement or is not bound with let expression
fixpoint f applies transformation f until fixpoint is reached. If the transformation orbit contains non-trivial cycles, then the transformation will stop at an arbitrary point of a cycle.
propagate_consts bil propagates consts from their reaching definitions. The implementation computes reaching definition using inference style analysis, overapproximates while cycles (doesn't compute the meet-over-paths solution), and ignores memory locations.
prune_dead_virtuals bil removes definitions of virtual variables that are not live in the provided bil program. We assume that virtual variables are used to represent temporaries, thus their removal is safe. The analysis over-approximates the while loops, and won't remove any definition that occurs in a while loop body, or which depends on it. The analysis doesn't track memory locations.
since 1.5
BIL Special values
The Special statement enables encoding of arbitrary semantics using encode attr values and decode attr to get the values back. The meaning of the attr and values is specific to the user domain.
Example, encode call "malloc", where call is the BIL attribute that denotes a call to a function. See call for more information.
Value of a result. We slightly diverge from an operational semantics by allowing a user to provide its own storage implementation.
In operational semantics a storage is represented syntactically as
v1 with [v2,ed] : nat <- v3,
where v1 may be either a Bot value, representing an empty memory (or an absence of knowledge), or another storage. So a well typed memory object is defined inductively as:
That is equivalent to an assoc list. Although we provide an assoc list as storage variant (see Storage.linear), the default storage is implemented slightly more effective, and uses linear space and provides $log(N)$ lookup and update methods. Users are encouraged to provide more efficient storage implementations, for interpreters that rely heave on memory throughput.
val register_pass : ?desc:string ->string ->(t->t)->pass
register_pass ~desc name pass provides a pass to the BIL transformation pipeline. The BIL transformation pipeline is applied after the lifting procedure, i.e., it is embedded into each lift function of all Target modules. (You can selectively register passes based on architecture by subscribing to the Project.Info.arch variable). All passes that were in the selection provided to the select_passes are applied in the order of the selection until the fixed point is reached or a loop is detected. By default, no passes are selected. The bil plugin provides a user interface for passes selection, as well as some useful passes.
select_passes passes select the passes for the BIL transformation pipeline. See register_pass for more information about the BIL transformation pipeline.