Real-Time Monitoring of Ocaml programs

Transcription

Real-Time Monitoring of Ocaml programs
Real-Time Monitoring of Ocaml programs
S. Conchon1,2,3 — J.-C. Filliâtre2,1,3 — F. Le Fessant3 —
J. Robert1 — G. Von Tokarski1
1: Université Paris Sud F-91405 Orsay
2: CNRS / LRI UMR 8623 F-91405 Orsay
3: INRIA Saclay – Île-de-France F-91893 Orsay
{conchon,filliatr,jrobert,gvt}@lri.fr, [email protected]
Abstract. In order to tune or debug a program, it is useful to be able to observe its execution. For instance, one may observe how the memory is used, how much time is spent in some
parts of the code, or even observe some values computed by the program. Several tools allow
such observations (system monitors, specific or generic profilers or debuggers, users logs, etc.).
However, these tools only provide limited observations which are typically obtained at the end
of program execution. This paper presents Ocamlviz, a library to instrument OCaml code and
tools for real-time observation of its execution.
Keywords: Monitoring, Real Time, Ocaml
Résumé long
Il existe de nombreux moyens pour mettre au point un programme
écrit en Objective Caml [3]. S’il s’agit de sa correction, la solution la
plus simple consiste souvent à insérer quelques affichages sur la sortie
standard, par exemple à l’aide de la fonction printf. Si cette trace ne
suffit pas, on peut se tourner vers le debugger d’OCaml [4], qui permet notamment d’inspecter toute valeur calculée et même de revenir en
arrière dans l’exécution du programme. Il est également possible d’utiliser un debugger plus générique tel que gdb [1]. S’il s’agit d’analyser
les performances du programme, il existe des solutions simples et imJFLA 2010
160
JFLA 2010
médiates, comme les outils Unix time et top pour mesurer respectivement le temps d’exécution et la quantité de mémoire utilisée. Pour
une analyse plus fine, on peut utiliser des profilers tels que celui fourni
avec OCaml, ocamlprof [5], ou encore des outils génériques comme
gprof [2] et OProfile [6].
Même s’ils sont complémentaires, ces divers outils possèdent un
grand nombre de limitations. La première concerne les possibilités
d’observation. Les profilers génériques gprof et OProfile sont ainsi
limités à l’observation du nombre d’appels de chaque fonction et du
temps passé dans celles-ci. Inversement, ocamlprof a une granularité
plus fine, mais ne donne que des décomptes de points de passage sans
indication de temps de calcul. Les debuggers non plus ne permettent pas
de mesurer le temps de calcul. D’autre part, aucun de ces outils ne permet de mesurer la quantité de mémoire occupée par une valeur OCaml.
Concernant l’aspect temps-réel, seul OProfile permet d’observer un
programme en cours d’exécution. Des outils comme ocamlprof et
gprof n’offrent qu’une analyse post-mortem. Enfin, aucun effort particulier n’est fait dans ces outils pour présenter graphiquement les résultats des observations. Leur architecture ne permet généralement pas
de leur ajouter facilement une telle fonctionnalité.
Le projet Ocamlviz propose une alternative à ces différents outils,
sous la forme d’une bibliothèque pour instrumenter du code OCaml
et des outils pour visualiser ensuite son exécution, en temps-réel et de
manière distante. Les fonctions proposées dans cette bibliothèque permettent au programmeur d’indiquer précisément les observations qu’il
souhaite faire. En particulier, la granularité de ces observations n’est
pas limitée aux fonctions. L’utilisateur peut ainsi observer des points de
passage et des temps d’exécution pour des portions de code arbitraires.
D’autre part, il est possible de mesurer la quantité de mémoire utilisée
par une ou plusieurs valeurs et même de déterminer si elles sont encore
vivantes pour le ramasse-miettes.
Une particularité d’Ocamlviz est de dissocier le calcul des valeurs
observées et leur exploitation. En effet, le programme instrumenté à
l’aide de la bibliothèque se comporte, pendant son exécution, comme
un serveur, auquel des clients peuvent se connecter pour récupérer les
résultats de l’observation. Ocamlviz fournit un client particulier, sous la
Ocamlviz
161
forme d’une interface graphique GTK-2. Le serveur et les clients communiquent à l’aide d’un protocole réseau indépendant d’OCaml, ce qui
permet d’observer l’exécution depuis des machines distantes et d’écrire
des clients dans d’autres langages.
Principe. Ocamlviz est à la fois une bibliothèque pour instrumenter
du code OCaml à observer et des outils pour transmettre et visualiser
les résultats. En premier lieu, l’utilisateur instrumente son code en indiquant les observations qu’il souhaite réaliser. Il utilise pour cela les
briques de base fournies par la bibliothèque Ocamlviz. Il lie ensuite
son programme avec cette bibliothèque. Le code ainsi obtenu peut être
exécuté normalement mais il peut également être observé. En effet, en
parallèle de l’exécution normale, le programme se comporte maintenant
comme un serveur qui attend des connections et transmet le résultat des
observations à ses clients. En particulier, Ocamlviz fournit une interface graphique évoluée permettant une visualisation agréable et synthétique des résultats.
Le caractère client/serveur de l’architecture offre de nombreux avantages. Premièrement, il permet de décider d’observer le programme à
n’importe quel moment de son exécution. En particulier, un client peut
se connecter longtemps après le début de l’exécution et se déconnecter
à tout instant. Deuxièmement, un nombre arbitraire de clients peuvent
se connecter à un même programme en cours d’exécution. Troisièmement, les clients peuvent se connecter depuis des machines distantes,
qu’elles aient ou non la même architecture que celle sur laquelle s’exécute le code instrumenté. Enfin, cette architecture dissocie l’instrumentation de l’observation des résultats. En particulier, le protocole de communication est indépendant du langage OCaml et permet par exemple
l’utilisation de clients écrits dans d’autres langages.
Observations. La bibliothèque Ocamlviz permet plusieurs types
d’observations, que l’on présente rapidement dans cette section.
– Mesure du temps. Ocamlviz permet de mesurer le temps passé
entre certains points du programme, cumulativement. Pour cela, l’utilisateur crée un ou plusieurs chronomètres, puis les déclenche et les
stoppe en divers endroits du programme, non nécessairement au sein de
162
JFLA 2010
la même fonction. Pour chaque chronomètre, le temps cumulé de son
activation est alors rapporté. L’interface graphique montre cette valeur,
le pourcentage qu’elle représente par rapport au temps total d’exécution
ainsi que le dernier instant où elle a été modifiée.
– Points de passage. Ocamlviz permet de mesurer le nombre de fois
que l’exécution passe par un ensemble de points de programme. Comme
pour la mesure du temps, l’utilisateur commence par créer un ou plusieurs marqueurs, qu’il place aux endroits adéquats. Pour chaque marqueur, Ocamlviz décompte alors le nombre de fois que l’exécution a
emprunté un point de programme désigné par ce marqueur. Là encore,
la notion de marqueur n’est pas liée à celle de fonction, un même marqueur pouvant être placé n’importe où dans le code, y compris plusieurs
fois. L’interface graphique montre la valeur de chaque marqueur, ainsi
que sa date de dernière modification. Un marqueur peut être utilisé de
manière immédiate pour compter le nombre d’appels à une fonction, à
la manière de gprof, ou encore pour compter le nombre de fois qu’une
branche else est empruntée, à la manière d’ocamlprof, mais aussi pour
mesurer cumulativement le nombre de fois que l’on emprunte tel ou tel
partie du code, ce que gprof et ocamlprof ne permettent pas de faire.
– Observation de valeurs. Ocamlviz permet d’observer des valeurs
arbitraires calculées par le programme. Pour les types de base (entiers,
flottants, chaînes de caractères), il suffit d’indiquer une fonction calculant la valeur à observer et la fréquence à laquelle cette fonction doit
être évaluée. La valeur est alors transmise à tous les clients, à cette fréquence. Lorsque la valeur est numérique, l’interface graphique permet
même de tracer un graphe de l’évolution de la valeur dans le temps. La
bibliothèque Ocamlviz fournit des facilités pour observer directement
le contenu d’une référence ou celui d’une table de hachage provenant
de la bibliothèque standard d’OCaml.
Pour observer des valeurs de types plus complexes, une solution
simple consiste à les transformer en chaînes de caractères. Néanmoins,
Ocamlviz fournit également une solution élégante pour observer des
structures s’apparentant à des arbres ou à des graphes. Pour cela, l’utilisateur commence par traduire la valeur à observer dans le type suivant
type t = { node : string; mutable children : t list }
Ocamlviz
163
puis celle-ci est transmise via le protocole client/server jusqu’à l’interface graphique où elle est affichée sous forme d’arbre ou de graphe,
selon le cas. Le partage ou les cycles éventuels sont préservés, à condition qu’ils l’aient été par l’utilisateur dans sa traduction vers le type t
ci-dessus, ce que permet le caractère mutable du champ children.
– Marquage de valeurs. Ocamlviz permet d’analyser l’utilisation de
la mémoire par le programme. Pour cela, l’utilisateur peut marquer un
ensemble de valeurs et connaître à intervalle régulier le nombre d’entre
elles qui sont encore vivantes pour le ramasse-miettes et, éventuellement, la taille qu’elles occupent sur le tas. Là encore, l’utilisateur peut
créer plusieurs marqueurs. Un même marqueur peut être utilisé pour
marquer une ou plusieurs valeurs, possiblement de types différents. Si
plusieurs valeurs sont désignées par le même marqueur, le partage est
pris en compte dans le calcul de l’occupation mémoire.
– Journal. Une dernière technique d’observation fournie par Ocamlviz consiste en une commande similaire à printf qui permet d’envoyer
des messages arbitraires par le protocole client/serveur. Ceux-ci apparaissent alors côté client sous la forme d’un journal, où chaque message
est accompagné de son temps d’exécution.
Lorsque l’utilisateur souhaite faire certaines de ces observations de manière systématique sur l’ensemble du programme, comme observer le
contenu de toutes les références globales ou compter le temps passé
dans chaque fonction, il peut utiliser pour cela un petit module camlp4
fourni avec Ocamlviz.
Réalisation. L’architecture d’Ocamlviz repose sur trois éléments
principaux : un protocole de communication, une bibliothèque serveur
et une bibliothèque client.
Le protocole client/serveur utilisé par Ocamlviz est indépendant du
langage OCaml et de l’architecture sur laquelle tourne le programme en
cours d’observation ou les différents clients. Cela permet entre autres
d’observer un programme depuis une autre machine ou encore d’écrire
un client dans un autre langage. La représentation des données reste
compacte, pour une utilisation optimale de la bande passante, les données transmises pouvant être très volumineuses.
164
JFLA 2010
La bibliothèque serveur, liée avec le programme observé, calcule à
intervalle régulier l’ensemble des données d’observation et les transmet aux différents clients. Les valeurs marquées sont stockées dans des
tables de pointeurs faibles, afin de ne pas s’opposer à leur récupération
par le ramasse-miettes. Ocamlviz fournit deux réalisations différentes
de cette bibliothèque serveur : l’une utilisation des alarmes et l’autre
des processus légers (threads). Il est ainsi possible de s’adapter aux
contraintes du programme observé.
Enfin, Ocamlviz fournit également une bibliothèque facilitant l’écriture de clients. Cette bibliothèque permet notamment de conserver localement au client une partie des données envoyées par le serveur, dans
une fenêtre de temps paramétrable. Il est alors relativement facile de
fournir dans le client un mécanisme de pause ou même de retour en arrière dans le temps. L’interface graphique fournie avec Ocamlviz utilise
cette bibliothèque et fournit notamment ces deux fonctionnalités.
Crédits. Ocamlviz est un logiciel libre, disponible à l’adresse http:
//ocamlviz.forge.ocamlcore.org/. Son développement a été financé par la société Jane Street Capital et réalisé par deux étudiants
de l’Université Paris Sud, Julien Robert et Guillaume Von Tokarski, au
sein de l’INRIA Saclay–Île-de-France.
!!!
Ocamlviz
165
1. Introduction
There exist several ways to tune Objective Caml [3] programs. Concerning correctness, the simplest solution usually consists in inserting
debugging messages, for instance using the printf function. When
this trace is not enough, one can turn to the OCaml debugger [4]. It
allows value inspection and even to rollback in program execution. It
is also possible to use a generic debugger such as gdb [1]. Regarding
performances, there exist simple and immediate solutions, such as Unix
time and top commands, to measure time and space, respectively. For
finer analyses, on can use either the OCaml specific tool ocamlprof [5]
or generic purpose profilers such as gprof [2] or OProfile [6].
Even if useful when used in collaboration, these tools have severe
limitations. The first one is related to observation capabilities. Generic
profilers gprof and OProfile are limited to the number of calls and
time spent for each function. Conversely, ocamlprof only reports on
program point counts without any time measure. Debuggers do not
measure any time either. Beside, none of these tools allows to measure space used by an OCaml value. Regarding real-time analysis, only
OProfile allows to monitor a program under execution. Tools such
as ocamlprof and gprof only offer post-mortem analysis. Finally, no
effort is made is these tools to provide graphical display of observation
results, and their architecture do not allow to easily add such a feature.
The Ocamlviz project is an alternative to all these tools. It is both
a library to instrument some OCaml code and tools to visualize its execution in real-time, possibly remotely. The library offers the programmer to precisely specify the observations to perform. In particular, the
granularity is not limited to functions; program points can be inserted
anywhere in the code and time spent between two such points can be
measured. Moreover, space used by an OCaml value can monitored. It
is even possible to observe whether a value is still alive for the garbagecollector.
One of the main features of Ocamlviz is to dissociate the computation of observed values and their display. Indeed, the program instrumented using the library behaves as a server, to which clients can connect to collect observation results. In particular, Ocamlviz provides a
166
JFLA 2010
GTK-2 graphical client. The architecture of Ocamlviz can be depicted
as follows:
program
instrumented
library
Ocamlviz
network
GUI
Ocamlviz
Figure 1: Ocamlviz’s architecture.
Server and clients communicate through a protocol independent of
OCaml, which allows remote observation from distant machines and to
write clients in other languages.
This paper is organized as follows. Section 2 describes the general
architecture of Ocamlviz and shows the main features for instrumenting programs. Section 3 gives specific implementation details about the
library and the protocol. Finally, we sketch some perspectives in Section 5.
2. Principle
Ocamlviz is both a library to instrument OCaml code and tools to
communicate and visualize the results. The concept is illustrated Figure 1. First, the user instruments the code to indicate which observations
to perform. To do so, the library provides different kinds of observation
functions. Then, the code is linked to the library. When executed, the
resulted program runs as if it was not instrumented, but can also be
observed. Indeed, in parallel to normal execution, the program now behaves as a server ready to connect to new clients and to transmit them
observation data. In particular, Ocamlviz provides an enhanced graphical interface which allows a nice and clear display of results (see Figure 2).
The client/server architecture offers many advantages. First, it allows observation at any time during program execution. In particular,
a client may connect late after the program startup and may disconnect whenever it likes. Second, several clients may connect to a single
Ocamlviz
167
Figure 2: Ocamlviz’s graphical interface.
program, even simultaneously. Third, clients can connect from remote
machines, possibly with a different architecture from that of the running
program. Finally, Ocamlviz’s architecture dissociates the production of
data from their observation. In particular, the communication protocol
is independent of OCaml and thus allows clients to be written in other
languages.
The minimal instrumentation simply consists in linking a program to
the Ocamlviz library. It has the immediate effect of sending to clients
data related to the OCaml garbage-collector (total size of the heap and
of its living part). The graphical interface then displays this information
as a graph, as shown in Figure 2. Beyond this feature, the Ocamlviz
library provides the following features:
– measuring the time spent between two program points;
– counting the number of times one or several program points are
reached;
– observing values computed by the program;
– measuring number and total size of a set of values marked by the
user;
168
JFLA 2010
– displaying printf-like messages, which are also logged.
The remaining of this section details both these different kinds of instrumentation and the graphical interface features.
2.1. Instrumentation
The Ocamlviz library provides modules to observe computation
times, program points, or values computed by the program.
2.1.1. Time Measure.
Ocamlviz defines a notion of stopwatch, which can be started or
stopped anywhere in the program. Such program points do not necessarily coincide with function entries or exits, as it is usually the case in
profiling tools. Moreover, one can measure the cumulative time spent
in several parts of the code. A new stopwatch is created as follows:
let chrono = Ocamlviz.Time.create "t"
The string "t" is only used for display in the graphical interface. The
stopwatch is used as follows:
let f x =
...
Ocamlviz.Time.start chrono;
let z = ... in
Ocamlviz.Time.stop chrono
...
In this example, we only measure the time spent for computing z but not
in the remaining of function f. It is worth noticing that a stopwatch can
be started in some function and stopped in another one. Figure 3 shows
how the graphical interface displays the value of stopwatch "t". Column Time displays the current value of the stopwatch (namely 40.9 seconds, representing 100% of total execution time) and the next column
displays the execution time at which this value was received (namely
40.9 seconds as well, since both values coincide).
Ocamlviz
169
Figure 3: Stopwatches and program points.
2.1.2. Program Points.
Ocamlviz provides a way to mark one or several program points and
then to count the number of times execution reaches these marks. A
new mark is created as follows:
let point = Ocamlviz.Point.create "p"
Then program points to be observed are marked as follows:
...
Ocamlviz.Point.observe point;
...
A given mark can be used at several places inside the code and the count
is global for each mark. For instance, Figure 3 shows that execution
reached 803,800,206 times program points marked using point.
2.1.3. Observing Values.
Ocamlviz allows to observe arbitrary values computed by the program. For simple types (integers, floating-points or strings), it suffices
to provide a function computing the value to observe. Then this function
is called repeatedly, at a frequency specified by the user. The following
code
let () = Ocamlviz.Value.observe_float_fct
"my value" ~period:200 (fun () → sin !v)
declares an observation function computing the floating-point value sin
!v every 200 milliseconds. Figure 4 shows the visualization of this
value in the graphical interface.
170
JFLA 2010
Figure 4: Observing a scalar
value.
Figure 5: Observing a structured value.
For more complex types, a simple solution is to turn values to observe into strings. However, Ocamlviz provides a more elegant way to
observe values such as trees or graphs. To do so, the user first turns his
data structure into the following type:
type t = { node : string; mutable children : t list }
Generally speaking, it is a datatype for graphs where nodes are labeled
with strings. The mutable nature of field children allows to build
cyclic values. If the user data contains sharing, it will be preserved by
the communication protocol and displayed in the graphical interface.
Figure 5 shows the visualization of a structured data where the node
named "shared" is shared.
2.1.4. Marking Values.
Ocamlviz provides a way to precisely analyze how the memory is
used. Indeed, one can mark values and then to monitor, at any time, the
number of such values still alive and the amount of space they use.
As for program points, one first create a new marker, called a “tag”
in this context:
let t =
Ocamlviz.Tag.create ~size:true ~period:300 "foo"
Ocamlviz
171
Here we create a tag with name "foo". We indicate that memory size
must be computed (option size), everywhere 300 milliseconds (option
period). Using such a tag, one can mark values when they are created,
for instance, as follows:
let cons x =
let l = Random.float 10. :: x in
Ocamlviz.Tag.mark t l;
l
The graphical interface displays the number of values and total size for
this tag:
On this screenshot, one reads that 6,032 values still alive are marked
with tag "foo" and use slightly more than 289 kb.
A given tag can be used to mark one or several values, possibly of
different types. If several values are marked with the same tag and
share sub-values, then sharing is taken into account in the computation
of memory use. Thus in the following code
let l2 = [Random.int 3; Random.int 4] in
Ocamlviz.Tag.mark t l2;
let l3 = Random.int 5 :: l2 in
Ocamlviz.Tag.mark t l3;
...
the number of values tagged with t is 2 (the lists l2 and l3) and the
memory used for this tag is 36 bytes (3 cons blocks, of 3 words each
including the header word).
2.1.5. Hash Tables.
Observing a value from an abstract type from the standard library is
not easy: one has to know the type definition and to resort to module
172
JFLA 2010
Obj. For instance, the user cannot easily monitor the contents of a hash
table of type Hashtbl.t, in particular to determine if the distribution
among buckets is uniform. To help in this process, Ocamlviz provides
a module to observe a hash table from the standard library.
The following line declares a global hash table to be observed under
the name "h":
let h =
Ocamlviz.Hashtable.observe "h" (Hashtbl.create 17)
Then the hraphical interface displays several elements regarding this
table:
It includes the number of empty buckets, the filling rate (that is the
proportion of non-empty buckets), and the length of the longest bucket.
2.1.6. Journal.
Finally, Ocamlviz provides a printf-like facility to transmit text
messages up to the clients. These messages are sent together with the
program execution time. For instance, one can write
...
Ocamlviz.log ">>> test %d >>>" i;
and observe the sequence of such messages in the graphical interface:
Ocamlviz
173
2.1.7. Automatic Instrumentation using Camlp4.
It can be rather tedious to instrument the code in a systematic way,
for instance to observe the time spent in each function in a way similar
to a traditional profiler. Ocamlviz provides a small camlp4 module to
do that, which automatically inserts the following observations:
– values for global references of type int, float, bool or string
(initialized with a constant);
– contents of all global hash tables ;
– number of calls and time spent in each global function.
The main purpose of this camlp4 module is to show that such an automatic instrumentation is possible; it is likely that each user will modify
it as needed.
2.2. Advanced Features
2.2.1. Graphical Interface
The graphical interface provided with Ocamlviz proposes other features beyond the ones already presented so far. In particular, the user
may stop the display update at any time, to have a detailed look at the
values transmitted by the server. The program under execution is not interrupted and keeps sending new data. These data are not lost but stored
in a database local to the client (to be described later in Section 3.4). At
any time, the user may resume the real-time observation of the program.
It resumes where it was stopped, thus with a time shifting. More generally, the user may navigate in time arbitrarily using a cursor, within
the limits of a predefined and customizable time window. This feature
appears in the upper part of the graphical interface, as follows:
Another feature of the interface is the ability to display one or several
numerical values within a new graph, as done for garbage-collector data
for instance. To do so, the user selects the values to display (checking
174
JFLA 2010
them) and then asks for the creation of a new graph or the addition to
an existing graph. Such graphs appear in new tabs, but can also be
detached from the main window to be observed in parallel.
2.2.2. Dump Profiling
To profile a program using Ocamlviz, the user needs to start the
graphical interface after starting the program. This can be particularly
disappointing for short life programs, where the graphical interface does
not have time to connect to the program to extract interesting information.
We found that a good solution to that problem is to ask Ocamlviz
to dump all the observations in a file during the execution of such a
program, and then to ask the graphical interface to display the content
of that dump file.
For that, the user just needs to execute the instrumented program with
an additional environment variable, called OCAMLVIZ_DUMP, that must
contain the name of the file that should be created during the execution,
and filled with the observations.
Then, we implemented two new options for the graphical interface:
– Option -i filename tells the graphical interface that it should
read observations from the specified filename.
– Option -tscale factor tells the graphical interface that it should
scale the time using the specified factor.
The second option is particularly useful for very short life programs,
where it is interesting to slow down the execution speed of the program, to be able to observe the program behavior in the graphical interface. The user can also use the OCAMLVIZ_PERIOD environment variable
while running the instrumented program to decrease the period between
observations, to get a very fine grain monitoring of the program.
Using the navigation panel described in the previous section, the user
can then inspect the observed values during the execution of the program, with the possibility to go forward and backward in the execution
as needed.
Ocamlviz
serveur
programme
175
clients
protocole binaire
Ocamlviz
Net
Db
affichage
...
Net
Db
affichage
Figure 6: Client and server parts of Ocamlviz library.
3. Implementation
As shown in Figure 6, the Ocamlviz library can be split into two
parts. First, a server part, linked to the instrumented program, which
communicates with one or several clients via the network. Second, a
client part which provides tools to ease the programming of clients.
In this section, we show how these two parts are implemented in the
current version of Ocamlviz. More precisely, we first describe the communication protocol and its implementation, then we give several details
underlying the computation of observations and finally we present the
storage module on the client side.
3.1. Communication Protocol
An important feature of Ocamlviz is the possibility to remotely observe the execution of a program. In order to be independent of the
architectures of the machines used to observe and to run the program,
we designed a portable binary protocol and implemented it in the client
and server parts of the library. The benefit of such an approach is to
allow the communication between different machines (32 and 64-bits
for instance, little-endian and big-endian) but also to write clients and
servers in any programming language. Thus, it is possible to write a new
library to instrument programs written in a language other than OCaml
and to observe programs using the current graphical interface of Ocamlviz. Conversely, it is possible to write new clients in any programming
language.
176
JFLA 2010
3.1.1. Messages
The current protocol only contains messages from server to clients.
It is defined with 3 kinds of messages:
type msg =
| Declare of uid × kind × string
| Send of uid × value
| Bind of uid list
Messages refer to observed values which are identified by unique integers of type uid. The message Declare (uid, kind, name) declares a new observed value, giving its unique id, its nature of type
kind and the name under which is must be displayed. At connection
time, the client receives from the server a bunch a Declare messages
corresponding to all values observed in the instrumented program so far.
The message Send (uid, v) gives an update v for the value identified by uid. Such a message is repeatedly sent for each observed value,
even when there is no change.
Last, the message Bind uid_list tells the client that several unique
identifiers are linked together; it is used in particular to bind the number
and the size values for a single tag of type Ocamlviz.Tag.t.
3.1.2. Binary Protocol
Each message is a string of characters. It contains a first 4-byte
header indicating its length and a second header indicating its type.
Then other values follow, whose type and number differ according to
the message type.
This representation allows an efficient processing of each message.
In particular, a message is not analyzed as long as it is not entirely
received. Messages are stored within a global buffer, which is only
extended when necessary. This avoids allocating many small strings
for messages, which would fragment the heap and slower the garbagecollector.
To simplify the implementation and possible extension of the protocol, functions are provided to transmit values of each OCaml base type
Ocamlviz
177
and then combined to transmit more complex data. For each base type
ttype (8, 16, 32 or 64-bits integers, floating-points, etc.) we introduce
a pair of functions as follows:
val get_ttype : string → int → ttype × int
val buf_ttype : Buffer.t → ttype → unit
Function get_ttype s pos extracts of value of type ttype from
string s starting at position pos and returns a pair containing this values
together with the next position in the string. Function buf_ttype buf
v appends value v of type ttype to the buffer buf.
To insert the message length, four null bytes are placed in front of
the buffer before encoding the message. The string corresponding to
the message is then extracted and, its length being now known, its four
first bytes are modified accordingly.
Last, in order to allow compatibility between 32 and 64-bits architectures, integer values come together with a tag indicating their size
(31 or 63-bits int, int32, or int64).
3.2. The Server Library
The server library computes every 100 milliseconds all the currently
monitored values, and sends them to all connected clients. The delay between these computations can be adjusted using the OCAMLVIZ_PERIOD
environment variable.
Two different mechanisms have been implemented to deal with these
operations, depending on the constraints on the monitored program:
alarms (signals) and threads.
3.2.1. Using Alarms
Alarms can be used to trigger the execution of a function periodically. This technique works well in most cases, because it stops the
execution of the program while the function is executing, and thus, it
does not suffer from synchronization problems.
There are still cases where alarms could raise issues:
178
JFLA 2010
– In OCaml, alarms are not allowed to interrupt a program at any
instant. Instead, OCaml only allows the execution of the alarm code at
particular points, to avoid bad interactions with the garbage collector.
These points are mostly allocations and input-output operations. Thus,
a program in a computation loop, without allocations nor input-output,
will never trigger the execution of the alarm code, and the Ocamlviz
server will never send any data. To solve this issue, it is possible to add
in the code of the loop a call to the Ocamlviz.yield () function to
allow OCaml to deal with alarm code.
– When a program already makes use of alarms, it becomes impossible for Ocamlviz to use them, because there can be only one handler
for alarms. It is then required to use threads to avoid modifying the
semantics of the original program.
3.2.2. Using Threads
In this second solution, a thread is started when the program is
launched. The only task of this thread is to periodically compute the
new values of monitored variables, and send them to the connected
clients. This solution also suffers from some drawbacks :
– The use of threads is somewhat limited in OCaml, in particular
because the garbage collector is not concurrent. As a consequence, the
runtime does not allow the execution of OCaml code in more than one
thread concurrently.
– As for alarms, rescheduling of threads cannot happen at any time,
but only within allocations or input-output operations. Here again, an
explicit call to Ocamlviz.yield () should be added within computation loops to allow Ocamlviz thread to be scheduled and executed.
3.3. Monitoring Techniques in the Server
Monitoring techniques implemented in Ocamlviz try to modify the
behavior of the monitored program as little as possible. In particular, to
avoid delaying garbage collection of dead values, data that have been
marked by the user with Ocamlviz.Tag.mark are stored in a table of
weak pointers.
Ocamlviz
179
For that, an immediate solution would be to use the Weak module
provided by the standard library in the following way:
module WeakHash = Weak.Make(struct
type t = Obj.t
let hash = Hashtbl.hash
let equal = (==)
end)
However, this solution is both incorrect and inefficient. Indeed, it is
inefficient because the generic hash function Hashtbl.hash can have a
prohibitive cost when applied on big structured values. For example, if
the user decides to mark all the elements of a list, the cost of inserting
the whole list in the table would be quadratic in the length of the list.
In Ocamlviz, this issue is solved by providing our own hash function,
that limits the depth of recursion in big structured values. Although the
Hashtbl.hash_param function looks like a good candidate, its notion
of significant block — basic types — is exactly the opposite of what is
needed here — pointers.
The solution above is also incorrect. Indeed, the physical equality
operator (==) cannot be used to compare values stored in weak tables.
This can be illustrated by the following sequence of instructions:
# let t = WeakHash.create 17;;
val t : WeakHash.t = <abstr>
# let v = Obj.repr [1];;
val v : Obj.t = <abstr>
# WeakHash.add t v;;
- : unit = ()
# WeakHash.mem t v;;
_ : bool = false
The reason of this problem is that WeakHash.mem t v does not actually compare v to every value w pointed to by a weak pointer in the
table t, but to “copies” of these values. Indeed, a direct comparison of
v with w would create a new temporary root that could delay the collection of w. To be more precise, WeakHash.mem creates a copy of w
180
JFLA 2010
by duplicating the first block and then sharing pointers to internal values. Consequently, to get a correct solution, we had to provide our own
physical equality function, which compares the first block with structural equality and internal values by physical equality between v and
w:
let copy_equal x y =
if Obj.is_block x && Obj.is_block y &&
Obj.size x = Obj.size y then
let len = Obj.size x in
let rec loop i x y len =
(i = len) ||
(Obj.field x i == Obj.field y i
&& loop (i+1) x y len)
in
loop 0 x y len
else false
The size of marked values is computed in a depth first search traversal, taking care of correctly dealing with shared and cyclic values: for
that, we use a temporary hash table of the same kind as our weak hash
tables.
Although correct and relatively efficient, our solution is nevertheless still expensive for big structured values, or when the frequency of
monitoring is high. Finally, it is important to know that the OCaml optimizing compiler allocates some static values outside of the heap (in the
data segment). Such values cannot be stored in weak tables, and will
thus result in surprising statistics in Ocamlviz.
3.4. Storage Techniques in the Client
To ease the development of clients, Ocamlviz provides several
OCaml modules. Figure 6 on page 175 shows two modules Net and
Db. The former one is in charge of connecting to the server, reading
incoming messages and decoding them. The latter one records received
data in a database, for a given time window. It also provides an interface
Ocamlviz
181
to query the content of the database for any given time in the execution
of the program within the given time window. Consequently, the time
machine functionality of page 173 is provided freely to any client built
on the Db module.
The database is built on top of a dictionary structure indexed by time.
The signature of the structure module is as follows:
type α t
val create : ?size:int → α → α t
val add : α t → float → α → unit
val find : α t → float → float × α
val remove_before : α t → float → unit
Type α t is the type of a mutable structure associating floating point
values (for execution time) to any kind of data of type α. Function
create allocates a new dictionary containing the data associated with
time 0. It is possible to specify an initial capacity. The instruction add
d t x adds to the dictionary d the association between data x and time
t, with the assumption that t is greater or equal to any time already in
d. The instruction find d t returns the most recent association (t! , x),
such that t! ≤ t. Finally, function remove_before clears from the
dictionary all the associations older than a given time.
It is clear that such a data structure can be directly used to implement
the Db module. Every monitored value is stored in such a dictionary, that
is filled while data is received from the server. The assumption done by
the add function is not a problem, since data are sent with increasing
execution times. The remove_before is immediately used to get rid
of the values that fall outside of the monitored time window to avoid
saturating the client with too much data.
Implementing efficiently such a data structure is not immediate.
Most of data structures provided by the OCaml standard library cannot be used immediately for that purpose. A binary tree would provide
an efficient function to return the greatest value smaller than a given key,
but would not provide an efficient implementation for remove_before,
and insertion would always be logarithmic, even with the simplifying
assumption that data is received in increasing order of time.
182
JFLA 2010
Finally, we implemented the following solution: the associations are
stored linearly in a circular buffer (actually using two arrays, one for
values and one for times). Thus, insertion is done in constant time.
When the buffer is full, the size of the arrays is doubled (insertion keeps
a constant amortized cost). The find function is implemented by a binary search, of logarithmic cost, and the only difficulty is to take care of
the circular nature of the buffer. remove_before calls the binary search
function, and simply moves the position of the first element of the circular buffer to that position, removing all previous entries in logarithmic
time.
4. Experience Report
We experimented the use of Ocamlviz for profiling the Alt-Ergo [?]
theorem prover.
Alt-Ergo is a little engine of proof dedicated to program verification.
It is used to prove validity of first order formulas involving built-in theories such as linear arithmetics, AC symbols etc. Alt-Ergo is currently
used to prove correctness of C, Java or Ada programs.
The architecture of the prover is higlhy modular. Its kernel consists of three main modules: a home made sat-solver which handles the
propositional structure of formulas, a matching module which is used
to instantiate quantified formulas (i.e. lemmas), and a (combination of)
decision procedures which decide atomic formulas involving built-in
theory symbols.
Due to the undecidability and complexity of the problem to be addressed, monitoring and profiling are the main activities in the development of a theorem prover. Many reasons lead us to this very tedious
task:
– the understanding of the reasons why a theorem prover loops on
some inputs;
– the measure of the effect of optimizations;
– the inspection of crucial parts of the system.
We used several Ocamlviz’s features for profiling Alt-Ergo.
Ocamlviz
183
– Counters have been used for counting the number of calls to several
functions in the kernel.
– We used tags to mark the new terms and formulas created by the
matching module; combining counters and tags information gives for
instance a good measure for the impact of hashconsing in the prover.
– The ability to start and stop stopwatch anywhere in the program allowed us to precisely compute the cumulative time spent in some crucial
parts of the sat-solver (for instance in the boolean constraint propagation part which is dispatched in several functions).
– Observation functions such as observe_float_fun have been
used to measure very valuable information about the average size of
arguments of some functions.
Linking Alt-Ergo with the Ocamlviz’s library only required to add
libocamlviz.cmxa at one place in the Makefile. Furthermore, the
monitoring cost was very low (mainly because we instrumented very
few but crucial functions in the prover). Finally, profiling Alt-Ergo with
Ocamlviz has been a very positive experience since it allows us to precisely understand the reasons why the prover failed to prove some goals.
5. Conclusion and Prospects
The Ocamlviz project is free software 1 funded by Jane Street Capital within the Jane Street Summer Project. The idea has been proposed
by the three advisors and implemented by Julien Robert and Guillaume
Von Tokarski, two master students of Université Paris Sud. As a summer project, Ocamlviz seems to have fulfill his goal :
« From my point of view, the single most useful project is
unquestionably ocamlviz. Ocamlviz is a realtime profiling
tool for OCaml, and I was really impressed with the system’s polish. The design is carefully thought out; it seems
to be quite well implemented; the front-end has a surprisingly usable UI; there’s a nice looking website for it, and
good documentation to boot. It’s really a fantastic effort,
1. Ocamlviz is available from http://ocamlviz.forge.ocamlcore.org/.
184
JFLA 2010
and I expect we’ll be taking it for a spin on some of our
own OCaml projects. »
Yaron Minsky, Jane Street Capital [8]
Moreover, first experiments using Ocamlviz have been very successful. In particular, Ocamlviz has been used to analyze a realistic OCaml
program, the Alt-Ergo automatic solver [7]. It allowed us, with only a
lightweight and well thought instrumentation, to identify precisely the
reasons for the non-termination of the solver on some examples. This
would have clearly been much more complicated to do with other tools,
such as debuggers or profilers, and extremely painful with execution
traces, i.e. “à la printf”.
It is clear that Ocamlviz is naturally adapted to such well-aimed
monitoring. Its library has been designed to allow the user to specifically instrument some parts of his code, and so disturb the execution as
little as possible. In particular, the parameters of some functions in the
library can be used to finely tune the cost of the monitoring. However,
it is less obvious if a global instrumentation of a program can really be
useful and provide useful results. In particular, the execution is likely to
be perturbated the computations of too many monitored values within
the library.
Several aspects of Ocamlviz can be improved. On one hand, some
information are still missing. For example, the time spent within the
garbage collector is not available. Although this information could be
computed indirectly from the output of a profiler like gprof, it would
be interesting to have it available within the information received from
the Ocamlviz server. Unfortunately, this would require to modify the
OCaml runtime, which is prohibited if we want to guarantee the perenniality of Ocamlviz.
Regarding the graphical interface, the ability to pause the display
has shown to be very practical, to study in details the information received from the program. However, Ocamlviz still lacks the ability to
put breakpoints in the program, that would automatically pause the interface at a given point of program execution. This would be different
from a debugger that would pause the execution of the program itself.
Ocamlviz
185
Technically, it would be a simple extension, that could be implemented
by adding a new message to pause the display in the graphical interface.
More generally, the protocol could be extended to allow messages to
flow from the client to the server. This would allow more interactivity
between the user and the program, for example to modify some parameters of the program, to stop monitoring an expensive value, or to change
the frequency of monitoring, etc.
Acknowledgments.
The authors, students as advisors, thank the Jane Street Capital company for funding the Ocamlviz project and thank INRIA Saclay–Îlede-France for hosting the students.
References
[1] gdb — The GNU Project Debugger.
software/gdb/.
http://www.gnu.org/
[2] gprof — The GNU Profiler.
http://sourceware.org/
binutils/docs-2.19/gprof/index.html.
[3] Le langage Objective Caml. http://caml.inria.fr/.
[4] ocamldebug — The Objective Caml Debugger. .
[5] ocamlprof — The Objective Caml Profiler. http://caml.
inria.fr/pub/docs/manual-ocaml/manual031.html.
[6] OProfile — A System Profile for Linux.
sourceforge.net.
http://oprofile.
[7] Sylvain Conchon and Evelyne Contejean. Alt-Ergo, an automatic theorem prover dedicated for program verification. http:
//alt-ergo.lri.fr/.
[8] Yaron Minsky. Jane Street Summer Project round-up. http://
janestcapital.com/?q=node/68.