TOWARDS QUANTITATIVE LANGUAGES FOR COMPLEX SYSTEMS
Mark A. Smith*, Yaneer Bar-Yamº and William Gelbart
* MIT Artificial Intelligence Laboratory
º New England Complex Systems Institute
Harvard University Department of Molecular and Cellular Biology
General Abstract
We believe that many fields of scientific study are hampered by the lack of precise language to describe the structure, behavior, and dynamics of complex systems. New quantitative languages are needed. The challenge is to develop intuitive, and yet precise and systematic mechanisms for description. Such languages will merge the strength of human language in specifying objects, attributes, relationships and processes with the ability of computer systems to display and manipulate numerical aspects of systematic description. They will also serve as bridges between experimental observations and theoretical treatments. The creation of a particular language involves connecting qualitative terms and quantitative representations of the system. We illustrate the development of such languages using the structure of drosophila (fruit fly) wings. In this language, the description of a wing is composed of descriptions of veins (structural support elements of the wing), and the structural constraints on these veins. The veins are described as mathematical curves which capture their shape. The constraints that are satisfied by these curves as parts of the wing are incorporated in the description of the wing. The nature of a "wing" as a class of objects is captured by the set of curves and their constraints. A description of a particular wing is captured by the specific values of parameters which can be captured for individual real wings.
We present an ongoing exploratory investigation into the development of compact descriptions of the venation patterns in wings of wild-type Drosophila melanogaster. The long-range benefits of this particular line of work will include the ability to describe the regulatory mechanisms of development as well as the creation of unambiguous ways to share information within the developmental biology community. Specifically, we build a hierarchical, object-oriented representation of the wings as two-dimensional objects. Bézier curves are used to model the veins as one-dimensional objects. Geometrical constraints between the veins are explicitly represented. The Bézier curves themselves require a relatively small number of numerical parameters when compared to the level of detail being represented. The constraints reduce the required number of parameters even further. The process of codifying experimental results by restricting the number of descriptive parameters in this fashion will-by necessity-be accompanied by an increased theoretical understanding of the system under study. We have developed two related applications in the Java programming language which share information using the representation described above: (1) a graphical model editor for registering data from wing images, and (2) a collation utility for model testing, comparison and analysis. These applications can be used together to conduct high-precision, double-blind biometric studies.
We observe that the problem of incorporating constraints turns out to be a key issue in developing quantitative languages for complex systems in general. However, note that the general problem of satisfying constraints is computationally unsolvable, so care must be taken in the way they are handled. The model editor propagates constraints according to a carefully selected partial order, while any proposed dynamical model would have to include causal mechanisms to maintain the constraints. Possible topics for additional research include: (1) statistical analysis of many wings to look for subtle phenotypic variations and correlations, (2) complexity analysis to quantify representation and measurement errors, (3) development of representations for ectopic veins and plexate wings, (4) development of computer vision systems to facilitate data acquisition, (5) sensitivity analysis to find and eliminate redundant parameters through the addition of more constraints, (6) representing higher dimensional objects in higher dimensional spaces, and (7) representing object evolution over time.
Motivation and Context
One of the difficulties in bringing quantitative methods to bear on complex systems is merely having meaningful ways to talk about the systems in question. In an effort to address this problem as well as to promote the study of complex systems as a systematic discipline, we seek to develop novel ways of describing various systems quantitatively while sharing tools and techniques between disciplines. The resulting languages should provide compact descriptions and be flexible in the level of detail they are able to convey.
We approach this problem in the context of a specific example from the field of bioinformatics: representing the venation patterns in the wings of Drosophila--arguably the premier organism for the study of development and genetics. The wing is a good subject for this exploratory investigation because it is essentially flat and therefore can be represented well with a two-dimensional model. Moreover, it is a good subject for subsequent analysis since the network of genes that regulates its growth is relatively well understood.
The wing representation, data acquisition, and analysis tools described below were all implemented using the Java programming language.
Standard Wing Model
The following figure shows our standard wing model. Individual vein segments are labeled following conventional terminology and are modeled quantitatively as Bézier curves with a parameter . Segment endpoints and control points are shown as open circles and colored boxes respectively. Any gaps caused by mutations are demarcated by colored diamonds. Constraints are modeled as objects that contain references to the veins and control points that they constrain, and these objects must be interpreted in a very specific fashion.

Wing Model Data Structure
The following outline exemplifies the nested hierarchy of objects contained in a Wing object. The hierarchy mirrors the hierarchy of the Java inner classes defined in the Wing class.
- Wing
- primitive descriptive fields
- Vein1
- ControlPoint1
- x
- y
- reference to outer Vein object
- ControlPoint2
- ...
- ControlPointN
- Vein2
- ...
- PointConstraint1
- reference to ControlPoint1
- ...
- reference to ControlPointP
- PointConstraint2
- ...
- TangentConstraint1
- reference to left ControlPoint
- references to two center ControlPoints
- reference to right ControlPoint
- ...
- TetherConstraint1
- reference to constraining Vein
- t-value of ControlPoint1
- ...
- t-value of ControlPointQ
- ...
- SubsplineConstraint1
- gap ControlPoint1
- ...
- gap ControlPointN (same N as the Vein)
- t-value of gap ControlPoint1 (beginning)
- t-value of gap ControlPointN (end)
- ...
Data Registration/Model Analyzer
Data can be entered from wing images using a graphical model editor. The model starts out shaped like the standard model shown to the right and is edited by dragging the control points around with the mouse. The resulting model files can be loaded into a separate model analysis program, and then means and cross correlations can be computed for selected quantities of interest. The following figure shows the graphical user interface of the analysis program. The editor interface is very similar. The display region shows a partially edited model overlaying the corresponding wing image.

This figure shows the final fit:

Constraints
The following diagrams illustrate the constraints and show how the they are either satisfied or propagated when one of the constrained objects (a control point or a vein) moves in response to the mouse.
Point Constraint
Control points tied to the same location must move in parallel as a group.

Tangent Constraint
Control points in a line must move in parallel or rotate about the center.

Tether Constraint
Control point must slide along a vein segment or be pulled along when that segment moves.

Subspline Constraint
Gap endpoints must slide along a vein segment or the gap must follow that segment when it moves.

Constraint Satisfaction and Propagation
If the control point selected by the mouse is tethered or is an endpoint of a gap, the corresponding constraint must first be satisfied by solving for the t-value closest to the mouse.
Once the new location for that control point is determined, the constraints must be propagated to actually move the control points according to a causal partial ordering specified as follows:
1. Move any other points as needed to maintain tangency. To avoid a cascade, this step is not repeated (i.e., it is not called recursively).
2. Move all control points tied to any previously affected control points.
3. Move all control points tethered to any affected veins. Repeat, assuming no cycles.
4. Move the subsplines defining any gaps back onto the affected veins.
Conclusions
- New ways to represent complex systems are needed if we are to model them successfully and if we are to realize the full benefits of interdisciplinary research on difficult topics.
- Bézier splines provide a suitable, compact representation of the venation patterns in Drosophila wings.
- Constraints between system components constitute important parts of a complex system, and care must be taken to insure that they are captured and maintained properly.
- The Java programming language provides the necessary infrastructure for creating hierarchical, object-oriented descriptions of system components as well as graphical tools for editing and analysis.
Back to Quantitative Languages for Complex Systems page
Back to NECSI Research page |