explore paths ahead

Symbiosis Tutorial

Introduction

Definition

symbiosis, n.

- : a supporting infrastructure for building software projects from smaller dissimilar components

— dVide

or

1 : the living together in more or less intimate association or close union of two dissimilar organisms

2 : a cooperative relationship (as between two persons or groups)

— Merriam-Webster, online dictionary

Motivation

Symbiosis is a kind of high-level build engine and package manager.

A software project normally has source code and build scripts, but not any automatic means of describing how to resolve dependencies to other components.

Problems arise when one piece of software needs another piece. Often they do not use a compatible build process, and when they do, it becomes difficult to maintain when enough pieces are stacked on top of each other.

Some solve this by keeping all source code in one tree and using one standard build approach. Others modularize the software but still uses one overall build process, for example SCons, which is more modular than make. Others, such as platform package managers, patch the individual projects and also rely on standard install locations.

Any such approach requires a fair amount of manual intervention and has to deal with keeping software versions up to date, unless all software is developed within the given framework.

Another approach uses source control tool support for external modules. This can work well, but does require all source to be managed by the same system (and often we do want to import third party software anyway). It does not address dependency and build configuration issues beyond providing modularity.

Symbiosis cannot solve all problems, but it can make it simpler to manage non-trivial source trees. Partially because we try to allow as much freedom for the individual components as possible, including how they build, where they store their build products, and how the source code is organized and retrieved.

Ideally components can opt-in to be part of Symbiosis and publish the associated configuration files, yet still maintain their independence, or others could provide such files and make them publicly available for open source projects.

This has been done in package managers, but they are generally limited by programming language or operating system. Package managers work well for final releases, but not so much so for dynamically evolving software projects that span multiple programming languages, platforms, and branches.

Many software projects (including Symbiosis) necessarily limit their dependencies on other projects by reimplementing functionality because of the above issues, but it need not always be so.

Objectives

A list of design objectives is found in the last chapter, but they are not essential for following this tutorial.

Overview

This tutorial explains how to create a small symbiosis project that can fetch specific versions of a few dependent source components from the internet and then build and run the code.

Our goal is to have developers download a symbiosis executably and a system configuration file to local disk, then type, for example:


$ symbiosis libpng

And have Symbiosis check out libpng automatically together with it's dependency zlib, build it, and run the test program pngtest.

More generally our goal is to create a supporting infrastructure for building a large number of source components, blending in-house source with third-party software.

Symbiosis should make it easier to manage this complexity.

There is admittedly a bit up-front work to get to the point where everything just runs automatically, but this cost is not exposed to every participating developer, tester or build server.

Here we will stick to a few small projects. After that, new components can be gradually added and create a healthy fauna of software that can co-exist in a symbiotic relationship.

Examples

We start by slowly building up a small zlib example. We first access the zlib source via http from gzip.org using the wget tool.

Then we create our own version of zlib using git for source control, and see how we can deal with multiple versions of the same component in the same build.

After that we focus on the complexity of managing our build configuration files in a shared environment. We use git for source control and access a fictitious server example.org via ssh. We use this server to share both a new branch of zlib and to share our symbiosis configuration data.

Finally we create a libpng example that fetches source from sourceforge.net and builds upon the zlib example.

It is recommended to read along and build up the examples, but source code is available in the examples directory of the symbiosis source distribution:


$ ls examples
examples/zlib
examples/libpng

However, these examples will not work with full version control. You will have to follow the tutorial and set up you own repositories to get the full benefit. If you do this, you also have your own setup to start working from.

More Information

The README file in the source distribution contains a fairly detailed explanation of both available configuration arguments, and the inner workings.

It may be helpful to add new agents to allow for new source control tools. This is done in the myocamlbuild.ml file and is also covered by the README, at least to some extend. Good knowledge of ocamlbuild will help.

http://git.dvide.com/pub/symbiosis/plain/README

But even with the current setup it is possible to get far by calling external tools during the build process.

Compiling Symbiosis

In order to compile Symbiosis, we need OCaml, but after compilation, we have an independent executable that can be distributed.

Windows users will still need to have the bash shell installed (but not Cygwin).

Obtaining Symbiosis

First download the symbiosis source via http:

http://git.dvide.com/pub/symbiosis/:

http://git.dvide.com/pub/symbiosis/snapshot/symbiosis-master.tar.gz

Or via git:


git clone git://git.dvide.com/pub/symbiosis.git

This should result in two files in the current directory:


$ ls
myocamlbuild.ml
myocamlbuild_config.ml

Compiling under Unix(ish) Systems

You need OCaml 3.10.1+ to compile Symbiosis, available from:

http://caml.inria.fr/

Verify the version:


$ ocaml -version
The Objective Caml toplevel, version 3.10.1

Enter the symbiosis source directory:


$ ls
myocamlbuild.ml
myocamlbuild_config.ml

Compile with ocamlbuild:


$ ocamlbuild

Extract and rename the binary:


$ mkdir bin
$ cp _build/myocamlbuild bin/symbiosis
$ rm -rf _build

Install the symbiosis tool in the executable path:


$ sudo cp bin/symbiosis /usr/local/bin/symbiosis

After compilation, you no longer need OCaml.

You can distribute the binary to other compatible systems.

In this tutorial you will also need:

git, tar, wget, ssh, scp, sshd.

Compiling Under Windows

Note: Windows operation has not been tested, and feedback is welcome.

You need OCaml 3.10.1+ to compile Symbiosis, available from:

http://caml.inria.fr/

ocamlbuild requires bash for Windows:

http://sourceforge.net/projects/win-bash

Enter the symbiosis source directory:


$ ls
myocamlbuild.ml
myocamlbuild_config.ml

Compile with ocamlbuild:


$ ocamlbuild

Install symbiosis in the executable path:


mkdir %ProgramFiles%\Symbiosis
copy _build\myocamlbuild.exe %ProgramFiles%\Symbiosis\symbiosis.exe
set PATH=%PATH%;%ProgramFiles%\Symbiosis\

You may want to enter the My Computer / Environment dialog to permanently add Symbiosis to the executable path.

Once compiled, you can distribute the binary to other Windows systems. The binary requires the bash shell, but not OCaml, or anything else needed by OCaml.

In this tutorial you will also need:

git for Windows:

http://code.google.com/p/msysgit/

wget, tar for Windows:

http://unxutils.sourceforge.net/

ssh, scp, sshd for Windows:

http://www.putty.org/

Running Symbiosis from Source

Instead of installing symbiosis as a binary, it is also possible to keep the files:


myocamlbuild.ml
myocamlbuild_config.ml

in the working directory and call:


ocamlbuild <target>

Instead of:


symbiosis <target>

This is useful when developing new agents located in myocamlbuild.ml. However, normally it is more practical to install Symbiosis as a tool and this also doesn't require OCaml to be installed on the local machine.

Creating a Component

A component is always identified by a .proxy file. The same physical source tree may contain multiple components.

Working Directory

First we need a clean working directory for our symbiosis setup:


$ mkdir my-symbiosis
$ cd my-symbiosis

In the rest of this tutorial we will assume this is our current working directory unless otherwise stated.

Proxies

We must tell Symbiosis where to find source code components to build. We do this by creating a .proxy file for each component.

First we create the root proxies directory. (Later we will see how to automate this):


$ mkdir proxies

This directory contains an inventory over all available components.

Now we need some source code to build. As an example, let us use the zlib library. Create the following file with a favorite text editor:


proxies/zlib.proxy

Once done, the file should look like (excl. the $ cat line):


$ $EDITOR proxies/zlib.proxy
$ cat proxies/zlib.proxy
{
  "volume": "zlib",
  "branch": "zlib-1.2.3",
  "path": "<branch>"
}

Note: The file must be valid JSON syntax. Be careful not to have a comma before '}'. Keep a space after ':' so the file is also valid YAML syntax. This will help third party tools access the proxy directory.

Our component has the name "zlib" derived from the file name, and a specific version: "zlib-1.2.3" identified by the branch.

When checked out, the source is located in a subdirectory with the same name as the branch. We use the "path" argument to specify this.

Volumes

A volume is a logical container of source trees.

All source trees within a volume are accessed in a similar manner. Each volume has an associated symbiosis agent to handle this abstraction.

Create a new directory:


$ mkdir volumes

This directory contains an inventory of all available volumes.

We add a volume by creating the file zlib.volume so it has the content:


$ cat volumes/zlib.volume
{
  "agent": "wget",
  "arguments": { "site": "http://www.gzip.org/", "module": "zlib" } 
}

Please observe the required trailing "/" in the "site" argument.

Note: it is not important that our volume is named "zlib", it just represents the fact that it only one source tree, namely zlib. If we desire, we can easily point the zlib volume to another mirror site.

Because we use the built-in wget agent to fetch, or check out, the source, we must install wget and tar on our system.

Checking Out

If we have installed Symbiosis, wget and tar in the executable path we can now run:


$ symbiosis zlib.proxy.volume.checkout.resp

Don't be scared of the long build target. Later we will create short names for important targets.

You should now see an output starting with:


+ cd ../_work/zlib && wget -O - http://www.gzip.org/zlib/zlib-1.2.3 
  | tar xz
...

The ../ in the path output is because Symbiosis is working from the activities/_oversight directory. Paths are automatically translated to handle this.

Now have a look in the workspace:


$ ls activities/_work
zlib
$ ls activities/_work/zlib
zlib-1.2.3
$ ls activities/_work/zlib/zlib-1.2.3/
... lots of zlib files ...

Note: there is technical reason for the underscore in _work. The build engine driving Symbiosis will not look for files in directories starting with an underscore. We only want project local builds to operate inside _work.

If you want to know how Symbiosis figured out to get the source, you can look at the myocamlbuild.ml source file we compiled earlier. You will see the "wget" agent there along with some path and eval magic. You do not need to care about this unless you want to create or modify agents.

Manual Build

Until now we have only fetched, or checked out, the zlib source. We must add information on how to start a build. But first it is best to have a successful manual build:


$ cd activities/_work/zlib/zlib-1.2.3
$ make
cc -O   -c -o example.o example.c
...
$ cd ../../../..

Now we can inspect the build products:


$ ls activities/_work/zlib/zlib-1.2.3/*.a
activities/_work/zlib/zlib-1.2.3/libz.a

Creating a Build Synopsis

To tell Symbiosis how to build a component, we create a synopsis file:

Create the file proxies/zlib.synopsis file such that we get:


$ cat proxies/zlib.synopsis
{
  "actions": [
   {
     "agent": "build",
     "action": "build",
     "arguments": { "tool": "make" },
      "exports": { "zlib-libdir": "<project>",
                   "zlib-include": "<project>" }
   }
  ]
}

In case you wonder - the synopsis file is separate from the proxy file because it can also be placed together with the source code, and because multiple proxy files may use the same synopsis file to create different versions.

A synopsis is a set of actions that we can perform on our component after it has been checked out. In this case we are interested in supporting the "build" action using the "build" agent. The build agent enters the source tree and executes the tool it has been informed about: in this case "make". We can also add build targets if necessary, but this is not needed for zlib.

The path to the source root where we build is called "<project>". Often this is the same as the path to the workspace directory activities/_work/zlib, but here it includes the subdirectory zlib-1.2.3. This subdirectory is identified by the optional "<path>" variable in the zlib.proxy file. Here "<path>" is required or the build will fail.

The "exports" variable is not required for the build action, but when we add dependencies later on, this will make it easier to locate the libz.a and zlib.h files.

Now we can try build the project automatically, but to make it more interesting, let us clean the source tree after we built it manually before:


$ cd activities/_work/zlib/zlib-1.2.3
$ make clean
$ cd ../../../..


$ ls activites/_work/zlib/zlib-1.2.3/libz.a
ls: activites/_work/zlib/zlib-1.2.3/libz.a: No such file or directory

Now, let's try to build zlib automatically:


$ symbiosis zlib.proxy.synopsis.build.resp
+ cd ../_work/zlib/zlib-1.2.3 && make
cc -O   -c -o example.o example.c
...


$ ls activities/_work/zlib/zlib-1.2.3/libz.a
activities/_work/zlib/zlib-1.2.3/libz.a

OK, so let's try this over again, this time from scratch so we also fetch the source. First clean everything and see what we have:


$ rm -rf activities
$ ls *
proxies:
zlib.proxy	zlib.synopsis

volumes:
zlib.volume

Now we have a clean directory ready to build:


$ symbiosis zlib.proxy.synopsis.build.resp
+ cd ../_work/zlib && 
    wget -O - http://www.gzip.org/zlib/zlib-1.2.3 | tar xz
...
+ cd ../_work/zlib/zlib-1.2.3 && make
cc -O   -c -o example.o example.c
...

Source Code for zlib Example

The source code can be found in the symbiosis source distribution under:


examples/zlib

It corresponds to what we have done so far. There are also a few shell scripts with some of the command line actions we have done.

Cleaning

Broken Checkouts

For the time being, Symbiosis do not have any direct support for cleaning up, so some manual intervention is sometimes needed.

If something goes wrong, Symbiosis might get stock during checkout because it will not checkout if a work directory already exists even if the checkout process failed in the middle.

You can delete a specific project under _work, here we delete the workspace for the zlib component:


$ rm -rf activities/_work/zlib

Repeating a Successful Checkout

If a build step such as 'checkout' succeeds, a .resp file is created to mark this step completed. To force a rebuild, we must delete this file. For checkouts, we must also manually delete the work directory for the given component (as a safety precaution).

To force a new checkout of zlib after a successful checkout, we have to delete the zlib workspace manually and then delete the .resp file:


$ rm -rf activities/_work/zlib
$ rm activities/_oversight/proxies/zlib.proxy.volume.checkout.resp

You will recognize the .resp file name from the original checkout command we gave to Symbiosis.

Note: The exact path to the .resp file may be different when the proxies directory is located elsewhere, as we shall see later, but it will always be somewhere under the activities/_oversight directory.

Cleaning everything

If we have made changes to source code or configuration files, we must backup these before cleaning. If we checked out with a source control tool, we can normally enter the relevant directory and commit and push the changes.

We clean by removing the activities directory:


$ ls
activities	proxies		volumes
$ rm -rf activities
$ ls
proxies	volumes

If we keep the proxies and volumes directories, we can start a new clean build using a relevant symbiosis build target.

Later we will see that we can also check out the proxies and volumes directories automatically. In this case they will be found in the activities/meta directory:


$ ls activities/meta
proxies
system
volumes

If we remove the activities directory we must also make sure the files in activities/meta directory are safe, usually by committing and pushing changes back to a source repository.

Version Controlled Component

Until now, we have used the "wget" agent to access source code.

Let us try to use the "git" source control agent instead.

You will need a server with ssh access for full benefit, otherwise you may use a local directory path instead.

Create a New Repository

First get some clean source code to work from:


$ rm -rf activities
$ symbiosis zlib.proxy.volume.checkout.resp
$ cd activities/_work/zlib/zlib-1.2.3

Then create a new repository and check in the source:


$ git init
$ git add .
$ git commit -m "imported zlib-1.2.3"
$ git tag v1.2.3
$ git branch zlib-1.2.3

And upload the repository via ssh:


$ cd ..
$ ssh user@example.org mkdir -p repositories/third_party
$ scp -r zlib-1.2.3 user@example.org:repositories/third_party/zlib
$ cd ../../..
$ ls activities	proxies		volumes

Add a Server Volume

Next, we need a new volume pointing to our server. Because git likes to have many small repositories, we organize the repositories in a tree (though we only have the zlib repository for now). We only need one volume for all the components in this tree.

Create a new volumes/myorg.volume file, but keep the old volumes/zlib.volume file:


$ cat volumes/myorg.volume
{
  "agent": "git",
  "arguments":
  {
    "site": "user@myserver:repositories/"
  }
}
$ ls volumes/
myorg.volume	zlib.volume

If we later decide to test code from another server with a compatible repository layout, we can easily update the site url.

If you cloned your repository to local disk elsewhere, the "site" could be:

(only if you are not using an ssh server as above)


...
  "site" : "/home/myaccount/repositories/"
...

Or


...
  "site" : "<env:HOME>/repositories/"
...

The trailing "/" or trailing ":" is important in the "site" argument.

Create a New Proxy

First rename the old proxy and copy the synopsis:


$ mv proxies/zlib.proxy proxies/zlib-orig.proxy
$ cp proxies/zlib.synopsis proxies/zlib-orig.synopsis

Note: for technical reasons it is not possibly to have '.' in the proxy name, so "zlib-1.2.3.proxy" is not valid.

Create a new proxies/zlib.proxy file. We reuse the synopsis:


$ cat proxies/zlib.proxy
{
  "volume": "myorg",
  "module": "third_party/zlib",
  "branch": "zlib-1.2.3"
}

So now we have:


$ ls proxies/
zlib-orig.proxy		zlib.proxy
zlib-orig.synopsis	zlib.synopsis

Notice that we no longer have "path" argument in the proxy. This is because we checked in the source at root level instead of keeping the zlib-1.2.3 subdirectory. Also notice that the original zlib proxy is now named zlib-orig.proxy and we can still use it if we like.

Testing Checkout

Remove the old zlib repository:


$ rm -rf activities/_work/zlib

Now let's see if we can still get our old wget source code checked out:


$ symbiosis zlib-orig.proxy.volume.checkout.resp
...
$ ls ls activities/_work/zlib-orig/zlib-1.2.3/
...

OK, now have a look at our new repository:

Note: You somehow need to make sure we can get automated access to the server. You can do this with ssh certificates and ssh-add (or passwordless certs):


$ ssh-add
... password ...
$ symbiosis zlib.proxy.volume.checkout.resp
...
$ ls activites/_work/zlib
...

When we deleted the zlib directory before, we should also have deleted the .resp file from the successful checkout so we can repeat the checkout operation. However, Symbiosis detects that the arguments are different and therefore invalidates the old operation.

If all went well, we now have two versions of our zlib project checked out in different places. Of course, we only need one of them, but it illustrates that we can support multiple versions simultaneously.

Finally make sure we can still compile with the old synopsis file in place:


$ symbiosis zlib.proxy.synopsis.build.resp
+ cd ../_work/zlib && make
...

Versioning the Synopsis

Because we now have our own branch of zlib, we can move the synopsis into the source tree. This may not be the best example, but if our component later modifies the way it builds, it becomes easier and more reliable to maintain the synopsis.

Move the synopsis and commit and push the changes:


$ mv proxies/zlib.synopsis activities/_work/zlib/
$ cd activities/_work/zlib
$ git add zlib.synopsis
$ git commit -m "added zlib.synopsis"
$ git push
...
$ cd ../../..

If you get an everything up to date message from the push, you are not on a branch that tracks the remote repository. It should have been set up to work by the "git" agent, but you risk loosing work if you do not make sure things have been pushed correctly.

So we modified our source tree and pushed the changes back to our repository server. Thus, it is safe to delete our activities directory for a clean start.

Building with Synopsis in Source Tree

A synopsis file next to a proxy file takes precedence over a synopsis file in the source tree. Now we will build with the synopsis file in the source tree, so we make sure that we do not have any old one around:


$ ls proxies/
zlib-orig.proxy		zlib-orig.synopsis	zlib.proxy

A clean build:


$ rm -rf activities
$ symbiosis zlib.proxy.synopsis.build.resp
...
$ symbiosis symbiosis zlib-orig.proxy.synopsis.build.resp
...

Multiple Dependent Actions

Automatic Checkout

The "build" action automatically depends on the "checkout" action.

What happens is that we ask Symbiosis to perform the action "build" specified in the synopsis file that our zlib.proxy file points to.

(We can explicitly name a synopsis file in the proxy, but in our example we rely on matching names).

Whenever Symbiosis needs to perform a synopsis action, it makes sure that the source tree is available. The tree is made available by locating the checkout agent of the volume also listed in the proxy file.

You should now see why we require proxy and volume files initially, while synopsis files can be obtained later.

In reality, there is nothing special about the "build" action. You can add any number of actions to the synopsis file, and they will all trigger a checkout if it hasn't already happened.

Add an Action

Try adding an action by editing the synopsis file:


$ cat activities/_work/zlib/zlib.synopsis
{
  "actions": [
   {
     "agent": "build", "action": "build",
     "arguments": { "tool": "make" },
     "exports": { "zlib-libdir": "<project>",
                  "zlib-include": "<project>" }
   },
   {
     "agent": "shell", "action": "dist",
     "arguments": { "script":
        [ ["mkdir", "-p", "<artefacts>/lib"],
          ["mkdir", "-p", "<artefacts>/include"],
          ["cp", "<project>/libz.a", "<artefacts>/lib/"],
          ["cp", "<project>/zlib.h", "<artefacts>/include/"] ] }
   }
  ]
}

$ symbiosis zlib.proxy.synopsis.build.resp
$ symbiosis zlib.proxy.synopsis.dist.resp
$ ls activities/artefacts/*
activities/artefacts/include:
zlib.h

activities/artefacts/lib:
libz.a

The "build" agent only support a limited set of actions such as "build", but the "shell" agent allow you to create any number of scribtable actions. Here we chose the action name "dist". You can look at the symbiosis source file myocamlbuild.ml to see how actions are defined.

Dependencies Between Actions

The "dist" action will only work when we have built what we want to distribute, so let's add the following line:


...
  "dependencies": ["zlib:build"],
...

Or in full:


$ cat proxies/zlib.synopsis
{
  "actions": [
   {
     "agent": "build", "action": "build",
     "arguments": { "tool": "make" },
     "exports": { "zlib-libdir": "<project>",
                  "zlib-include": "<project>" }
   },
   {
     "agent": "shell", "action": "dist",
     "dependencies": ["zlib:build"],
     "arguments": { "script":
        [ ["mkdir", "-p", "<artefacts>/lib"],
          ["mkdir", "-p", "<artefacts>/include"],
          ["cp", "<project>/libz.a", "<artefacts>/lib/"],
          ["cp", "<project>/zlib.h", "<artefacts>/include/"] ] }
   }
  ]
}

Clean and test:


$ rm -rf activities/_oversight/proxies/zlib.proxy.synopsis.*
$ rm -rf activities/artefacts/*
$ rm -f activities/_work/zlib/libz.a
$ symbiosis zlib.proxy.synopsis.dist.proxy
$ ls activities/artefacts/lib/
libz.a

Because we depend on a local action, we could also write:


...
  "dependencies": [":build"],
...

If we later change the name of the proxy, this will still work.

Updating Synopsis in Source Control

You may want remove the entire activities directory and retest, but remember to check in the updated .synopsis file:


$ cd activities/_work/zlib
$ git commit -am "updated synopsis with dist action"
$ git push
$ cd ../../..

Wipe and retest:


$ rm -rf activities
$ symbiosis zlib.proxy.synopsis.dist.resp
...
$ ls activities/artefacts/lib
libz.a

Depending on Other Components

So far we only have two components, and they are very similar. But just to illustrate how we make one build the other:


... "action": "build",
    "dependencies": [ "zlib-orig:build"],
...

You should now have a synopsis file that looks like:


$ cat activities/_work/zlib/zlib.synopsis
{
  "actions": [
   {
     "agent": "build", "action": "build",
     "dependencies": ["zlib-orig:build"],
     "arguments": { "tool": "make" },
     "exports": { "zlib-libdir": "<project>",
                  "zlib-include": "<project>" }
   },
   {
     "agent": "shell", "action": "dist",
     "dependencies": [":build"],
     "arguments": { "script":
        [ ["mkdir", "-p", "<artefacts>/lib"],
          ["mkdir", "-p", "<artefacts>/include"],
          ["cp", "<project>/libz.a", "<artefacts>/lib/"],
          ["cp", "<project>/zlib.h", "<artefacts>/include/"] ] }
   }
  ]
}

To test this, we clean up a bit:


$ rm -rf activities/_work/zlib-orig/
$ rm -rf activities/_oversight/proxies/zlib-orig.*
$ rm -rf activities/_oversight/proxies/zlib.proxy.synopsis.*

and run:


$ symbiosis zlib.proxy.synopsis.dist.resp
...
$ ls activities/_work/zlib-orig/zlib-1.2.3/libz.a
activities/_work/zlib-orig/zlib-1.2.3/libz.a

We don't have to say that we need the build action in zlib-orig because the dependent action is also a build action, so we simplify from:


... "dependencies": ["zlib-orig:build"] ...

to:


... "dependencies": ["zlib-orig"] ...

This was for illustration only. In reality this dependency does not make sense, so we can quickly go back to our last version we checked in:


$ cd activities/_work/zlib
$ git checkout zlib.synopsis
$ cd ../../..

System Configuration

To make life simpler, and to allow developers to share configurations, we can create a system configuration.

Short Build Targets

Create a system configuration in the current directory, such that we get:


$ cat symbiosis.system
{
  "targets" :
  {
    "zlib": "zlib.proxy.synopsis.build.resp",
    "get-zlib": "zlib.proxy.volume.checkout.resp"
  }
}

Then clean up and see what we have:


$ rm -rf activities
$ ls *
symbiosis.system

proxies:
zlib-orig.proxy		zlib-orig.synopsis	zlib.proxy

volumes:
myorg.volume	zlib.volume

We can now build simply by typing:


$ symbiosis zlib
...

And if we only wanted to checkout the source, we could instead type:


$ rm -rf activities
$ symbiosis get-zlib
...

We can add any number of targets this way. When we have a gazillion different components, we don't want to add a target for each because it becomes a burden to maintain.

Targets are only intended for the few targets we actually build all the time. Remember, you can always create a new configuration for a different project.

A small script could also help:


$ $EDITOR build
$ cat build
#!/bin/sh
symbiosis $1.proxy.synopsis.build.resp
$ chmod +x build
$ ./build zlib-orig
...

Source Controlled Meta-Data

If our project is somewhat larger than zlib, it helps to have shared access to meta-data such as proxies and volumes.

We keep proxies and volumes separate because proxies is an organization wide directory while volumes may be different for different groups of developers, testers, and build servers. We consider each such group a configuration "family".

First we create the repositories, later we update the system configuration to locate those repositories so Symbiosis can automatically check them out.

Here we choose to use the git source control system once again, although we could also just upload the files to an http server and access them with wget.

First create a new repository on our server example.org:


$ mkdir myorg.meta.git
$ cd myorg.meta.git
$ git init --bare
$ cd ..
$ ssh user@example.org mkdir -p repositories/meta
$ scp -r myorg.meta.git user@example.org:repositories/meta/

Add proxies to the organization wide branch myorg.proxies:


$ cd proxies
$ git init
$ git add .
$ git commit -m "adding original zlib and
      local repository proxies for zlib-1.2.3"
$ git branch -m master myorg.proxies
$ git remote add origin
      user@example.org:repositories/meta/myorg.meta.git
$ git push origin myorg.proxies
$ cd ..

Add volumes to the special configuration family "test" using the branch myorg.test.volumes:


$ cd volumes
$ git init
$ git add .
$ git commit -m "adding zlib and myorg volumes at gzip.org"
$ git branch -m master myorg.test.volumes
$ git remote add origin
      user@example.org:repositories/meta/myorg.meta.git
$ git push origin myorg.test.volumes $ cd ..

We also create a repository for our system configuration. We are going to update it, but it helps if we already have a repository ready. We use the branch myorg.test.system:


$ git init
$ git add symbiosis.system
$ git commit -am "added preliminary system configuration"
$ git branch -m master myorg.test.system
$ git remote add origin
      user@example.org:repositories/meta/myorg.meta.git
$ git push origin myorg.test.system


Meta-Volumes

Checking out the volumes directory is a chicken and egg problem because Symbiosis needs volumes for checking out source trees.

The problem is solved by adding meta-volumes to the system configuration. A meta-volume contains information we would normally store in proxy and volume files.

We can now update our system configuration:


$ cat symbiosis.system
{
  "configuration":
  {
    "family"       : "test",
    "repositories" : "user@example.org/repositories/",
    "proxies"      : "<metaspaces>/proxies",
    "volumes"      : "<metaspaces>/volumes"
  },
  "meta-volumes" :
  {
    "proxies" :
    {
      "agent": "git",
      "arguments":
      {
        "workspace"  : "proxies",
        "module"     : "meta/myorg.meta",
        "branch"     : "myorg.proxies"
      }
    },
    "volumes" :
    {
      "agent": "git",
      "arguments":
      {
        "workspace"  : "volumes",
        "module"     : "meta/myorg.meta",
        "branch"     : "myorg.<family>.volumes"
      }
    },
    "system" :
    {
      "agent": "git",
      "arguments":
      {
        "workspace"  : "system",
        "module"     : "meta/myorg.meta",
        "branch"     : "myorg.<family>.system"
      }
    }
  },
  "targets" :
  {
    "zlib": "zlib.proxy.synopsis.build.resp",
    "get-zlib": "zlib.proxy.volume.checkout.resp"
  }
}

Several things happen here. First of all, we make Symbiosis look for the proxies and volumes in a new place named "<metaspaces>":


"proxies"      : "<metaspaces>/proxies"
"volumes"      : "<metaspaces>/volumes",

"<metaspaces>" corresponds to the directory activities/meta. This is similar to the variable "<workspaces>" which refer to activities/_work.

Next we have to populate the "<metaspaces>" directory. The "meta-volumes" configuration points to three different branches. A meta-volume is a mixture of both proxy and volume information, so we give the exact branch details here.

The "workspace" argument decides where a meta-volume is checked out under the "<metaspaces>" directory.

We normally would have given a "site" argument in each volume, but "site" defaults to "<repositories>" which we have defined in the configuration section to avoid repeating the site name.

Initializing Symbiosis

Due to chicken and egg issues, we cannot automate the checkout of the meta-volumes. Instead we must call:


$ symbiosis init
...

If all goes well, and no server path were misspelled, we should now see three populated directories:


$  ls activities/meta/*
activities/meta/proxies:
zlib-orig.proxy		zlib-orig.synopsis	zlib.proxy

activities/meta/system:
symbiosis.system

activities/meta/volumes:
myorg.volume	zlib.volume

Actually, at this point Symbiosis do not care about what it has checked out. You may add you own repositories under meta-volumes, and you may collect all directories in a single repository if you prefer.

All that matters is that we now have our meta-data ready where they are supposed to be.

From this point forward, there is (almost) no difference from what we have done before. Symbiosis is simply looking for proxies and volumes in a new place. We could have put the content there manually instead of calling symbiosis init.

Now we can build zlib yet again:


$ symbiosis zlib
...

Remember to check in the updated symbiosis.system file:


$ git commit -am "proxies and volumes now under Symbiosis control"
$ git push origin myorg.test.system

Cleaning with New Proxies Location

With proxies stored in activities/meta/proxies we have one major difference from the earlier setup.

Look at the activites/_oversight directory after having built zlib:


$ ls activities/_oversight/activities/meta/proxies/
zlib.proxy			zlib.proxy.volume.checkout.req
zlib.proxy.synopsis.build.req	zlib.proxy.volume.checkout.resp
zlib.proxy.synopsis.build.resp

Earlier these were located in:


activities/_oversight/proxies

This is due to the way the underlying ocamlbuild engine works. Therefore, when we need to force rebuilding a specific action, we must remove the .resp files in the new location.

The symbiosis targets remain the same. We can still build the zlib-orig as we used to:


symbiosis zlib-orig.proxy.synopsis.build.resp
...

The System Meta Volume

We added the activites/meta/system meta-volume in our configuration.

But actually this repository has absolutely no effect. If we can check it out via symbiosis init, we already have a copy available.

activites/meta/system is purely a convenience so we can update changes to our configuration there and easily push them back.

On a new machine with a compatible configuration, here called "test", we can check out the myorg.test.system branch:


$ git clone -n user@example.org:repositories/meta/myorg.meta.git myorg.test
$ cd myorg.test
$ git checkout myorg.test.system origin/myorg.test.system

Make sure Symbiosis is installed. We could this by adding the binary to the myorg.test.system branch as source or binary. Then:


$ symbiosis init
...
$ symbiosis zlib
...

Libpng Example

Now we will use some of what we have learned to create a libpng example.

libpng only depends on zlib, so this make it easy. But we still need to tell libpng where to find libz.a and zlib.h.

Here we show the example with proxies and volumes in the current directory, but you can also add them in the activities/meta if you use the source controlled approach discussed above.

We build upon the zlib example earlier on and add libpng specific files without modifying any aspects of the zlib build.

The Libpng Configuration Files

First create the sorceforge.volume:


$ cat volumes/sourceforge.volume
{
  "agent": "wget",
  "arguments":
  {
    "site": "http://prdownloads.sourceforge.net/",
    "extension": ".tar.gz"
  }
}

Note: this volume may also be useful for other projects in the future.

Add the libpng.proxy:


$ cat proxies/libpng.proxy
{
  "volume": "sourceforge",
  "module": "libpng",
  "path"  : "libpng-1.2.32",
  "branch": "libpng-1.2.32-no-config"
}

And finally add the libpng.synopsis, where we also add the zlib dependency:


$ cat proxies/libpng.synopsis
{
  "actions":
  [
    { "agent": "build", "action": "build",
      "dependencies": ["zlib"],
      "arguments": { "tool": "make",
        "targets": ["-f", "scripts/makefile.std",
                    "ZLIBLIB=<base>/<zlib:zlib-libdir>",
                    "ZLIBINC=<base>/<zlib:zlib-include>",
                    "all"]
      },
      "exports": { "libpng-path": "<project>",
                   "libpng-include": "<project>"}
    },
    { "agent": "shell", "action": "test",
      "dependencies": [":build"],
      "arguments": {"dir": "<project>", "script": [["./pngtest"]]}
    }   
  ]
}

Testing Libpng

We should now be able to run the test program pngtest:


$ symbiosis libpng.proxy.synopsis.test.resp

Edit the symbiosis.system file to add "libpng" as a target:


$ cat symbiosis.system
...
  "targets":
  { ..., "libpng": "libpng.proxy.synopsis.test.resp" }
...
$ symbiosis libpng
...

Of course, building libpng once more will not do anything, not even run the pngtest program again. This is because Symbiosis remembers the above successful build.

Explaining the Libpng Synopsis

Notice the zlib: namespace which we use to access the values exported from the zlib build.

Also notice the use of "<base>". This is the reverse path from "<project>" to the current directory, for example "../../../..". Since most paths are calculated relative to the current directory, "<base>" is a useful path prefix.

We chose the version of libpng without a configure script and opted to compile with the standard makefile scripts/makefile.std in the libpng distribution.

The "targets" argument is intended to specify build targets like "all" or "doc", but also works to pass makefile variables. In this way we get to specify the path to our version of libz.a and zlib.h.

In summary, there was some need for tweaking to get libpng to build, but nothing we couldn't handle without modifying the source tree.

If we wanted to add our own makefile, or use another build tool, we could create a proxy for these files and add an action to copy files as a dependency to the build action. This technique is also useful when many projects share a similar build file that we do not wish to maintain in many source trees.

Example Source Code

The symbiosis source code contains the libpng example in:


examples/libpng

This example stores proxies and volumes directly in the current directory and does not assume that we have set up any source control for proxies and volumes using the "meta-volumes" feature of the symbiosis.system file.

All zlib and libpng source will be fetched from gzip.org and sourceforge.net respectively using wget.

Updating Meta Volumes

If we did create meta repositories earlier on, we should now commit and push the changes to proxies and volumes:

We assume we have made a clean checkout and init of our project since we added the source repositories. This ensures we have all the git remote tracking branches set up correctly and simplifies the operation:


$ cd activities/meta/volumes/
$ git add .
$ git commit -m "added sourceforge volume"
$ git push
$ cd ../proxies
$ git add .
$ git commit -m "added libpng proxy and synopsis"
$ git push
$ cd ../../..
$ git commit -am "add libpng target"
$ git push

We might want to keep our system tree up to data in the meta directory, just so we don't get confused:


$ cd activities/meta/system
$ git pull
$ cd ../../..

Now other developers can update their configuration by pulling from the project server and also build libpng.

Additional Features

Here we discuss some of the features not covered by the tutorial without going into any great level of detail. The README file contain more information on these topics.

Exact Revisions

A proxy can sometimes accept an exact revision identifier or tag instead of a branch. Depending on circumstances we may want one or the other. This feature is not fully developed, but sometimes this information can simply be used where a branch name would otherwise be used.

Variables

When creating synopsis files, there is a fairly elaborate namespace hierachy available to access information. We will not cover the details here.

The README document lists the property names and their default values.

It may also help to inspect the .req files inside the _oversight directory. A .req file is always generated before calling an action that resulting in the corrsponding .resp file.

Dependencies can pass on information. Symbiosis will copy exported values into the .req file of a dependent action and this make the information available.

The "build" action of the "build" agent can create simple name value pairs using the "echo" argument. These pairs are written to a file that can be imported into make files and scripts such as Ruby. This is sometimes useful for communicating with local build systems.

Artefacts and Tools

Symbiosis automatically creates activities/artefacts and activities/tools directories. These are convenient places to copy build results.

The activities/tools directory is automatically added to the executable path before executing an action. We can install previously built tools into this directory as a build dependency.

One example is the ragel state-machine compiler. The ragel compiler can itself be compiled and installed in tools. And the output of ragel can also be used as a tool, for example to pre-preprocess documentation or to scan for dependencies.

It makes sense to add the activities/tools to the executable path manually in the shell environment. When doing so, it is possible to enter a source tree and manually build source that depends on tools created in an earlier step. Otherwise the include path is only present when Symbiosis drives the build.

Named Synopsis Files

A synopsis file may be named explicitly, relative to the "<project>" root in the source tree. This makes it possible to reuse a synopsis file between proxies that represent different versions of the same source. See README.

A component is always named by the proxy file, even when the synopsis has a different name.

Named Workpace

A proxy file may choose an alternative checkout location under "<workspaces>" by setting the "workspace" argument. See README.

Rebuilding

Symbiosis will rebuild a target if the input file .req has changed since last build, or when it is missing.

Symbiosis will rebuild a dependent action if the output .resp file has changed.

It does not matter whether dependent actions belong to the same component or another component.

Creation of .resp files are controlled by agents, and when an agent fails to produce a .resp file, the build fails.

Source control agents try to put a unique revision signature in the .resp file such that a new checkout will trigger a rebuild, but this is (as of this writing) not supported for wget.

When using build systems with hard signatures such as SCons, OMake and ocamlbuild, we can create a build agent that copies such a signature to the .resp file. The generic "build" agent does not do this.

When this is configured, a component will rebuild if the input .req file changes. The build signature will only change if the underlying build system detects any effective changes, and only then will the rebuild ripple up and rebuild dependent components.

This allows for very fast and precise rebuild of large trees, but does require manually deleting a .resp file when a source tree has been modified.

Because of the rebuild conditions, new component builds should try to avoid reading environment variables directly, and have passed from Symbiosis which logs the content in a .req file. This can be done by including simple name value lists, which both Symbiosis and other tools can easily create.

For comparision, it can observed that the SCons tool can take complete control of the environment and log any variables that it writes to the environment before starting a build.

Symbiosis might learn how to do this, but it makes it more difficult to start builds manually, outside of Symbiosis control, which is what we mostly do while developing a single component.

Adding Agents and Actions

Each agent is defined in the myocamlbuild.ml file. There is a fixed set of actions registered, which can be enhanced. If an action is not defined for a given "agent", Symbiosis will complain.

When building dependent actions, it does not matter which agent provides the action. So one component may replace the "build" agent with, say, a "build-specially" agent and still look the same to dependent actions.

Some agents may support a missing action. The "shell" agent does this to allow arbitrary action names performed by external commands.

New actions must ensure that they create a .resp file. At least creating one. The path to both the existing .req file and the invalid or missing .resp file is given. A simple approach is to copy .req to .resp since at least this will trigger rebuild if the input conditions change.

See README for more.

Future Ideas

Checkout Verification Identifiers

For security it would be nice to support hash verification of files downloaded from external repositories.

We can already get this from several distributed version control systems by giving a specific revision number. But not for "wget" access to tarballed archives.

However, often we want unverified access to source so we can track the latest development. Therefore this should be an optional argument to proxies.

We may also have an option in a volumes that demands proxies to give verification identifiers. This is useful when we point a volume to an untrusted mirror.

Tags

A nice to have in Symbiosis would be the ability specify tags along with a build target and have components adapt accordingly, including replacing or removing components.

As it is today, it is possible to create variants by adding multiple proxy files, and also to some extend by communicating through variables.

Reproducible Builds and Change Detection

Source code may have changed between two checkouts of the same component branch. Often we desire this, and when not, we can use tags or revision identifiers instead of branches.

But in any case, it would be nice to track exactly what source revision we actually did check out, at least for source control tools that do support this.

It would be even better if we can also detect if the source code has changed since the checkout. This can be done with support from a source control agent, either through built-in tool support, or by running a hash on the source tree.

We can then flag the build is dirty and not reproducible, and we can help generate means to automatically commit changes.

When we have a reproducible built, we want Symbiosis to be able to use a build log to replace branch information with exact revision information, and thereby check out the exact same source we did when the log was created for all participating components.

We can use such a log to track down bugs more easily when a user reports a problem with a 3 year old release.

For this to work optimally, system installed libraries should be avoided to the extend possible, and where not possible, there should be a way to log such dependencies informally. On a system such as Debian, a report of all installed packages could be generated and added to the log.

External and Remote Agents

All information needed by an agent, except the source tree content, is delivered in the .req for the corresponding action.

For some types of actions this makes it possible to call agents over a network. Some agents may even be able to obtain their own local copy of a source tree.

A fair amount of work would be needed to complete this, but it is, or should be, possible to do this around the existing agent framework.

A shorter term goal would be to support externally scripted agents. The main problem is to make it simple to deal with relative path issues and in evaluating variable expressions.

The "export" action of the "build" agent does expand the exported variables and thus provides one path towards this goal. The "shell" agent does something similar by expanding arguments in the command line strings. But it does not give the full power to easily do all that internal agents can do.

Distributed Cached Builds

As an extension to remote agents, it would be possible fetch products already built on other servers. Hash signatures mentioned above under rebuilds may be used for this purpose.

Ideally this approach would avoid fetching a source tree at all, if there is a complete build product available remotely. A proxy may be configured to ask a remote server if it can handle synopsis requests and if not, a volume checkout action will be engaged for the component, as today.

A build action could be driven by an http post request. For example by using curl to post the request file to a remote http server and store the http response as the response file upon success.

The actual build products may then be fetched by the usual checkout methods, possibly targeting the artefacts directory, or alternatively do a workspace checkout similar to source checkouts as we know, except the resulting directory would only contain build products.

The concept of configuration families becomes important because we want the correct binaries.

Build Tools

Build tools such as SCons, CMake, OMake, and now ocamlbuild focus on automating build configurations, platform abstractions, and on providing better dependency handling than is readily available in make.

Such tools are recommended for individual component builds, but it is ultimately a matter of what fits any particular project and component.

ocamlbuild-ctools

ocamlbuild-ctools is an additional build tool developed in parallel with Symbiosis. It is created as plugin for ocamlbuild similar to Symbiosis and can be installed as a binary tool with a complete build engine.

We only mention it here for reference, and as a potential companion tool for symbiosis projects.

The ocamlbuild-ctools project:

http://git.dvide.com/pub/ocamlbuild-ctools/

attempts to add cross platform C/C++ support for the ocamlbuild tool that otherwise mostly address OCaml related builds. This tool enables multi-variant builds within a single source tree and automatic dependency scanning.

Build configurations can be made as simple as creating a text file with a list of object files, where the extension of this file decides whether it is a program or a library. Build of dependent libraries are easily handled within a single source tree.

The tool also handles integration of mixed OCaml and C code, but it remains a general purpose build tool.

OCaml 3.10.1 must be installed, and a small OCaml configuration file is required to set up the project since it doesn't parse JSON files like Symbiosis does. It would be possible to remove this dependency, but OCaml does provide a powerful framework to manage complex build configurations.

On internal projects, ocamlbuild-ctools has been integrated as a build dependency for some components by copying the myocamlbuild.ml and myocamlbuild_config.ml files into the root of projects that use myocamlbuild and then use ocamlbuild to build the component.

The required include file scanner tool cppinclude:

http://git.dvide.com/pub/cppinclude/

has also been added as a dependency and installed in the tools directory.

In this way we only have to update the build rules in one place for multiple components, and we only have to install OCaml, Symbiosis, a source control tool and a C/C++ tool chain to start building.

We have not covered exactly how to do that in this tutorial, but all the techniques necessary have been shown.

Background

Why is the above as it is?

Some answers may be found in the design objectives below.

Others in the possibilities and limitations of interacting as a plugin-in to a build engine.

Objectives

We want a complete index of all software components that we can access and build. It should be cheap to access this index, and we do not want to have to fetch all source for all components before we can be operational.

We want support for remembering how to build a specific component and help with identifying the active software branches of said component.

In this way we can activate code that might otherwise be hovering in some stale and long forgotten repository.

And we dare start building projects that include non-invented-here parts without fear of loosing control or overview.

We want to be able to use the same component in different ways for different projects, up to the point of using a component multiple times in the same project. For this we need a precise way to identify branches, repositories, and build parameters.

We want to support a single source tree holding multiple independent components, although we prefer many small source trees.

We want to avoid having to modify a third party components build system whenever possible. But we also want to provide build configuration variables to build systems that are willing to listen and cooperate.

We want components to declare information about themselves instead of relying on ad-hoc assumptions on path conventions and other settings.

We want to make it easy to change source code location or version of a particular source component. Either permanently, or temporarily to test something, or to allow different groups of testers and developers to access different levels of maturity.

We want all of this information to managed in source controlled text files so that we manage the overall project in the same environment as source code. This including multiple live configuration branches. We also want small independent files to reduce merge conflicts.

We want to reduce the amount of manual configuration and installation work needed before a build can run.

We seek to have the configuration text files in a format that is easily accessed and parsed by third party tools, and possibly even support networked communication.

We seek to support existing build and source control tools for individual components without preference, and to make it easy to add support for new tools.

We do not want to imply any specific order of operations such as a depend step before a build step, yet we still want it to be simple to effectuate a standard build operation.

We want to minimize rebuild effort by maintaining strict control of input arguments and by observing build signatures provided by some build tools.

We want to see an aggregate software project as just another component that can be used by yet other components.

We generally try to avoid relying on system installed libraries, and we also want to avoid OS-specific package managers, programming language specific package managers and source control specific module management. This gives us a lot of freedom in including new source, operating with multiple versions, testing on different platforms and going back to older versions.

Finally we seek to make the supporting tool a means to an end. That is, the tool should operate on the data files and produce predictable results that can be reproduced by other tools.

Therefore, Symbiosis is more a supporting infrastructure for building software projects from smaller dissimilar components, than a tool.