The Balaka system

SourceForge.net Logo Support This Project
Author: Lalo Martins
Date: 2006-01-08

This document is what we call “science-fiction design”; it's written as if it was documentation for a working system, but it's actually a design document. The system itself doesn't exist yet. Until it does, and works well enough to generate its own website, this document will serve as a temporaty webpage.

Balaka is a web development framework. Like most good frameworks, it's based on the concept of publishing objects; like the best frameworks, it's loosely based on the Model-View-Controller (MVC) paradigm. However, it relies on VOS for its object system, and persistence. This also gives you the choice of scaling your app by running different parts in different machines; you can have a number of fast computers running your controllers, a big number of cheap computers serving your views, and one heavy-duty server with the actual models.

Balaka itself is composed of five main parts:

A set of built-in widgets for editing content trough the web may or may not be added later, depending on the progress of VOS' own “Metalurgy” system.

Notably, Balaka does not include a template system. Since tastes in template languages differ wildly, I consider templating out of scope. Conceptually, it's the responsibility of your views to render html, and they may do whatever is necessary to achieve that; if you want to use templates, there are enough libraries out there you can use. I would, of course, recommend OpenTAL, but hey, that's just me.

Balaka is also not in the business of helping you with model logic. You have the full power of VOS to use in your models. As of this writing, you can implement a VOS site (that's the VOS terminology for a collection of objects, what other systems could call a “server” or a “database”) in C++ or Python; a Java implementation is in the works.

Rationale

The major difference between Balaka and most frameworks, is that its views are “dumped”, rather than generated on-demand. On a website where accesses are much more common than modifications, this will result in very high performance, with minimal use of resources; it also makes it easier for you to scale up, by mirroring the generated output. On the other hand, if the ratio is not that high on your website (for example, a very busy online BBS), it may not be the system for you. (Then again, you can still use it, by skipping the views and using just controllers for everything.)

Short walk-trough

So here is how you go about making your Balaka website:

  1. Create a VOS application that will serve your models. Pick any site extensions you want - you will certainly need some kind of persistence, at a minimum.

  2. If you need any model logic, create some metaobjects to implement this logic. Bear in mind, though, that most websites won't need metaobjects; if you just need to store data, VOS does that well enough. When you need to add logic, think carefully whether it is model logic or view/controller logic.

  3. Create a directory to hold your views, one for the view output, and one for your controllers.

  4. Create a Balaka configuration file. It's actually a VOS storage file; it may be in any format supported by the VOS import system (currently COD or XOD). You probably want to start from the example files in the distribution. In this file, tell Balaka where is the model site, your views tree, your controllers tree, and your output tree.

  5. Populate your view tree with view definitions.

  6. Optionally, populate your controller tree with controller definitions.

  7. Set up your webserver to serve the output tree. If you want to serve the views and controllers from the same IP and port, you can also set up your webserver to proxy the controller calls to Balaka.

    You can either put “media” files (css, javascript, images, etc) in your output tree, using static views, or serve them from a different tree using your webserver configuration (different virtual host, or different location, or whatever - you can even serve them from a different machine).

When you first run Balaka with your configuration file, it will go trough your models and generate all output as defined by your views. It will then start an http server for your controllers.

View definitions

A Balaka view is composed of one or more listeners, which are functions (or something similar to that) that will be called to render an object or a group of objects.

Your views are stored in a directory tree in your filesystem. Balaka traverses this tree, looking for view definitions, and instantiates them as Vobjects. Alternatively, you may implement simple views directly in the configuration file.

There's no reason why you can't have more than one view tree; just list them in the configuration file, and they will be traversed.

Views are created in this order: first the ones in the configuration file, then those in the trees, in the order the trees are listed in the configuration file. This is important if you have views with the same output name; the ones created later will override the ones created earlier.

Each view definition can be in one of a few formats, which will be explained in detail in the following sections.

The type of a view is defined by the extension of the filename.The exception, of course, is with directories; Balaka knows it's a directory, and no extension is used.

Static views

A static view is a file with the suffix .static. The contents of the file are ignored; you can use it to store comments, or you can leave them empty. This just tells Balaka that, if it finds a file or directory with that name in the output tree, it shouldn't be deleted.

View Directories

A directory will simply be traversed further for view definitions. This allows you to structure your output; so if you have a directory foo inside your view tree, with a view definition bar inside it, this view will generate a file foo/bar in the output tree.

Directory views can have one special file named .ignore; this is a list of filenames (actually, regular expressions) that aren't to be traversed. If this is missing, it will be inherited from the parent directory. You can use it to ignore artifacts from your revision control system, such as a .bzr or .svn or CVS directory.

The top-level view directory is considered a directory view, so you can store your .ignore file there. If you don't have one, it will inherit from Balaka's default: ^. (ignore all files and directories with names starting with a period).

Ignored files will also be ignored on the output side - meaning, they won't be deleted if present. This means, if you use Apache, you can have a .htaccess file in your output tree, to set up permissions, redirects, and whatnot.

Python views

Files with the extension .py are imported as Python modules. It should define functions that implement the listeners. A special function listen is inserted in the global namespace, to register functions as views. You may call it like this:

listen(function, path)

This decorator turns a function into a “single-object listener”; it will be called with that object as its single argument, whenever the object changes.

Or you may call it like this:

listen(function, path, vtype [,vtype...])

It turns the function into a “children listener”; it will be called whenever a child object of that object is added or modified, if the child object has one of the types in the vtype list. It will get the parent-child relation as its single argument.

If the object being listened to is removed, the view is unregistered. (Although generally, I don't recommend ever deleting objects; instead, use some kind of workflow system to deactivate it - in its simple form, a boolean property named workflow:is_active.)

Listeners can also be registered while running a view.

You may combine these two facts to watch deep structures; let's say you have a model structure like this - a top-level generic object named blog, with child objects of type blog:user, each with child objects of type blog:entry. So you can register a children listener like listen(users, '/blog', 'blog:user'), and then inside the body of this function, for each user, call listen(user_home, '/blog/%s' % username) and listen(entry, '/blog/%s', 'blog:entry').

Python view implementation

Listeners operate on Vobject objects. Vobject is a simple wrapper type, implementing a subset of the VOS python API. Namely, it has these methods and attributes:

getChildren():returns a list of parent-child relations
getParents():returns a list of parent-child relations
findChild(name):
 returns a parent-child relation
findParent(name):
 returns a parent-child relation
iterChildren(vtype):
 iterates over children of that vtype
iterChildren():iterates over all children
send(messagetype, method, **args):
 sends a message
types:list of strings (the vtypes)

Note

those methods use camelCase names rather than pep8 style, because they're mimicking similar methods in the VOS python API.

Trying to convert an object to a string will check if the object is a Property; if so, it will return the result of read()ing the Property. Otherwise, it will raise NotImplemented.

Additionally, trying to access any other attribute will look for a child with that name, and if it exists, return the child (not the parent-child relation). Trying to get an item (eg obj[5]) will get the child object in that position. Iterating will iterate over the children in order.

Parent-child relation objects have four attributes: parent, child, name, and position.

Quite intentionally, there are no methods to modify the models. However, if you're crazy enough, nothing keeps you from just doing import vos, connecting to the model site, and doing whatever you like.

To actually generate output, a listener calls the function output(name), which is inserted in the global namespace. It will return a file-like object, which you can .write() to (if there is a single output tree defined, it will be optimized to return the actual file object). The name is a string, used to generate the filename; it may be relative, in which case parent view directories will be taken into consideration, or absolute, in which case it will be relative to the top of the output tree. You don't need to .close() this file (but you can). This design is intended to allow a single listener to generate multiple output files, which is a common requirement.

Anything that the listener sends to standard output will be logged.

Compiled views

Files with the extension .so are loaded as shared libraries; then Balaka looks for a symbol named initialize, and calls it to set up the actual listener(s).

Detailed API to be documented.

Executable views

Files with no recognized extension that have the executable filesystem permission set are simply executed. This is the “ultimate alternative”, if you really need to write views in some odd way not (yet) supported by Balaka.

When run with no arguments, such a file should output a stream of commands, setting up its listeners.

When actually generating output, Balaka will call it with a single argument, the listener name. It will get four open file descriptors, which will be pipes into Balaka; the three standard ones (standard input, output, and error output), plus a command output. It's expected to communicate with Balaka trough the command output and standard input, and generate output on the standard output.

Detailed API to be documented.

Error handling

To be documented: what happens when there are errors (Python views can raise exceptions, compiled views can return error conditions, executables can write to standard error and return nonzero). Basically, Balaka skips that object and either logs it or mails you.

Controller definitions

Controllers are similar to views, in the way they are defined, but they are “truly dynamic”; instead of generating static output, they are intended to do an actual http request. Therefore, their locations in the filesystem mirror their actual URLs.

They are intended mostly for highly-dynamic interaction - such as searches, user account data, or modifying data, probably returning XML or JSON - or nothing. You can, of course, do this with a simple CGI on your webserver; the point of controllers is that they give you easier access to your models.

Python Controllers

Like python views, python controllers are implemented as modules. Python files in the controller tree will be imported, and if they export a symbol named controller, it will be made available. If it doesn't export such a symbol, it will be ignored by Balaka, which allows you to keep “utility” modules in the tree.

The controller is called with two arguments, the request and response.

The request object has two dictionary attributes, GET and POST (one of them will be None); one dictionary attribute headers with the raw HTTP headers; one path, one method and one protocol attributes mapping the request line (eg, GET /foo HTTP/1.1).

The response is a file-like object, with two methods:

set_status(code, message=None):
 sets the response status, and optionally the message; if the message is note, use a default (based on the text from the HTTP 1.1 spec).
send_header(name, value):
 sends a header.

These methods can't be called after the response has been written to.

Additionally, a Vobject object named site is available in the global namespace when a controller is called. See python view implementation for details on Vobject objects.

Instead of setting the status for an error, a controller may (if it hasn't yet written to the response) raise one of a few predefined exceptions:

ControllerDataNotFound:
 returns a 404 (Not Found).
ControllerError:
 returns a 500 (Server Error).
ControllerUnauthorized:
 returns a 401 (Unauthorized).
ControllerForbidden:
 returns a 403 (Forbidden).
ControllerInvalid:
 returns a 400 (Bad Request).
ControllerRedirect:
 returns a 302 (Found) - a redirect. Useful for controllers that modify data and don't return anything.
ControllerException:
 the base class for all these. While the specialised classes take an optional string as its constructor argument (for the message), this takes both the response code and the optional message, like the response method send_header().

The JSON helper

A controller who returns JSON can optionally use the JSON helper. You can use it as a decorator:

@json_controller
def controller(request):
    ...

In this case, the function doesn't get the response argument. Instead, it's expected to either raise one of the built-in exceptions, or

Other features

Output listeners

The configuration file may register one or more output listeners. These are normal executables, which are ran after each change to the output tree. Specifically, if many models change in a short period of time (configurable), all the views will be called, and only after that the output listener will run.

You can use this feature to commit your output to version control, or to rsync it to another location, for example.