distributed moo

mooix will be a distributed moo with high degrees of locality. Each computer running mooix will have a set of objects, which are stored on that computer, and run there. This will typically be used to create a set of rooms, and the objects in them. Everything that goes on in those rooms will be managed and run by the computer that hosts them.

So there will be two types of objects. Most everyday objects will be concrete objects; these have a sigle location and a single instance. There will also be abstract objects that do not have a location, and that can have any number of instances, which can even vary in abritrary ways (not requited to be kept in sync).

Concrete objects will be used for rooms, avatars, things you can pick up, etc. Abstract will be used for object classes, genders, parsers, data container objects, logging objects, etc.

When a concrete object moves from one room to another, it is literally moved from one computer to another; and so are all of the object's contents. And if that computer doesn't have instances of the abstract objects associated with the moved objects, then the abstract objects are copied to it, but remain behind on the source computer too.

(What about abstract object garbage collection? If they're copied around all the time, they will end up sitting around anywhere an object has been.)

object references

There are basically two use cases for references.

The first is a room that needs to point to the room to the north, which might be on a different server. Note that rooms don't move between servers, much. An url, possibly with some kind of redirect system in case the room does move, should be enough.

The second use case is actually identity. You want to know if this "Bob" is your good friend Bob, or someone else. An object needs to have a way to state the identity of its abstract parent object. Etc. This is actually more common.

It's also helpful to be able to query for where Bob is now, so there needs to be a way to discover an object's location. Although this needs to be done in a decentralised way, and Bob can opt out of letting you know. One way, if Bob had a cell phone, would be for him to let you add a tracker to it.

rpc

The technology chosen for the network layer will be very important. Some choice criteria:

  • needs only one tcp port
  • firewall isses
  • standards based
  • reasonably lightweight
  • extensible
  • able to represent any data that might need to be thrown at it
  • able to transport objects

It would be good to make the rpc layer encapsulated so that it can be switched out fairly painlessly if a bad initial choice is made.

inheritance

Inheritance is difficult in a distributed moo, because a patent object has to be copied to wherver its children go, and yet at the same time we want to let the parent object be modified, and these mods should affect the children too. Except the children may be running on a thuroughly disconnected system.

I think that requiring abstract objects as parents (ie, class based inheritance) is the best approach. It doesn't deal with what to do if the parent object is modified, but we can probably live without that in many cases. It might be worthwhile to set up some way to sync abstract objects from their source, when that source is available and has a new version. gpg signatures and some kind of pinging?

method portability

mooix 1.0 has shown that it's very handy to be able to write metods in C, to mess around in arbitrary bits of the OS, etc. But for a distributed moo, that will be problimatic.

The solution is that every method of an object will have multiple forks. It must have a portable fork, although this fork can be a stub that does, essentially, nothing. It may also have any number of nonportable forks, that are run if the system the object is running on matches them, instead of the portable fork.

The portable fork will be written in one language, or a very limited set of languages (TBD), using only a limited subset of the language that does not include OS operations, only method calls, logic, etc. Methods written in it should run anywhere.

There can be a heierarchy of nonportable forks for example, python users will be able to define an OS independent python fork. And a Debian linux with modules X Y and Z specific python fork too, for the more demanding methods. There will be a i386 linux C fork, which will include precompiled binaries (and source), but there could also be a generic C fork, which runs K&R C code on any processor and OS.

The best way to represent the environment needed by a fork is probably as an object class. The class might even provide most of the needed environent in some cases. A mooix system can load the different environment classes, and they can run tests to make sure the environment will work; if it will then the method forks tagged as needing that environment can be run.

To choose which fork to use if multiple ones are available, a method can have a simple preference list. For example, i386 linux C, then linux perl, then portable.

sandboxing

The basic security model is very simple. Either a method is trusted or it is not; if it is not then it runs in a sandbox. Methods in the sandbox can only access their object's data and run other methods, but can't do general network access. They might be CPU and disk throttled.

But most sandboxes are still a full unix (or whatever) system, so methods that need to use libraries, etc can. Probably the sandbox will be implemented with UML, xen or similar technologies, although for security we don't want a monoculture of sandbox setups.

So any given system running mooix will run two parallell systems; one will be for trusted methods and runs as whatever user is running mooix, using all the capabilities of the system; one will be a restricted sandbox. For the purposes of mooix, these two systems together comprise one system, although they might well be implemented using more than one computer.

security

Sandboxing is the first level defense of security in mooix, but it is not enough. It's one thing to prevent untrusted code from taking over a computer running mooix; sandboxing accomplishes this. It's another thing to prevent one object from doing something to another object that the latter object, or the underlying rules of the world do not allow.

And as this is a distributed moo, we also need to worry about things like viruses, methods that DOS the system doing something (and maybe send their results home) etc.

From a security point of view, it's better to think of mooix as a related set of systems. A set of these might make up a world, with its own consistent rules. Objects can come in from outside, with different behavior, but that different behavior can be disallowed, so that the object is forced to behave according to the rules of the world it's in.

This can be accomplished by signed methods and a chain of trust. Methods can be reviewed by and signed by someone, and systems comprising one world can each trust the other's signatures, and optionally trust systems trusted by their peers. If a method is not trusted, the moo can refuse to run it. It might fall back to an alternate, trusted fork of the method.

Part of the process of moving an object from one system to another should consist of looking at these signatures, and letting the object decide if it's even viable for it to run on the other system.

It should also be possible to revoke a signature on a method, and/or to revoke all methods signed by someone you've stopped trusting. This could even be allowed to upend the security model in some systems, which might choose to sign and trust methods by default until they prove to be malicious. Of course there would be risk in doing so, but it would be somewhat balanced out by the wider range of methods that would be available.

Of course most systems will probably choose to trust a common collection of core methods, and there will be well-known authors who everyone will choose to trust. Many abstract methods will be in this category.

Also, it will probably be a common idiom for an object to be split into a remote and a local part, the remote part being a well-known and trusted object, that is sent out to other systems but is controlled by the local part, which can run locally determined code. Avatars are the obvious but far from the only example.

clusters

If you have a set of computers that you are quite sure will always be able to contact each other then it makes sense to be able to group these into one mooix server. Objects should be able to be spread amoung the computers in the cluster at will, even if their location object is running on another computer in the cluster. Objects should be able to request to move to a different computer in the cluster (though it might reject the move). And computers should be able to move objects to another computer in the cluster.

Sandboxing is just a special case of clustering?

Probably objects in a cluster can be replicated to N machines in it for redundancy.

Probably there's a single frontend machine for the cluster that talks with the larger world.

what to trust

As noted, either a method is trusted or it is not and it runs sandboxed. There's not a lot of need to trust many methods; some that do need to be trusted include the methods that are used to implement the mooix UI, and any that need to access the internet (web page downloads, etc).

Probably most such methods will be in the core; there will need to be a way to bless other methods as trusted as desired; probably by signing them in a special way. Then you can read the code and decide to trust something, so sign it, and then anyone who trusts methods you've signed will also trust the method.

object homes

While objects move around between computers all the time in mooix, some objects also need to have a well-defined home. If the computer serving their home object gets disconnected from where they are, they warp home (NB: their contents do not!). This is especially needed for avatars.

To implement this, we need three things, first there needs to be a way to replicate any changes to an object's data back to its home. Note that replicating atomically will be important to avoid bugs; if a method is running remotely and changes one field and the object is forced home, the method isn't running anymore, and can't update the other field. Probably this is best fixed in the very core of the moo; so that methods can make multiple changes atomically, (make-changes-and-commit?), and always do.

Second, there needs to be a way for references to the object to continue to work when it has gone home. They need to redirect back to its home computer.

Third, there needs to be a way to quickly detect a disconnection. Probably some kind of heartbeat between the home and away objects.

So objects with homes are constantly replicated back to their homes, and if the home isn't accessible anymore, they are removed from their location, the replica becomes the object, and references to the object are set to point to the new one back at home.

location discovery

There needs to be a way for one location to find out about another location. Probably this implies central servers that keep track of what other mooix servers are out there.

observation

There needs to be a way for an object to register its interest in certian types of events in its location. The simplest case is that the avatar needs to notice things going on around it and communicate them back to it's user. Another case is the parser needs to be kept informed of the names and other salient characteristics of the objects in close proximity to the avatar, so that it can parse sentences about them. Other cases include stuff like remote communication systems, cameras, etc.

This calls for some way for an object to register its interest, and some well-defined system of events etc that can be filtered on the server side so that an object only gets the events it's interested in and so that hidden events can be filtered out.