VRML Behaviors - an API

Mitra mitra@mitra.biz
Yasuaki Honda <honda@csl.sony.co.jp>, Sony Computer Science Laboratory
Kouichi Matsuda <matsuda@csl.sony.co.jp>, Sony Computer Science Laboratory

This is a living document, please fetch the latest version from http://www.mitra.biz/vrml/vrml2/vrml-api.html.
This version was last updated by Mitra 9 Dec 95 at 17:42.
This proposal should be read in conjunction with the rest of the behaviors proposal, that specifies the nodes inside VRML that support this. It is available from http://www.mitra.biz/vrml/vrml2/behaviors/behaviors.html 



Overview and Goals

Note: SGI agrees with the goals and overall architecture presented in this document. However, due to time limitations, SGI has not yet commented on the details of this proposal.

After an event has arrived at a ScriptNode it is forwarded to an Applet via either an embedded or plug-in interpreter. The goal of this document is to present a straightforward API for the interfaces between the Browser, the Interpreter and the Applet. It must be:

  1. Implementable - to ensure support across a variety of browsers
  2. Efficient - so that complex behaviors can run fast.
  3. Work across multiple platforms
  4. Support a number of languages
  5. Avoid a M platforms by N languages implementation burden.
  6. Work with new nodes, without changing the interpreters or browsers.
  7. Work well with networking, while not being dependant upon the network protocol.
  8. Allow Applets to be written for any particular language so that they will run on any platform that supports the API.
This API presents a detailed API that Applet authors can work to.

This API also specifies an interface between the Browser and an Interpreter, this is intended so that plug-ins can be built that will work with a variety of different browsers. This API is not required where the Browser supports the language internally, and support of the Interpreter API, and whether the Applet is running in an embedded, or plug-in (e.g. DLL) interpreter should not change the Applet code at all.

This proposal presents a simple "thin" API consisting of the minimum set of calls that enable a language interpreter and browser to interact, while allowing for appropriate extensability.

We believe that this API should be treated as experimental, to give us a basis for interoperability. It remains to be seen what other functionality will be required.

API - Object Model

Overview - three parts

The model taken here has three distinct elements.
  1. Browser

  2. Reads behavior files via HTTP or whatever, loads interpreters, and passes behavior files to them, handles callbacks from running Applets including those that manipulate the scene graph, and passes events to them. Passes messages between Applets.
  3. Interpreter

  4. An interpreter for the language which can manage one or more running Applets passed to it by the browser.It may be embedded in the browser, or be available via a plug-in model, for example a DLL, or shared-library. The interpreter is responsible for forwarding calls from the browser to calls to specific Applets, and calls from the Applets to calls to the Browser.
  5. Applets

  6. Applications running in the interpreter, executing code supplied in the behavior files. Think of these as being equivalent to C++ objects or Java classes.

What is NOT specified

In order to give language writers as much flexibility as possible this spec does not specify:

Few simple calls

The API breaks down into a very few calls. Each language will have to define it's own binding to these calls, but these should take a one-to-one form so that the glue layers can be extremely thin. The calls take two forms, one form for calling from the browser to the library, or DLL or whatever that implements the interpreter, and the second form for the call from the interpreter to the Applet. In most cases we are specifying calls to the Applet - calls to the Interpreter are specified in a (slightly) platform dependant manner.

Explanation of Terms

Cookie
An opaque value that can be thought of semantically as a pointer, however the receiver of a cookie cannot assume its syntax for example to dereference it. All that can be done with a cookie is to pass it back to the place it came from in another call.

Cookies are represented as Foo* meaning a cookie that represents an object of type Foo.

aaa::bbbb(cccc)
This syntax means placing a call to the object represented by "cookie" aaa, calling function "bbbb" with paramaters "cccc" and "dddd".

Starting the Applet

Starting the Applet is seperated into three stages, locating the interpreter, loading the program, and running the program. Issue: Should Init be a call to the Applet, or the first event, these are equivalent in the syncronous case, but if the Applet wants to run in a seperate thread, then the init call may be usefull to create a sepeperate thread. This may be a language-dependant issue. 

Sending methods to the Applet

One the Applet is loaded, all calls to it are sent in the same way.

The browser creates an event data structure. Looking like.

Structure {
    event* nextEvent;  // Pointer to the next event    
    char* eventName;   // The Ascii name of the function being called
    double* eventTime; // Time event occured (seconds since 1 Jan 1970 GMT)
    enum  eventType;   // Specifies the data following
    int   dataSize;    // The number of bytes following
    bytes              // The data that follows depends on the EventType.
}
Note that the eventTime is the time the input was first detected by the system, not neccessarily the time it was processed by a sender, or that this data structure was created.

The browser calls to the Interpreter.

int Interpreter::ProcessEvents(Applet*, Node*, Event*)
If the interpreter is internal to the browser, it is free to skip the step above, and call directly to the Applet.

The interpreter converts this data structure into a corresponding one that makes sense for the language being handled, for example a list of Java classes. And then turns the Applet* cookie into a real address for the Applet, and calls the applet.

int Applet::ProcessEvents(Node*, Event*);
The Applet will typically define ProcessEvents as a dispatcher, for example - in C this might be.
ProcessEvents(Node* node, Event* eventList) {
    for (event=eventList; event; event=event->nextEvent) {
        if (strcmp(event->eventName,"Init") { 
            init(event->eventType, event->eventData);
        } else if (strcmp(event->eventName,"SetColor") {
            SetColor(event->eventType, event->eventData);
        } else {
            return VRMLAPI_NO_SUCH_EVENT;
    }
}
Of course, because of the tedious boredom of programming the above it is likely that a trivial macro will be provided to automate the process.

Issue: Three other approaches were considerd and discarded.

  1. Direct calls from the interpreter to the function:

  2. Unfortunately this requires the interpreter to know a lot about the internal calls of the Applet.
  3. Direct calls from the browser to the function

  4. Unfortunately this will not work with most libraries (e.g. DLL's) where the calls into the library have to be known at compile-time.
  5. Sending just one event at a time, however this leads to inefficiencies in the case where several inputs change at the same time, and outputs need only be computed once.
By the time the Applet::ProcessEvent call has returned it must have finished with, or copied the data, and the Interpreter is free to destroy it. By the time the call to Interpreter::ProcessEvent returns the Interpreter must have copied or be finished with the data, so that the Browser can now destroy it.

Note that there is no requirement that the events have actually been processed by the time the calls return. In some languages and platforms the event processing will be scheduled asynchronously.

Events from the Applet to the Browser

The browser communicates back to the Browser by sending events via the Interpreter. It does this through a single call:
SendEvent(Node* destination, Applet* self, Event* eventList);
Normally the Node* is that passed to the Applet with the Init call. Self is a pointer to itself, and event is a data structure of the same format as those it receives in calls to ProcessEvents.

The interpreter will convert the Event* data structure into a C data structure, convert the Applet* pointer into an Applet* cookie and place an identical call to the Browser.

It should be noted that if the Applet is to send events to anything other than the Node* passed at Init time, then it must have the SIDEEFFECTS flag set in the ScriptNode, so that the browser knows not to optimise this node.

Sometimes it is usefull to be able to group a set of events, so that they all appear in the same frame, for example deleting an object in one place, and adding somewhere else, you'd rather not have a frame where the object disappears. This is accomplished by sending the events as a single EventList, in one call to SendEvent. 

Stopping the Applet

Resolution

The resolution calls are used by the Applet to aquire a pointer to a node in the Browser, in order to send it events.

It is an open issue as to what an Applet should be able to resolve, opinions varied between "anything" and "only-fields" in the Script node.

Field* ResolveNode(Node* relativenode, char* name);
This call is used to turn a textual name for a node into a cookie for that node e.g. The relativenode field is used so that an applet can refer to a node within a particular context. This is an area for further discussion 
Ref Counters
It is assumed that some browser dependant mechanism exists for keeping a count of References to a data structure, because of the asynchronous nature of the browser and Applet, it is essential that the browser not destroy a data structure which the Applet still has a pointer to. For this reason any time the Browser resolves a string into a cookie for a data structure, or otherwise passes an Cookie to the Applet. It increments the RefCount on that data structure, allowing the Applet to be sure that it will stay in existance. This means that the Applet must explicitly free any cookie it has, so that the Browser can delete the data structure if nothing else points to it. This is done via a call to FreeNode(Node*).

Note that this cannot be done by a garbage collection mechanism, because the browser has no access to the data structures of the Applet to see what pointers exist. 

Events

Events communicated between the browser fall into the following groups:

Generic Events

The ScriptNode specifies a set of events that the Script can send to the ScriptNode, which will then be sent on by the Browser depending on the ROUTEs set up. The Applet can only send events to a Node whose pointer it has obtained in some way, for example by using the Resolution call above. 

Changing Fields

Changing fields is done identically to sending generic events, except that the names of the events to do so are defined in the VRML spec, or by means of a Prototype declaration, for example by building an event{ eventType=SFColor SFName="setDiffusiveColor" SFData=1 0 0}.

In order to make this painless, a set of Event constructors are defined, so that for example.

    sendEvent(Node*, this, t1 = eventSFColor("setDiffusiveColor",1,0,0)); destroy(t1);
can be used to send an event. Issue Is it worth defining a slightly higher level call of the form.
    sendEventSFColor(Node*, this, "setDiffusiveColor", 1,0,0);
to create an event, send the event, and destroy the event.

Sending MF events is only a little harder, with similar constructors being called, e.g.

    sendEvent(Node*, this, t1=eventMFFloat("doIt",1.0,3.1415,2E10));
Sometimes the Applet will need the value of a field, in which case we have for each SF* type.

Open issue Getting fields needs a little more thought, since the Applet may be running asynchronously, and so be unable to wait for a return from the browser.

Needs have currently been identified for functions of the form

Adding and Deleting Nodes

To edit the scene graph we have to be able to obtain pointers to parts of it, turn VRML into a browser dependant data structure, and add and delete those structures from the scene. Passing entire chunks of the Scene Hierarchy across the API should be avoided wherever possible.

Typically the first thng that will be done is to compile the VRML. This done by a call via the interpreter.

Node* Browser::CompileVRML(char* VRML)
This compiles some valid VRML into a Browser dependant snippet of Scene Graph. It returns NULL if VRML is invalid. At this point the VRML data structure is owned by the Browser, but is not rendered. It can be manipulated using any of teh calss described here before being placed in the scene, typically by sending a AppendNode(Node*) event to a Frame.

This is not done by sending an event because it needs to return before the Applet can proceed.

We are assuming that the Frames proposal is accepted. But this approach would also work , with minor changes, with the existing Separators.

The Frame node will be changed to support the events needed to allow editing of the scene graph.

Frame { 
    .... transform properties and events
    # eventOut SFLong GetNoChildren # Find the number of children.
    # eventOut SFNode GetChildN.    # Return the n-th child. 
issue: How to specify the parameter N
    # eventIn  SFNode setChildren   # Replaces all children
    # eventIn  SFNode appendChild   # Appends child
    # eventIn  SFNode DeleteChild   # Removes a child from Frame
}
The events above allow for the arbritrary changes to a node once the Behavior has a cookie for it. Note that the order of children in Frames is not important, so there is no need for an event to insert a child in a particular order.

To discover exactly what VRML to pass to a browser, the Applet can call:

    bool Interpreter::NodeSupported("NodeType");
Which will return true if the node is supported, and false otherwise.

Issue: It is an open issue as to whether a Behavior should be able to create arbitrary geometry, however if access is restricted to things we have cookies to, then adding a behavior that is similarly restricted, to a node we have access to should not violate security. 

Culling events

Gavin: Does not believe these are neccesary, and would use VisibilitySensors instead In order for the script to optimise itself, for example to reduce Geometry updates when not visible, it needs to be able to receive events related to the system. This set of predefined events are sent by the browser in response to various culling, and administration situations. They are controlled by flags on the scriptNode.

Dynamic Routing

The Applet needs to be able to respond to things that happen in the browser. In a static world, this is accomplished by the path: In a dynamic world, the behavior needs to be able to set up this path. To do this it needs to be able to do 2 things.
  1. Create a Sensor
  2. Add a new Sensor
  3. Route an existing, or new Sensor to an existing ScriptSensor.
  4. and the opposites

  5. Delete a Sensor
  6. Remove a route
Creating a Sensor could be done by compiling the VRML for it, but this could be required sufficiently frequently that calls to the Browser (via the Interpreter) should be provided to do this efficiently i.e.:
    Node* CreateClickSensor();
    Node* CreateLineSensor(...);
and so on for all the predefined Sensor nodes.

Adding a new Sensor can be done by a new event on the Frame or a Leaf node.

    #eventIn SFNode AddEvent
which will add a Sensor (Click or Drag) to an existing node, and a corresponding
    #eventIn SFNode DeleteEvent
to remove the sensor.

Adding and deleting routes is a little more complicated, because it requires multiple paramaters (not allowed in the architecture proposed). It should probably be done by a call to the Browser (via the Interpreter), which has the required parameters, and can be converted by the Interpreter into a number of browser specific events needed to set up the required data structures, dependency graphs etc.

    AddRoute(Node* sourceNode, SFString sourceEvent, 
             Node* destinationNode, SFString destinationEvent,
             SFString userData);
To remove an event we have the complementary call
    DeleteRoute(Node* sourceNode, SFString sourceEvent, 
             Node* destinationNode, SFString destinationEvent,
             SFString userData);

Bindings

SFEnum

In order for the Applet to be able to know what values etc to return it is essential that the Applet have values bound to some things which have been symbolic in VRML1.0. In particular this applies to Enums.

There are three ways the binding could be done - ranging from most flexible at the top, to most efficient at the bottom.

  1. All calls are passed as strings - e.g. SendEventSFBool(Node*, this, "setOn", "TRUE");
  2. A call to determine the value of bindings for that platform e.g. QueryEnum("ShapeHints.vertexOrdering") resulting in some data structure that can be reused.
  3. Loading a platform dependant header file from with the application.
  4. Defining the binding across platforms, at the time the Enum is defined.
We believe that the fourth option is the best, since there is little to be gained by leaving enum assignement to the browsers. The process of defining these values is a fairly minor administrative task. In the meantime it should be assumed that enum's are allocated in the order they are mentioned in the VRML1.1 specification, starting with 0. e.g. For Vertex ordering in the ShapeHints node. UNKNOWN_ORDERING=0 CLOCKWISE=1 COUNTERCLOCKWISE=2. Note, FALSE=0 TRUE=1.

SFBitMask

The same logic applies to SFBitMask. Allocate 1 to the first mentioned, 2 to the second, 4 to the third. With a maximum of a 32 bit word being allowed. An extension will be needed to allow definitions of Enum values when declaring new node types with the PROTO keyword.

Error codes

The following error codes are defined.
  1. VRMLAPI_NOSUCHEVENT: When an unknown eventName is passed
  2. More to be filled in as a need is determined.

Event Types

The EventType field of an Event is defined as follows:
  1. SFBitMask

  2. The next field is a 32 bit word containing a field-dependant bitmask.
  3. SFBool

  4. The next 32 bit field contains 1 for TRUE and 0 for false.
  5. SFColor

  6. The next 3 32 bit fields contain IEEE floating point values for the colors
  7. SFEnum

  8. The next 32 bit word contains a field-dependant enum
  9. SFFloat

  10. The next 32 bit word contains a IEEE floating point value
  11. SFImage

  12. The next 32 bit fields contain an SFImage. The number of bytes following can be determined by multiplying the first 3 fields together.
  13. SFLong

  14. The next 32 bit field is an integer.
  15. SFMatrix

  16. The next 16 32 bit fields represent a matrix
  17. SFNode

  18. The next 32 bit field is a cookie for the SFNode.
  19. SFRotation

  20. The next four 32 bit fields contain IEEE floating point values.
  21. SFString

  22. The next 32 bit field contains a length, followed by that number of bytes, followed by a NULL.
  23. SFVec2f

  24. The next 2 32 bit fields contain IEEE floating point values
  25. SFVec3f

  26. The next 3 32 bit fields contain IEEE floating point values
  27. SFTime

  28. The next 64 bit field is a double precision floating point number - the number of seconds since Jan 1, 1970 GMT.
  29. SFPick

  30. The next 10 32 bit fields are SFBOOL isover SFBOOL isActive SFVEC3F hitPoint SFVEC3F hitNormal SFVEC2F hitTexture.
    Gavin: Would rather see these as seperate events
    Issue: We might prefer to send a cookie and allow callbacks, especially for calculating the hitNormal and hitTexture.
  31. SFProximity

  32. The next 11 32 bit fields are SFTIME enter, SFTIME exit, SFVEC3F position SFROTATION orientation
  33. SFCollision

  34. The next 9 32 bit fields are SFTIME collisionTime, SFVEC3F position, SFROTATION orientation

Platform specific parts

This section needs filling out for each platform. Its probably going to be very similar between different applications on the same platform.

Apple

Apple Events are an obvious, but probably wrong, choice - an Apples Guru is needed here.

Windows (3.1, 95 and NT)

OLE or DLL are obvious choices, but a windows guru is needed here!

Unix

In Unix, shared dynamically linked libraries are probably the easiest way to go.

Sockets

SDSC has a way to extend the API directly over sockets, this needs writing up. 

Language specific parts

Different languages will have specific implementation concerns, here are some comments, but this section needs replacing by people with expertise in those languages.

Requirements

For each language we need to define:

Java

Java requires that the interpreter runs as a seperate thread, and wants to poll for events. The threading model is not specified in this API, the suggestion is that the API is implemented as a small library which just queues events for interpretation by the regular Java interpreter running in a seperate thread. In many cases Java is going to be included inside a browser in which case the Interpreter calls mentioned above are all internal, and outside the scope of the API.

Extensability to new languages

One goal is to be able to add languages, with any of the above approaches, the interpreter can be a seperate piece of code. This enables dynamic, or at least user-initiated, downloading. As a simple (but not neccessarily sufficient) solution, it is suggested to create a known location (or one that can be set up in a config file) from which an interpreter can be downloaded. For example, if running on Windows, and needing Java, then the interpreter could be obtained from: http://vrml.org/windows/java.dll.

Obviously this brings up security concerns which is why the site should be specified in a config file, rather than in the VRML. Behind a firewall the MIS people can then specify a local site where only trusted interpreters are made available.

Open issues / Further Work