RelaxNGCC Manual

$Id: manual.htm,v 1.5 2002/06/26 13:08:48 kkawa Exp $
Contents

1 Annotation Syntax

all the attributes and elements defined in RelaxNGCC uses http://www.xml.gr.jp/xmlns/relaxngcc as the namespace URI. In this manual, we assume the prefix "c" is bound to this namespace URI.

1.1 c:alias attribute

Adding this attribute to a RELAX NG pattern will cause the compiler to declare a field declaration inside the generated class. At run-time, if a string in an XML document matches the pattern, then the generated code stores that value to this field. This makes it possible for your code to acess those values. This attribute is applicable to the data, text, ref, value, and list pattern.

<data type="nonNegativeInteger" c:alias="count"/>

Adding c:alias attribute to a ref or parentRef pattern will cause the specified variable to receive the value specified as the return value in the refrenced scope. By default, this is an instance of the generated class that was used to parse the referenced scope.

A c:alias attribute on a list pattern will cause the entire string (that matches to the whole list pattern) to be anchored.

1.2 c:java element

You can write a code fragment of Java in a c:java element. Within this fragment, you can refer to the data of an XML document by fields declared through alias attribites. The content of java element is executed when the input hits the position where the c:java element is written.

<element name="name">
  <text c:alias="name"/>
  <c:java>System.out.println(name);</c:java>
</element>

The code fragment can throw SAXExceptions (or unchecked exceptions, as usual.)

A java element can be written inside any pattern that can take child patterns except choice and interleave (that is, as a child of start, define, group, optional, zeroOrMore, oneOrMore, mixed, and list.)

Note that the reason why we use the keyword "java" as an element name is that we are planning to support other languages (such as C#) in the future.

1.3 java-body element

The body of this element is copied inside the body of the generate classes. Thus this is usually used to declare additional fields or helper methods.

<define name="x">
  <c:java-body>
  private void echo(String msg) {
      System.out.println(msg);
  }
  </c:java-body>
  ...
</define>

For example, With the above declaration, the echo method becomes available from all the java elements in this define block.

A java-body element can be written only as children of the following patterns:

a. grammar pattern
The contents of java-body is copied to all the classes generated by RelaxNGCC.
b. start or define pattern
The contents of java-body is copied only to the class that corresponds to that start or define element.

1.4 java-import element

Works similar to c:java-body elements. The only difference is that the contents of a java-import element is copied before the definition of a class. Hence one would usually write import declarations by using the java-import element.

<c:java-import>
    import java.util.Set;
    import java.util.Iterator;
</c:java-import>

1.5 c:class attribute

The class attribute can appear at a start pattern or a define pattern, and specifies the name of the generated Java class. The value of class attribute must be a valid Java class. If it is omitted, RelaxNGCC generates a name for the class.

<start c:class="Root">
...
</start>

1.6 c:package attribute

The package attribute can appear only on the root element of RELAX NG. Adding this attribute will cause a compiler to add the package declaration to all the files it generates.

<grammar ... c:package="com.example.project1">
...

1.7 c:access attribute

The c:access attribute causes the compiler to add the specified access modifiers (such as "public final") to the generated Java class. Only define and start pattern can carry this attribute.

<start c:class="sample1" c:access="public final">
...
</start>

1.8 c:runtime-type attribute

This attribute causes the compiler to use a user-defined runtime class instead of the default NGCCRuntime class. The value of the attribute must be a valid Java class that is derived from NGCCRuntime.

Only the root element (usually a <grammar> pattern) in the source schema can carry this attribute.

<?xml version="1.0"?>
<grammar c:runtime-type="org.acme.foo.MyNGCCRuntime" ...>
  ....
</grammar>

1.9 c:return-type/c:return-value attribute

These attributes can be specified on <define> and <start patterns, to specify the return value from a handler class.

c:return-value specifies the expression that will be evaluated to the return value from a handler, and c:return-type specifies its type. The return value will be assigned to the alias specified on the corresponding <ref> element. c:return-value defaults to "this", hence by default the handler object itself will be returned and assigned to the alias.

In the following example, makeResult method will be called and the return value from that method will be returned from the handler.

<define name="foo" c:return-type="String" c:return-value="makeResult()">
  <c:java-body>
    private String makeResult() {
      ....
    }
  </c:java-body>
  ...
</define>

...


    <ref name="foo" c:alias="someStringVariable">

1.10 c:params/c:with-params attribute

These two annotations are used together to allow a parent handler to pass parameters to the a handler.

The c:params attribute can be specified on <define> and <start elements to declare parameters. When a c:params is present, the c:with-params needs to be specified on the corresponding <ref> elements.

The value of the c:params attribute must be a camma(',')-separated list of type and variable name pairs, just like when you define arguments of a function. The value of the c:with-params attribute must be a camma-separated list of Java expressions, again just like when you invoke a method.

Once you specify a c:params attribute on a block, you need to have c:with-params attributes on all the <ref/> patterns that refer to it.

The compiler generated fields by the same name, and passed parameters are assigned to those fields, so you can access them from <c:java-body> or <c:java>.

<define name="foo" c:params="String a,boolean b,Object c">
  ...
  <c:java>
    System.out.println(a);
  </c:java>
</define>

...

   <ref name="foo" c:with-params='"xyz",true,null' />

...

   <ref name="foo" c:with-params='"test",false,System.out' />

2 Usage of the Generated Code

(This section assumes you are familiar with JAXP.)

To compile and run the code generated by RelaxNGCC, a JAXP-compliant XML parser is necessary.

The generated code will work with any component that produces SAX2 events. For details, please refer to the main() function located at the class corresponding to the start pattern.

Customizing constructors

RelaxNGCC uses the consturctor of the generated classes for its own purpose. This makes it impossible to customize the constructors through c:java-body elements. To add code executed at the instanciation of an object, use the instance initializer:

<define name="foo">
  <c:java-body>
    {// add your code here
        System.out.println("initializer");
    }
  </c:java-body>

3 Restrictions

There are grammars that cannot be handled by RelaxNGCC. Specifically, a grammar cannot be handled by RelaxNGCC if it needs a look-ahead.

 <choice>
   <group>
     <element name="a"><text/></element>
     <c:java>System.out.println("a0");</c:java>
     <element name="x"><text/></element>
   </group>
   <group>
     <element name="a"><text/></element>
     <c:java>System.out.println("a1");</c:java>
     <element name="y"><text/></element>
   </group>
 </choice>

In the above sample, correct branching on the choice pattern cannot be done until you read the next element, because both children of the choice starts with the a element. To avoid this, the grammar needs to be rewritten as follows:

<group>
 <element name="a"><text/></element>
 <choice>
   <group>
     <element name="x"><text/></element>
     <c:java>System.out.println("a0");</c:java>
   </group>
   <group>
     <element name="y"><text/></element>
     <c:java>System.out.println("a1");</c:java>
   </group>
 </choice>
</group>

If a grammar violates this restriction, RelaxNGCC raises a warning message. Note that some grammars are unable to rewrite in an unambigous way.

In terms of information science, when we interpret the given RELAX NG grammar as a context free grammar by treating every SAX event as a terminal symbol, RelaxNGCC can treat only if that grammar is LL(1).

4 Unsupported Features

RelaxNGCC does not support the following features of RELAX NG. Hopefully they will be implemented in a future version.


RelaxNGCC Home