Presentation Syntax for OWL-S

Author:
Drew McDermott, Yale University
** DRAFT August 7, 2006 **
[[ The grammar described here is so far for human consumption. It has not been used to actually parse anything, and no doubt contains many bugs. Translation to RDF/XML has not been dealt with at all, and, again, many complexities are surely out there lying in wait. ]]

Abstract

This document presents a grammar for the OWL-S "presentation syntax." The grammar is represented as a set of operators with precedences, rewrite rules, and pattern checkers.

1 Goals

The original syntax for processes in OWL-S is somewhat unwieldy, to put it mildly. The main design constraint was that it be RDF; but the job it is doing, describing processes in all their intricate detail, is not really suitable for RDF, which is basically a metadata formalism. Therefore, we are moving toward a more readable syntax, which can nonetheless be translated into the RDF notation when required. Our syntax includes N3 [<>] as a subset, then adds what we hope is a natural algebraic syntax for expressing variable declarations and control structures.

[[ Big internalization issue: Uniquifying variable names in a file. ]]

Here is an example of a process definition, drawn from the Congo domain [[ ]], which involves a book-selling service. All XML files must begin with namespace declarations, and ours is no exception:

<<Define congo-namespaces [congo, line 3]
with-namespaces
       (uri"http://www.daml.org/services/owl-s/1.2/CongoProcess.owl",

        profileHierarchy: "http://www.daml.org/services/owl-s/1.2/ProfileHierarchy.owl") 
>>

It is not necessary to include any ontological definitions in an Owl-S file, but it's often quite convenient. We use a superset of the N3 notation:

<<Define congo-ontology [congo, line 22]
  {

   ontology
    {
      ontology_version
          "1.88 2006/01/30 07:52:44" .

      imports time, service, process, profileHierarchy,
              uri"http://www.daml.org/rules/proposal/swrl.owl" .
      /**

      A B2C bookbuying example of OWL-S (Web Ontology Language
      for Services; see http://www.daml.org/services/owl-s) usage,

      */    



     shippedTo - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range :AcctID .

     shippedBook - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range profileHierarchy:Book .

     deliveryType - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range DeliveryType .
     
     packagingType - owl:ObjectProperty;
               rdfs:domain Shipment;
               rdfs:range PackagingType .

     Shipment - owl:Class;
              rdfs:subClassOf [ a owl:Restriction;
                                owl:onProperty :shippedTo;
                                owl:cardinality 1 ],
                              [ a owl:Restriction;
                                owl:onProperty :shippedBook;
                                owl:cardinality 1 ],
                              [ a owl:Restriction;
                                owl:onProperty :deliveryType;
                                owl:cardinality 1 ];
                              [ a owl:Restriction;
                                owl:onProperty :packagingType;
                                owl:cardinality 1 ] .

     cardNumber - owl:DatatypeProperty;
            rdfs:domain CreditCard;
            rdfs:range xsd:decimal .

     cardType - owl:ObjectProperty;
            rdfs:domain CreditCard;
            rdfs:range CreditCardType .

     cardExpiration - owl:DatatypeProperty;
            rdfs:domain CreditCard;
            rdfs:range xsd:gYearMonth . 

     validity - owl:ObjectProperty;
            rdfs:domain CreditCard;
            rdfs:range ValidityType .

     CreditCard rdfs:subClassOf [ a rdfs:Restriction
                                  owl:onProperty cardNumber;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty cardType;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty cardExpiration;
                                  owl:cardinality 1 ],
                                [ a rdfs:Restriction
                                  owl:onProperty validity;
                                  owl:cardinality 1 ] .

     FailureNotification enum { NotifyBookNotFound NotifyBookOutOfStock } .

     OrderShippedAcknowledgment - owl:Class .

     ExpressCongoBuyOutputType - owl:Class;
            owl:unionOf ( OrderShippedAcknowledgment 
                          FailureNotification ) .

     FullCongoBuyOutputType - owl:Class;
            owl:unionOf ( OrderShippedAcknowledgment 
                          FailureNotification ) .

     LocateBookOutputType - owl:Class;
       /** If successful, LocateBook returns an ISBN;
           otherwise, it returns a FailureNotification. */
        owl:unionOf ( profileHierarchy:ISBN FailureNotification ) .

     InStockBook - owl:Class;
        /** The set of books that are found in stock of Congo. */
         rdfs:subClassOf profileHierarchy:Book .

     OutOfStockBook - owl:Class;
       /** The set of books that are NOT found in stock of Congo. */
         rdfs:subClassOf profileHierarchy:Book;
         owl:disjointWith InStockBook .

     hasBook - owl:ObjectProperty;
      /** Inverse of hasISBN property, i.e. relate an ISBN number
          to a book. */
         owl:inverseOf profileHierarchy:hasISBN .

     SignInData - owl:Class . 

     acctName - owl:DatatypeProperty;
       rdfs:domain SignInData;
       rdfs:range xsd:string .

     password - owl:DatatypeProperty;
       rdfs:domain SignInData;
      rdfs:range xsd:string .

     UserProfileInfo - owl:Class .

     AcctInfo - owl:Class;
          owl:unionOf ( SignInData UserProfileInfo ) .

     AcctID - owl:Class;

     hasAcctID - owl:FunctionalProperty;
            /** An account exists if there is an account ID associated
               with it. */
          rdfs:domain AcctInfo;
          rdfs:range AcctID .

     NonExistingAcct - owl:Class;
          /** If there is no account ID for an account then that
             account simply does not exist */
      owl:intersectionOf ( AcctInfo 
                           [ a owl:Restriction;
                             owl:onProperty hasAcctID;
                             owl:maxCardinality 0 ] ) .



     }
>>
[[ Need to omit more; start with stuff for full CongoBuy only. ]] The principal enhancement to the notation is to allow constants to be declared using a hyphen between the name of the constant and its "type." Obviously, in Owl objects can belong to more than one class, but type checking is quite useful when verifying that a variable is being used properly. If there is no reason to single a particular class out as "the" type of an object, one can of course leave it undeclared or declare it to be of type owl:Thing. An undeclared constant must be referred to using the usual N3 notation, with a prefixed colon. The name :foo is conventionally an abbreviation for <#foo>; in Owl-S this is not a convention. The angle-bracket notation (<#foo>) is not supported, so the prefix colon is the only way to refer to an undeclared global constant. This is consistent with the general approach to namespaces described in section 2.2

Now we come to an actual process definition

<<Define ExpressCongoBuy [congo, line 283]
    define atomic process ExpressCongoBuy
      /**
         This is an express "one shot" service for buying a book
         with Congo.  It takes as input a Book ISBN number and
         the customer's sign-in information, and has the effect of
         ordering the book if the book is in stock.  In this case,
         the output of the service is a message saying the book 
         is ordered.  If the book is not in stock, the output says 
         that the book is out of stock.
      */
          (inputs: (ExpressCongoBuyISBN - profileHierarchy:ISBN,
                    ExpressCongoBuySignInInfo - SignInData,
                    ExpressCongoBuyCreditCardNumber - xsd:decimal,
                    ExpressCongoBuyCreditCardType - CreditCardType,
                    ExpressCongoBuyCreditCardExpirationDate - xsd:gYearMonth),

           exists: (ExpressCongoBuyAcctID - CustomerAcctID,
                    /** This is a local variable that represents the associated 
                        account ID for the given sign-in info. The existence of 
                        this value indicates that the sign-in info is valid an
                        account exists. */
           `        ExpressCongoBuyCreditCard - CreditCard
                    /** This is a local variable that represents the associated 
                      credit card for the given card number. */
                    ),
           precondition: (hasAcctID(ExpressCongoBuySignInInfo, ExpressCongoBuyAcctID)
                          & credit-card-number(ExpressCongoBuyCreditCard,
                                               ExpressCongoBuyCreditCardNumber)
                          & validity(ExpressCongoBuyCreditCard, Valid)),
                /** Typically this condition should also say that
                   given credit card type and expiration date is also
                   correct for the credit card. Those details are left
                   out for this example. */

           outputs: (ExpressCongoBuyOutput - ExpressCongoBuyOutputType),

           result: (forall (ExpressCongoBuyBook - profileHierarchy:Book)
                      /** book identified by the given ISBN input */
                      (hasISBN(ExpressCongoBuyBook, ISBN)
                       & inStockBook(ExpressCongoBuyBook)
                       |->
                        /** If the book is in stock, then the result is that the
                            order was shipped and an appropriate acknowledgment
                            is output. */
                          N3$${ ExpressCongoBuyOutput 
                                    a orderShippedAcknowledgment }
                           & exists(ExpressCongoBuyShipment - Shipment)
                                    /** shipment that gets generated on
                                        a successful purchase */
                              (shippedTo(ExpressCongoBuyShipment, AcctID)
                               & shippedBook(ExpressCongoBuyShipment,
                                             ExpressCongoBuyBook)))
                      &
                      (hasISBN(ExpressCongoBuyBook, ISBN)
                       & N3$${ ExpressCongoBuyBookISBN 
                                a [ owl:Restriction;
                                    owl:onProperty hasBook;
                                    owl:allValuesFrom outOfStockBook ] }
                       /** If the book is out of stock, then the result is simply
                          that an appropriate acknowledgment is output, indicating 
                          that the book is out of stock. */
                       |-> output(ExpressCongoBuyOutput
                                     <= notifyOutOfStock))))
>>
An atomic process is one that is defined entirely in terms of the values it returns and the effects that it has, without any mention of its subevents. (At the time scale of a typical web-service interaction, any such subevents can be neglected.) Modeling a service as atomic in this way is closely analogous to the way web services are modeled in WSDL [<>], and in fact that's why many web services can be described in Owl-S and grounded in WSDL.

The inputs and precondition fields of a definition describe what goes into a process, and the outputs and results fields describe what comes out. (The exists field is used to bind variables [[ and will be described later ]].) What the process definition says is that if the inputs satisfy the type descriptions and the preconditions are true, then the process will produce outputs of the prescribed types and the result will "occur." The prototypical result field is a proposition that will become true (e.g., shippedTo(AcctID)), but these effects can be gated by conditions (e.g., inStockBook(book)), and some result fields specify output values instead of propositions (e.g., output(Status <= orderShippedAcknowledgement)).

In addition to atomic processes, we also have composite processes, which have bodies describing what other processes must carried out in order for the process to be consummated.

In what follows I will specify the grammar for the ontology and process definitions (sect. 2) and display some sample syntactic analysis provided by this grammar (sect. 3). The entire grammar is found in this document, laid out in what I hope is a convenient order for human consumption. What isn't covered in the body of the document is covered in Appendices  A,  B, and  C.

This document consists of pieces of code intermixed with text, in a "literate programming" style [Knu84]. Some of the code comes from our CongoBuy example, but in addition we present the entire grammar in this format. [[ See [McD04b], appendix 1, for a brief explanation of this incarnation of literate programming. ]]

2 Syntax

The OWL-S grammar is expressed in a somewhat nontraditional form, and to explain it will require a digression on the topic of syntax. [[ The syntax notation is for the Lexiparse parser [McD04b]. ]] Although human languages are more complex, the formal languages used for programming, logic, and OWL-S process definition are fundamentally quite simple. Every expression can be thought of as a tree structure whose leaves are the lexemes of the language. (We leave aside for the moment how to think about the structure of those lexemes.) It is convenient to assume that every nonleaf tree structure has an "operator" lexeme that sits at the top. The most obvious examples are arithmetic expressions such as "a * b + c," which can be diagramed as the tree shown in the Figure. We will draw our tree structures using variations of the parenthesis notation apparently first invented by Noam Chomsky [<>]. This one can be drawn (+ (* a b) c). The operator of the tree is "+"; its left subtree is itself a tree, with operator "*".

Figure: Classical Syntax Tree
+
  *
    a
    b
  c

We will be using two sorts of parenthesized tree expression: the one just exemplified, which is of course the notation used by the programming language Lisp [<>]; and a variant in which the symbol ":^+" comes between the open paren and the operator. The reason is that we will be using Chomsky-tree-bracket notation for two purposes: to express the rules of the grammar, and to describe the trees of the language defined by the grammar. For the first purpose we will use traditional Lisp notation, and for the second we would write our example tree as (:^+ + (:^+ * a b) c). (The "Tinkertoy" symbol is meant to resemble a treelet with two subnodes.)

The Lexiparse formalism is different from most in that it is cast in terms of precedences and checking rules rather than phrase-structure rules[]. Explicit phrase-structure rules are unnecessary because they fall into a few stereotyped classes that are captured more succinctly using other information. Every nonterminal can appear in infix position or prefix position or both, depending on whether an operand is allowed to appear to its left. For each of the two positions, its behavior is defined in terms of at most four features:

  1. Numargs: How many arguments it expects to its right (an interval [min, max]).
  2. Left precedence: How tightly it binds an operand to its left (unnecessary in prefix position)
  3. Right precedences: How tightly it binds operands to its right (unnecessary if max numargs = 0).
  4. Bracket: Whether it is an open or close bracket, and which brackets it matches.
Postfix operators are just a special case of infix operators whose numargs = [0,0]. Postfix operators that take more than one argument to their left are impossible. The notation for specifying these things, and a couple of other items of less import, will become clear as we go.

These features suffice to define a deterministic grammar that can be parsed rapidly, transforming any string of lexemes into a parsetree without further ado. From that point on, we express our grammar in terms of transformations that clean up and filter parsetrees, producing error messages for the "defective" ones that don't pass the filters. The grammar rules are written in a Lisp-style notation, and the parsetrees they refer to are written using "Tinkertoy" notation.

The higher the precedence of a token, the "tighter" it binds to the expressions adjacent to it, which means that high-precedence tokens tend to head small(er) subtrees of the large(r) parsetrees headed by low-precedence tokens. The tokens near the root of the tree are those with the lower precedences. (This might be one context where computer scientists should draw their trees with the leaves up!) All grouping decisions can be made purely locally. Suppose the parser has the sequence A op B in hand, where A and B are trees and op is an operator. The question is whether to build the tree (:^+ op A B), or to keep looking for material to the right of B that could be used to augment the second operand of op. If there is an operator to the right of B, and its left precedence is higher than op's right precedence, the parser augments B; otherwise, it doesn't. Example: Having seen x + y, if the next lexeme is *, then the parser will augment y, so that it may well wind up with a tree that looks like (:^+ + x (:^+ * y ...)). If the next lexeme is absent (we're at the end of the stream), or it's a close bracket (whose left precedence is 0, and therefore much lower than any normal operator), or it's "-" (whose left precedence is the same as +'s right precedence, which, let's say, is 180), then the parser will build (:^+ + x y). Of course, the parser must then decide whether to incorporate that into a larger tree (such as (:^+ - (:^+ + x y) ...)), depending on exactly the same considerations as before, but now the sequence A op B has A = (:^+ x y), op = -, and B = whatever's to the right of -.

This is all well and good, but there are many expressions that don't consist of an application of an operator to one or two arguments. In Owl-S, how do we handle something as complex as define, which expects a bunch of keywords to the right, followed by a complex structure of inputs, results, and such? Each case must be handled in its own way, but the idea is to define keywords such as process or atomic as operators themselves, so that the parsetree headed by define contains subtrees headed by these "sub-operators." The syntax of define then consists of its basic precedence information plus operations for tidying and checking a tree headed by an occurrence of it. Checking a tree means looking for illegal patterns; tidying occurs first, in order to clean up the structure produced by the basic precedence grammar. These are embodied in rule sets called the tidiers and checkers for define.

What this means is that to read a Lexiparse grammar you look for the patterns that are expected to occur under an operator. These are not found on the right-hand sides of grammar rules, but inside the tidiers and checkers for each operator. The language of checkers in particular contains many features that go far beyond what can be expressed with context-free grammars.

A useful example is the classic if-then-else, which we'll present here even though it belongs much later, in the section on control constructs in composite processes. Intuitively, <if> is an operator with two or three arguments: a test, something to do if the test is true, and, optionally, something to do if the test is false. We capture that by making <else> into an infix operator with a higher precedence than <then>, so that the three-argument case has a subtree of the form (:^+ then (:^+ else iftrue iffalse)).

<<Define then-else-syntax

   (def-syntactic then :reserved true
                       :prefix (:precedence 87 :numargs 1))

   (def-syntactic else :reserved true
                       :infix (:precedence 88)) >>
The field ":reserved true" means that this token is derived from occurrences of the symbol itself, and not from some character sequence (as <left-paren> is derived from '('). The lexical level of the language is dealt with in appendix  B

Now we define if thus:

<<Define if-syntax

   (def-syntactic if :reserved true
                     :prefix (:numargs 2 :precedence 85)
      :tidier (( (:^+ if ?test (:^+ then (:^+ else ?iftrue ?iffalse)))
                !~>>- (:^+ if ,test ,iftrue ,iffalse))
               ( (:^+ if ?test (:^+ then ?iftrue))
                !~>>- (:^+ if ,test ,iftrue))
               (:else
                (defect "'if' has no 'then' part")))) >>
The first clause establishes that the precedence of the token if is 85, which places it above else in any parsetree. (It doesn't compete with prefix operators such as then.) The :tidier section is a list of pairs of patterns:
      ( ( (:^+ ...) !~>>- (:^+ ...) )
        ( (:^+ ...) !~>>- (:^+ ...) )
        (:else ...)
       )
(but we normally use fewer spaces and more newlines + indentation). On the left side of a "!~>>-" rule, question marks indicate chunks of a parsetree to extract and name. On the right side, commas indicate items to plug into a parsetree being built. We use standard notations for pattern matching and expression construction. The former are derived from logic programming and the latter from Lisp "quasi-quote" or "backquote." These are not regular expressions, mainly because we are not matching strings. The precedence grammar has given us crisply organized structures to match against, not saggy strings. A match variable of the form ?var matches a single item in a parsetree and sets var to it. These are also called qvars. A qvaroid is an expression where the question mark is followed by a parenthesized list, beginning with a symbol. What it matches is governed by that symbol. The qvaroid ?(:^+ op –subs–) matches a parsetree with operator op and subtrees subs. (The "subtrees" are not always parsetrees themselves; they can be occurrences of symbols, strings, and other lexical primitives.) The pattern ?(:^+ if ?test (:^+ then (:^+ else ?iftrue ?iffalse))) matches any parsetree of the form (:^+ if __ (:^+ then (:^+ else __ __))) and binds the variables test, iftrue, and iffalse to the pieces filling in the blanks. (If the left-hand side is a parsetree, as it usually is, the question mark in front of it may be omitted, and question marks may be omitted for parsetree patterns occurring inside other parsetree patterns, but in other contexts they are mandatory.)

On the right side of a "!~>>-" rule, commas indicate where the pieces of a newly constructed parsetree are to come from. So (:^+ if ,test ,iftrue ,iffalse) creates a new parsetree with a "flatter" structure of three subtrees, respectively the values of the variables test, iftrue, and iffalse. Commas may also be used in conjunction with question marks. In a match pattern, ?,exp matches the value of exp. Note that repeated occurrences of a match variable do not refer to the same value unless there is a comma. (foo ?x ?,x) matches a foo-headed expression with three subexpressions, the second and third of which must be the same; but (foo ?x ?x) matches any three-element foo-headed expression, setting x first to the second subexpression, then to the third.

After either a question mark or a comma, an atsign indicates that a series of elements is to be matched or substituted. ?@v matches a series of parsetrees and stores them (as a list) in v, and ,@exp substitutes the value of the list exp (as a series) into the parsetree being built. Only one ?@ may occur in a single list. That doesn't mean in the entire pattern, but just within one list in the pattern. The ,@ construct has no analogous limitations We will see examples of the atsign constructs below.

If neither transformation rule in the tidier applies, then the :else clause kicks in. The expression (defect ...) is obviously not a transformation rule. Its pattern consists of strings and expressions. The values of the expressions are concatenated with the strings to make a description of a defect, or "flaw," in the syntax tree. These defects are accumulated, and in the end produce error messages about an unsuccessful parse.

The <if> construct is simple enough that it needs no separate checker. If it had needed one, it would have been defined with a separate :checkers clause in def-syntactic. We will see examples of this shortly.

Here's another key example, the definition of left-parenthesis:

<<Define left-paren

   ;; Eliminate commas from parenthesized expressions
   (def-syntactic left-paren :lex "("
                             :prefix (:right-precedence 0
                                      :bracket (:open right-paren))
                             :infix (:left-precedence 200 :right-precedence 0
                                     :bracket (:open right-paren))
      :tidier
         (( (:^+ left-paren [*_] (:^+ comma ?@subs))
           !~>>- (:^+ group [*_] ,@subs))
          ( (:^+ left-paren [*_] ?e)
           !~>>- e)
          ( (:^+ left-paren ?f [_*_] (:^+ comma ?@args))
           !~>>- (:^+ fun-app ,f [_*_] ,@args))
          ( (:^+ left-paren ?f [_*_] ?arg)
           !~>>- (:^+ fun-app ,f [_*_] ,arg)))) >>
Left parentheses can occur as infix or prefix operators. In either case a left paren is (big surprise) a :bracket closed by a right paren; its right precedence of 0 causes all other operators to bind tighter, until a right bracket is seen, with left precedence of 0. The resulting parsetrees are then tidied using four rules. The first two contain the symbol [*_], indicating that they apply only to prefix occurrences of <left-paren>; the second two contain [_*_], which indicates that they match with infix parsetrees only. What the rules do depends on whether the main operator inside the parentheses is a <comma> or not. Because <comma> is a flattening operator, it can dominate a parsetree with an indefinite number of subtrees. Hence we use "?@" as a prefix on subs and args in our match patterns, and ",@" as prefix on the same variables in our substitution patterns. In those cases, the <comma>s are removed.

What makes <comma> a flattening operator is that its syntax is declared thus:

<<Define comma [arithgram, line 55]
(def-syntactic comma :lex "," :infix (:precedence 100 :flatten true))
>>
This declaration ensures that any parsetree of the form
   (:^+ comma (:^+ comma a b) c)
will be converted to
   (:^+ comma a b c)

A prefix paren not followed by a comma is redundant (having served its purpose of overriding normal precedence relationships) and it is removed (by the second tidier clause). In all other cases the <left-paren> token is replaced; prefix <left-paren> becomes the token <group> and infix <left-paren> becomes <fun-app> (short for "function application"). In what follows, tidiers will often look for occurrences of <group>, and it's useful to bear in mind where they came from.

One thing that may not be clear is that inside a pair of brackets there must always be a single expression. Typically some flattening connective, such as <comma> in the case of ordinary parentheses, is used to bundle together multiple expressions. As we will see later, if there is no explicit connective, Lexiparse will insert one.

The discussion in the rest of this section roughly follows the order shown in the

Tables "Right Precedence" and "Left Precedence", from high-level, low-precedence tokens to low-level, high-precedence tokens. Before continuing with the discussion, it might be useful to stop and scan these tables, which show the overall precedence structure of OWL-S. An infix operator (such as <left-paren>in some contexts) has both a left and right precedence. The former determines how well it competes with tokens to its left, the latter, how well with those to its right.

Table 2: OWL-S operators in order of increasing ...
right precedence
Op Left Right
** 1000 -1000
left precedence
Op Left Right
** 1000 -1000

Later, in Table Precedence Array, we show the "cross product" of these two tables, indicating more clearly which tokens tend to occur below which other tokens.

2.1 Upper-Level Syntax: Process Name Declarations

We start by declaring that we're constructing a grammar, called owl-s:

<<Define grammar-def

(def-grammar owl-s
    :top-tokens (ontology define with_namespaces)
    :lex-parent standard-arith-syntax
    :syn-parent standard-arith-syntax
    :sym-case #-allegro :up #+allegro (allegro-case-choice)) >>
The :top-tokens declaration indicates that a legal process definition in OWL-S must be headed by one of the tokens ontology, define, or with_namespaces. (This last one is actually the most likely; it's explained in Section 2.2; ontology is discussed in section 2.6.)

The rest of the grammar consists of definitions of the tokens of the language, starting with those that appear higher (closer to the root) in parsetrees, which are (more or less) the same as the :top-tokens. We start with <define>.

A process-model definition consists of a sequence of process definitions, each atomic, simple, or composite. All three stem from the <define> operator.

<<Define process-definition

   (def-syntactic define :reserved true
                         :prefix (:precedence 5 :numargs 1)
      :tidier
      (( (:^+ define (:^+ composite 
                          ?(:^+ process ?name ?@iopr-spec)
                          ?@body-spec))
        !~>>- (:^+ define composite ,name (:^+ iopr ,@iopr-spec) ,@body-spec))

       ( (:^+ define (:^+ atomic (:^+ process ?name ?@iopr-spec)))
        !~>>- (:^+ define atomic ,name (:^+ iopr ,@iopr-spec) nil))

       ( (:^+ define (:^+ simple (:^+ process ?name ?@iopr-spec)))
        !~>>- (:^+ define simple ,name (:^+ iopr ,@iopr-spec) nil))

       (:else (defect "Unintelligible 'define' expression")))
      <<Insert: process-definition-checkers>>>>

The tidier for <define>, like <if>'s, flatten the structure produced by the operators <atomic>, <composite>, etc. (which are defined later). Now we add in a :checkers field, which specifies various validity tests for the parsetrees headed by <define>:

<<Define process-definition-checkers

      :checkers
       (
        ((:~ ?(:^+ define ?(:|| atomic composite simple)
                          ?(:+ ?name is-name)
                          ?@_))
         (defect "Process has illegal name: " name))

        (:up *
          ((:~ ?(:^+ ?(:|| with-namespaces left-brace) ?@_))
           (defect "'define' can appear below only 'with-namespaces'"
                   " or 'left-brace'")))
       )) >>

Note that checkers is a list of (in this case, two) items. Each specifies a test to be run, followed by defect generators or further checkers. That is, each checker is of the form
    (pattern –defects-or-checkers–)
If the pattern matches the current parsetree, then the defects-or-checkers specify what happens. Expressions of the form (defect ...) create defects, and anything else is interpreted as another checker to be run. If a checker has a pattern followed by nothing, a standard defect of the form "Doesn't match this pattern" is created. If the checker is instead supposed to signal "no defect," one just writes :okay after the pattern.

Some new features of the pattern language appear at this point. They are summarized in table 2-2.

We adopt the convention that the first checker gives the overall shape of the construct headed by the syntactic operator being defined. It thus plays the role of the traditional phrase-generation rules. (If there are no checkers, then the tidier provides the overall shape.) The meaning of the first checker may be glossed as, "If the parsetree does not start with atomic, composite, or simple, followed by a name and possibly some other stuff, , then report the defect [Process has illegal name: n], where n is the offending second child." (The reason we can be sure that this is the proper defect is that the tidiers have already checked for the presence of atomic, composite, or simple. The only reason to repeat them is to fulfill our convention.)

Both tidiers and checkers are run bottom-up, but tidiers are run as soon as each sub-parsetree is first constructed. Checkers are run only after all grouping and tidying is finished for the entire parsetree; in addition, for a given sub-parsetree, the checkers are run only if the tidying phase found no defects. A consequence of this design is that the checkers can look up the tree as well as down.

We don't have to wait long for an example; the second checker for define has a pattern of the form (:up * –checkers–), which indicates that all the ancestors of the current parsetree are to be checked using the given checkers. (If "1" had appeared instead of the asterisk, then only its parent would have been checked.) In this case the checker limits define to appearing either at the top level or below a with_namespaces, possibly with some enclosing curly braces.

The syntactic rules above introduce the acronym IOPR, which stands for "Inputs, Outputs, Preconditions, and Results" — all the types of entities that define the external view of the process. For atomic and simple processes, the external view is all there is; for composites, the body-spec provides a view of the inner works of the process.

We analyze <atomic>, <simple>, and <composite> as simple prefix operators, followed by a process-spec. See appendix  B

Table 2-2: Match codes used by Lexiparse
?_ Matches anything
?(:~ p) Matches only if p does not
?(:& p1 ...pK) Matches only if each pI does
?(:|| p1 ...pK [:& p0]) Matches only if one of the pI and, if present, p0 match
?(:+ p –tests–) Matches if p matches and the item in question satisfies the tests.
?(:* p) Matches a list of zero or more items that match p

2.2 Namespaces

Before presenting the syntax of processes, I digress to talk about namespaces, on the grounds that processes and ontologies almost always occur inside namespace declarations. In the RDF/XML representation, namespaces are treated using the standard XML namespace notations, that is, as attributes whose names start with xmlns. In the presentation syntax, we must supply a special notation:

<<Define namespace-spec

   (def-syntactic with_namespaces :reserved true
                  :prefix (:precedence 15 :numargs 2)
      :tidier
         (( (:^+ with_namespaces ?spec ?e)
           !~>>- (:^+ with_namespaces
                        ,e ,@(tidy spec namespaces-flatten)))))

   <<Insert: namespaces-flatten>>>>
A named tidier is a packaged tidier that can be invoked from multiple contexts. The construct (tidy x tidier) denotes the result of applying the tidying rules in tidier to x. In this case, instead of a rule set we have a named tidier, namespaces-flatten. We'll get to its definition shortly. Its purpose is to "flatten" the namespace list that occurs right after <with_namespaces>, i.e., to move it up a level and put it after the body of the <with_namespaces> expression in the tidied parsetree. This format is slightly easier for internalizers to handle. Before presenting it, we discuss the syntax of colons, which are obviously relevant to namespaces.

When <colon> occurs as an infix operator, it's part of a qualified name, like owl:Class. Infix occurrences are replaced with the more mnemonic <name-wrt-space>. It also occurs as a prefix operator. These occurrences are not renamed, but deleted, as explained in section 2.3. Either way, there should be no <colon>s in a parsetree after it is tidied, which explains the checker in the following syntax definition:

<<Define colon

   ;; Note that colon is not a reserved word, but comes from
     ;;  occurrences of ":"
   (def-syntactic colon :lex ":" :infix (:numargs 1 :precedence 210)

                         ;; -- namespace construct, with high precedence
                         :prefix (:numargs 1 :precedence 120)
                         ;; -- Occurs only after inputs, outputs,
                         ;; preconditions, effects
                         ;; Low precedence, but higher than comma
      :tidier
         (( (:^+ colon ?namespace [_*_] ?name)
           !~>>- (:^+ name-wrt-space ,namespace ,name)))

      :checkers
         (( (:^+ colon ?@_)
           (defect "Stray colon"))))

;;; Because infix <colon> is replaced by the "artificial" token
;;; <name-wrt-space>, we have to use 'def-checkers' to define its
;;; checkers --

   (def-checkers name-wrt-space
      ( ?(:^+ name-wrt-space ?namespace ?name))

      ( ?(:~ ?(:^+ name-wrt-space ?(:+ ?namespace is-Symbol) ?_))
       (defect "Namespace must be a non-nil symbol: " namespace))
     
      ( ?(:~ ?(:^+ name-wrt-space ?_ ?(:+ ?name is-Symbol)))
       (defect "Name must be a symbol: " name))) >>
The checkers for (:^+ name-wrt-space space name) verify that space and name are symbols. The usual expectation is that space is bound by an outer with_namespaces expression, but bindings are not checked until expressions are internalized. And, of course, in the with_namespaces expression itself, the meaning of the name and space are completely different.

We can now turn to the named tidier used to tidy with_namespaces expressions:

<<Define namespaces-flatten

   (def-named-tidier namespaces-flatten
      ( (:^+  uri ?s)
       !~>>- ((:^+ namespace+name ,false  (:^+ uri ,s))))
      ( (:^+ ?(:|| equals name-wrt-space)
             ?(:+ ?sym is-Symbol)
             (:^+ uri ?s))
       !~>>- ((:^+ namespace+name ,sym (:^+ uri ,s))))
      ( (:^+ group ?@specs)
       !~>>- (all-tidy specs 
                (use-named-tidier namespaces-flatten)))
      (:else
        (defect "Illegal namespace spec " spec))) >>
It has the same syntax as a regular tidier, although those not intimately familiar with Lisp notation may think there is a missing open parenthesis. It differs from a regular tidier in that it can be used in more than one place, so you may want to go back to the definition of with_namespaces to remind yourself that in that context it is applied to the list of namespace prefixes that occurs as its first argument. The key purpose of namespaces-flatten is revealed by its third clause, which matches against a parsetree of the form (:^+ group ...), which, you'll recall, is what a parenthesized list gets tidied into. If this is the form taken by the expression following with_namespaces, then namespaces_flatten is applied to the subtrees of the group recursively. The construct (all-tidy l tidier) means the result obtained by tidying each element of l using tidier, then appending all the results together. (If a tidier produces a single parsetree, it is treated as a singleton list containing that parsetree.)

The base cases are the first two clauses of namespaces-flatten: either a single URI, representing the default namespace, or expressions of the form prefix:URI. (We also allow prefix=URI, which explains the ?(:|| ...) expression.)

In XML, URIs are represented as strings, and are recognized as URIs by their context. We take a slightly different tack, and supply a syntactic operator uri"..." that indicates a string is to be taken as a URI.

<<Define uri-spec

   (def-syntactic uri :reserved true
                  :prefix (:precedence 210 :numargs 1)
      :checkers
         (( (:~ ?(:^+ uri ?(:+ ?s is-String)))
           (defect "uri followed by non-string " s)))) >>

2.3 Process Syntax

We now return to the syntax of processes, and in particular process itself:

<<Define process-spec

   
   ;; __::= process <name>(-IOPRs-)
   (def-syntactic process :reserved true
                  :prefix (:precedence 25 :numargs 1)  
      :tidier
         ( ( (:^+ process (:^+ fun-app ?name [_*_] ?@subs))
           !~>>- (:^+ process ,name ,@subs))
          <<Insert: anon-process>>
          (:else (defect "'process' must be followed by <name> "
                         "(if not 'as process'),"
                         " then IOPRs in parens")))

      :checkers

         (( (:^+ process ?name ?@(:* ?(:|| inputs outputs exists participants
                                           precondition result))))

          ( (:^+ process ?(:+ ?_ (not is-Symbol)) ?@_)
           (defect "Process with non-symbol as name"))

          ( (:^+ process ?_ ?@ioprs)
           (check ioprs
              (occ-range 0 1 ?(:^+ inputs ?@_))
              (occ-range 0 1 ?(:^+ outputs ?@_))
              (occ-range 0 1 ?(:^+ exists ?@_))
              (occ-range 0 1 ?(:^+ precondition ?@_)))))
) >>
The only remark to make about the tidier for process, besides the fact that it instantiates the common flatten-or-produce-defect pattern, is that it refers to the token <fun-app>, which is what an infix <left-paren> turns into.

The notation (check xcheckers–) means to check x using all the given checkers and to collect all the defects that result. The checker (occ-range lo hi pattern) verifies that the expression being checked is a list, and counts the elements that match pattern. If the number falls outside the range [lo,hi], a defect is issued, of the form [Too few (or too many) subexpressions match pattern].

There are two kinds of IOPRs, the IOs (parameters) and the PRs (preconditions and results). One key feature of the former is that they are followed by a parameter-declaration list with a distinctive syntax, which we handle by introducing a local grammar. We will look at the grammar first, then reveal how it's activated in the appropriate context.

The new grammar must parse typed variable declarations such as (x,y - String n - Integer), in which the hyphens must have precedence lower than that of the commas.

<<Define typed-var-grammar

(def-grammar typed-var-grammar
    :lex-parent owl-s
    :syn-parent owl-s
    :replace ((minus hyphen))
    :sym-case #-allegro :up #+allegro (allegro-case-choice)

 :definitions
 (
   ;; Hyphens can be generated only by replacement of <minus>
   ;; Precedence puts hyphen above comma in parsetree --
   (def-syntactic hyphen :lex "-" :infix (:precedence 17 :numargs 1)
       :tidier
          ;; Eliminate commas --
          (( (:^+ hyphen (:^+ comma ?@vars) ?type)
            !~>>- (:^+ hyphen ,@vars ,type)))
       :checkers
          (hyphen-checks))
          
   ;; We interpose a named checker because we're going to use it
   ;; again later --
   (def-named-checkers hyphen-checks
       ( ?(:~ ?(:^+ hyphen ?@(:* ?(:+ ?_ ?is-Symbol))
                           ?(:+ ?_ is-name)))
        (defect "Ill-formed variables and types"))) >>

The grammar typed-var-grammar is declared to be a subgrammar (lexically and syntactically) of the main grammar, owl-s. The clause :replace ((minus hyphen)) means to replace the token <minus> generated by the lexer with <hyphen> when typed-var-grammar is in control. That means that even when control returns to the normal owl-s grammar, the token derived from "-" will not start looking like a minus again. The precedence of <comma> is inherited from the OWL-S grammar; it's 30, so <hyphen>s appear above <comma>s in parsetrees, as desired.

The only other connective that requires changing in typed-var-grammar is <left-paren>:

<<Define typed-var-paren

   (def-syntactic left-paren :lex "("
                             :prefix (:left-precedence 200 :right-precedence 0
                                      :bracket (:open || right-paren))))) >>
Here the bracket declaration has an extra parameter, the token with name "||". When an :open declaration is followed by the names of two tokens, the first is the separator and the second is the closer. If the main connective of the expression between the <left-paren> and the <right-paren> is the separator, then expression is replaced by its arguments. If the separator were, say, <comma> then this would be easier to explain; the idea would simply be that a parsetree of the form (:^+ left-paren (:^+ comma x y z)) would be transformed into (:^+ left-paren x y z). Declaring the separator would save us from writing a tidier of a very common sort.

What's slightly odd about <||> is that it is, shall we say, "lexically unrealized," i.e., not actually there. It's the "virtual" operator, called contiguity, that appears between String and n in (x,y - String n - Integer). As mentioned in the beginning of the Syntax section, inside every pair of brackets must come a single expression. This need to assimilate everything under one operator drives the system to insert <||> whenever it comes to the end of an expression and what follows is not an infix operator. This event occurs after the occurrence of String in the example expression. However, the precedence and other properties of this operator are entirely up to the grammar creator. In this case, by making the precedence of <||> low we ensure that it becomes the main operator for type-variable lists with multiple type declarations, and then gets tidied away by the implicit flattening transformation applied to bracket separators.

<<Define typed-var-contiguity

   (def-syntactic || :lex "" :infix (:precedence 16 :flatten :true)
      :checkers
         (( ?(:^+ || ?@_)
           (defect "Stray occurrence of contiguity op")))) >>
(Don't confuse <||> with :||, the qvaroid code meaning "disjunctive match.")

Now we return to the main OWL-S grammar, ready to define the syntax of IOPRs using the local grammar we have just defined. In the :prefix and :infix declarations for operators, we can use a :local-grammar parameter to provide a special grammar for the right-hand argument of an operator. (Actually, we can do this on an argument-by-argument basis, allowing us to provide a different syntax for each right-hand argument. The qualification "right-hand" is due to Lexiparse's left-to-right asymmetry; by the time it sees an infix operator, its left-hand argument has already been parsed. But the possibilities are wide open for the arguments to the right of an operator, whether infix or prefix.)

<<Define param-decls

   
   (def-syntactic inputs :reserved true
                         :prefix (:numargs 1 :precedence 120
                                  :local-grammar typed-var-grammar))
      :tidier
         (use-named-tidier typed-vars-tidier inputs)

      :checkers
         (use-named-checkers typed-vars-checker inputs)

   (def-syntactic outputs :reserved true
                          :prefix (:numargs 1 :precedence 120
                                   :local-grammar typed-var-grammar)
      :tidier
         (use-named-tidier typed-vars-tidier outputs)

      :checkers
         (use-named-checkers typed-vars-checker outputs))

   (def-syntactic exists :reserved true
                        :prefix (:numargs 1 :precedence 120
                                 :local-grammar typed-var-grammar)
      :tidier
         (use-named-tidier typed-vars-tidier exists)

      :checkers
         (use-named-checkers typed-vars-checker exists))

   (def-syntactic participants :reserved true
                        :prefix (:numargs 1 :precedence 120
                                 :local-grammar typed-var-grammar)
      :tidier
         (use-named-tidier typed-vars-tidier participants)

      :checkers
         (use-named-checkers typed-vars-checker participants)) >>
The tidiers for each of these constructs are all essentially the same, so we make use of a single named tidier for all of them, whose name is typed-vars-tidier:
<<Define typed-vars-tidier

(def-named-tidier typed-vars-tidier (sort) :grammar owl-s
   ( (:^+ ?,sort (:^+ colon (:^+ group ?@decls)))
    !~>>- (:^+ ,sort ,@decls))
   ( (:^+ ?,sort (:^+ colon ?decl))
    !~>>- (:^+ ,sort ,decl))
   (:else (defect "'" sort "' doesn't occur in the form "
                  sort ": (...)"))) >>
This named tidier differs from the previous one because it takes an argument sort, in addition to the implicit argument, the parsetree being tidied. The argument takes on the value inputs when typed-vars-tidier is called to tidy a prsetree of the form (:^+ inputs ...), and similarly for the other types of variable binding. The checker for these constructs takes the same kind of argument:
<<Define typed-vars-checker

(def-named-checkers typed-vars-checker (sort)
   ( (:up 1 (:process ?name ?_))
    (defect sort " doesn't occur as process argument"))) >>

The other two things that can appear in an IOPR list are preconditions and results:

<<Define preconds

   ;; <IOPR> ::= precondition : ...
   (def-syntactic precondition :reserved true
                               :prefix (:numargs 1 :precedence 120)
      :tidier
         (( (:^+ precondition (:^+ colon ?exp))
           !~>>- (:^+ precondition ,exp))
          (:else (defect "'precondition' not followed by ': <expression>'")))

      :checkers
         ((:up 1 (?(:^+ iopr ?@_))))) >>
The formula in the precondition field of a process definition has no special syntax, but is just part of the general-purpose logical-expression system that generates all those XML literals at the leaves of OWL-S. This system is discussed in section 2.7.

2.4 The Syntax of results

The result field(s) of a process definitions is more complex than the inputs, outputs, precondition, and other fields.

<<Define results

   ;; <IOPR> ::= result : ...
   (def-syntactic result :reserved true
                          :prefix (:numargs 1 :precedence 120)
      :tidier
         (( (:^+ result (:^+ colon ?exp))
           !~>>- (:^+ result ,exp))
          (:else (defect "'result' not followed by ': <expression>'")))

      :checkers
         ((:up 1 (?(:^+ iopr ?@_)))
          <<Insert: result-tree-checker>>)) >>
Results can include several constructs that are not meaningful, or mean something different, in other contexts. Even though a result may look like a predicate-calculus formula, it is not really an entity with a truth value, but a recipe for changing the truth values of various propositions. For this reason, I prefer using the verb "impose" to describe a result. A result R does not become "true"; it is imposed following an action or event. Here is a rough definition of what that means:

  1. To impose an atomic formula p is to make it true
  2. To impose a negated atomic formula not p is to make p false. For many implementations, this translates into deleting p from a representation of the current situation, so that the use of a closed-world assumption [Rei78] will allow an implementation to conclude that p is false.
  3. To impose p |-> q is to impose q if p was true before the action or event.
  4. To impose forall (x) p[x] is to impose p[a] for all objects a. In practice, most implementations only handle the special case forall (x) p[x] |-> q[x], which means "For every set of values v for vars such that p[v] is true before the action or event, impose q[a]." OWL-S is typical in singling this case out.
  5. To impose output(p1 <= v1, ..., pk <= vk), where each pi is an output of the current process, make each vi the value of pi. (This notation corresponds to the withOutput construct of OWL-S.)

Various syntactic constraints are implied by this definition of "impose." There are others as well. For example, we don't allow disjunctive effects in OWL-S.

To complete the syntax spec, we first define the special operators (<when> and <output>), which occur only in results.

<<Define syn-when

   (def-syntactic when :lex "|->"
                       :infix (:left-precedence 111 :right-precedence 110)) >>
<<Define output-syntax

   (def-syntactic output :reserved true
                  :prefix (:precedence 200 :numargs 1)
      :tidier
         (( (:^+ output (:^+ group ?@bindings))
           !~>>- (:^+ output ,@bindings)))

      :checkers
         (( ?(:~ ?(:^+ output ?@(:* ?(:^+ bind-param ?output-var ?value))))
           (defect "Outputs must all be of form 'param <= exp'")))) >>
Plus "<=" (<bind-param>), which occurs in several other contexts:
<<Define bind-param-syn

   (def-syntactic bind-param :lex "<="
                             :infix (:precedence 110)
      :checkers
         (( (:^+ bind-param ?(:~ ?(:+ ?p is-Symbol)) ?_)
           (defect "Non-symbol to left of '<=': " p))
          <<Insert: bind-param-context>>)) >>

The checker for results calls the recursive checker result-check:

<<Define result-tree-checker

:. (def-syntactic result ... 
      ...
      :checkers
         (...  .:
          ( ?(:^+ result ?exp)
           (check exp result-check)) >>
<<Define result-syntax-checker

(def-named-checkers result-check
   (select 
      ( (:^+ forall ?vars ?r)
       (check vars
          (select
             ( (:^+ left-paren ?@vars)
              (forall-check vars decl-check))
             (:else
              (defect "Ill-formed variable bindings: " vars))))
       (check r forall-body-must-be-whens))
      ( (:^+ exists ?vars ?r)
       (defect "Existential quantifiers are illegal in results"))
      ( (:^+ when ?_ ?effect)
       (check effect result-check))
      ( (:^+ and ?@conjuncts)
       (forall-check conjuncts
          result-check))
      ( (:^+ group ?_)  ;; Expression of the form (a,b,c)
       (defect "Can't have <comma> as a connective"))
      ( (:^+ not ?elt)
       (check elt must-be-fun-app))
      ( (:^+ output ?@_)
      (:else
       must-be-fun-app))))

<<Insert: result-piece-checkers>>>>

def-named-checkers is so called because (as you may have noticed) it is used to give a name to a set of checkers. In this case the set is a singleton, a checker of the form (select –clauses–), which operates by running the checkers in the first clause whose pattern matches the current parsetree. Note that to apply a named checker set to the current parsetree, all you have to do is give the checker set's name. To apply one to something else you use check.

The notation (forall-check l –checkers–) indicates that every element of the list l is to be checked using the checkers. The resulting defects are accumulated.

The checker sets used by result-check are defined below. To understand the first one, which applies to the bindings of a forall, it's necessary to recall that declarations are parsed with a local grammar that ultimately can leave either <hyphen> or <comma>, as the main connective under <left-paren>, or, in the case of (forall (v) ...), leave a solitary variable as the only contents of the paren pair. The checker set decl-check works its way through these possibilities. The other two checker sets are fairly straightforward.

<<Define result-piece-checkers

(def-named-checkers decl-check
   (select 
      ( (:^+ hyphen ?@vars ?type)
       hyphen-checks)
      ( (:^+ comma ?@vars)
       (forall-check vars
          ( ?(:+ ?v (not is-Symbol))
           (defect "Variable is not symbol: " v))))
      ( ?(:+ ?v (not is-Symbol))
       (defect "Variable is not symbol: " v))))

(def-named-checkers forall-body-must-be-whens
   (select
      ( (:^+ when ?_ ?e)
       (result-check e))
      ( (:^+ and ?@conjuncts)
       (forall-check conjuncts
          forall-body-must-be-whens))
      (:else
       (defect "Body of 'forall' must be a '|->' or a conjunction"
               " of '|->' expressions")))) >>
<<Define must-be-atomic

(def-named-checkers must-be-fun-app
   ( (:~ (:^+ fun-app ?_ ?@_))
    (defect "Context requires atomic formula"))) >>
One thing these checkers do not check is whether each occurrence of a variable in a result is bound by a dominating forall. This constraint, like all long-range dependencies, can't be checked until the internalization phase, which is beyond the scope of this report.

2.5 The Bodies of Composite Processes

Unlike atomic and simple processes, composite processes have bodies, the stuff in left braces after the IOPR specs.

<<Define syn-braces

   (def-syntactic left-brace :lex "{"
                             :prefix (:precedence 0
                                      :bracket (:open right-brace)))

   (def-syntactic right-brace :lex "}"
                              :postfix (:left-precedence 0
                                        :bracket (:close left-brace))) >>
(See the syntactic definition for composite.) Some are defined using reserved words, others with special character sequences.
<<Define control-constructs

   (def-syntactic anyord :lex "||;"
                         :infix (:precedence 60 :flatten :true)
      :tidier elim-left-braces

      :checkers (control-construct-checks))


   (def-syntactic split-join :lex "||>"
                             :infix (:precedence 60 :flatten true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic split :lex "||<"
                        :infix (:precedence 60 :flatten true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic choice :lex ";?"
                         :infix (:precedence 60 :flatten true)
       :tidier elim-left-braces

       :checkers (control-construct-checks))

   (def-syntactic semicolon :lex ";"
                            :infix (:precedence 65 :flatten true)
       :tidier elim-left-braces

       :checkers (( (:^+ semicolon ?@elements)
                   (forall-check elements
                      check-down-to-control-constructs))))

   ;;; All control constructs must pass these two checks ...
   (def-named-checkers control-construct-checks
      ( (:^+ ?_ ?@elements)
       (forall-check elements
          check-down-to-control-constructs))

      (:up 1 control-context-check)) >>

Before getting to the details of the checkers for these operators, we must of course describe the tidier, elim-left-braces. Braces are used for grouping at the control-construct level the way parentheses are used elsewhere, but once they have served that purpose they are in the way. Hence we want to remove all but the outermost pair that hold the entire composite-process body together.

<<Define control-construct-tidier

  (def-named-tidier elim-left-braces
    ( (:^+ ?op [_*_] ?@statements)
     !~>>- (:^+ ?op [_*_]
                ,@(all-tidy statements
                     elim-left-brace))))

  (def-named-tidier elim-left-brace
     ( (:^+ left-brace ?s)
      !~>>- (tidy s elim-left-brace))
     (:else :okay)) >>

Now for the checkers used for <semicolon>, <anyord>, et al.:

<<Define control-construct-checkers


<<Insert: control-constants>>

   ;;; A control construct can contain only certain things
   (def-named-checkers check-down-to-control-constructs 
      ( ?(:~ ?(:^+ ?(:|| tag
                         with-namespaces
                         ?(:member +owl-s-control-operators+)
                         ?(:member +owl-s-taggables+))
                   ?@_))
       (defect "Illegal as element of composed process: " this-ptree))
      ( (:^+ with-namespaces ?e ?@_)
       (apply-named-checkers check-down-to-control-constructs e)))

   ;;; And it can appear only in certain places
   (def-named-checkers control-context-check 
      ( ?(:|| ?(:^+ if ?_ ?,start-ptree ?_)
              ?(:^+ if ?_ ?_ ?,start-ptree)
              ?(:^+ define composite ?_ ?_ ?,start-ptree)
              ?(:^+ ?(:member +owl-s-control-operators+) ?@_))
       (defect "Control construct in illegal context"))) >>
In a checker or tidier, this-ptree is bound to the parsetree being checked or tidied. Checkers may also refer to the variable start-ptree, which is the parsetree where the checking began. In an :up, start-ptreeis not necessarily the same as this-ptree. So the expression (:^+ if ?_ ?,start-ptree ?_) matches an "if" whose "iftrue" part is the tree we started at, one level below the current one.

The qvaroid ?(:member l) matches any datum in the list l. The list is typically a named constant. Here the two constant lists are

<<Define control-constants

(defconstant +owl-s-control-operators+
    '(anyord choice split-join split semicolon))

(defconstant +owl-s-taggables+
    '(perform produce as-process)) >>
The first list, +owl-s-control-operators+, is the set of composite-process classes that have components comprising bags of steps. The second, +owl-s-taggables+, is the set of "leaf" classes. The name derives from the fact that in the presentation syntax these are the constructs whose instances can be named using tags.
<<Define composite-process-leaves


   (def-syntactic perform :reserved true
                          :prefix (:numargs 1 :precedence 90)
      :tidier
         (( (:^+ perform (:^+ fun-app ?name [_*_] ?@pbs))
           !~>>- (:^+ perform ,name ,@pbs))
          (:else (defect "Unintelligible")))

      :checkers
         (( (:^+ perform ?process-name
                         ?(:* ?(:^+ bind-param ?process-param
                                               ?value))) )

          ( (:^+ perform ?(:+ ?process-name (not is-Symbol)) ?@_)
           (defect "Perform of process with illegal name " process-name))))

   ;; 'produce (p1 <= v1 , ...)' indicates bindings to outputs of this
   ;; process
   (def-syntactic produce :reserved true
                          :prefix (:numargs 1 :precedence 90)
      :tidier
         (( (:^+ produce (:^+ group ?@bindings))
           !~>>- (:^+ produce ,@bindings)))

      :checkers
         (( (:^+ produce ?@(:* ?(:^+ bind-param ?output-var ?value))) ))) >>
To refer to the outputs of a performed process, you must tag it and then refer to the outputs as "slots" of the tag.

It happens that the syntax of the parenthesized items after perform and produce is identical to that of the items after output, although they all serve different purposes. In each case, the items must be a sequence of parameter bindings, p <= v. (The characters "<=" are converted to the token <bind-param> .) For output and produce, they describe how the results of the current process are set; the former is for atomic and simple processes, and occurs in a result construct, while the latter occurs in the body of a composite process.

<<Define bind-param-context

:. (def-syntactic bind-param :lex "<="
                             ...
      :checkers
         (...      .:
          (:up 1
             (:? ?(:^+ ?(:|| output perform produce) ?@_))) :.)).: >>

Any leaf component of a control construct can be tagged. We indicate a tag using a double colon: The syntax is simple:

<<Define tag-syn

   (def-syntactic tag :lex "::" :infix (:precedence 70)
      :checkers
         (( (:~ (:^+ tag ?label (:^+ ?(:& ?op ?(:member +owl-s-taggables+))
                                     ?@_)))
           (defect "Tag applied to illegal entity"
                   "[" op "]"))

          ( (:^+ tag ?(:+ ?label (not is-Symbol)) ?_)
           (defect "Illegal tag: " label))

          (:up 1 control-context-check))) >>

The final control construct is as process(...){...}, which allows us to reuse the process syntax for pieces of a process, so that we can declare local effects, preconditions, and outputs:

<<Define as-syntax

(def-syntactic as :reserved true
                  :prefix (:precedence 75 :numargs 2)
   :tidier
      (( (:^+ as ?(:^+ process :local ?@ioprs)
                 ?body)
        !~>>- (:^+ as (:^+ process :local ,@ioprs)
                      ,body))
       (:else (defect "Ill-formed 'as process' construct")))) >>
The process spec following the token has the same syntax as a composite-process definition, except that the process has no name. So we have to broaden the syntax of process to include the anonymous case:
<<Define anon-process

 :.(def-syntactic process ...
                  ...
      :tidier .:
          (:? ?(:^+ process (:^+ group ?@ioprs))
             !~(:^+ process :local ,@ioprs))  >>
After this transformation, all processes have names; the anonymous ones have the name :local.

2.6 Ontologies

In case you've forgotten, we allow pieces of ontologies to be included into Owl-S, written in the N3 notation. Hence we must provide a local grammar for it.

<<Define n3-grammar-def

(def-grammar n3-grammar :parent owl-s
   :definitions
   (
<<Insert: n3-lexical>>

   <<Insert: ontology-def>>

   <<Insert: dot-syntax>>
    ))
>>
The declaration of grammar owl-s as the parent for n3-grammar is a bit dicey. We do it so that we can inherit the lexical and syntactic definitions of <colon>. But we have to make sure to avoid inheriting many of the infix operators of the algebraic language used by Owl-S, which are foreign to N3 (and RDF). (See appendix A.) We do so by declaring them illegal at the lexical level.
<<Define illegal-n3-ops

   (def-lexical #\~ (illegal))
   (def-lexical #\< (illegal))
   (def-lexical #\> (illegal))
   (def-lexical #\& (illegal))
   (def-lexical #\| (illegal))
   (def-lexical #\= (illegal)) >>

At the top of the precedence heap, so to speak, we find ontologies introduced by the operator ontology:

<<Define ontology-def

   (def-syntactic ontology :reserved true
                  :prefix (:precedence 5 :local-grammar n3-grammar)
      :checkers
         (( (:~ (:^+ ontology ?@(?& ?subs
                                    (:* ?(:|| ?(:^+ imports ?_)
                                              ?(:^+ ontology_version ?_)
                                              ?(:^+ triple ?@_))))))
           (forall-check subs
              ( ?(:& ?(:~ ?(:|| ?(:^+ imports ?_)
                                ?(:^+ ontology_version ?_)
                                ?(:^+ triple ?@_)))
                     ?x)
               (defect "Illegal element of ontology: " x))))))
 
   (def-syntactic imports :reserved true
                  :prefix (:precedence 6)
      :tidier
         (( (:^+ imports (:^+ comma ?@subs))
           !~>>- (:^+ imports ,@subs)))

      :checkers
         (( (:^+ imports ?@subs)
           (forall-check subs
              (select
                 ( ?(:+ ?_ is-name)
                  :okay)
                 ( ?(:^+ uri ?_)
                  :okay)
                 ( ?sub
                  (defect "Illegal import (not a name or URI): " sub)))))

          ( (:up ?(:~ ?(:^+ ontology ?@_)))
           (defect "'imports' can occur only at top level of ontology"))))

   (def-syntactic ontology_version :reserved true
                  :prefix (:precedence 6)
      :checkers
         (( ?(:^+ ontology-version ?(:+ ?_ (not is-String)))
           (defect "Ontology version must be string"))

          ( (:up ?(:~ ?(:^+ ontology ?@_)))
           (defect "'ontology_version' can occur"
                   " only at top level of ontology")))) >>
The versioning information is there for purely documentary purposes. The imports statement is more substantial, but at the syntactic phase the grammar can check only its most superficial properties.

The important elements of the ontology are, of course, the triples and type declarations, separated by dots:

<<Define dot-syntax

   (def-syntactic left-brace :lex "{"
                  :prefix (:right-precedence 0 :bracket (:open right-brace))
       <<Insert: brace-tidier>>
                   )

   (def-syntactic right-brace :lex "}"
                  :postfix (:left-precedence 0 :bracket (:close left-brace)))

   (def-syntactic dot :lex "."
                  :infix (:precedence 5
                          :numargs (0 1)
                          :flatten true)
       <<Insert: dot-tidier>>
       <<Insert: dot-checkers>>

   (def-syntactic semicolon :lex ";"
                  :infix (:precedence 10 :flatten true)
       <<Insert: semicolon-tidier>>
<<Insert: semicolon-checkers>>
   <<Insert: n3-is-and-of>>
  <<Insert: triple-tidy>>
  <<Insert: n3-lists>>
    <<Insert: n3-blank-nodes>>
      <<Insert: path-expressions>>
    )

   (def-syntactic hyphen :lex "-"
                  :infix (:precedence 18 :flatten false))

   (def-syntactic || :lex ""
                  :infix (:precedence 20 :flatten true))

  <<Insert: contig-checkers>>

   (def-syntactic comma :lex ","
                  :infix (:precedence 30 :flatten true))
    
   <<Insert: n3-term-checkers>>
     >>
These precedences put <dot> highest in an N3 parse, not surprisingly. It may seem natural to think of "." as a three-argument postfix operator, but Lexiparse doesn't allow postfix operators. Besides, it's really not the dot that makes the terms stick together; as shown by the way semicolons work, it's simple contiguity that means "property attribution" in N3. So we recruit the contiguity operator <||>, discussed in section 2.3. Its precedence is set below that of <comma>, so that "x p a,b,c" gets parsed as
 
   (:^+ || x p (:^+ comma a b c))
Then <semicolon> is given lower precedence than <||>, so that "x p1 a1; p2 a2; p3 a3" gets parsed as
   (:^+ semicolon (:^+ || x p1 a1)
                  (:^+ || p2 a2)
                  (:^+ || p3 a3))
Here syntax doesn't follow semantics exactly. But we can fix that easily with a tidying rule.

We use a cluster of tidiers here to translate all occurrences of contiguity below braces, dots, and semicolons into something more revealing. In particular, we want (:^+ || x y z) to become (:^+ triple x y z) and (:^+ || x y) to become (:^+ prop-attrib x y). [These rules could be simpler if contiguity didn't play a second role as list separator in N3, discussed below.]

<<Define brace-tidier

       :tidier
          (( (:^+ left-brace ?(:& ?top-level-trip (:^+ || ?@subtrees)))
            !~>>- (:^+ left-brace ,(tidy top-level-trip triple-tidy)))
           ( (:^+ left-brace (:^+ dot ?@subtrees))
            !~>>- (:^+ left-brace (:^+ dot ,@(all-tidy subtrees triple-tidy))))) >>
<<Define dot-tidier

:.  (def-syntactic dot .:
       :tidier
          (( (:^+ dot ?@subtrees)
            !~>>- (:^+ dot ,(all-tidy subtrees triple-tidy)))) >>
<<Define semicolon-tidier

:.  (def-syntactic semicolon .:
       :tidier
          (( (:^+ semicolon (:^+ || ?subj ?prop ?obj)
                            ?@doubles)
            !~>>- (:^+ multi-attrib
                       ,subj
                       (:^+ prop-attrib ,prop ,obj)
                       ,@(all-tidy doubles double-tidy)))
           (:else (defect "Semicolon not preceded by triple")))) >>
<<Define triple-tidy

  (def-named-tidier triple-tidy
     ( (:^+ || ?@subtrees)
      !~>>- (tidy this-ptree
               ( (:^+ || ?x ?y ?z)
                !~>>- (:^+ triple ,x ,y ,z))
               ( (:^+ || ?_ ?_ ?_ ?@_)
                (defect "Triple with more than three elements"))
               (:else
                (defect "Triple with fewer than three elements"))))
     ( (:^+ hyphen ?subj ?type)
      :okay)
     ( (:^+ multi-attrib ?subj ?@doubs)
      :okay)
     (:else
        (defect "Illegal dot component " this-ptree)))

  (def-named-tidier double-tidy
     ( (:^+ || ?prop ?obj)
      !~>>- (:^+ prop-attrib ,prop ,obj))
     (:else (defect "Semicolon followed by non-double"))) >>
These tidier rules introduce the operator prop-attrib to govern the fragments introduced by semicolons, which are themselves renamed multi-attribs.

N3 allows the usual triple order to be reversed, so you can write "obj is prop of subj" instead of "subj prop obj." The best way to think of this is that "is p of" means "inverse of p," because "is p of" has the same syntactic role as p by itself would (with reversed semantics, of course). The grammar is pretty simple:

<<Define n3-is-and-of

   (def-syntactic is :reserved true
                  :prefix (:precedence 40)
      :tidier
         (( (:^+ is (:^+ of [_*] ?prop))
           !~>>- (:^+ inverse ?prop))
          (:else (defect "'is' not followed by 'of' after " prop))))

   (def-syntactic of :reserved true
                  :postfix (:precedence 45)
      :checkers
          ( ( (:^+ of ?x)
             (defect "Stray occurrence of 'of'")))) >>
One thing to be aware of is that the introduction of inverse properties means that our use of labels such as ?subj for the first element of a a triple is misleading; if the property is inverted, then the first element is actually the object, and its syntax is slightly different. Similar remarks apply to the ?obj slot.

Now for the syntax of subjects, predicates, and objects. We need a few more operator definitions and then some checkers. N3 allows lists of objects separated by spaces to appear as the object element of a triple.

<<Define n3-lists

  (def-syntactic left-paren :lex "("
                 :prefix (:precedence 0 :close (:open || right-paren))
     :tidier
        (( (:^+ left-paren ?@elements)
          !~>>- (:^+ list ?@elements))))

  (def-syntactic right-paren :lex ")"
                 :postfix (:precedence 0 :close (:open left-paren)))

>>
When we say "spaces," we mean of course "contiguity"; which is slightly problematic, because this gives the contiguity operator two fundamentally different roles in N3. In triples it plays a basic semantic role; in lists it is a mere separator. We make checking simpler by replacing all occurrences of || with the appropriate context-dependent substitute. The rules for introducing triple and prop-attrib were given above. In the list context, the contiguity operator is made the separator for left-paren, causing it to be "flattened" away. Then, after tidying, (:^+ left-paren a b c ...) becomes (:^+ list a b c ...).

Another class of terms in N3 are square-bracketed terms, which stand for RDF blank nodes. The term [ prop y] means "some object x such that x has value y for property prop"; unless the prop is inverted, when it means "some object x such that y has value x for property prop."

<<Define n3-blank-nodes

    (def-syntactic left-bracket :lex "["
                   :prefix (:precedence 0 :bracket (:open right-bracket))
       :tidier
          (( (:^+ left-bracket ?rel ?thing)
            !~>- (:^+ something-with ?rel ?thing))
           ( (:^+ left-bracket ?_ ?_ ?_ ?@_)
            (defect "Square brackets around more than two expressions"))
           (:else
            (defect "Square brackets around less than two expressions"))))
            
    (def-syntactic right-bracket :lex "]"
                   :postfix (:precedence 0 :bracket (:close left-bracket))) >>

Finally, we augment the class of terms with path expressions of the form term!prop and term^prop, which are short for [ is prop of term] and [prop term] respectively.

<<Define path-expressions

      (def-lexical-tokens ((#\! object-of)
                           (#\^ subject-of)))

      (def-syntactic object-of :lex "!"
                     :infix (:precedence 60)
         :tidier
            (( (:^+ object-of ?term ?prop)
              !~>>- (:^+ some ?prop ?term))))

      (def-syntactic subject-of :lex "^"
                     :infix (:precedence 60)) >>
There is no particular reason to tidy "!" expressions so that (:^+ object-of lakshmi mother) becomes (:^+ some mother lakshmi), except that it's easier to read quickly.

We now bring order to this set of operators by supplying checker rules. Once the tidying phase is complete, there should be no remaining occurrences of ||:

<<Define contig-checkers

  (def-checkers ||
     ( (:^+ || ?@_)
      (defect "Illegal occurrence of " ||))) >>
Similarly, dots should occur only below left braces:
<<Define dot-checkers

:. (def-syntactic dot .:
       :checkers
          ( ( (:up 1 ?(:~ ?(:^+ left-brace ?_)))
             (defect "Dot in illegal context")))) >>

Now we get into more substantive checkers, of the operators that semicolon transformed itself and its subtrees into. Recall that "semicolon" got transformed into multi-attrib.

<<Define semicolon-checkers


   (def-checkers multi-attrib

           ( (:^+ multi-attrib ?subj ?@subtrees)
            (check subj legal-subject)
            (forall-check subtrees
                double-check))
           
           ( (:up 1 ?(:~ ?(:^+ dot ?@_)))
            (defect "Semicolon in illegal context")))

   (def-checkers triple

       (select
          ( (:^+ triple ?x (:^+ inverse ?prop) ?y)
           (check x legal-object)
           (check prop legal-property)
           (check y legal-subject))
          ( (:^+ triple ?x ?prop ?y)
           (check x legal-subject)
           (check prop legal-property)
           (check y legal-multiple-object)))

       ( (:up 1 ?(:~ ?(:| ?(:^+ dot ?@_)
                           ?(:^+ left-brace ?_))))
        (defect "Triple in illegal context")))

   (def-checkers prop-attrib

       (select
          ( (:^+ prop-attrib (:^+ inverse ?prop) ?y)
           (check prop legal-property)
           (check y legal-subject))
          ( (:^+ prop-attrib ?prop ?y)
           (check prop legal-property)
           (check y legal-multiple-object))))

   (def-checkers comma

      ( (:up 1 ?(:~ (:^+ prop-attrib ?@_)
                    (:^+ triple ?@_)))
       (defect "Comma in illegal context")))

        >>
Not surprisingly, prop-attrib is checked exactly like a triple with its first element missing.

When we finally get down to terms, we need a different checker for each of several different possible syntactic contexts.

<<Define n3-term-checkers

   (def-named-checkers legal-subject

      ( ?(:~ ?(:| ?(:+ ?_ is-name)
                   ?(:^+ subject-of ?@_)
                   ?(:^+ some ?@)))
       (defect "Illegal subject of triple")))


   (def-named-checkers legal-multiple-object

      ( ?(:~ ?(:| ?(:+ ?_ (or is-name is-String is-Number))
                   ?(:^+ subject-of ?@_)
                   ?(:^+ comma ?@_)
                   ?(:^+ some ?@)))
       (defect "Illegal object of triple"))

     ( (:^+ comma ?@objects)
      (check objects legal-object)))

   (def-named-checkers legal-object
      ( ?(:~ ?(:| ?(:+ ?_ (or is-name is-String is-Number))
                   ?(:^+ something-with ?@_)
                   ?(:^+ some ?@)))
       (defect "Illegal object of triple")))

   (def-named-checkers legal-property
      ( ?(:~ ?(:+ ?_ is-name))
       (defect "Illegal property in triple"))) >>

2.7 Formulas

A key problem in the development and implementation of Owl-S is the inclusion of logical formulas at various places. Because there is no official declarative logical language for use in Owl, we want to allow for a large, and indeed open-ended, set of possibilities. A related issue is the provision for an open-ended set of literals. We can treat strings, numbers, and perhaps a few other kinds of literals as built in, but Owl, being built on RDF, allows any datatype that has a URI as its conventional name.

We introduce two new operators. A literal is written "URI $ "lit"," and a proposition is written "name $$ { formula }." In the former case the lit is delimited by double quotes, like a string. In the latter case, the formula is delimited delimited by curly braces, but it's still essentially a string. What this means is that dealing with foreign notations is mainly a lexical problem. All we do is get the notation into the system as a string; it must presumably be parsed by some pluggable machinery acting as ambassador from the foreign notation, but that's someone else's problem. The normal string-parsing rules are suspended for literals and formulas, so we need two little local grammars. The reason is that we want very "lightweight" escape sequences. In a literal, we have to be able to escape occurrences of double quote, and we will use backslash for that purpose as usual, but we don't want to impose any other rules on the string, and in particular we don't want to have to escape most occurrences of backslash. So we need just two escape sequences: \" puts a double quote into the string, and \\" puts a backslash followed by a double quote into the string. We have analogous requirements for formulas, at least for the standard case.

<<Define formula-lexers

   (def-lexical #\$
       (dispatch
           (#\$ (token double-dollar))
           (:else (token dollar)))) 

   (def-syntactic dollar :lex "$"
      :infix (:precedence 90
              :local-grammar literal-gram)

      :checkers
         ( ( ?(:^+ dollar ?(:~ ?(:|| ?(:+ ?notation is-name)
                                     ?(:+ ?notation is-uri)))
                          ?_)
            (defect "Illegal notation"))

           ( ?(:^+ dollar ?_ ?(:+ ?formula (not is-string)))
            (defect "Formula must be expressed as a string"))))

   (def-grammar literal-gram
      :definitions
         (
          (def-lexical #\"
             (lex-string (("\\\"" "\"")
                          ("\\\\\"" "\\\"")
                          ("\"" :term))))
         ))   >>

The behavior of $$ is slightly subtler. Not all formula languages are really "foreign." In particular, there are three languages we would like to "support" in the sense that instead of treating their expressions as strings we actually parse them; and whatever internalization system we develop for Owl-S must also correctly internalize formulas in these notations. They are:

  1. N3: Because we are already supporting the inclusion of ontologies written in N3, it makes sense to allow it to be used to express preconditions, results, and such.
  2. SWRL: SWRL is essentially an encoding of rules in RDF using a predicate-calculus-like notation for Owl constructs.
  3. DRS, SWRL-FOL, RuleML, ...: Okay, exactly how many of these we "support" is not clear. But it's pretty clear what it would mean to support them.
For a language in this privileged list, we introduce the language name as a prefix operator, so that $$ must be interpreted as a prefix operator as well. Now N3 $$ P is parsed as (:^+ N3 (:^+ $$ [*_] P$)), where P$ is the parsetree for P in N3. For an "unprivileged" language L, L $$ P is parsed as (:^+ $$ [_*_] L P), with P being preserved as a string.

Here is how we encode all that:

<<Define formula-parsers

   (def-syntactic double-dollar :lex "$$"
      :infix (:precedence 90 :local-grammar formula-gram)
      :prefix (:precedence 90)
      :checkers
         <<Insert: dollar-checkers>>
    )

   (def-grammar formula-gram :parent owl-s
      :definitions
       (
         (def-lexical #\{ (lex-string (("\\}" "}")
                                       ("\\\\}" "\\}")
                                       ("}" :term))))
        ))
   
   (def-syntactic n3 :reserved true
      :prefix (:precedence 80 :local-grammar n3-grammar)

      :tidier
         (( (:^+ n3 (:^+ double-dollar [*_] ?formula))
           !~>>- (:^+ n3 ?formula))
          (:else
           (defect "n3 must be followed by $$ and formula in braces"))))
   >>
Any prefix occurrence of $$ must be tidied away by the specific language operator to its left, which accounts for the following checker:
<<Define dollar-checkers

:.    :checkers .:
         ( ( (:^+ double-dollar [*_] ?@_)
            (defect "$$ must have a notation name to its left"))) >>

After all that, suppose someone simply includes a formula in, say, an Owl-S precondition, without benefit of any $$ or the like? Nothing! One nice thing about a Lexiparse language is that anything that isn't forbidden is permitted. All the arithmetic expressions from the standard arithmetic grammar of appendix A are parsed correctly (using operators with precedences in the range 100 to 200). To that grammar we add a few facilities, including the usual quantifiers:

<<Define quantifiers

   (def-syntactic forall :reserved true
                         :prefix (:numargs 2 :precedence 90
                                  :local-grammar (typed-var-grammar nil))))

   (def-syntactic exists :reserved true
                         :prefix (:numargs 2 :precedence 90
                                  :local-grammar (typed-var-grammar nil))) >>

Finally, in table Cross Precedence, we show which operators a given operator "dominates." The row for operator i shows how it behaves when competing with other operators to its right. The first column shows how it behaves when matched against itself. The second column shows all the operators whose left precedence is greater than operator i. If operator j appears in the list, it will tend to appear in trees headed by operator i. (A bullet to the left of an operator's name in the row labels indicates that it is an infix operator, to minimize confusion about operators with infix and prefix versions.) Of course, this table gives only a crude indication of how each operator interacts with the entities below it in the parsetree; the details are spelled out in the syntactic spec for the operators in question.

Table 2-3:Operator precedence comparisons
Operators on right
Operator
on left
Associa
tivity
Dominated tokens

2.8 Comments

Comments in Owl-S are similar to Java comments, with a couple of differences. Multiline comments are bounded by /*and */. Single-line comments start with //. In Java, as in C/C++, multiline comments cannot be nested. Owl-S pays attention to nesting. In Java, a doubled asterisk in the opening of a multiline comments means that it should be used by Javadoc. In Owl-S, it means that the comment should be transformed into an rdfs:comment declaration. [[ This is harder than it sounds ]]

Lexiparse handles comments mainly at the lexical level. Comments "float above" other syntactic material, and get swept up into whatever parsetrees they are closest to. They are invisible to tidiers and checkers. (They should be visible to parsetree pretty-printers and other debugging tools.) However, internalizers are free to do what they want with them, including turning them into rdfs:comments, which is what should happen with "important" comments.

Here are the definitions of the comment characters

<<Define comments

   (def-lexical #\/
      (dispatch
         (#\* (dispatch
                 (#\* (comment
                         :style (:multiline "*/" 0 :important ("/*" "*/"))))
                 (:else
                    (comment
                       :style (:multiline "*/" 0 ("/*" "*/"))))))
         (#\/ (dispatch
                 (#\/ (comment :style (:one-line 2)))
                 (:else (comment :style (:one-line 1)))))
         (:else (token divide)))) >>
Hopefully these declarations are not too mystifying.

3 Syntactic Analyses

[[ This section will eventually include pretty-printed parsetrees generated by the Owl-S grammar when applied to the Congo example. ]]

Appendix A: Standard Arithmetic Syntax

This is the grammar that most of the Owl-S grammar is based on.

<<Define arithgram [arithgram, line 2]
(in-package :lexiparse)
;; $Id: arithgram.lisp,v 1.9 2006/07/13 17:04:42 dvm Exp $

(depends-on %lexiparse/ types)

(depends-on (:at :run-time) %lexiparse/ syntax-def)

(eval-when (:compile-toplevel :load-toplevel :execute :slurp-toplevel)
   (export '()))

(def-grammar standard-arith-syntax)

(with-grammar standard-arith-syntax

(def-lexical-tokens  ((#\( left-paren)
		      (#\, comma)
		      (#\) right-paren)
		      (#\~ not)
		      (#\- minus)
		      (#\+ plus)
		      (#\* times)
		      (#\/ divide)
		      (#\< less)
		      (#\& and)
		      (#\| or)))

(def-lexical #\= (dispatch
                    (#\< (token leq))
                    (else (token equals))))

(def-lexical #\> (dispatch
                    (#\= (token geq))
                    (else (token greater))))

(def-lexical (#\0 - #\9) #'lex-number
    :name numeric)

(def-lexical (#\a - #\z #\A - #\Z) #'one-char-sym
    :name alphabetic)

(def-lexical #\" (lex-string (("\\" :escape)
                              ("\"" :term))))

)

;;; Except for parens' internal precedences, the entire standard-
;;; arith system has precedences >= 100.  This leaves plenty of room
;;; for "superstructure."


(with-grammar standard-arith-syntax

<<Insert: comma>>

(def-syntactic right-paren :lex ")"
               :postfix (:left-precedence 0 :bracket (:close left-paren)))

;; The left-prec of left-paren (as infix) is 200, which is higher than
;; the right-prec 190 for times and divides.  It's rare to see
;; a op b(...) get parsed as (a op b)(...), but it can be arranged by
;; giving 'op' a right-precedence >= 200.  -- 

(def-syntactic left-paren
        :lex "("
	:prefix (:precedence 0 :bracket (:open right-paren))
	:infix (:left-precedence 200 :right-precedence 0
				 :bracket (:open right-paren)))

(def-syntactic or :lex "|" :infix (:precedence 120 :flatten true))

(def-syntactic and :lex "&" :infix (:precedence 130 :flatten true))

(def-syntactic not :lex "~" :prefix (:precedence 140))


(def-syntactic greater :lex ">" :infix (:precedence 160))

(def-syntactic less :lex "<" :infix (:precedence 160)) 

(def-syntactic geq :lex ">=" :infix (:precedence 160)) 

(def-syntactic leq :lex "=<" :infix (:precedence 160)) 

(def-syntactic equals :lex "=" :infix (:precedence 160))

(def-syntactic plus :lex "+"
		    :prefix (:precedence 180 :numargs 1)
                    :infix (:precedence 180 :flatten true))

(def-syntactic minus :lex "-"
		     :prefix (:precedence 180 :numargs 1)
                     :infix (:precedence 180))

(def-syntactic times :lex "*" :infix (:precedence 190 :flatten true))

(def-syntactic divide :lex "/" :infix (:precedence 190 :flatten true))

)
>>

Appendix B: OWL-S Syntax: Top-Level

For the complete, "(un)tangled," listing for this file, see owl-s-syn.lisp

<<Define File owl-s-syn-new.lisp
;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*-
(in-package :lexiparse)

(depends-on %module/ ytools lexiparse)

(depends-on %lexiparse/ arithgram)

#+allegro
(defun allegro-case-choice ()
   (cond ((eq excl:*current-case-mode* ':case-sensitive-lower)
          ':preserve)
         (t ':up))) 

<<Insert: grammar-def>>

<<Insert: result-syntax-checker>>

<<Insert: must-be-atomic>>

(with-grammar owl-s

;;; LEXICAL

   (def-lexical-tokens ((#\{ left-brace)
                        (#\} right-brace)
                        (#\. dot)))

   ;; ;? = choice, ; = sequence
   (def-lexical #\;
         (dispatch (#\? (token choice))
                   (:else (token semicolon))))

   (def-lexical #\| 
   ;; ||; = any-order, ||> = split+join, ||< = split.
   ;; '|->' becomes the token <when> --
                    (dispatch     
                       (#\| (dispatch 
                               (#\; (token anyord))
                               (#\> (token split-join))
                               (#\< (token split))))
                       (#\- (dispatch (#\> (token when))))
                       (:else (token or))))

   ;; '->' is ordinary if-then --
   (def-lexical #\- (dispatch
                       (#\> (token imply))
                       (:else (token minus))))

   ;; '=>' becomes the token <when>, although '|->' is now the preferred form
   (def-lexical #\= (dispatch
                       (#\> (token when))
                       (:else (token equals))))

   ;; '<=' becomes the token <bind-param>
   (def-lexical #\< (dispatch
                       (#\= (token bind-param))
                       (:else (token less))))

   ;; '::' is for step tags --
   (def-lexical #\: (dispatch
                        (#\: (token tag))
                        (:else (token colon))))

   (def-char-set owl-s-lexical-chars* (#\_))

   (def-lexical (#\_ #\a - #\z #\A - #\Z)
       (lex-sym (owl-s-lexical-chars))
       :name alphabetic)

   <<Insert: formula-lexers>>

   <<Insert: comments>>

;;; SYNTACTIC

   <<Insert: namespace-spec>>

   <<Insert: uri-spec>>

   <<Insert: process-definition>>

   ;; __::= atomic <process-spec>
   (def-syntactic atomic
                  :reserved true
                  :prefix (:precedence 15 :numargs 1)
      :checkers
         (only-below-define))

   ;; __::= simple <process-spec> 
   (def-syntactic simple :reserved true
                  :prefix (:precedence 15 :numargs 1)
      :checkers
         (only-below-define))

   ;; __::=  composite <process-spec>... { ...
   (def-syntactic composite :reserved true
                  :prefix (:precedence 15 :numargs 2)
      :tidier
        (( (:^+ composite ?proc (:^+ left-brace ?body))
          !~>>- (:^+ composite ,proc ,body))
         ( (:^+ composite ?proc ?b)
          (defect "Ill-formed body for composite process " proc
                  :% " (must be surrounded by braces): " b)))
      :checkers
         (only-below-define))
      
   (def-named-checkers only-below-define
       ( (:up 1 (:~ (:^+ define ?@_)))
        (defect "Should appear only immediately inside 'define'")))

   <<Insert: process-spec>>

   <<Insert: param-decls>>

   <<Insert: preconds>>

   <<Insert: results>>

   <<Insert: output-syntax>>

<<Insert: colon>>

<<Insert: typed-vars-tidier>>

<<Insert: typed-vars-checker>>

   <<Insert: quantifiers>>

   <<Insert: syn-when>>

   <<Insert: syn-braces>>

   <<Insert: bind-param-syn>>

   <<Insert: control-constructs>>

  <<Insert: control-construct-tidier>>

<<Insert: control-construct-checkers>>

   <<Insert: if-syntax>>

   <<Insert: then-else-syntax>>

<<Insert: composite-process-leaves>>

<<Insert: as-syntax>>

   <<Insert: tag-syn>>

   <<Insert: left-paren>>

   <<Insert: formula-parsers>>



   (def-syntactic || :lex "" :infix (:precedence 3 :flatten true))

   (def-syntactic dot :lex "." :infix (:precedence 205)
      :checkers
      ((:? ?(:^+ dot ?step ?output)
          (nconc (cond ((is-Symbol step) !())
                       (t (list (defect "Illegal name to left of dot in "
                                        step "." output))))
                 (cond ((is-Symbol output) !())
                       (t (list (defect "Illegal name to right of dot in "
                                        step "." output))))))))
)


;;; LOCAL GRAMMAR for type declarations

<<Insert: typed-var-grammar>>

   <<Insert: typed-var-contiguity>>

   <<Insert: typed-var-paren>>

(defun is-name (x)
   (or (is-Symbol x)
       (matchq ?(:^+ name-wrt-space ?_ ?_)
               x))) >>

Appendix C: N3 Syntax: Top-Level

This is the outline of the N3 syntax discussed in section 2.6, plus a few key pieces not covered there.

<<Define File n3-syn-new.lisp
;-*- Mode: Common-lisp; Package: lexiparse; Readtable: ytools; -*-
(in-package :lexiparse)

(depends-on %module/ ytools lexiparse)

<<Insert: n3-grammar-def>>

>>
<<Define n3-lexical


   <<Insert: illegal-n3-ops>>

   (def-lexical #\. (token dot))

   (def-lexical #\"
      (dispatch
         (#\" (dispatch
                 (#\" ;; Pythonish strings
                  (lex-string (("\\" :escape)
                               ("\"\"\"" :term))))
                 (:else ""))
         (:else (lex-string (("\\" escape)
                             ("\"" :term)))))))
            
   (def-lexical-tokens ((#\, multiple-props)
                        (#\! an-object)
                        (#\^ a-subject)
                        (#\[ left-bracket)
                        (#\] right-bracket)
                        (#\( left-paren)
                        (#\) right-paren)))

     >>

Bibliography

Coa05
The OWL-S Coalition. OWL-S Release 1.1beta. Recent draft, 2004.
Knu84
Donald E. Knuth. Literate programming. The Computer Journal , 27(2):97-111, 1984.
McD04a
Drew McDermott. The YTools Manual, version 1.4
McD04b
Drew McDermott. Lexiparse: A Lexicon-based Parser for Lisp Applications. Draft, 2004.
MH04
Deborah L. McGuinness and Frank van Harmelen (eds.) OWL Web Ontology Language Overview. W3C Recommendation 2004
Rei78
Raymond Reiter. On closed world data bases. In Gallaire and Minker (eds). Logic and Databases, Plenum Press.