Proposal: Invoking XMLP Services Directly from Web Forms

John J. Barton John_Barton@hpl.hp.com
Gaurav Misra
Hewlett Packard Labs

Copyright 2001 Hewlett Packard

Introduction

The new XML Protocol effort supports the interaction of two computer programs via XML messages. Web fill-in forms have been used extensively to allow humans to interact with computer programs. This proposal explores the potential for web fill-in forms to use the new XML Protocol.

Allowing web clients to submit forms directly in XMLP format reduces the complexity of some kinds of applications. It allows XMLP service to be developed for both human and machine inputs without pointless reformatting front ends. It allows debugging of XMLP services directly with web browsers. Furthermore, a system of XMLP-based services could arise that users could compose by applying ad-hoc human intelligence rather than the planned intelligence of a developer.

The basis for this proposal is the XML-based Web fill-in form approach called XFORMs. The XFORMs Working Group documents have solicited an XML Protocol-based submission solution. Thus in part this proposal addresses that request.

The basic mechanism for integrating XML Protocol and XFORMs can be quite easily described and seems likely to be quite powerful in its application. An XFORM server simply places a complete XML Protocol message in the instance element of the xform descriptor in the web page header. The Web browser then has a local copy of the XMLP message. Since the body of the XML Protocol message is serialized XML data, the values in that data can be addressed by XFORMs operators and presented to human users for modification. When the submit button is pressed, the now-modified XMLP message is sent to the server (or rather to the destination address supplied with the form).

The mechanism proposed here is might be adaptable to XHTML web browsers that do not support the complete XFORMs solution. This proposal discusses one approach to XHTML.

The submission of an XMLP message using the XFORMs postXML method will not be quite adequate unless the XFORMs Working Group changes postXML. Therefore this proposal is write as a prototype specification for a form submission method called postXMLP to parallel the postXML method in XFORMs. The primary additional specification we add here concerns submission of binary data.

(I will switch between XMLP and SOAP, using XMLP when I talk about abstractions or W3C and SOAP when I talk about specific protocols. This is an expression of the obvious: SOAP is concrete prototype for a potential W3C standard called XMLP.)

 

Relation to XFORM proposals

The XForms project [2] defines a layered, XML based approach for defining interactive forms. The existing documents refer to a postXML submission method and the submission of XML-encoded data. We propose an additional submission method based on XMLP.

Form submission is a component in an overall messaging model. A messaging model defines how messages are packaged and transmitted for a given service. For XForms, this primarily relates to how forms are transmitted to the user and submitted by the user. The model defines what data to send, in what format, and what results to expect; for the user, it also defines the actions taken to submit a form.

The recently proposed XML Protocol (XMLP) [1] is uniquely suited for XForms transmission because it is also XML based and it is specifically designed to address the complete range of issues associated with XML communications. As we will describe below, the integration of XMLP into the current XForms proposal would require only minor changes to the current proposal and the addition of some specifications relating to the XMLP protocol itself.

XMLP is the current W3C proposal for a standardized message format, designed for data encapsulation and data serialization, and based on non-exclusive transport mechanisms [1]. The W3C working group has taken Simple Object Access Protocol [5], as its starting point for discussions.These attributes of XMLP recommend it for form submission:

XMLP is independent of transport mechanisms.
Devices would not be required to implement any specific transport mechanism, which could be restrictive to device design. Devices will also not be required to implement multiple transport mechanisms, which may be beyond the capabilities of smaller devices.
XMLP uses asynchronous messaging.
Since there is no connection layer specification for XMLP, an XMLP-enabled device does not need to engage in any sort of communications session with its target server. So, devices that are not always connected to a network can receive and transmit XForms at the user's discretion.
XMLP is written in XML.
Since XForms is also written in XML, any XForms enabled device will already have an XML parser. This parser can be adapted for reuse in processing the XMLP messages. This reduces the additional software and processing requirements, which are significant factors for smaller devices.
XMLP provides data serialization.
XMLP provides a general mechanism for data serialization into XML before transmission, and de-serialized upon reception. XFORMs content types can be handled easily and more complex data to be transmitted as well if XFORMs extensions were explored.
XMLP provides a framework for binary data.
The XForms Data Model provides an outline for the inclusion of MIME-typed binary data (Section 4.9) [3] but does not yet provide a solution as to how that data should be packaged with an XForm. However, there is already a proposal for integrating MIME attachments with SOAP/XMLP messages; it defines how binary data can be co-packaged with SOAP/XMLP messages using MIME attachments [SOAP Attachments].
XMLP is a W3C endeavor.
Since XMLP is being developed by the W3C, significant industry support can be expected. There already exists good support for SOAP-enabled products.

Integrating XMLP and XForms would only require the addition of a submission method using XMLP. Existing submission methods need not be effected by this proposal.

Motivating Examples

To the extent that web services become formulated in terms of XMLP messages, the ability to edit these messages within a web browser allows human interaction into a system designed for machine-to-machine operation. Unless we expect web services to be significantly more reliable and adaptable than other kinds of software, this direct interaction capability seems valuable and important all by itself.

Beyond desktop browsers, XFORMs seeks to address digital "appliances", small handheld special purpose devices. Therefore consider a user intent on adding photos to their online album, a wireless digital camera with a TCP/IP stack and a photo album service. An application can be built to acheive this that would have parts in the camera and parts in the service. If the camera could interpret simple XFORMs and allow images to be attached to INPUT elements with type=binary on the form, then form submission would accomplish image upload. The service could be programmed to accepted these forms and add the image to the album.

If the form client submitted XMLP messages and the service were programmed to accept XMLP messages, then that service could also be used by desk top computer file upload applications, both controlled by a program or by a human operating through a web browser. Other clients like scanners can upload to the service using the same form submission mechanism. A scanner that created an incompatible image format for the photo album could be accomodated by simply inserting a tranformation service in the pipeline between the client and the photo album service. In these cases we can switch between a human filling in a form and a programming sending a message since the form client sends messages in the same format as program would.

It is the case that this mechanism only applies when the service is designed within the limitations of the content types that a form-fill-in client can offer. The spectrum of applications on the World-Wide Web testify to the efficacy of working within such limitations. In addition we envision a greater breadth of content types arising from web browsers extended with sensors.

Method=postXMLP

(This section is intended to give you something to complain about...)

All aspects of the form specification for method=postXML apply to method=postXMLP except for the format of the data posted by the client. Specifically the user of the form should not be able to detect a difference except by reading the source for the page.

Within the header of a page with a form to be sent via method=postXMLP there must be an xform element with an instance child. The child of this instance element must have the format of a valid SOAP 1.1 message. The values of elements in this message represent the initial state for form fill in. The form fill in operation may change the values of elements. When the form is submitted, the content within the instance element, with modified values, is submitted. In other words, the form filling operation edits the orginal SOAP 1.1 proto-message.

Consequently the XFORMs schema may have to be adapted to allow the SOAP overhead elements (envelop, header, body and others) to appear without XFORMs failure.

The data stream sent under method=postXMLP will be a SOAP 1.1 message. The SOAP body of this message will contain well-formed XML consisting of the form instance data serialized according to SOAP 1.1 rules. Since every form instance datatype is derived from an XML Schema datatype [XSchema-2] and every XML Schema datatype can be serialized using SOAP 1.1 rules, then every instance datatype has a SOAP 1.1 serialization. Generally the SOAP serialization is a natural one for XML. Informally one can say that the SOAP 1.1 serialization is character equivalent to the XML instance data representation discussed in the XFORMs documentation. The formal connection is given by these rules:

XFORMs datatype SOAP Serialization
atomic (5.3.1) simple value
enumerations (5.3.2) enumerations (section 5.2.2 in SOAP 1.1)
groups (5.3.3) and switches (5.3.6) compound values with the XFORMs name attribute used for the SOAP accessor.
arrays (5.3.5) arrays
unions (5.3.4) polymorphic accessors, (section 5.3 in SOAP 1.1)

Note that all of the XFORMs datatype can be serialized; there maybe SOAP serializations that cannot be put into forms. (like...)

The remaining XFORMs datatype is binary. SOAP 1.1 is strictly an XML protocol: it does not have a mechanism for sending non-XML data. Consequently binary data can be sent only if it is either encoded or attached to the SOAP message. SOAP Messages with Attachments [Attachments] provides an attachment mechanism quite similar to the multipart/form-data submisssion mechanism in XFORMs. We adapt this mechanism here.

If a form contains any binary data items, then the submitted form will have MIME structure Multipart/related. The first or root part will be a SOAP 1.1 message formatted as described above. For each binary data item in a form, the submitted SOAP message will have a MIME part whose type matches one of the mediaType values given in the form model. The Content-Location value (a URI) in the MIME header for that MIME part will be referenced by an href attribute of the corresponding form instance data element in the SOAP body.

If relative URI addressing is used as described in SOAP Messages with Attachments [Attachments], the XFORMs binding expression for the binary element can be used as the URI in the instance data href and the corresponding MIME type Content-location header. The binding expression looks like an XPATH through the form instance data.

Here is an example of a form with a string and a binary element:

<xform  xmlns="http://www.w3.org/2000/xforms"
  id="photo_upload"
  action="http://myphotoservice.com/addphoto.asp",
  method="postXMLP">
<model>
  <string name="album_id" />
  <binary name="photo">
    <mediaType>image/jpeg</mediaType>
    <mediaType>image/png</mediaType>
  </binary>
</model>
<instance>
  <xmlp:Envelope
   xmlns:xmlp="http://schemas.xmlsoap.org/soap/envelope/">
   <xmlp:Body>
      <album_id>John's Great Adventure</album_id>
      <photo></photo>
    </SOAP-ENV:Body>
  </SOAP-ENV:Envelope>
</instance>
</xform>

</head>
<body>
  <xform:input ref="photo_upload/xmlp:Envelope/xmlp:Body/photo">Select Photo </input>  
  <xform:submit xform="photo_upload">
     Add to <output ref="photo_upload/xmlp:Envelope/xmlp:Body/album_id"> photo album</submit>
 </body>

(The current XFORMs proposal does not have the xform:input element for a binary input shown above. A separate proposal describes it). When the submit button is pushed, submitted data would be

   MIME-Version: 1.0
   Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
   start="<b6f4ccrt@15.4.9.92/s445>"
   Content-Description: This is the optional message description.

   --MIME_boundary
   Content-Type: text/xml; charset=UTF-8
   Content-Transfer-Encoding: 8bit
   Content-ID: <b6f4ccrt@15.4.9.92/s445>
   Content-Location: addphoto.xml

   <?xml version='1.0' ?>
   <SOAP-ENV:Envelope
     xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
     <SOAP-ENV:Body>
       <xform:instance>
         <album_id>John's Great Adventure</album_id>
         <photo href="photo_upload/photo"/>
       </SOAP-ENV:Body>
   </SOAP-ENV:Envelope>

   --MIME_boundary
   Content-Type: image/jpeg
   Content-Transfer-Encoding: binary
   Content-ID: <a34ccrt@15.4.9.92/s445>
   Content-Location: "photo_upload/photo"

   ...binary JPEG image...
   --MIME_boundary-
 

Note that the value for the album_id came out of the instance data downloaded from the server and the image was added at the client.

Sending XMLP message from XHTML with enctype=multipart/related

The above discussion was focused on the new XForms proposal. We could also integrate XMLP submission into the XHTML forms model. For this case we propose the addition of a new enctype=multipart/related. This acts like a third component to section 17.13.4 Form content types of the HTML 4.0.1 specfication. It resembles the second form conten type, enctype=multipart/form-data.

For enctype=multipart/related the submitted data has MIME type multipart/related. The first MIME part will be a MIME type text/xml part that contains a SOAP 1.1 message. The body of that SOAP message consists XML elements cooresponding to each form control in the same order they apper in the form. The elements are named according to the HTML 4.0.1 naming convention for multipart/form-data MIME parts. HTML 4.0.1 uses the following definition for a MIME part name:

a name attribute specifying the control name of the corresponding control. Control names originally encoded in non-ASCII character sets may be encoded using the method outlined in [RFC2045].

A control name is defined as:

A control's "control name" is given by its name attribute. The scope of the name attribute for a control within a FORM element is the FORM element

For enctype=multipart/related these control names are used as element names. The control name may be namespace qualified; that qualification must be maintained when the form data is submitted.

For element values we have two cases: text-entry controls and file input controls.

  1. Text entry controls give text values to their corresponding XML element. The element values that have markup characters must be sent as CDATA sections; servers should expect that clients may send all element values this way.
  2. Each input type=file control results in an attachment. The corresponding XML element appears in the SOAP message as above, but no element value appears. Rather an href attribute is included that references to the MIME attachment as specified in SOAP Message with Attachments. User agents may use the control name as the attachment's Content-location value or the file name on the local operating system encoded for use as a URI.

The following example illustrates "multipart/related" encoding. Suppose we have the following form:

<FORM action="http://server.com/cgi/handle"
       enctype="multipart/form-data"
       method="post">
   <P>
   What is your name? <INPUT type="text" name="submit-name"><BR>
   What files are you sending? <INPUT type="file" name="files"><BR>
   <INPUT type="submit" value="Send"> <INPUT type="reset"> 
</FORM>

If the user enters "Larry" in the text input, and selects the text file "file1.txt", the user agent might send back the following data:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary; type=text/xml;
        start="<b6f4ccrt@15.4.9.92/s445>"
		Content-Description: This is the optional message description.
--MIME_boundary
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-ID: <b6f4ccrt@15.4.9.92/s445>
Content-Location: s13rr5.xml

<?xml version='1.0' ?>
<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
  <SOAP-ENV:Body>
    <submit-name>Larry</submit-name>
    <file href="file1.txt" />
  </SOAP-ENV:Body>
  </SOAP-ENV:Envelope>
--MIME_boundary Content-Type: text/plain Content-ID: <a34ccrt@15.4.9.92/s445> Content-Location: = file1.txt
... contents of file1.txt ... --MIME_boundary-

Carrying XMLP Headers in Form Markup.

XMLP provides a standard mechanism for intermediation in the form of XML co-messages packaged in "headers" with the primary XML message (the "body"). For forms to be complete XMLP senders we need a mechanism to specify these headers. Essentially these headers are data from the source of the form (server) to the target of the submission. They have a role similar to the "input type=hidden" fields in HTML forms.

XMLP Headers for XFORMs.

For XFORMs, the headers can simply be placed in the instance data. Here is an example:


<head>
<xform  xmlns="http://www.w3.org/2000/xforms"
  id="photo_upload"
  action="http://myphotoservice.com/addphoto.asp",
  method="postXMLP">
<model>
  <string name="album_id" />
  <binary name="photo">
    <mediaType>image/jpeg</mediaType>
    <mediaType>image/png</mediaType>
  </binary>
</model>
<instance>
  <xmlp:Envelope
   xmlns:xmlp="http://schemas.xmlsoap.org/soap/envelope/">
   <xmlp:Header>
      <ads:bundle-id xmlns:ads="a_uri">http://example.com/cookie/53A6C2209</ads:bundle-id>
   </xmlp:Header>
   <xmlp:Body>
      <album_id>John's Great Adventure</album_id>
      <photo></photo>
    </SOAP-ENV:Body>
  </SOAP-ENV:Envelope>
</instance>
</xform>

</head>
<body>
  <xform:input ref="photo_upload/xmlp:Envelope/xmlp:Body/photo">Select Photo </input>  
  <xform:submit xform="photo_upload">
     Add to <output ref="photo_upload/xmlp:Envelope/xmlp:Body/album_id"> photo album</submit>
</body>

Since the header elements of the eventual SOAP message appear in the instance they can be copied to submission. Moreover values in the headers can be edited (filled-in) by users or by the user agent on the user's behalf. For example, the user agent could fill in the digest value and signature value in a Digital Signature [SOAP-dsig].

XMLP Headers for XHTML forms.

Adding hidden XMLP header content to XHTML forms cannot use the XHTML input type=hidden solution now used in XHTML forms. Under method=postXMLP, the value field of such an input field is written into the form body. To parallel the proposal for XFORMs, we need to provide a prototype for the XMLP message that can be modified by the client.

For this purpose servers would place xform:instance content in to the head section of web pages. Within the body of the web page, HTML 4.0.1 form elements would have one additional attribute named instance whose value would be an XPATH to the XML data within the XMLP body of the xform:instance. A typical value would be "./xform:instance/xmlp:envelop/xlmp:body". This attribute gives the browser's submission logic a means for locating the data within prototype message. Within the form, input elements control name would appended to the instance attribute value to locate the the slot in the message that their data is designed to fill. (Incidently this provides an intermediate point for the XHTML to XFORMs transition).

Two cases have to be satisifed in the XHTML case. First we need to show that recently developed existing browsers would not be broken by XMLP-related content. Second we need to show that new browsers would be able perform as XMLP senders similar to a full XFORMs client.

Existing browsers, under their usual approach of ignoring content the cannot understand, may accept an XMLP message prototype in the head section of downloaded pages:

User agents do not generally render elements that appear in the HEAD as content.
http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html

Similarly, the proposed form instance attributes should be transparent:

If a user agent encounters an attribute it does not recognize, it should ignore the entire attribute specification (i.e., the attribute and its value). [HTMLExtension]

Thus it seems that existing browsers, to the extent that the follow the HTML 4.0.1 standard, will ignore the added text needed to support XMLP submission. Furthermore they should operate correctly as forms under HTML 4.0.1. They won't of course send XMLP messages on submission.

Future browsers with minor extension could submit XMLP messages. At submission, the values of form input elements would simply be placed in the appropriates slots as identified above and the entire message posted to the address given in the form element.

In reality, since there is no open XHTML follow-on other than XFORMs, the acceptance of this aspect of our proposal depends on the interests of users and browser implementors.

References:

[1] XML Protocol Activity
http://www.w3.org/2000/xp/

[2] XForms - the next generation of Web forms
http://www.w3.org/MarkUp/Forms/

[3] XForms 1.0: Data Model
W3C Working Draft, 6 April 2000
http://www.w3.org/TR/2000/WD-xforms-datamodel-20000406/

[4] XForms Requirements
W3C Working Draft, 21 August 2000
http://www.w3.org/TR/2000/WD-xhtml-forms-req-20000821/

[5] SOAP - Simple Object Update Protocol
http://www.w3.org/TR/SOAP/

[6] SOAP Messages with Attachments
http://www.w3.org/TR/SOAP-attachments/

[XSchema-2]
Paul V. Biron and Ashok Malhotra. Candidate Recommendation: XML Schema Part 2: Datatypes. Available at: http://www.w3.org/TR/xmlschema-2. 2000.

[XHTML Forms]
Form content types of the HTML 4.0.1 specfication.
http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4

[SOAP-dsig]
Soap Security Extensions: Digital Signature
http://www.w3.org/TR/SOAP-dsig/

[HTMLExtension]
HTML 4.01 Specification B.1 Notes on invalid documents
http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#notes-invalid-docs

XMLP Headers for XFORMs: Solution in submit element.

I thought of this solution first...but the one presented above seems much better.

Within the XFORMs proposal, the XMLP headers naturally fit in to the submit child element of the xform element. The XFORMs approach uses a template for the form in the header for the web page containing the form. This template specifies the form datatypes and initial values for form entries. Also specfied is the submission protocol under the submit element. This element includes a submissionExtension child that can carry the XMLP headers. The simple procedure of placing the XMLP headers to be sent with the message verbatim within a submissionExtension element will suffice for headers with XML content. The XMLP headers must not be immediately enclosed in a XMLP envelope; this will insure that the headers are not intended for the forms client. The XMLP headers, as an extracted string, must be well-formed. Under method=postXMLP The forms client will copy the characters from the submissionExtension into the XMLP message envelop just above the XMLP body.

Here is an example:

<xform xmlns="http://www.w3.org/2001/02/xforms" id="poll">
   <submit>
     <target>"http://example.com/app1"<target/>
     <submitExtension>
         <SOAP:Header>
            <bundle-id>http://example.com/cookie/53A6C2209</bundle-id>
         </SOAP:Header>
     </submitExtension>
   </submit>
   <model>
      <simple>
         <number name="choiceCode" enum="closed">
            <value>-1</value> 
            <value>10</value>
            <value>20</value>
            <value>30</value>
         </number>
     </simple>
   </model>
</xform>