eZ Component: Document, Design
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Author:   $Author: kn $
:Revision: $Revision: 7272 $
:Date:     $Date: 2008-02-01 11:51:22 +0000 (Fri, 01 Feb 2008) $

Design description
==================

ezcDocument class (abstract)
----------------------------

This class is the base class for each implemented markup format. The extending
classes are named like 'ezcDocumentHtmlDocument'.

The extending classes implement the required actions to load a file or string
in the respective markup and return an arbitrary structure describing the
document and containing all available markup information. For each document
there will be at least a conversion to the DocBook__ format.

The document classes may implement additional interfaces for direct access to
some converted format, if there are shortcut implementations available, like
the following example shows::

	<?php
	class ezcDocumentRstDocument extends ezcDocument implements
		ezcDocumentHtmlConversion
	{
		// Required by ezcDocumentHtmlConversion interface
		public function getAsHtml()
		{
			// Use direct conversion using `rst2html.py`
		}
	}
	?>

Beside loading a document all document classes need to implement save()
method, which returns the document serialized as a string.

ezcDocumentWikiBase class
^^^^^^^^^^^^^^^^^^^^^^^^^

There may be extensions, like a basic wiki syntax parser, from the document
class, which implement a set of parse helper functions for some set of markup
languages.

ezcDocumentConversion interface
-------------------------------

Interface that defines an abstract conversion class. Real conversion classes
will implement conversion of the given document from one format to another.
The names of that classes follow the pattern:
'ezcDocumentRstToHtmlConversion'.

The main method of this abstract class, convert(), takes a document of the
first format and returns a document in the other format. Both objects extend
the ezcDocument class::

	<?php
	abstract class ezcDocumentConversion
	{
		/**
		 * @return ezcDocument
		 */
		abstract public function convert( ezcDocument $source );
	}
	?>

ezcDocumentXsltConversion
^^^^^^^^^^^^^^^^^^^^^^^^^

There are a lot of conversions which may just work on the base of an XSLT. For
all those conversions an abstract ezcDocumentXsltConversion class is provided,
which just takes a set of XSLT files to apply to the XML of the source
document and handles the selection of a proper XSLT engine.

ezcDocumentManager
------------------

The document manager knows about the document parsers to use for each format
and offers a set of static convenient methods to change the used parser for a
document type, or directly create a document from a string or file. The Syntax
for accessing the document manager could look like::

	<?php
	ezcDocumentManager::setFormat( 'rst', 'myCustomRstDocument' );
	$doc = ezcDocumentManager::loadFile( 'rst', '/path/to/my.rst' );
	?>

The document manager initiall knows about all internal formats and has them
registered.

ezcDocumentConversionManager
----------------------------

This won't be implemented in the first release.

While you may call the conversion directly, the conversion manager has a set
of registered conversion routes and offers you direct access to the document
conversions, like::

	<?php
	$result = ezcDocumentConversionManager::convert( 
		'rst', 'xhtml', '/path/to/my.rst'
	);
	?>

There will be an interface to overwrite default conversion routes and add or
change the classes used for some conversion.

Multiple conversions
^^^^^^^^^^^^^^^^^^^^

There may be multiple conversion implementations for a set of formats. A
conversion from ezp4xml to HTML may not only be done by using an XSLT, but also
using a simple templating enging, based on string replacements, or a
ezcDocumentTemplateTieIn which uses ezcTemplate to convert between the
documents.

ezcDocumentValidation interface
-------------------------------

The document classes may implement the ezcDocumentValidation, which provides
the method validate( $file ), which can be used to validate an input file
before trying to load it.

The XML based formats will usually implement this by checking against a
RelaxNG based document definition.

Algorithms
==========

Transforming XML
----------------

XML documents are transformed using XSLT stylesheets and XSL extension for
PHP.  Transformations are done with ezcDocXSLTTransformer class. Optionally
the transformations may be done by the xsltrans CLI utility.

Examples
========

Examples for the usage of the ezcDocument API.

Loading a document
------------------

::
	$doc = new ezcDocumentRstDocument();
	$doc->loadFile( '/path/to/my.rst' );

Validating a set of documents
-----------------------------

::
	$doc = new ezcDocumentXhtmlDocument();
	$result_1 = $doc->validate( '/path/to/first.htm' );
	$result_2 = $doc->validate( '/path/to/second.htm' );

	foreach ( $result_1 as $error )
	{
		echo "- {$error->msg} in {$error->line} in {$error->file}.\n";
	}

Convert between documents
-------------------------

::
	$doc = new ezcDocumentRstDocument();
	$doc->loadFile( '/path/to/my.rst' );

	$converter = new ezcDocumentRstToHtmlConversion();
	$converter->option->literalTag = 'code';
	$html = $converter->convert( $doc );

	// __toString() will of course be implemented, too.
	echo $doc->save();

Using the document manager
--------------------------

::
	ezcDocumentManager::setFormat( 'rst', 'myCustomRstHandler' );
	$doc = ezcDocumentManager::loadFile( 'rst', '/path/to/my.rst' );

	// All errors in the RST syntax should now be magically fixed.
	echo $doc;


..
   Local Variables:
   mode: rst
   fill-column: 79
   End:
   vim: et syn=rst tw=79
