Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/mono/mono.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
Diffstat (limited to 'web/xml-classes')
-rwxr-xr-xweb/xml-classes464
1 files changed, 464 insertions, 0 deletions
diff --git a/web/xml-classes b/web/xml-classes
new file mode 100755
index 00000000000..143a3af60f5
--- /dev/null
+++ b/web/xml-classes
@@ -0,0 +1,464 @@
+* XML Classes Status and Tasks
+
+** Abstract
+
+ XML library is used by several areas of Mono such as ADO.NET and XML
+ Digital Signature (xmldsig). Here I write about System.Xml.dll and
+ related tools. This page won't include any classes which are in other
+ assemblies such as XmlDataDocument.
+
+ Note that current corlib has its own XML parser class (Mono.Xml.MiniParser).
+
+ Basically System.XML.dll feature is almost finished, so I write this
+ document mainly for bugs and improvement hints.
+
+** Status
+
+*** System.Xml namespace
+
+
+**** Document Object Model (Core)
+
+ DOM implementation has finished and our DOM implementation scores better
+ than MS.NET as to the NIST DOM test results (it is ported by Mainsoft
+ hackers and in our unit tests).
+
+**** Xml Writer
+
+ Here XmlWriter almost equals to XmlTextWriter. If you want to see
+ another implementation, check XmlNodeWriter.cs and DTMXPathDocumentWriter.cs
+ in System.XML sources.
+
+ XmlTextWriter is completed, though it looks a bit slower than MS.NET (I
+ tried 1.1).
+
+**** XmlResolver
+
+ XmlUrlResolver is implemented.
+
+ XmlSecureResolver, which is introduced in MS .NET Framework 1.1 is basically
+ implemented, but it requires CAS (code access security) feature. We need to
+ fixup this class after ongoing CAS effort works.
+
+ You might also be interested in an improved <a href="http://codeblogs.ximian.com/blogs/benm/archives/000039.html">XmlCachingResolver</a> by Ben Maurer.
+ If even one time download is not acceptable, you can use <a href="http://primates.ximian.com/~atsushi/XmlStoredResolver.cs">this one</a>.
+
+ [2.0] XmlDataSourceResolver is not implemented as yet.
+
+**** XmlNameTable
+
+ NameTable is implemented, but also needs performance improvement.
+ It affects on the whole XML processing performance so much.
+ Optimization hackings are welcome. There is also a <a
+ href="http://bugzilla.ximian.com/show_bug.cgi?id=59537">bugzilla entry</a>
+ for this matter.
+
+**** XML Reader
+
+ XmlTextReader, XmlNodeReader and XmlValidatingReader are almost finished.
+
+ <ul>
+ * All OASIS conformance test passes as Microsoft does. Some
+ W3C tests fail, but it looks better.
+ * Entity expansion and its well-formedness check is incomplete.
+ It incorrectly allows divided content models. It incorrectly
+ treats its Base URI, so some dtd parse fails.
+ * I won't add any XDR support on XmlValidatingReader. (I haven't
+ ever seen XDR used other than Microsoft's BizTalk Server 2000,
+ and Now they have 2002 with XML Schema support). If anyone
+ contributes an implementation, it would be still nice.
+ </ul>
+
+ XmlTextReader and XmlValidatingReader should be faster than now. Currently
+ XmlTextReader looks nearly twice as slow as MS.NET, and XmlValidatingReader
+ (which uses this slow XmlTextReader) looks nearly three times slower. (Note
+ that XmlValidatingReader wouldn't be so slow as itself. It uses schema
+ validating reader and dtd validating reader.)
+
+
+**** Some Advantages
+
+ The design of Mono's XmlValidatingReader is radically different from
+ that of Microsoft's implementation. Under MS.NET, DTD content validation
+ engine is in fact simple replacement of XML Schema validation engine.
+ Mono's DTD validation is designed fully separate and does validation
+ as normal XML parser does. For example, Mono allows non-deterministic DTD.
+
+ Another advantage of this XmlValidatingReader is support for *any* XmlReader.
+ Microsoft supports only XmlTextReader (this bug is fixed in .NET 2.0 beta,
+ taking shape of XmlReader.Create()).
+
+ <del>I added extra support interface named "IHasXmlParserContext", which is
+ considered in XmlValidatingReader.ResolveEntity(). </del><ins>This is now
+ made as internal interface.</ins> Microsoft failed to design XmlReader
+ so that XmlReader cannot be subtree-pluggable (i.e. wrapping use of other
+ XmlReader) since XmlParserContext shoud be supplied for DTD information
+ support (e.g. entity references cannot be expanded) and namespace manager.
+ (In .NET 2.0, Microsoft also supported similar to IHasXmlParserContext,
+ named IXmlNamespaceResolver, but it still does not provide DTD information.)
+
+ We also have RELAX NG validating reader (described later).
+
+
+*** System.Xml.Schema
+
+**** Summary
+
+ Basically it is completed. You can test how current schema validation engine
+ is complete (incomplete) by using standalone test module (see
+ mcs/class/System.XML/Test/System.Xml.Schema/standalone_tests).
+ At least in my box, msxsdtest fails only 30 cases with bugfixed catalog -
+ this score is better than that of Microsoft implementation. But instead,
+ we need performance boost. There should be many points to improve
+ schema compilation and validation.
+
+**** Schema Object Model
+
+ Completed, except for some things to be fixed:
+
+ <ul>
+ * Complete facet support. Currently some of them is missing.
+ Recently David Sheldon is doing several fixes on them.
+ * ContentTypeParticle for pointless xs:choice is incomplete
+ (fixing this arose another bugs in compilation.
+ Interestingly, MS.NET also fails around here, so it might
+ be nature of ContentTypeParticle design)
+ * Some derivation by restriction (DBR) handling is incorrect.
+ </ul>
+
+**** Validating Reader
+
+ Basically this is implemented and actually its feature is complete,
+ but I have only did validation feature testing. So we have to write more
+ tests on properties, methods, and events (validation errors).
+
+
+*** System.Xml.Serialization
+
+ Lluis rules ;-)
+
+ Well, in fact XmlSerializer is almost finished and is on bugfix phase.
+
+ However, we appliciate more tests. Please try
+
+ <ul>
+ * System.Web.Services to invoke SOAP services.
+ * xsd.exe and wsdl.exe to create classes.
+ </ul>
+
+ And if any problems were found, please file it to bugzilla.
+
+ Lluis also built interesting standalone test system placed under
+ mcs/class/System.Web.Services/Test/standalone.
+
+ You might also interested in "genxs", which enables you to create custom
+ XML serializer. This is not included in Microsoft.NET.
+ See <a
+ href="http://primates.ximian.com/~lluis/blog/archives/000120.html">here</a>
+ and manpages for details. Code files are in mcs/tools/genxs.
+
+ Lluis also created "sgen", that based on XmlSerializer.GenerateSerializer().
+ Code files are in mcs/tools/sgen.
+
+*** System.Xml.XPath and System.Xml.Xsl
+
+ There are two XSLT implementations. One and historical implementation is
+ based on libxslt (aka Unmanaged XSLT). Now we uses fully implemented and
+ managed XSLT by default. To use Unmanaged XSLT, set MONO_UNMANAGED_XSLT
+ environment value (any value is acceptable).
+
+ As for Managed XSLT, we support msxsl:script.
+
+ It would be nice if we can support <a href="http://www.exslt.org/">EXSLT</a>.
+ <a href="http://msdn.microsoft.com/WebServices/default.aspx?pull=/library/en-us/dnexxml/html/xml05192003.asp">Microsoft has tried to do some of them</a>,
+ but it is not successful because of System.Xml.Xsl design problem:
+
+ <ul>
+ * In general, .NET's "extension objects" (including
+ msxsl:script) is not useful to return node-sets (MS XSLT
+ implementation rejects just overriden XPathNodeIterator,
+ but accepts only their hidden classes. And are the same
+ in Mono though classes are different)
+
+ * In .NET's extension object design, extension function name
+ is a valid method name that cannot contain some characters
+ such as '-'. That is, implementing EXSLT in C# is impossible.
+ </ul>
+
+ So if we support EXSLT, it has to be done inside our System.XML.dll.
+ Microsoft developers are also aware of this problem and some of them wish
+ to have EXSLT support in WinFX (not whidbey). If anyone is interested
+ in it, it would be nice.
+
+ Our managed XSLT implementation is slower than MS XSLT for some kind of
+ stylesheets, and faster for some.
+
+
+*** RELAX NG
+
+ I implemented an experimental RelaxngValidatingReader. It is still not
+ complete, for example some simplification stuff (see RELAX NG spec
+ chapter 4; especially 4.17-19) and some constraints (especially 7.3).
+ See mcs/class/Commons.Xml.Relaxng/README for details.
+
+ Currently we have
+
+ <ul>
+ * Custom datatype support. Right now, you can use XML schema
+ datatypes ( http://www.w3.org/2001/XMLSchema-datatypes )
+ as well as RELAX NG default datatypes (as used in relaxng.rng).
+
+ * RELAX NG Compact Syntax support, though not yet stable.
+ See Commons.Xml.Relaxng.Rnc.RncParser class.
+</ul>
+
+
+** System.Xml v2.0
+
+ Microsoft released the first public beta version of .NET Framework 2.0,
+ available from <a href="http://www.microsoft.com/downloads/details.aspx?familyid=916EC067-8BDC-4737-9430-6CEC9667655C&displaylang=en">MSDN</a>.
+ It contains several new classes.
+
+ There are two assemblies related to System.Xml v2.0; System.Xml.dll and
+ System.Data.SqlXml.dll. Now that System.Data.SqlXml.dll is little important.
+ It just contains only XQueryCommand class inside System.Xml.* namespace.
+ Most of the important part are in System.Xml.dll.
+
+ Note that .NET Framework is pre-release version, so they are subject
+ to change. Actually many of the pre-released classes vanished.
+
+ System.Xml 2.0 contains several features such as:
+
+ <ul>
+ * new XPathNavigator <del>and XPathDocument</del><ins>XPathDocument is <a href="http://blogs.msdn.com/dareobasanjo/archive/2004/08/25/220251.aspx">being reverted</a></ins>
+ * XmlReaderSettings, XmlWriterSettings and factory methods
+ * Strongly typed XmlReader and XmlWriter.
+ * XML Schema design changes
+ * XSD Inference
+ * Well-documented and improved XmlSerializer.
+ * XQuery execution engine
+ * XQuery and XSLT per-stylesheet assembly generator
+ </ul>
+
+*** System.Xml 2.0
+
+**** XmlReader/XmlWrier Factory methods
+
+ In .NET 2.0, XmlTextReader, XmlNodeReader, XmlValidatingReader are
+ obsolete and XmlReader.Create() is recommended (there is however no
+ alternative way to create XmlNodeReader). Similarly, there are
+ XmlWriter.Create() overloads.
+
+ Currently, Microsoft's XmlWriter.Create() is unreliable and maybe there
+ will be changes. So basically XmlWriter.Create() is supposed to be done
+ after the next beta version of .NET 2.0.
+
+ Some of XmlReader.Create() overloads are implemented, with limited
+ XmlReaderSettings support.
+
+
+**** Typed XmlReader/XmlWriter
+
+ In .NET 2.0, XmlReader is supposed to support strongly-typed data reading.
+ They are based on W3C "XML Schema Datatypes" Recommendation and "XQuery 1.0
+ and XPath 2.0 Data Model" Working Draft.
+
+ Some of XmlReader.ReadValueAsXxx() and XmlWriter.WriteValue() overloads are
+ implemented, though incompletely. They are based on internal XQueryConvert.
+
+
+**** Sub-tree handling in XmlReader/XmlWriter/XPathNavigator
+
+ Currently XmlReader.ReadSubtree(), XmlWriter.WriteSubtree() and
+ XPathNavigator.ReadSubtree() are implemented, though not well-tested.
+ They are based on Mono.Xml.SubtreeXmlReader and
+ Mono.Xml.XPath.XPathNavigatorReader classes.
+
+
+*** System.Xml.Schema 2.0
+
+ Since .NET 1.x is not so compliant with W3C XML Schema specification,
+ Microsoft had to redesign System.Xml.Schema classes. We also have to
+ change many things.
+
+ 1) It does not expose XmlSchemaDatatype anymore (except for obsolete
+ members). Primitive types are represented as XmlSchemaSimpleType
+ instances (thus there are ElementSchemaType, AttributeSchemaType,
+ BaseXmlSchemaType that replace some existing properties).
+
+ 2) "XQuery 1.0 and XPath 2.0 Data Model" datatypes (such as
+ xdt:dayTimeDuration) are newly supported. They are partially implemented
+ yet. This task is partly done.
+
+ 3) schema structures are now bound in parent-child relationship. It is
+ not yet implemented. With related to it, there seems bunch of schema
+ compilation bugfixes.
+
+ 4) XmlSchemaCollection is not used anymore to represent effective set of
+ schemas. Instead, new XmlSchemaSet class is used. It should affect on
+ schema compilation design. In fact, I've implemented XmlSchemaCollection
+ as more conformant to W3C specification, but there are still many changes
+ required. This task is partly done.
+
+
+**** XSD Inference
+
+ In .NET 2.0, there is an XML Schema inference implementation. Now that
+ XmlSchemaSet is basically implemented, it can be separately done by anyone.
+ Volunteer efforts are welcome here.
+
+
+*** System.Xml.XPath 2.0
+
+**** Editable XPathDocument
+
+ <del>
+ in .NET 2.0 XPathDocument is supposed to be editable. Currently we provide
+ fast document table model based implementation (DTMXPathNavigator), but
+ by that design change, we (and they) cannot provide fast read only
+ XPathNavigator from XPathDocument anymore.
+ </del><ins>
+ It is being reverted to the original (.NET 1.x) XPathDocument. We still have
+ them, but we'll revert them too in the future. So our XPathDocument will be still faster one.
+ </ins>
+
+ Currently, new XPathDocument implementation is provided. The actual
+ implementation is Mono.Xml.XPath.XPathDocument2, that is simple dom-like
+ tree model. XPathDocument2 implements the same interfaces as XPathDocument
+ does. And XPathDocument delegates most of the methods to that class (for
+ example, XPathDocument.CreateEditor() calls XPathDocument2.CreateEditor()).
+
+ Currently Mono.Xml.XPath.XPathDocument2 is unstable (it does not pass
+ the standalone XSLT tests unlike existing DTMXPathDocument does). So
+ it did not replace existing XPathDocument implementation, but you can use
+ new implementation by explicitly setting environment value
+ USE_XPATH_DOCUMENT_2 = yes. Currently it supports (well, is supposed
+ to support) basic editor feature such as AppendChild(). Other members
+ are untested (such as RejectChanges()).
+
+**** extra stuff - XPathEditableDocument
+
+ Currently we provide another IXPathEditable; XPathEditableDocument. That is
+ based on the idea that handles XmlDocument as editor target. It is
+ implemented as Mono.Xml.XPath.XPathEditableDocument. We might provide this
+ class as extra set (might be different mono-specific XML assembly).
+
+
+**** System.Xml.XQuery
+
+ In this namespace, there are two significant classes. XsltCommand and
+ XQueryCommand.
+
+ XsltCommand implements XSLT transformation. It is almost the same as
+ System.Xml.Xsl.XslTransform, but this class transforms documents twice
+ to four times as fast as XslTransform. Instead, stylesheet compilation
+ is much slower, because it generates compiled stylesheet assembly.
+
+ XQueryCommand implements XQuery. XQuery is a new face XML document
+ manipulation language (at least new face in .NET world). It is similar
+ to XSLT, but extended to support XML Schema based datatypes (and it is
+ not XML based langauge). It is similar to XPath, but it can construct
+ XML nodes. It has no complicated template resolution, but works like
+ functional languages.
+
+ Under MS.NET, XQuery implementation is mainly in System.Xml.Query and
+ MS.Internal.Xml.* namespaces. The implementation is mostly
+ in System.Xml.dll. It is also true to our System.Xml.dll. Our XQueryCommand
+ in System.Data.SqlXml.dll just invokes the actual XQuery processor
+ (Mono.Xml.XPath2.XQueryCommandImpl) which resides in System.Xml.dll via
+ reflection.
+
+ Currently we are not implementing MS.Internal.Xml.* classes. MS
+ implementation is based on an old version of the W3C specification, and
+ our implementation is currently based on
+ <a href="http://www.w3.org/TR/2004/WD-xquery-20040723/">23 July 2004
+ version</a> (latest as of now) of the working draft.
+
+ XQuery implementation tasks are:
+
+ <ul>
+ * XQuery syntax parser that parses xquery string to AST
+ (abstract syntax tree). -> partly not done.
+
+ * XQuery AST compiler into static context -> partly not done.
+
+ * XQuery (dynamic context) runtime = XQuery expression evaluator
+ + sequence iterator. -> partly not done.
+
+ * XPathItem data model and (mainly) conversion support.
+ -> partly done.
+
+ * Applied expression classes for XQuery/XPath 2.0 functions and
+ operators. -> partly done.
+
+ * Optimization, and design per-query assembly code generator (later)
+ </ul>
+
+ It already handles some queries, while major part implementation is missing
+ or buggy (like FLWOR, expressions for sequence type handling, built-in
+ function support etc.).
+
+
+*** Relax NG and DSDL in Mono 1.2
+
+ Currently we support only RELAX NG as one part of ISO DSDL effort. There
+ is existing Schematron implementation (NMatrix Project: <a
+ href="http://sourceforge.net/projects/dotnetopensrc/">
+ http://sourceforge.net/projects/dotnetopensrc/</a>). With a few changes,
+ it can be used with mono.
+
+ We also don't have multi-language based validation support, namely
+ Namespace-based Validation Dispatch Language (NVDL). To support unwrapping,
+ one special XmlReader implementation is required (other schema validation
+ support can be done by ReadSubtree()). Note that we had seen RELAX
+ Namespace, Modular Namespace (MNS) and Namespace Routing Language (NRL)
+ - that is, standardization effort is still ongoing (though NVDL looks
+ mostly the same as NRL).
+
+ In Mono 1.2, there might be improvements on Commons.Xml.Relaxng.
+
+ <ul>
+ * Currently RelaxngPattern.Compile() provides cheap compilation
+ error information. At least it can provide error location.
+ Also, the type of error should be kind of
+ RelaxngGrammarException.
+
+ * Right now there is no ambiguity detection implementation that
+ would be useful for RelaxngPattern based xml serialization (if
+ there is need).
+
+ * Because of lack of ambiguity detection, there is no way to
+ provide XmlMapping (XmlTypeMapping/XmlMemberMapping). But
+ If anyone is interested in such effort, integration with
+ XmlSerializer would be interesting task.
+ </ul>
+
+
+** Tools
+
+*** xsd.exe
+
+ See <a href="ado-net.html">ADO.NET page</a>.
+
+
+** Miscellaneous
+
+*** Mutual assembly dependency
+
+ Sometimes I hear complain about System.dll and System.Xml.dll mutual
+ dependency: System.dll references to System.Xml.dll (e.g.
+ System.Configuration.ConfigXmlDocument extended from XmlDocument), while
+ System.Xml.dll vice versa (e.g. XmlUrlResolver.ResolveUri takes System.Uri).
+ Since they are in public method signatures, so at least we cannot get rid
+ of these mutual references.
+
+ Nowadays System.Xml.dll is built using incomplete System.dll (lacking
+ System.Xml dependent classes such as ConfigXmlDocument). Full System.dll
+ is built after System.Xml.dll is done.
+
+ Note that you still need System.dll to run mcs.
+
+
+ Atsushi Eno <asushi@ximian.com>
+ last updated 09/02/2004
+