diff options
Diffstat (limited to 'xsd/doc/xsd-epilogue.1')
-rw-r--r-- | xsd/doc/xsd-epilogue.1 | 566 |
1 files changed, 566 insertions, 0 deletions
diff --git a/xsd/doc/xsd-epilogue.1 b/xsd/doc/xsd-epilogue.1 new file mode 100644 index 0000000..2b78ff7 --- /dev/null +++ b/xsd/doc/xsd-epilogue.1 @@ -0,0 +1,566 @@ +\" +\" NAMING CONVENTION +\" + +.SH NAMING CONVENTION +The compiler can be instructed to use a particular naming convention in +the generated code. A number of widely-used conventions can be selected +using the +.B --type-naming +and +.B --function-naming +options. A custom naming convention can be achieved using the +.BR --type-regex , +.BR --accessor-regex , +.BR --one-accessor-regex , +.BR --opt-accessor-regex , +.BR --seq-accessor-regex , +.BR --modifier-regex , +.BR --one-modifier-regex , +.BR --opt-modifier-regex , +.BR --seq-modifier-regex , +.BR --parser-regex , +.BR --serializer-regex , +.BR --const-regex , +.BR --enumerator-regex , +and +.B --element-type-regex +options. + +The +.B --type-naming +option specifies the convention that should be used for naming C++ types. +Possible values for this option are +.B knr +(default), +.BR ucc , +and +.BR java . +The +.B knr +value (stands for K&R) signifies the standard, lower-case naming convention +with the underscore used as a word delimiter, for example: foo, foo_bar. +The +.B ucc +(stands for upper-camel-case) and +.B java +values a synonyms for the same naming convention where the first letter +of each word in the name is capitalized, for example: Foo, FooBar. + +Similarly, the +.B --function-naming +option specifies the convention that should be used for naming C++ functions. +Possible values for this option are +.B knr +(default), +.BR lcc , +and +.BR java . +The +.B knr +value (stands for K&R) signifies the standard, lower-case naming convention +with the underscore used as a word delimiter, for example: foo(), foo_bar(). +The +.B lcc +value (stands for lower-camel-case) signifies a naming convention where the +first letter of each word except the first is capitalized, for example: foo(), +fooBar(). The +.B java +naming convention is similar to the lower-camel-case one except that accessor +functions are prefixed with get, modifier functions are prefixed with set, +parsing functions are prefixed with parse, and serialization functions are +prefixed with serialize, for example: getFoo(), setFooBar(), parseRoot(), +serializeRoot(). + +Note that the naming conventions specified with the +.B --type-naming +and +.B --function-naming +options perform only limited transformations on the +names that come from the schema in the form of type, attribute, and element +names. In other words, to get consistent results, your schemas should follow +a similar naming convention as the one you would like to have in the generated +code. Alternatively, you can use the +.B --*-regex +options (discussed below) to perform further transformations on the names +that come from the schema. + +The +.BR --type-regex , +.BR --accessor-regex , +.BR --one-accessor-regex , +.BR --opt-accessor-regex , +.BR --seq-accessor-regex , +.BR --modifier-regex , +.BR --one-modifier-regex , +.BR --opt-modifier-regex , +.BR --seq-modifier-regex , +.BR --parser-regex , +.BR --serializer-regex , +.BR --const-regex , +.BR --enumerator-regex , +and +.B --element-type-regex +options allow you to specify extra regular expressions for each name +category in addition to the predefined set that is added depending on +the +.B --type-naming +and +.B --function-naming +options. Expressions that are provided with the +.B --*-regex +options are evaluated prior to any predefined expressions. This allows +you to selectively override some or all of the predefined transformations. +When debugging your own expressions, it is often useful to see which +expressions match which names. The +.B --name-regex-trace +option allows you to trace the process of applying +regular expressions to names. + +The value for the +.B --*-regex +options should be a perl-like regular expression in the form +.BI / pattern / replacement /\fR. +Any character can be used as a delimiter instead of +.BR / . +Escaping of the delimiter character in +.I pattern +or +.I replacement +is not supported. All the regular expressions for each category are pushed +into a category-specific stack with the last specified expression +considered first. The first match that succeeds is used. For the +.B --one-accessor-regex +(accessors with cardinality one), +.B --opt-accessor-regex +(accessors with cardinality optional), and +.B --seq-accessor-regex +(accessors with cardinality sequence) categories the +.B --accessor-regex +expressions are used as a fallback. For the +.BR --one-modifier-regex , +.BR --opt-modifier-regex , +and +.B --seq-modifier-regex +categories the +.B --modifier-regex +expressions are used as a fallback. For the +.B --element-type-regex +category the +.B --type-regex +expressions are used as a fallback. + +The type name expressions +.RB ( --type-regex ) +are evaluated on the name string that has the following format: + +[\fInamespace \fR]\fIname\fR[\fB,\fIname\fR][\fB,\fIname\fR][\fB,\fIname\fR] + +The element type name expressions +.RB ( --element-type-regex ), +effective only when the +.B --generate-element-type +option is specified, are evaluated on the name string that has the following +format: + +.I namespace name + +In the type name format the +.I namespace +part followed by a space is only present for global type names. For global +types and elements defined in schemas without a target namespace, the +.I namespace +part is empty but the space is still present. In the type name format after +the initial +.I name +component, up to three additional +.I name +components can be present, separated by commas. For example: + +.B http://example.com/hello type + +.B foo + +.B foo,iterator + +.B foo,const,iterator + +The following set of predefined regular expressions is used to transform +type names when the upper-camel-case naming convention is selected: + +.B /(?:[^ ]* )?([^,]+)/\\\\u$1/ + +.B /(?:[^ ]* )?([^,]+),([^,]+)/\\\\u$1\\\\u$2/ + +.B /(?:[^ ]* )?([^,]+),([^,]+),([^,]+)/\\\\u$1\\\\u$2\\\\u$3/ + +.B /(?:[^ ]* )?([^,]+),([^,]+),([^,]+),([^,]+)/\\\\u$1\\\\u$2\\\\u$3\\\\u$4/ + +The accessor and modifier expressions +.RB ( --*accessor-regex +and +.BR --*modifier-regex ) +are evaluated on the name string that has the following format: + +\fIname\fR[\fB,\fIname\fR][\fB,\fIname\fR] + +After the initial +.I name +component, up to two additional +.I name +components can be present, separated by commas. For example: + +.B foo + +.B dom,document + +.B foo,default,value + +The following set of predefined regular expressions is used to transform +accessor names when the +.B java +naming convention is selected: + +.B /([^,]+)/get\\\\u$1/ + +.B /([^,]+),([^,]+)/get\\\\u$1\\\\u$2/ + +.B /([^,]+),([^,]+),([^,]+)/get\\\\u$1\\\\u$2\\\\u$3/ + +For the parser, serializer, and enumerator categories, the corresponding +regular expressions are evaluated on local names of elements and on +enumeration values, respectively. For example, the following predefined +regular expression is used to transform parsing function names when the +.B java +naming convention is selected: + +.B /(.+)/parse\\\\u$1/ + +The const category is used to create C++ constant names for the +element/wildcard/text content ids in ordered types. + +See also the REGEX AND SHELL QUOTING section below. + +\" +\" TYPE MAP +\" +.SH TYPE MAP +Type map files are used in C++/Parser to define a mapping between XML +Schema and C++ types. The compiler uses this information to determine +the return types of +.B post_* +functions in parser skeletons corresponding to XML Schema types +as well as argument types for callbacks corresponding to elements +and attributes of these types. + +The compiler has a set of predefined mapping rules that map built-in +XML Schema types to suitable C++ types (discussed below) and all +other types to +.BR void . +By providing your own type maps you can override these predefined rules. +The format of the type map file is presented below: + +.RS +.B namespace +.I schema-namespace +[ +.I cxx-namespace +] +.br +.B { +.br + ( +.B include +.IB file-name ; +)* +.br + ([ +.B type +] +.I schema-type cxx-ret-type +[ +.I cxx-arg-type +.RB ] ; +)* +.br +.B } +.br +.RE + +Both +.I schema-namespace +and +.I schema-type +are regex patterns while +.IR cxx-namespace , +.IR cxx-ret-type , +and +.I cxx-arg-type +are regex pattern substitutions. All names can be optionally enclosed +in \fR" "\fR, for example, to include white-spaces. + +.I schema-namespace +determines XML Schema namespace. Optional +.I cxx-namespace +is prefixed to every C++ type name in this namespace declaration. +.I cxx-ret-type +is a C++ type name that is used as a return type for the +.B post_* +functions. Optional +.I cxx-arg-type +is an argument type for callback functions corresponding to elements and +attributes of this type. If +.I cxx-arg-type +is not specified, it defaults to +.I cxx-ret-type +if +.I cxx-ret-type +ends with +.B * +or +.B & +(that is, it is a pointer or a reference) and +.B const +\fIcxx-ret-type\fB&\fR otherwise. +.I file-name +is a file name either in the \fR" "\fR or < > format and is added with the +.B #include +directive to the generated code. + +The \fB#\fR character starts a comment that ends with a new line or end of +file. To specify a name that contains \fB#\fR enclose it in \fR" "\fR. For +example: + +.RS +namespace http://www.example.com/xmlns/my my +.br +{ +.br + include "my.hxx"; +.br + + # Pass apples by value. + # + apple apple; +.br + + # Pass oranges as pointers. + # + orange orange_t*; +.br +} +.br +.RE + +In the example above, for the +.B http://www.example.com/xmlns/my#orange +XML Schema type, the +.B my::orange_t* +C++ type will be used as both return and argument types. + +Several namespace declarations can be specified in a single file. +The namespace declaration can also be completely omitted to map +types in a schema without a namespace. For instance: + +.RS +include "my.hxx"; +.br +apple apple; +.br + +namespace http://www.example.com/xmlns/my +.br +{ +.br + orange "const orange_t*"; +.br +} +.br +.RE + + +The compiler has a number of predefined mapping rules that can be +presented as the following map files. The string-based XML Schema +built-in types are mapped to either +.B std::string +or +.B std::wstring +depending on the character type selected with the +.B --char-type +option +.RB ( char +by default). + +.RS +namespace http://www.w3.org/2001/XMLSchema +.br +{ +.br + boolean bool bool; +.br + + byte "signed char" "signed char"; +.br + unsignedByte "unsigned char" "unsigned char"; +.br + + short short short; +.br + unsignedShort "unsigned short" "unsigned short"; +.br + + int int int; +.br + unsignedInt "unsigned int" "unsigned int"; +.br + + long "long long" "long long"; +.br + unsignedLong "unsigned long long" "unsigned long long"; +.br + + integer "long long" "long long"; +.br + + negativeInteger "long long" "long long"; +.br + nonPositiveInteger "long long" "long long"; +.br + + positiveInteger "unsigned long long" "unsigned long long"; +.br + nonNegativeInteger "unsigned long long" "unsigned long long"; +.br + + float float float; +.br + double double double; +.br + decimal double double; +.br + + string std::string; +.br + normalizedString std::string; +.br + token std::string; +.br + Name std::string; +.br + NMTOKEN std::string; +.br + NCName std::string; +.br + ID std::string; +.br + IDREF std::string; +.br + language std::string; +.br + anyURI std::string; +.br + + NMTOKENS xml_schema::string_sequence; +.br + IDREFS xml_schema::string_sequence; +.br + + QName xml_schema::qname; +.br + + base64Binary std::auto_ptr<xml_schema::buffer> +.br + std::auto_ptr<xml_schema::buffer>; +.br + hexBinary std::auto_ptr<xml_schema::buffer> +.br + std::auto_ptr<xml_schema::buffer>; +.br + + date xml_schema::date; +.br + dateTime xml_schema::date_time; +.br + duration xml_schema::duration; +.br + gDay xml_schema::gday; +.br + gMonth xml_schema::gmonth; +.br + gMonthDay xml_schema::gmonth_day; +.br + gYear xml_schema::gyear; +.br + gYearMonth xml_schema::gyear_month; +.br + time xml_schema::time; +.br +} +.br +.RE + + +The last predefined rule maps anything that wasn't mapped by previous +rules to +.BR void : + +.RS +namespace .* +.br +{ +.br + .* void void; +.br +} +.br +.RE + +When you provide your own type maps with the +.B --type-map +option, they are evaluated first. This allows you to selectively override +predefined rules. + +.\" +.\" REGEX AND SHELL QUOTING +.\" +.SH REGEX AND SHELL QUOTING +When entering a regular expression argument in the shell command line +it is often necessary to use quoting (enclosing the argument in " " +or ' ') in order to prevent the shell from interpreting certain +characters, for example, spaces as argument separators and $ as +variable expansions. + +Unfortunately it is hard to achieve this in a manner that is portable +across POSIX shells, such as those found on GNU/Linux and UNIX, and +Windows shell. For example, if you use " " for quoting you will get +a wrong result with POSIX shells if your expression contains $. The +standard way of dealing with this on POSIX systems is to use ' ' +instead. Unfortunately, Windows shell does not remove ' ' from +arguments when they are passed to applications. As a result you may +have to use ' ' for POSIX and " " for Windows ($ is not treated as +a special character on Windows). + +Alternatively, you can save regular expression options into a file, +one option per line, and use this file with the +.B --options-file +option. With this approach you don't need to worry about shell quoting. + +.\" +.\" DIAGNOSTICS +.\" +.SH DIAGNOSTICS +If the input file is not a valid W3C XML Schema definition, +.B xsd +will issue diagnostic messages to +.B STDERR +and exit with non-zero exit code. +.SH BUGS +Send bug reports to the xsd-users@codesynthesis.com mailing list. +.SH COPYRIGHT +Copyright (c) 2005-2014 Code Synthesis Tools CC. + +Permission is granted to copy, distribute and/or modify this +document under the terms of the GNU Free Documentation License, +version 1.2; with no Invariant Sections, no Front-Cover Texts and +no Back-Cover Texts. Copy of the license can be obtained from +http://codesynthesis.com/licenses/fdl-1.2.txt |