|Title of Invention||
"AN ENCODING SYSTEM FOR ENCODING A STRUCTURED DOCUMENT"
|Abstract||The invention relates to a method for encoding a structured document (DOC), particularly an XML document, during which the contents of the document DOC) are converted into a binary representation. This binary representation is divided into encoding units (FUU), which form an encoded data flow (BDOC) and can be read out from the encoded data flow. The encoded data flow contains configuration data, with which configuration information (EC) concerning the division of the binary representation into encoding units (FUU) can be read out before the reading out of one or more encoding units (FUU).|
|Full Text||The invention relates to a method for encoding a structured document, a decoding method and a corresponding encoding and/or decoding device, in which a binary representation of a structured, in particular XML-based document (XML = Extensible Markup Language) , is encoded and/or decoded with the aid of a scheme.
Encoding and decoding methods of this type are described for example in publications concerning the MPEG-7 standard, in particular in document . These methods allow the contents of the document, in particular elements and/or attributes and/or data types, to be determined with the aid of bit patterns in an encoded data flow. In this case, the encoded contents are stored in so-called FUU's (FUU - fragment update unit), in which the entire content of the element and/or attribute and/or data type need not be contained in the FUU. Parts of this element and/or attribute and/or data type can be encoded in subsequent FUU's.
The content of XML documents is frequently further processed by a recipient, and prepared for example for display. For this purpose, it is often the case that only specific elements and/or attributes and/or data types are filtered out from the XML document. The process of filtering can be specified for instance in a so-called XSLT (XSLT = SML style sheet language transformation).
According to the prior art, it has proven disadvantageous in applications for processing an XML document that in order to filter out contents the whole document is decoded from the bit flow and is only then filtered. The filtering can be accelerated by means of technologies known from the prior art such that FUU's which cannot contain the content to be filtered as a result of the information contained in the so-called context path of the FUU are not decoded. It is however not possibly to reliably determine, on the basis of the context path, which FUU's actually contain the desired content.
The object of the invention is thus to create a method for encoding a structured document, which enables a more simple and rapid filtering of contents from the document.
This object is achieved by means of the independent claims. Developments of the invention emerge from the dependent claims.
With the method according to the invention for encoding a structured document, in particular an XML document, the contents of the document are converted into a binary representation. The binary representation is divided into encoding units, which form an encoded data flow, it being possible to read out the encoded units from the encoded data flow. The encoded data flow thus contains configuration data, with which configuration information concerning the division of the binary representation into encoding units can be read out before one or more encoding units are read out. Therefore, in order to filter out specific contents from the document it is no longer necessary to decode the entire encoded data flow, but it is already possible to determine from the encoded data flow which contents the individual encoding units contain. The filtering of a structured document can thus be significantly accelerated. In a preferred embodiment of the invention, the configuration information, particularly information concerning missing contents, is in predetermined encoding units. It is thus possible to determine from the encoded data flow, which contents are missing in an encoding unit so that there is no need to decode this encoding unit if searches are made during filtering for precisely this missing content.
In a further preferred embodiment, the configuration data is itself encoded in the encoded data flow, as a result of which the encoding efficiency is increased.
In one configuration of the invention, the configuration data is the configuration information, this configuration information being added to the encoded data flow. In particular, the configuration information can be textually encoded in the form of an XML document. Alternatively, the configuration information can be encoded using an MPEG encoding method.
In one embodiment, the configuration data consists of references to configuration information, with which configuration information is selected from stored configuration information. The entire configuration information need no longer be transmitted, instead this information can be stored in a storage area, which can be accessed by the decoder.
The document to be encoded is preferably an MPEG description flow, in particular an MPEG-7 or MPEG-21 description flow, the encoding units being fragment update units which in turn form access units. A description of the encoding standard MPEG-21 can be found in document  for instance. The stored configuration information is preferably contained in profiles of an MPEG standard, in particular of the MPEG-7 or the MPEG-21 standard.
In a particularly preferred embodiment, the structured document is an XML document comprising elements and/or attributes and/or data types. If the configuration information is information concerning missing contents, the missing contents particularly comprise at least one element and/or one attribute and/or one data type.
In addition to the above-described method for encoding a data flow, the invention further comprises a method for decoding an encoded data flow, the method being designed such that a data flow encoded with the encoding method according to the invention is decoded. In this case, the configuration information is preferably read out from the encoded data flow.
Furthermore, the invention relates to a method for encoding and decoding a data flow comprising the above-described encoding method according to the invention and the above-mentioned decoding method according to the invention.
The invention further comprises an encoding device, which is designed such that the encoding method according to the invention can be implemented, and a decoding device, which is designed such that the decoding method according to the invention can be implemented. Furthermore, the invention relates to an encoding and decoding device comprising an inventive encoding device and an inventive decoding device.
Exemplary embodiments of the invention are described below in more detail with reference to the attached drawings, in which; Figure 1 shows a schematic representation of an encoding and decoding system, in which the encoding and decoding method according to the invention is implemented;
Figure 2 shows a schematic representation of the structure of
Figure 3 shows an example of a syntax of an XML document, from which information is to be filtered;
Figure 4 shows an example of a filter specification for
filtering out specific information from the binary representation of the XML document in Figure 3, and
Figure 5 shows an exemplary representation of an encoding
configuration formatted as an XML document which can be used in the method according to the invention
Figure 1 shows an exemplary encoding and decoding system with an encoder ENC and a decoder DEC, with which XML documents DOC are encoded and/or decoded. Both the encoder and the decoder have a so-called scheme S in which elements and types of the XML document used for communication are declared and defined. Code tables CT are generated from the scheme S by way of corresponding scheme compilations SC in the encoder and decoder. When the XML document DOC is encoded, the contents of the XML document are assigned binary codes by way of the code tables. Subsequently the codes are divided in the encoder into so-called fragment update units FUU, which are described in more detail in relation to Figure 2. The division of the codes into FUU's depends on the configuration of the encoder. The document DOC is thus converted into an encoded binary format BDOC which is subsequently transmitted to the decoder and then in turn decoded with the aid of the code table CT, thereby reproducing the original document DOC.
The method according to the invention is characterized in that information EC concerning the division of the contents of the XML document into FUU's carried out by the encoder is transmitted prior to or in parallel with the transmission of the binary representation of the XML document.
Figure 2 shows the components of a fragment update unit FUU, which represents the binary format of an MPEG 7 description flow. A unit of this type contains a fragment update command, in which is specified which operation is to be carried out in one node of the description tree of an XML document. Furthermore, the unit comprises a fragment update context, which contains among others a so-called context path, by means of which the path in the description tree of the document is specified to the node at which the fragment update command is to be implemented. The context path determines which information can be maximally contained in an FUU. The FUU finally still contains the fragment update payload, i.e. the encoded information to be processed in the corresponding node. For a more precise description of the structure of an FUU, reference should be made to document . An encoded data flow comprises a plurality of fragment update units of this type, these FUU's being in turn combined into so-called access units. In the embodiment of the method according to the invention described here, in addition to the FUU's, configuration information EC is still transmitted in the encoded data flow to the decoder, said configuration information specifying how an XML document is divided in FUU's.
Figure 3 reproduces an example of a content of an XML document to be encoded. The document comprises among other things four elements termed as "gBSDUnit", two of these elements containing a so-called marker attribute. Figure 4 shows a filter specification, according to which the document encoded using the method according to the invention is to be filtered. The filter specification determines that a context path is to be sought which contains the element gBDSUnit with the marker attribute. In the present case, this specification corresponds to the bit pattern "11010".
To filter this information from the encoded data flow with the least possible decoding effort, the configuration information of the encoder displayed in XML format in Figure 5 is transmitted to the decoder. This specifies that an access unit contains only gBSDUnits (line 4:
the decoder knows that marker attributes are not contained in FUU's containing gBSDUnits, and the gBSDUnits contained in the fragment update payloads need not be decoded for this purpose,
the decoder must only decode FUU's, the context path of which (see Figure 4) comprises the bit pattern of a context path to a marker attribute.As the comparison of bit patterns can be implemented significantly faster than the decoding of fragment update payloads, the transmission of the configuration information of the encoder can allow the filtering to accelerate
 Text of ISO/IEC FCD 15938-1 Information Technology -Multimedia Content Description Interface - Part 1, Systems
 Text of ISO/IEC CD 21000-7 Information Technology -
Multimedia Framework - Part 7, Digital Item Adaptation
 J.Heuer, C. Thienot, M. Wollborn, "Binary Format", in "Introduction to MPEG-7", Editors: B.S. Manjunath, P. Salembier, T. Sikora, John Wiley & Sons, West Sussex, 2002, pages 61-80.
1. An encoding system for encoding a structured document
(DOC) in particular an (XML) document comprising an encoding device (ENC) for encoding the structured document by means of coded table (CT) into binary representation, scheme (S) for defining elements and types of structured document (DOC), scheme compilation (SC) provided for generating coded table (CT) from the scheme (S), encoding units (FUU) generated by dividing the binary representation forming a encoded binary format (BDOC) containing configuration information (EC) by means of which configuration information (EC) concerning the division of the binary representation into encoding units (FUU) is readable prior to reading out one or more fragment update units.
2. An encoding system for encoding a structured document as claimed in claim 1, wherein the said encoding unit have predetermined encoding units (FUU) having configuration information concerning missing contents.
3. An encoding system for encoding a structured document as claimed in claim 1, wherein the encoded data flow (BDOC) have references to the location at which the
missing contents are located.
4. An encoding system for encoding a structured document
as claimed in claim 1, wherein the encoding units
(FUU) being fragment update units which in turn form,
5. An encoding system as claimed in claim 1, for encoding
a structured document in particular a document
substantially as hereinbefore described with reference
to the accompanying drawings.
|Indian Patent Application Number||4004/DELNP/2005|
|PG Journal Number||03/2009|
|Date of Filing||06-Sep-2005|
|Name of Patentee||SIEMENS AKTIENGESELLSCHAFT|
|Applicant Address||WITTELSBACHERPLATZ 2, 80333 MUNICH, GERMANY.|
|PCT International Classification Number||G06F 17/30|
|PCT International Application Number||PCT/EP2004/001992|
|PCT International Filing date||2004-02-27|