XMLserdes — XML Serialisation and Deserialisation
Mechanisms for serializing Python objects to XML, and deserializing them from XML. The top-level object in a serialization is almost always an instance of some class having multiple properties to be de/serialized. Support is provided for declarative specification of how this is to be done.
Top-level functions
- xmlserdes.serialize(obj, tag)
Entry point function to serialize a Python object to an XML element.
- Parameters:
obj (instance of class having
xml_descriptorattribute) – Python object to serialize- Returns:
XML element, as instance of
etree.Element.
- xmlserdes.deserialize(cls, elt, expected_tag)
Entry point function to deserialize a Python object from an XML element.
- Parameters:
cls – class of object to deserialize
elt (
etree.Element) – XML element
- Returns:
instance of class
cls.
See also xmlserdes.XMLSerializable and
xmlserdes.XMLSerializableNamedTuple for an ‘intrusive’ API.
Classes and functions
- class xmlserdes.XMLSerializable
Base class for types which become serializable to XML via instance method
as_xml, and deserializable from XML via class methodfrom_xml. XML behaviour is specified via anxml_descriptorclass attribute (of the derived class), which is a list of terse type-descriptor expressions — seexmlserdes.TypeDescriptor.from_terse()for details.
- class xmlserdes.XMLSerializableNamedTuple
Base class for types which are essentially named tuples with the field-names taken from the
xml_descriptor.>>> class Rectangle(xmlserdes.XMLSerializableNamedTuple): ... xml_default_tag = 'rect' ... xml_descriptor = [('wd', int), ('ht', int)] >>> r = Rectangle(10, 20) >>> print(r) Rectangle(wd=10, ht=20) >>> r.wd 10 >>> r.ht 20 >>> print(xmlserdes.utils.str_from_xml_elt(r.as_xml())) <rect><wd>10</wd><ht>20</ht></rect>
If a field’s name specifies that the its value is to be stored in an XML attribute (by starting with the
'@'character), then the field name of the class removes that'@':>>> class Ellipse(xmlserdes.XMLSerializableNamedTuple): ... xml_default_tag = 'oval' ... xml_descriptor = [('@major', int), ('@minor', int), ... ('colour', str)] >>> e = Ellipse(8, 5, 'red') >>> print(xmlserdes.utils.str_from_xml_elt(e.as_xml())) <oval major="8" minor="5"><colour>red</colour></oval>
If class has no
xml_default_tagattribute, it is created with value equal to the class name:>>> class Circle(xmlserdes.XMLSerializableNamedTuple): ... xml_descriptor = [('radius', int)] >>> c = Circle(42) >>> print(c) Circle(radius=42) >>> c.radius 42 >>> print(xmlserdes.utils.str_from_xml_elt(c.as_xml())) <Circle><radius>42</radius></Circle>
To suppress this behaviour, define an
xml_default_tagattribute with valueNone. This is useful if you wish to force callers ofas_xml()to supply the tag:>>> class Sphere(xmlserdes.XMLSerializableNamedTuple): ... xml_default_tag = None ... xml_descriptor = [('radius', int)] >>> s = Sphere(100) >>> print(xmlserdes.utils.str_from_xml_elt(s.as_xml('round-object'))) <round-object><radius>100</radius></round-object> >>> x = s.as_xml() ... Traceback (most recent call last): ... AttributeError: 'Sphere' object has no attribute 'xml_default_tag'
If you create a subclass of a
XMLSerializableNamedTuplesubclass, and do not explicitly specify anxml_default_tag, then the sub-subclass inherits the sub-class’sxml_default_tag:>>> class ShinyCircle(Circle): ... pass >>> sc = ShinyCircle(42) >>> print(sc) ShinyCircle(radius=42) >>> sc.radius 42 >>> sc_xml = sc.as_xml() >>> print(xmlserdes.utils.str_from_xml_elt(sc_xml)) <Circle><radius>42</radius></Circle>
(Note that the tag in the XML is
Circleand notShinyCircle.)But extracting a
ShinyCirclefrom an XML element works as expected:>>> sc_round_trip = ShinyCircle.from_xml(sc_xml, 'Circle') >>> print(sc_round_trip) ShinyCircle(radius=42)
- class xmlserdes.ElementDescriptor(*args, **kwargs)
Object which represents the mapping between an XML element and a property of a Python object, together with the native Python type of that property.
- Parameters:
tag (str) – tag for XML element to de/serialize from/to
value_from (callable) – function (or other callable) which extracts the value from the containing object
type_descr (subclass of
xmlserdes.TypeDescriptor) – type-descriptor which can de/serialize the Python object from/to the contents of an XML element
A more convenient way of constructing a
xmlserdes.ElementDescriptoris to use thexmlserdes.ElementDescriptor.new_from_tuple()method.- classmethod new_from_tuple(tup)
Construct a new
xmlserdes.ElementDescriptorfrom a two- or three-element tuple, covering the most common cases.- Parameters:
tup (tuple) – two- or three-element tuple describing required instance
The tuple must be one of:
a pair
(tag, type_descriptor), in which case thevalue_fromfield of the resultingElementDescriptorisattrgetter(tag)a triple
(tag, field_name_or_callable, type_descriptor); iffield_name_or_callableis astr, it is taken as an attribute and the resultingvalue_fromisattrgetter(field_name_or_callable); otherwisefield_name_or_callablemust be a callable.
- xml_element(obj, _xpath=[])
Serialize, into an XML element, the relevant property from the given object.
>>> descr = ElementDescriptor.new_from_tuple(('width', xmlserdes.Atomic(int))) >>> shape = collections.namedtuple('Shape', 'width')(42) >>> print(xmlserdes.utils.str_from_xml_elt(descr.xml_element(shape))) <width>42</width>
>>> descr_different_tag = ElementDescriptor.new_from_tuple(('shape-width', ... 'width', ... xmlserdes.Atomic(int))) >>> print(xmlserdes.utils.str_from_xml_elt(descr_different_tag.xml_element(shape))) <shape-width>42</shape-width>
- extract_from(elt, _xpath=[])
Deserialize, from an XML element, a value of the relevant type.
>>> descr = ElementDescriptor.new_from_tuple(('width', xmlserdes.Atomic(int))) >>> xml_elt = etree.fromstring('<width>99</width>') >>> descr.extract_from(xml_elt) 99
- xmlserdes.SerDesDescriptor(children)
Convenience function for constructing a list of
xmlserdes.ElementDescriptorinstances from a list of abbreviated tuples.- Parameters:
children (iterable of tuples) – descriptions of property/sub-element mappings; each should be a tuple suitable for passing to
xmlserdes.ElementDescriptor.new_from_tuple().- Returns:
New list of instances of
xmlserdes.ElementDescriptor.
- class xmlserdes.TypeDescriptor
Instances of classes derived from
xmlserdes.TypeDescriptorsupport two operations on objects of a particular type:xml_element()— serialize a given object as an XML element.extract_from()— extract an object of the correct type from a given XML element.
The following static method is also available:
tag_is_valid()— returnTrueorFalseaccording to whether the given tag is valid for this type-descriptor. For example, lists cannot be stored in attributes, and so a tag beginning with'@'is not valid for a type-descriptor storing a list.
This base type is not useful. Concrete derived types are:
xmlserdes.Atomic— fundamental type such as integer or string.xmlserdes.List— homogeneous list of elements.xmlserdes.Instance— instance of class, with fixed list of fields.xmlserdes.NumpyAtomicVector— Numpy vector with atomicdtype.xmlserdes.NumpyRecordVectorStructured— Numpy vector of recorddtype.
See those classes’ individual docstrings for more details.
- extract_from(elt, expected_tag, _xpath=[])
Extract and return an object from the given XML element. The tag of
eltshould be the given expected tag, otherwise anXMLSerDesErroris raised.- Parameters:
elt (
etree.Element) – XML elementexpected_tag (str) – tag which
eltmust have
- Return type:
depends on concrete subclass of
TypeDescriptor
- classmethod from_terse(descr)
Method to construct an instance of
xmlserdes.TypeDescriptorfrom a terse expression. Many types for theexpressionargument are supported:- atomic type object
A
xmlserdes.Atomicinstance is created for that type. The list of known ‘atomic’ types is stored inTypeDescriptor.atomic_types.>>> td = TypeDescriptor.from_terse(int) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(42, 'answer'))) <answer>42</answer>
- bool type object
An instance of
xmlserdes.AtomicBoolis created.>>> td = TypeDescriptor.from_terse(bool) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(False, 'is-blue'))) <is-blue>false</is-blue>
- Enum-derived class [Python 3.4 onwards]
An instance of
xmlserdes.AtomicEnumis created.>>> import sys >>> if sys.version_info >= (3, 4): ... from enum import Enum ... Animal = Enum('Animal', 'Cat Dog Rabbit') ... td = TypeDescriptor.from_terse(Animal) ... pet = Animal.Cat ... print(xmlserdes.utils.str_from_xml_elt(td.xml_element(pet, 'pet'))) ... else: ... print('<pet>Cat</pet>') <pet>Cat</pet>
- string instance
A
xmlserdes.Atomicinstance is created, where the contained type is found by interpreting the given string as a Numpy dtype code.>>> td = TypeDescriptor.from_terse('i2') >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(np.int16(42), 'answer'))) <answer>42</answer>
- non-atomic type object
A
xmlserdes.Instanceinstance is created, where the contained type is the given type. The type must have anxml_descriptorattribute.>>> class Blob(xmlserdes.XMLSerializableNamedTuple): ... xml_descriptor = [('size', int)] >>> td = TypeDescriptor.from_terse(Blob) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(Blob(42), 'blob'))) <blob><size>42</size></blob>
(This example uses the terse type-descriptor format to specify the XML behaviour of the one field of
Blob.)- list instance
A
xmlserdes.Listinstance is created.The given list must have either one or two elements.
If two elements, they are taken as the contained type and contained tag:
>>> td = TypeDescriptor.from_terse([int, 'ans']) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element([42, 99], 'answers'))) <answers><ans>42</ans><ans>99</ans></answers>
If one element, it must be a type having an
xml_default_tagattribute, which is used as the contained tag:>>> class Blob(xmlserdes.XMLSerializableNamedTuple): ... xml_default_tag = 'blob' ... xml_descriptor = [('size', int)] >>> td = TypeDescriptor.from_terse([Blob]) >>> blobs = [Blob(42), Blob(99)] >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(blobs, 'blobs'))) <blobs><blob><size>42</size></blob><blob><size>99</size></blob></blobs>
- tuple instance
Depending on the tuple length, either a
xmlserdes.NumpyAtomicVectoror axmlserdes.NumpyRecordVectorStructuredis created. In all cases, the first element of the tuple must be the Numpy.ndarray type object- two-element tuple
The second tuple element must be an atomic Numpy dtype, and a
xmlserdes.NumpyAtomicVectorfor that dtype is returned.>>> td = xmlserdes.TypeDescriptor.from_terse((np.ndarray, np.int32)) >>> xs = np.array([1, 2, 3], dtype = np.int32) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(xs, 'answers'))) <answers>1,2,3</answers>
- three-element tuple
The second element must be a record dtype, and the third element must be a string naming the contained elements. A
xmlserdes.NumpyRecordVectorStructuredis created.This example uses very short tag names to keep the output of a reasonable length:
>>> Rect = np.dtype([('w', np.uint16), ('h', np.uint16)]) >>> td = xmlserdes.TypeDescriptor.from_terse((np.ndarray, Rect, 'r')) >>> rects = np.array([(10, 20), (3, 4)], dtype = Rect) >>> print(xmlserdes.utils.str_from_xml_elt(td.xml_element(rects, 'rs'))) <rs><r><w>10</w><h>20</h></r><r><w>3</w><h>4</h></r></rs>
- xml_element(obj, tag, _xpath=[])
Return an XML element, with the given tag, corresponding to the given object.
- Parameters:
obj – object to be serialized into an XML element
tag (str) – tag for the returned XML element
- Return type:
XML element (as
etree.Elementinstance)
See examples under subclasses of
xmlserdes.TypeDescriptorfor details.
- abstract xml_node(obj, tag, _xpath=[])
Return either an xml element or an xml attribute.
- class xmlserdes.Atomic(inner_type)
A
xmlserdes.TypeDescriptorfor handling ‘atomic’ types. The concept of an ‘atomic’ type is not explicitly defined, but anything which can be faithfully represented as a string viastr(), and can be parsed from a string using the type name, will work.- Parameters:
inner_type – The native Python atomic type to be serialized and deserialized.
For example, an
xmlserdes.Atomictype-descriptor to handle an integer:>>> atomic_type_descriptor = xmlserdes.Atomic(int)
Serializing an integer into an XML element:
>>> print(xmlserdes.utils.str_from_xml_elt(atomic_type_descriptor.xml_element(42, 'answer'))) <answer>42</answer>
Deserializing an integer from an XML element:
>>> xml_elt = etree.fromstring('<weight>99</weight>') >>> atomic_type_descriptor.extract_from(xml_elt, 'weight') 99
Unexpected tag:
>>> atomic_type_descriptor.extract_from(xml_elt, 'length') ... Traceback (most recent call last): ... xmlserdes.errors.XMLSerDesError: expected tag "length" but got "weight" at /
- class xmlserdes.List(contained_descriptor, contained_tag)
A
xmlserdes.TypeDescriptorfor handling homogeneous lists of elements.- Parameters:
contained_descriptor (
xmlserdes.TypeDescriptor) – specification of the type of each element in the listscontained_tag (str) – tag for each sub-element of the sequence element
For example, a
xmlserdes.Listtype-descriptor to handle lists of integers, where each integer in the list will be represented by an XML element with taganswer.>>> list_of_ints_td = List(xmlserdes.Atomic(int), 'answer')
Serializing a list of integers into an XML element:
>>> print(xmlserdes.utils.str_from_xml_elt( ... list_of_ints_td.xml_element([42, 123, 99], 'list-of-answers'))) <list-of-answers><answer>42</answer><answer>123</answer><answer>99</answer></list-of-answers>
Deserializing a list of integers from an XML element:
>>> xml_elt = etree.fromstring('''<list-of-answers> ... <answer>1</answer> ... <answer>10</answer> ... <answer>100</answer> ... </list-of-answers>''') >>> list_of_ints_td.extract_from(xml_elt, 'list-of-answers') [1, 10, 100]
- class xmlserdes.Instance(cls)
A
xmlserdes.TypeDescriptorfor handling homogeneous instances of a ‘complex’ class having an ‘XML descriptor’.- Parameters:
cls – class whose instances are to be de/serialized; must have attribute named
xml_descriptorwhich is a list of instances ofxmlserdes.ElementDescriptor
Note
Possible to-do is allow separate passing-in of descriptor rather than requiring it to be an attribute of the to-be-serialized class.
Define class and augment it with
xml_descriptorattribute:>>> Rectangle = collections.namedtuple('Rectangle', 'wd ht') >>> Rectangle.xml_descriptor = xmlserdes.SerDesDescriptor([('wd', xmlserdes.Atomic(int)), ... ('ht', xmlserdes.Atomic(int))])
Define type-descriptor to handle de/serialization:
>>> rectangle_td = Instance(Rectangle)
Serialize instance of the
Rectangleclass:>>> r = Rectangle(210, 297) >>> print(xmlserdes.utils.str_from_xml_elt(rectangle_td.xml_element(r, 'rect'))) <rect><wd>210</wd><ht>297</ht></rect>
Deserialize instance:
>>> xml_elt = etree.fromstring('<rect><wd>4</wd><ht>3</ht></rect>') >>> rectangle_td.extract_from(xml_elt, 'rect') Rectangle(wd=4, ht=3)
- class xmlserdes.DTypeScalar(dtype)
A
xmlserdes.TypeDescriptorfor handling Numpy scalars of custom dtype.The XML representation has one sub-element per field of the dtype. Atomic-type fields are represented as their
repr; structured-dtype fields are represented with children corresponding to their fields, and so on.Note
Currently the XML tags for the fields of each record are the same as the field names of the
dtype. Possible to-do is to allow a mapping between these two sets of names.- Parameters:
dtype – Numpy record
dtypeof the vector
Define record
dtypewhose fields are all atomic types:>>> import numpy as np >>> ColourDType = np.dtype([('red', np.uint8), ... ('green', np.uint8), ... ('blue', np.uint8)])
Define type-descriptor for a scalar instance of it:
>>> colour_scalar_td = xmlserdes.DTypeScalar(ColourDType)
Serialize a scalar (the
[()]construct extracts a scalar element from the 0-dimensional array):>>> colour = np.array((20, 40, 50), dtype = ColourDType)[()] >>> print(xmlserdes.utils.str_from_xml_elt( ... colour_scalar_td.xml_element(colour, 'colour'), ... pretty_print = True).rstrip()) <colour> <red>20</red> <green>40</green> <blue>50</blue> </colour>
>>> xml_elt = etree.fromstring( ... '<green><red>0</red><green>64</green><blue>0</blue></green>') >>> extracted_colour = colour_scalar_td.extract_from(xml_elt, 'green') >>> print(extracted_colour) (0, 64, 0) >>> print(extracted_colour.dtype) [('red', 'u1'), ('green', 'u1'), ('blue', 'u1')]
Define a record
dtypewith nested custom field:>>> PatternDType = np.dtype([('background', ColourDType), ... ('foreground', ColourDType)])
Define type-descriptor for a scalar instance of it:
>>> pattern_scalar_td = xmlserdes.DTypeScalar(PatternDType)
Serialize a scalar (the
[()]construct extracts a scalar element from the 0-dimensional array):>>> pattern = np.array(((120, 140, 150), (20, 40, 50)), dtype = PatternDType)[()] >>> print(xmlserdes.utils.str_from_xml_elt( ... pattern_scalar_td.xml_element(pattern, 'pattern'), ... pretty_print = True).rstrip()) <pattern> <background> <red>120</red> <green>140</green> <blue>150</blue> </background> <foreground> <red>20</red> <green>40</green> <blue>50</blue> </foreground> </pattern>
- class xmlserdes.NumpyAtomicVector(dtype)
A
xmlserdes.TypeDescriptorfor handling Numpy vectors (i.e., one-dimensionalndarrayinstances) where thedtypeis an ‘atomic’ type. Serialization is done as a CSV string. Complex types are not supported.- Parameters:
dtype – Numpy
dtypeof the vector
Define type-descriptor to handle de/serialization of a Numpy vector of
uint16elements:>>> import numpy as np >>> vector_td = NumpyAtomicVector(np.uint16)
Serialize a vector:
>>> v = np.arange(4, dtype = np.uint16) >>> print(xmlserdes.utils.str_from_xml_elt(vector_td.xml_element(v, 'values'))) <values>0,1,2,3</values>
Deserialize a vector:
>>> xml_elt = etree.fromstring('<values>10,20,30</values>') >>> vector_td.extract_from(xml_elt, 'values') array([10, 20, 30], dtype=uint16)
- class xmlserdes.NumpyRecordVectorStructured(dtype, contained_tag)
A
xmlserdes.TypeDescriptorfor handling Numpy vectors (i.e., one-dimensionalndarrayinstances) where thedtypeis a Numpy record type. The record-type’s fields can be of scalar atomic type or customdtypein turn.The XML representation has one sub-element per element of the vector. Each of those sub-elements has sub-sub-elements corresponding to the fields of the record type.
Note
Currently the XML tags for the fields of each record are the same as the field names of the
dtype. Possible to-do is to allow a mapping between these two sets of names.- Parameters:
dtype – Numpy record
dtypeof the vectorcontained_tag (str) – tag to use for the element representing each element of the vector
Define record
dtype:>>> import numpy as np >>> ColourDType = np.dtype([('red', np.uint8), ... ('green', np.uint8), ... ('blue', np.uint8)])
Define type-descriptor for it:
>>> colour_vector_td = xmlserdes.NumpyRecordVectorStructured(ColourDType, 'colour')
Serialize a vector:
>>> colours = np.array([(20, 40, 50), ... (128, 128, 128), ... (255, 0, 255)], ... dtype = ColourDType) >>> print(xmlserdes.utils.str_from_xml_elt( ... colour_vector_td.xml_element(colours, 'colours'), ... pretty_print = True).rstrip()) <colours> <colour> <red>20</red> <green>40</green> <blue>50</blue> </colour> <colour> <red>128</red> <green>128</green> <blue>128</blue> </colour> <colour> <red>255</red> <green>0</green> <blue>255</blue> </colour> </colours>
>>> xml_elt = etree.fromstring( ... '''<greens> ... <colour><red>0</red><green>64</green><blue>0</blue></colour> ... <colour><red>0</red><green>192</green><blue>0</blue></colour> ... </greens>''') >>> extracted_colours = colour_vector_td.extract_from(xml_elt, 'greens') >>> print(extracted_colours) ... [(0, 64, 0) (0, 192, 0)] >>> print(extracted_colours.dtype) [('red', 'u1'), ('green', 'u1'), ('blue', 'u1')]
Custom
dtypeone of whose fields is non-atomic:>>> StripeDType = np.dtype([('colour', ColourDType), ('width', np.uint16)])
Type-descriptor for vector of such elements:
>>> stripe_vector_td = xmlserdes.NumpyRecordVectorStructured(StripeDType, 'stripe')
Serialize a vector:
>>> stripes = np.array([((20, 30, 40), 100), ((120, 130, 140), 200)], ... dtype = StripeDType) >>> print(xmlserdes.utils.str_from_xml_elt( ... stripe_vector_td.xml_element(stripes, 'stripes'), ... pretty_print = True).rstrip()) <stripes> <stripe> <colour> <red>20</red> <green>30</green> <blue>40</blue> </colour> <width>100</width> </stripe> <stripe> <colour> <red>120</red> <green>130</green> <blue>140</blue> </colour> <width>200</width> </stripe> </stripes>
- xmlserdes.NumpyVector(dtype, contained_tag=None)
Convenience function to instantiate an instance of the appropriate
xmlserdes.TypeDescriptorsubclass chosen fromIf a
contained_tagis given, axmlserdes.NumpyRecordVectorStructuredis created. If not, axmlserdes.NumpyAtomicVector.>>> import numpy as np >>> int_vector_td = xmlserdes.NumpyVector(np.int32)
>>> ColourDType = np.dtype([('red', np.uint8), ... ('green', np.uint8), ... ('blue', np.uint8)]) >>> colour_vector_td = xmlserdes.NumpyVector(ColourDType, 'colour')