Simple XML serialization and deserialization in C#

19 Aug 2020 - tsp
Last update 19 Aug 2020
Reading time 7 mins

TL;DR Use public classes wrapping public properties, instantiate XmlSerializer and use the Serialize and Deserialize methods as shown in the last two code blocks in this article.

Serializing and deserializing data is one of the most basic operations during application development. As anyone who learned programming languages back int the 80s and 90s knows there is a huge amount of hacked ways to store data, especially in science there has been a time where a huge amount of data has been stored in comma separated value (CSV) style file formats that changed frequently or have been extended in different ways over different branches of program versions or even stored in ASCII files with some assumed structure. As time passed more structured and standardized file formats have been developed such as SGML files - one of the most famous applications of SGML is HTML - and for more sophisticated data using databases like SQLite, HDF files or XML.

Databases such as SQLite or even more advanced databases - be it relational or NoSQL databases - and HDF files might be considered overkill for many applications and they are only partially portable and accessible to applications outside of oneselfs ecosystem. This also applies to file formats such as CERNs ROOT file format commonly used in high energy physics. All of these methods do have major advantages over any text based file storage such as providing indices, versioning, carrying metadata for all objects, allowing fast access and handling the split onto multiple storage backends - and they should definitely be used when applicable. But on the other hand they might be considered too heavyweight and too dependency loaded to be used with an application - and they might also not be suited well for long term data storage.

The XML markup language on the other hand allows easy and human readable structured storage according to a specified schema or without any schema. It looks easy at first but parsing without any library is really complicated - there are more than 80 EBNF rules specified for the basic file format, not talking about namespace, schema definitions, etc. Nevertheless it’s a popular file format and there is fortunately a huge amount of libraries available to process XML files. If one wants to do some processing one should rely on such libraries instead of trying to write routines by oneself even though that’s a nice exercise (I did - and even wrote an own EBNF compiler to be somehow capable to parse XML using an recursive descent parser which took quite some time). Even less one should try to process XML files using regular expressions or similar methods, this approach would be doomed to failure right from the start even if it looks like it works in the beginning.

As a student of mine wanted to store some data in a structured way for a really simple experimental application programmed in C# in which he tried to do some experiments using LINQ on in-memory lists the idea of mapping his data hierarchically into XML files emerged - and he was missing a simple example on how to do this even after a few hours of web search. Don’t get me wrong, all of this information is existing on the Internet and it’s easy to locate in case you know what you’re searching for but he was missing a single working base from which he could start working and reading the documentation. So I decided to write this short blog-post containing some really basic samples that I supplied to him during some lectures.

Note that this samples:

Are not complete.
Do not provide error handling. This is required to be done when using any of that stuff anywhere near production.
Only show a single method of using the XmlSerializer

The data objects

First one needs some objects that are storing the data. This is known to anyone who’s doing object oriented programming - for example the bean style objects in Java. This is a class of objects that resemble data structures but allow one to supply even more fine grained control over property access using some getter and setter methods.

One should define an object for each and every type of data. These wrapper classes should be public and all properties that are stored inside the XML file also have to be public. One can control access using getters and setters though. In the following example all setters and getters will be publicly accessible.

The following example models a simple collection of movies (film) that contain some (non complete) metadata as well as a reference onto a director and roles played by actors. As one can see it’s rather simple and non complete but should also only serve as a starting point and simple example:

namespace XmlSample {
    public class Person {
        public string name { get; set; }
        public int yearOfBirth { get; set; }
    }
}

namespace XmlSample {
    public class Actor : Person {
    }
}

namespace XmlSample {
    public class Director : Person {
    }
}

namespace XmlSample {
    public class ActingRole : Actor {
        public string roleName { get; set; }
    }
}

namespace XmlSample {
    public class Film {
        public string title { get; set; }
        public int releaseYear { get; set; }

        public Director director { get; set; }
        public List<ActingRole> actors { get; set; }

        public Film() {
            actors = new List<ActingRole>();
        }
    }
}

namespace XmlSample {
    public class FilmCollection {
        public List<Film> filmCollection { get; set; }

        public FilmCollection() {
            filmCollection = new List<Film>();
        }
    }
}

Serializing

If one now wants to serialize a FilmCollection object into an XML file one simply can use XmlSerializer provided by the .NET library. One instances the serializer and provides the datatype that should be serialized. This information will be used by the serializer to automatically determine the schema used. Then one passes a StreamWriter as well as the object that should be serialized and calls Serialize.

    public static void serializeFile(string filename, FilmCollection col) {
        XmlSerializer xSerializer = new XmlSerializer(typeof(FilmCollection));
        StreamWriter fileWriter = new StreamWriter(filename);
        xSerializer.Serialize(fileWriter, col);
        fileWriter.Close();
    }

That’s it (without error handling of course - i.e. it should only serve as a starting point)

Deserializing

The deserialization process is equally simple. Create an XmlSerializer, specify the datatype, provide an stream and call Deserialize. One has to cast the object type though:

    public static FilmCollection deserializeFile(string filename) {
        XmlSerializer xSerializer = new XmlSerializer(typeof(FilmCollection));
        FileStream fs = new FileStream(filename, FileMode.Open);

        FilmCollection readCollection = (FilmCollection)xSerializer.Deserialize(fs);

        fs.Close();

        return readCollection;
    }

Easy, as serialization.

Modifying the serialization process

Of course that’s not the end of the story - one can control the serialization and deserialization process using a number of attributes. For example the XmlRootAttribute allows one to specifiy the namespace used, the alternative name of the root attribute instead of the auto derived one from the class name, properties such as nullable properties, etc; XmlArrayAttribute can be used to modify the name of collection items such as lists, XmlAttribute allows to serialize properties into attributes instead of elements.

Since there is a huge bunch of attributes one should really consider reading the excellent documentation for XmlSerializer