How XML Works

So, this will obviously not be an explanation of how XML works. Instead, this will be a short explanation of “what I think about when I think about XML.” And, I want to begin by saying that I am happy that this page here exists, as that’s where I went to double-check my vocabulary and, in the process, learned quite a bit.

XML basically is a standard by which your computer (or a Python library, to be specific) knows how to put information into text format to save to a file. If you’ve ever done much by way of HTML encoding, the basic formats will look familiar.

How to write XML.

Remember in the last bit where I said that I think of XML data as a pile of folders, some containing other folders? Well, basically, each folder is an Element. It has this format:


They both are elements named ‘ELEMENT’ (I don’t know why I do so much in all caps in my own XML, I guess it’s so that I know I’m writing for a computer and not a person.)

Of the two, the first is a self-closing tag. (That’s just the way of saying that no second tag is required to ‘close it’, like in the second example.

In the second example, the first tag ‘opens’ the element, the second ‘closes’ it.

The reason for opening and closing the element is pretty simple. Remember my example of a bundle of nested folders? Well, if we have an element meant to represent a user, it might also contain things like grammar this person has worked on, or a list of vocabulary that they’re using.

To include another element in an element, you open the element, create the other elements, and then close the element again. Like this:

<USER NAME=”Toby the Amazing”>

<VOCAB_LIST WORDS=”XML, pizza, whining” />


<GRAMMAR NAME=”Simple Past” />

<GRAMMAR NAME=”Simple Present” />



(I’m sorry for the formatting, but I’m not willing to do all the weird stuff required to make it look nice with WordPress.)

So, if you look at what I have above, there is a user, and inside the user element (the first and last lines open and close the element) there is an element containing a vocabulary list, as well as another element for practicied grammar which contains more elements, each for a specific grammar that has been practiced.

If you look at the different elements in my example, within the tags, or within the opening tags, you’ll see where I wrote NAME=”Toby” or WORDS=”XML…”. These are attributes in the element.

Going back to my paper metaphor, I think of opening and closing tags as the folder that contains something, and self-closing tags as individual pieces of paper in that folder. So far, so good. The thing is, the attributes are what you’ve written on each folder/paper. So, in the folder you might have a hundred copies of a form that gives information on a grammar practiced (I could not name a hundred forms of grammar, that’s a bad example.)

But, just as it makes no sense to have a folder on which you have “Name: Toby” and “Name: Liam” written (which name is it?!) you can’t have two attributes with the same name in XML. I don’t know what happens when you manually write XML that way, but in the python library I’m going to write about next, it would just change the name.

Remember this: If you want many of something — grammar forms, words, users — what you want are elements. If you want to define your elements with unique properties, name, height, level of charm, then you want attributes.

Don’t worry, you won’t have to write XML

I went through all of that, and you’ll probably never write XML in your life (though I do write the initial tags in the XML files that I use, mostly because I’m too lazy to do it with code.)

In the next little bit that I write, I’m going to explain how I do XML in Python, and you’ll need to know the words element and attribute, but you’ll never have to worry about how to write it. After all, the goal is to have your computer do the work, not you.

Still, one of the great things about XML is that you can manually edit it, and I think it’s good to know how it’s put together.

This is something I’m writing mostly for my nephew in support of the Typing Tutor challenge that I’ve issued to him. I imagine all of this information has been explained better elsewhere on the Internet.

So why are you reading this?

Why I like XML

So, I haven’t been programming long enough — and certainly not well enough — to be allowed an opinion on what other people should do. But, I do think I can share why I recommend using XML to other new programmers.

It’s fair to point out that I’m talking about programming in Python with the Element Tree library.

It behaves like you’d think.

I didn’t fall in love with programming until I discovered object-oriented programming. It’s just easier for me to abstract out the idea of the different parts of a program as ‘objects’ that interact. (I can write a bit on object-oriented programming, if you’d like, too!)

A lot of my programs are just collections of objects. I have a dictionary object that manages so many word objects, for example, in my ESL worksheet application. And, when I need to save all this as a file, it’s nice to have a format that makes sense.

The way that it makes sense to organize all this data, to me, in the physical world, would be a great big folder or binder with lots of other folders and binders in it. But also, with thousands of loose pages.

There would be a binder that is labeled “Dictionary” and that binder would probably have a name and, because I record dates unnecessarily, the date when it was created. Inside, each word would be a folder that contained information on the word, its German translation, space, maybe for other languages, as well as things like what part of speech it is and if there are other words that have the same text. (A real example from my teaching: I teach in one company where ‘forks’ normally mean the parts of your bicycle that hold the front wheel, but I also teach a lot of people who also think that a fork is something to eat with. Obviously, it’s going to make a difference in the worksheets.)

And these word entries should have a unique label of some kind (so the different kinds of ‘forks’ can be referenced later) and, of the other binder full of sentences contains a good example sentence for this word, I might want to store the unique label of that sentence.

That’s all the data I have in this binder, right? (In your typing trainer, the binder might be for a single user — with records on the letters they’re learning, another with a list of keys they often miss…) And what’s great about XML, is that it gives me a way to save data in this format, if I only think about it the right way.

Even more than just saving it in this way, the Element Tree library gives me objects (they’re called Elements and SubElements, when I write about using Element Tree I’ll write more about it.) that behave this way. That means I can assign an entire Dictionary to a single variable and pass it to a function or a method.

In fact, a lot of my programming is basically writing objects that take a single XML object and ‘bring it to life’ by adding methods to work with the data in it.

You can read it.

Here’s the other thing about XML: I can use it to store data and then, when I realize I’ve made a mistake (there’s a typo in a word, or students say a definition I wrote is unclear), I don’t need to write up a bunch of code to access that data and change it — though I should and eventually will — but instead I can open the file in any editor and edit the file.

I don’t think it’s possible for me to overstate how thankful I am that I can read and directly edit the information I have stored in XML. Even more than being convenient, it’s reassuring to me that I can fix my mistakes without too much effort.

This is the first in a small series I’m doing on XML for my nephew who is also a new coder. He recently wrote me that he’d rather just set up a text file for the Typing Challenge I set him. Of course, he can do what he wants, but I think it’s the prerogative of an uncle to send long, rambling dissertations on his opinions.