#Document Structure
Every XML document is a tree. If you've worked with the DOM in JavaScript or XDocument in .NET, you already understand the shape — XML just has more node types than JSON.
#The Node Types
JSON has objects, arrays, strings, numbers, booleans, and null. XML has seven node types:
|
Node Type |
Example |
JSON Equivalent |
|---|---|---|
|
Document |
The root container |
The top-level |
|
Element |
|
An object key + value |
|
Attribute |
|
No direct equivalent |
|
Text |
Content inside elements |
A string value |
|
Comment |
|
Not supported |
|
Processing Instruction |
|
Not supported |
|
Namespace |
|
Not supported |
For most practical work, you'll deal with elements, attributes, and text.
#Elements vs Attributes
This is the first design decision XML gives you that JSON doesn't:
<!-- Data as an element -->
<customer>
<name>Acme Corp</name>
</customer>
<!-- Data as an attribute -->
<customer name="Acme Corp"/>Both are valid. The rule of thumb:
-
Attributes for simple metadata: IDs, dates, types, flags
-
Elements for content: text, nested structures, things that might repeat
Attributes cannot contain child elements or mixed content. Elements can contain anything.
#Mixed Content
This is something JSON fundamentally cannot represent:
<paragraph>
For more details, see <link href="/api">the API reference</link>
or contact <emphasis>support</emphasis>.
</paragraph>Text and elements interleaved. This is why every document format (HTML, DocBook, DITA, OOXML) uses XML — JSON has no way to express "text with inline markup."
#Well-Formedness
Every XML document must be well-formed:
-
Exactly one root element
-
Every opening tag has a closing tag (or is self-closing:
<br/>) -
Tags are properly nested (no
<b><i></b></i>) -
Attribute values are quoted
-
Element and attribute names are case-sensitive
This is stricter than JSON (which allows trailing commas in some parsers) but means every XML parser produces the same tree — no ambiguity.
#In .NET
using System.Xml.Linq;
var doc = XDocument.Parse("""
<order id="12345">
<customer>Acme Corp</customer>
</order>
""");
// Navigate the tree
var orderId = doc.Root?.Attribute("id")?.Value; // "12345"
var customer = doc.Root?.Element("customer")?.Value; // "Acme Corp"This is LINQ to XML — and it maps directly to the tree structure we've described. XPath and XQuery provide a more powerful way to navigate these same trees, which we'll cover next.