<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>C&#43;&#43; on Manuel Herrmann</title>
    <link>https://blog.0x17.de/tags/c&#43;&#43;/</link>
    <description>Recent content in C&#43;&#43; on Manuel Herrmann</description>
    <image>
      <title>Manuel Herrmann</title>
      <url>https://blog.0x17.de/images/mh.jpg</url>
      <link>https://blog.0x17.de/images/mh.jpg</link>
    </image>
    <generator>Hugo -- 0.159.0</generator>
    <language>en</language>
    <lastBuildDate>Thu, 01 May 2025 00:00:00 +0200</lastBuildDate>
    <atom:link href="https://blog.0x17.de/tags/c++/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Generic XML Parsing in C&#43;&#43;: A Template Metaprogramming Adventure</title>
      <link>https://blog.0x17.de/post/generic-xml-parsing-cpp/</link>
      <pubDate>Thu, 01 May 2025 00:00:00 +0200</pubDate>
      <guid>https://blog.0x17.de/post/generic-xml-parsing-cpp/</guid>
      <description>&lt;p&gt;Some years ago, I found myself experimenting with template metaprogramming in C++. One of the most intriguing projects that came out of this tinkering was a generic XML parser that combined several powerful features: declarative syntax, automatic validation, and bidirectional serialization/deserialization—all without requiring duplicate code.&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;intro&#34; loading=&#34;lazy&#34; src=&#34;https://blog.0x17.de/post/generic-xml-parsing-cpp/intro.jpg&#34;&gt;&lt;/p&gt;
&lt;p&gt;Today I&amp;rsquo;m dusting off this experiment to share how it works and reflect on how modern C++ features could simplify this approach even further.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Some years ago, I found myself experimenting with template metaprogramming in C++. One of the most intriguing projects that came out of this tinkering was a generic XML parser that combined several powerful features: declarative syntax, automatic validation, and bidirectional serialization/deserialization—all without requiring duplicate code.</p>
<p><img alt="intro" loading="lazy" src="/post/generic-xml-parsing-cpp/intro.jpg"></p>
<p>Today I&rsquo;m dusting off this experiment to share how it works and reflect on how modern C++ features could simplify this approach even further.</p>
<h2 id="the-goal-declarative-xml-handling">The Goal: Declarative XML Handling</h2>
<p>The primary goal of this project was to create a system where:</p>
<ol>
<li>You could define an XML structure once, using a declarative syntax</li>
<li>The same definition would handle both parsing and serialization</li>
<li>Validation would be built-in (required vs. optional elements)</li>
<li>The code would be type-safe and leverage compile-time checking</li>
</ol>
<p>What I wanted to avoid was the typical approach where you write separate code for parsing, validation, and serialization—an approach that often leads to duplicated logic and inconsistencies.</p>
<p>For example, consider this XML document:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-xml" data-lang="xml"><span style="display:flex;"><span><span style="color:#f92672">&lt;root</span> <span style="color:#a6e22e">key=</span><span style="color:#e6db74">&#34;mykey&#34;</span><span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;data</span> <span style="color:#a6e22e">id=</span><span style="color:#e6db74">&#34;1&#34;</span><span style="color:#f92672">&gt;</span>D1<span style="color:#f92672">&lt;/data&gt;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&lt;data</span> <span style="color:#a6e22e">id=</span><span style="color:#e6db74">&#34;2&#34;</span><span style="color:#f92672">&gt;</span>D2<span style="color:#f92672">&lt;/data&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">&lt;/root&gt;</span>
</span></span></code></pre></div><p>In a traditional approach, you might write separate parsing code, validation logic, and serialization functions for this structure. But with this template-based approach, you define the structure once and get all these capabilities automatically.</p>
<p>Here&rsquo;s a glimpse of what the final API looked like:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> xml <span style="color:#f92672">=</span>
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;root&#34;</span>_node(
</span></span><span style="display:flex;"><span>        Required(),
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;key&#34;</span>_attr(Required()),
</span></span><span style="display:flex;"><span>        <span style="color:#e6db74">&#34;client_id&#34;</span>_attr(),
</span></span><span style="display:flex;"><span>        NodeList(
</span></span><span style="display:flex;"><span>            <span style="color:#e6db74">&#34;data&#34;</span>_node(
</span></span><span style="display:flex;"><span>                <span style="color:#e6db74">&#34;id&#34;</span>_attr(Required()),
</span></span><span style="display:flex;"><span>                Text(Required()))));
</span></span></code></pre></div><p>With this single definition, you could:</p>
<ul>
<li>Parse XML strings into a structured <code>NodeData</code> object</li>
<li>Validate that all required elements exist</li>
<li>Serialize the data structure back to XML</li>
</ul>
<h2 id="how-it-works-template-metaprogramming-magic">How It Works: Template Metaprogramming Magic</h2>
<p>The implementation relies heavily on several C++ template metaprogramming techniques:</p>
<h3 id="1-the-nodedata-structure">1. The NodeData Structure</h3>
<p>The core of the system is a generic <code>NodeData</code> structure that acts as an intermediate representation:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">NodeData</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    std<span style="color:#f92672">::</span>string name;
</span></span><span style="display:flex;"><span>    std<span style="color:#f92672">::</span>string text;
</span></span><span style="display:flex;"><span>    std<span style="color:#f92672">::</span>map<span style="color:#f92672">&lt;</span>std<span style="color:#f92672">::</span>string, std<span style="color:#f92672">::</span>vector<span style="color:#f92672">&lt;</span>NodeData<span style="color:#f92672">&gt;&gt;</span> subnodes;
</span></span><span style="display:flex;"><span>    std<span style="color:#f92672">::</span>map<span style="color:#f92672">&lt;</span>std<span style="color:#f92672">::</span>string, std<span style="color:#f92672">::</span>string<span style="color:#f92672">&gt;</span> attributes;
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>This structure stores everything we need about an XML node: its name, text content, child nodes, and attributes.</p>
<h3 id="2-node-types-and-traits">2. Node Types and Traits</h3>
<p>The system defines several node types that know how to interact with both XML and the <code>NodeData</code> structure:</p>
<ul>
<li><code>Node&lt;name, Args...&gt;</code>: Represents an XML element</li>
<li><code>Attribute&lt;name, Args...&gt;</code>: Represents an XML attribute</li>
<li><code>Text&lt;Args...&gt;</code>: Represents text content within an element</li>
<li><code>NodeList&lt;SubNodeType, Args...&gt;</code>: Represents a list of similar child nodes</li>
</ul>
<p>Each node type implements several key methods:</p>
<ul>
<li><code>subnode()</code>: Retrieves the relevant part of an XML document</li>
<li><code>validate()</code>: Checks if the node meets requirements</li>
<li><code>parse()</code>: Extracts data from XML into <code>NodeData</code></li>
<li><code>serialize()</code>: Writes data from <code>NodeData</code> back to XML</li>
</ul>
<h3 id="3-type-deduction-and-user-defined-literals">3. Type Deduction and User-Defined Literals</h3>
<p>One of the key challenges in C++14 was that constructors couldn&rsquo;t deduce template parameters from arguments. This limitation required a creative workaround using user-defined literals and builder classes:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">CharT</span>, CharT... chars<span style="color:#f92672">&gt;</span> <span style="color:#66d9ef">auto</span> <span style="color:#66d9ef">operator</span><span style="color:#e6db74">&#34;&#34;</span>_node()
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">char</span> name[] <span style="color:#f92672">=</span> {chars..., <span style="color:#ae81ff">0</span>};
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> NodeBuilder<span style="color:#f92672">&lt;</span>name<span style="color:#f92672">&gt;</span>();
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This trick allows us to write <code>&quot;root&quot;_node()</code> which creates a <code>NodeBuilder&lt;&quot;root&quot;&gt;</code> that can then build a <code>Node&lt;&quot;root&quot;, Args...&gt;</code> with the proper template parameters.</p>
<h3 id="4-variadic-templates-for-complex-structures">4. Variadic Templates for Complex Structures</h3>
<p>Variadic templates allow us to handle arbitrary nesting and combinations of nodes:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">class</span><span style="color:#960050;background-color:#1e0010">... </span><span style="color:#a6e22e">Args</span><span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">is_required</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">is_required</span><span style="color:#f92672">&lt;&gt;</span> <span style="color:#f92672">:</span> std<span style="color:#f92672">::</span>false_type { };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Arg</span>, <span style="color:#66d9ef">class</span><span style="color:#960050;background-color:#1e0010">... </span><span style="color:#a6e22e">Args</span><span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">is_required</span><span style="color:#f92672">&lt;</span>Arg, Args...<span style="color:#f92672">&gt;</span> <span style="color:#f92672">:</span> 
</span></span><span style="display:flex;"><span>    std<span style="color:#f92672">::</span>conditional_t<span style="color:#f92672">&lt;</span>std<span style="color:#f92672">::</span>is_same_v<span style="color:#f92672">&lt;</span>Arg, Required<span style="color:#f92672">&gt;</span>, 
</span></span><span style="display:flex;"><span>                      std<span style="color:#f92672">::</span>true_type, 
</span></span><span style="display:flex;"><span>                      is_required<span style="color:#f92672">&lt;</span>Args...<span style="color:#f92672">&gt;&gt;</span> { };
</span></span></code></pre></div><p>This recursive template specialization checks if <code>Required</code> is among the arguments passed to a node, enabling us to handle validation elegantly.</p>
<h2 id="example-in-action">Example in Action</h2>
<p>Let&rsquo;s see how the system handles various XML inputs:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> examples <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&lt;wrong /&gt;&#34;</span>s,                                                            <span style="color:#75715e">// Wrong root element
</span></span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&lt;root /&gt;&#34;</span>s,                                                             <span style="color:#75715e">// Missing required attribute
</span></span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&lt;root key=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">mykey</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74"> /&gt;&#34;</span>s,                                               <span style="color:#75715e">// Valid minimal example
</span></span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&lt;root key=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">mykey</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">&gt;&lt;data id=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">1</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74"> /&gt;&lt;/root&gt;&#34;</span>s,                         <span style="color:#75715e">// Missing required text
</span></span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;&lt;root key=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">mykey</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">&gt;&lt;data id=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">1</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">&gt;D1&lt;/data&gt;&lt;data id=</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">2</span><span style="color:#ae81ff">\&#34;</span><span style="color:#e6db74">&gt;D2&lt;/data&gt;&lt;/root&gt;&#34;</span>s  <span style="color:#75715e">// Fully valid
</span></span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>The system will:</p>
<ol>
<li>Reject <code>&lt;wrong /&gt;</code> because it has the wrong root element name</li>
<li>Reject <code>&lt;root /&gt;</code> because the required &ldquo;key&rdquo; attribute is missing</li>
<li>Accept <code>&lt;root key=&quot;mykey&quot; /&gt;</code> as a minimal valid document</li>
<li>Reject <code>&lt;root key=&quot;mykey&quot;&gt;&lt;data id=&quot;1&quot; /&gt;&lt;/root&gt;</code> because the data node is missing required text</li>
<li>Accept the full example with two data nodes, each with their required attributes and text</li>
</ol>
<h2 id="modern-c-improvements">Modern C++ Improvements</h2>
<p>This code was written in the early days of C++17. If I were to revisit it today with C++20, several improvements would be possible:</p>
<h3 id="1-class-template-argument-deduction-ctad">1. Class Template Argument Deduction (CTAD)</h3>
<p>C++17 introduced CTAD, which would eliminate the need for the <code>NodeBuilder</code> approach. With C++20 we could directly write:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#75715e">// Instead of &#34;root&#34;_node(Required())
</span></span></span><span style="display:flex;"><span>Node<span style="color:#f92672">&lt;</span><span style="color:#e6db74">&#34;root&#34;</span><span style="color:#f92672">&gt;</span>(Required())
</span></span></code></pre></div><h3 id="2-string-literals-as-template-parameters">2. String Literals as Template Parameters</h3>
<p>C++20 allows string literals as template parameters, which would greatly simplify the implementation:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;</span><span style="color:#66d9ef">auto</span> name<span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Node</span> { <span style="color:#75715e">/* ... */</span> };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Usage:
</span></span></span><span style="display:flex;"><span>Node<span style="color:#f92672">&lt;</span><span style="color:#e6db74">&#34;root&#34;</span><span style="color:#f92672">&gt;</span>
</span></span></code></pre></div><h3 id="3-concepts-and-constraints">3. Concepts and Constraints</h3>
<p>C++20 concepts would allow for more precise constraints on template parameters, making error messages clearer and improving compile times:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">template</span><span style="color:#f92672">&lt;</span>NodeConcept... Children<span style="color:#f92672">&gt;</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Node</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><h3 id="4-stdvisit-and-variant">4. <code>std::visit</code> and Variant</h3>
<p>Using <code>std::variant</code> from C++17 could provide a more type-safe approach to handling different node types:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">using</span> NodeVariant <span style="color:#f92672">=</span> std<span style="color:#f92672">::</span>variant<span style="color:#f92672">&lt;</span>Element, Attribute, Text<span style="color:#f92672">&gt;</span>;
</span></span></code></pre></div><h3 id="5-if-constexpr">5. <code>if constexpr</code></h3>
<p>C++17&rsquo;s <code>if constexpr</code> should eliminate the need for some of the template specializations:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-cpp" data-lang="cpp"><span style="display:flex;"><span><span style="color:#66d9ef">if</span> <span style="color:#a6e22e">constexpr</span> (is_required_v<span style="color:#f92672">&lt;</span>Args...<span style="color:#f92672">&gt;</span>) {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Handle required case
</span></span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// Handle optional case
</span></span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h2 id="conclusion">Conclusion</h2>
<p>This XML parsing experiment demonstrates the power of template metaprogramming in C++. By combining user-defined literals, variadic templates, and SFINAE, we can create a declarative API for XML handling that provides type safety, validation, and bidirectional conversion.</p>
<p>While the implementation might seem complex, it delivers significant benefits:</p>
<ol>
<li>Write once, use for both parsing and serialization</li>
<li>Built-in validation enforced at runtime</li>
<li>Declarative syntax that mirrors the structure of XML</li>
<li>Type safety that catches errors at compile time</li>
</ol>
<p>Modern C++ could make this implementation even more elegant and concise, but the core ideas remain valuable. Template metaprogramming allows us to create domain-specific languages within C++ that can dramatically reduce boilerplate and improve code reliability.</p>
<p>The next time you find yourself writing separate code for parsing, validating, and serializing a data format, consider whether a template-based approach might let you define the structure once and get all those operations for free.</p>
<hr>
<p><em>See the full code example <a href="https://www.0x17.de/tools/generic_xml_parser.html">here</a></em></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
