[SOLVED] Speed efficiency in parsing XML with XMLStarlet in zsh or bash

Issue

We have a fairly complex zsh application that uses XML files for storing its configuration and data. The current approach to reading from and writing to those files is using xmlstarlet.

When updating a file we pipe the whole XML multiple times, once for each attribute or element we touch as follows:

cat "$config" \
| xml_addSubnode              "/a/b/c"                 "foo" \
| xml_createOrUpdateAttribute "/a/b/c/foo[last()]"     "attr1"  "zzzz" \
| xml_createOrUpdateAttribute "/a/b/c/foo[last()]"     "attr2"  "wwww" \
\
| xml_addSubnode              "/a/b/c/foo[last()]"     "attr3" \
| xml_createOrUpdateAttribute "/a/b/c/foo[last()]/bar" "attr4"  "zzzz" \
| xml_createOrUpdateAttribute "/a/b/c/foo[last()]/bar" "attr5"  "kkkk" \
\
| xml_update "$config"

The attributes are read in shell variables by calling xml each time separately:

local foo="$(xml_value "$xpath" "$config")"
local bar="$(xml_value "$xpath" "$config")"
...

The utility functions boil down to the following:

xml_addSubnode() {
    ...
    cat | xml ed -s "$elementXPath" -t elem -n "$element"
}

xml_createOrUpdateAttribute()
{
    ...
    cat | xml ed --update ... --insert ...
}

xml_value()
{
    ...
    xml sel -t -v "$xPath" "$xmlFile"
}

xml_update()
{
    ...
    cat > "$file"
}

This code works functionally well, but obviously the performance is horrible.

How can this code be made efficient? What other ways are there to parse XML with zsh or bash that would yield a faster execution?

Using another format is also an option although it would require some migration effort. I know about the jq JSON parser but the usage would be similar to xmlstarlet and I would not gain much if I follow the same approach, right?

The program runs on FreeBSD.

Solution

You can do all the updating in a single pass with xmlstarlet, which will be much faster than calling it 6 times:

#!/usr/bin/env zsh

cat test.xml
print -- --------
xmlstarlet ed \
           -s '/a/b/c' -t elem -n foo \
           -s '/a/b/c/foo[last()]' -t attr -n attr1 -v zzzz \
           -s '/a/b/c/foo[last()]' -t attr -n attr2 -v wwww \
           -s '/a/b/c/foo[last()]' -t elem -n bar \
           -s '/a/b/c/foo[last()]/bar' -t attr -n attr3 -v zzzz \
           -s '/a/b/c/foo[last()]/bar' -t attr -n attr4 -v kkkk \
           test.xml

Example:

$ ./test.sh
<?xml version="1.0"?>
<a><b><c/></b></a>
--------
<?xml version="1.0"?>
<a>
  <b>
    <c>
      <foo attr1="zzzz" attr2="wwww">
        <bar attr3="zzzz" attr4="kkkk"/>
      </foo>
    </c>
  </b>
</a>

Answered By – Shawn

Answer Checked By – Senaida (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *