- 6 minutes to read

XPath with RegEx search field expression type plugin

Use this plugin to apply RegEx expressions on extracted values from large XML messages

The Nodinite XPath with RegEx Search Field Expression Type plugin can be used to match and/or style (using RexEx) one or more unique values from elements and attributes (using XPath) in your logged XML documents.

A Logged message comes from a Log Event that is part of the Logging feature of Nodinite and the Search Fields are further used within self-service enabled Log Views for your business.

Quick example

Let's get you started with the following simple demonstration example valid for this search field plugin. For other and more advanced examples please scroll down on this page to the Examples section.

Input 1st Expression 2nd Expression Result
<ns0:Orders xmlns:ns0="Common.Schemas/Nodinite/1.0">
    <Header>
    <FileName>\\nodinitesrv01\ftp\public\INT001\Order\OrderFile_123.xml</FileName>
    </Header>
    <Order>
        <Id>101</Id>
        <Amount>1000</Amount>
        <City>Karlstad</City>        
    </Order>    
</ns0:Orders>

Orders/Header/FileName
([^\\]+$)
OrderFile_123.xml
XML Data Xpath expression RegEx configuration Unique values

Features

This plugin uses a high performance read-only fast forward only stream reader. Based on Microsoft's XPathReader.

  • Extract single or multiple unique values from XML messages (payload)
  • Uses many regular XPath expression as defined by W3C
  • The XPathReader provides the ability to perform XPath over XML documents in a streaming manner (large messages are thereby supported)
    • The XPathReader provides the ability to filter and process large XML documents in an efficient manner using an XPath-aware XmlReader. With the XPathReader, one can sequentially process a large document and extract an identified sub-tree matched by the user configured XPath expression
  • RegEx expressions can be applied to further match and style the output from the XPath expression

Not all types of XPaths can be evaluated using this XPath extractor, due to its forward only stream reader.

How to use

To extract values from XML messages you must first configure the Search Field, in this example case; The Order Id:

Once a Search Field is configured, values are extracted either during normal processing or from user initiated re-index operations. Extracted values are persisted and stored for as long as the days to keep events property on the Message Type is configured.

Test Expression

You can test an expression when configuring a Search Field in the 'Test Expression' tab

  1. Enter an appropriate payload in the 'Message Body' tab
  2. Select the 'XPath with RegEx' expression type plugin
    SelectXPathWithRegExPlugin
  3. Enter valid XPath expression (You can also click on elements/attributes to get a suggestion)
  4. Check the Treat sub XML as a string (the result from XPath is either xml node or string depending on this checkbox)
  5. Enter a RegEx expression to further match/style output from step 3
  6. Enter the number or the name of RegEx Group(s) to return (leave empty for all matches)
  7. Check the Global checkbox to not return on first match
  8. Review result/output, rewrite the code in step 3-7 until you get the data you seek

The actual result (values) are extracted by the Logging Service and then presented together with the evaluated processing state and the number of unique matches.

Test Expression Valid expression with state output, unique values and total count

If the expression is either invalid or does not match any data, then the following output is presented:
NoResult
Invalid expression yields no result

Examples

Basic example (style output)

To extract the clean file name in the XML Header element for the Message Type 'Common.Schemas/Nodinite/1.0#Orders' you can use the following valid expression 'Orders/Header/FileName'. This expression yields the value '\\nodinitesrv01\ftp\public\INT001\Order\OrderFile_123.xml'. By applying the following RegEx expression '([^\\]+$)' with options detailed below only the file name remains ('OrderFile_123.xml').

graph LR subgraph "Search Fields" sf(fal:fa-search-plus File name) end subgraph "Search Field Expressions" sfe(fal:fa-flask XPath with RegEx plugin) end subgraph "MessageTypes" mt1(fal:fa-file Orders) end sf --- sfe sfe ---|Expression configuration| mt1

Message Body

<ns0:Orders xmlns:ns0="Common.Schemas/Nodinite/1.0">
    <Header>
    <FileName>\\nodinitesrv01\ftp\public\INT001\Order\OrderFile_123.xml</FileName>
    </Header>
    <Order>
        <Id>101</Id>
        <Amount>1000</Amount>
        <City>Karlstad</City>        
    </Order>    
</ns0:Orders>

1st Expression (Xpath)

Orders/Header/FileName

2nd Expression configuration (RegEx)

Next you will configure the RegEx expression with its options.

  1. Leave the 'Treat sub XML as a string' checkbox unchecked since returned content from XPath expression is NOT xml
  2. Enter the following RegEx expression
([^\\]+$)
  1. Set the 'RegEx groups' to either empty or 1 (there is only 1 match)
  2. Leave the 'Global' checkbox unchecked since we do not want to continue after 1st match

Basic example (Apply RegEx on returned XML)

In this next example we will apply a RegEx configuration on XML content returned from the initiating XPath expression.

To extract the node names from some XML element within the for the Message Type 'Envelope' you can use the following valid expression 'Envelope/Any/Nodes'. This expression yields the XML element '\\Nodes' with child nodes. By applying the following RegEx expression '<([a-z0-9]{0,})' with options detailed below only the unique element names remains ('node1' and 'node2').

Test Expression
Valid expression with state output, unique values and total count

graph LR subgraph "Search Fields" sf(fal:fa-search-plus File name) end subgraph "Search Field Expressions" sfe(fal:fa-flask XPath with RegEx plugin) end subgraph "MessageTypes" mt1(fal:fa-file Orders) end sf --- sfe sfe ---|Expression configuration| mt1

Message Body

<Envelope>
<Any>
<Nodes>
<node1 id="1">
<node2/>
</node1>
</Nodes>
</Any>
</Envelope>

1st Expression (Xpath)

Envelope/Any/Nodes

2nd Expression configuration (RegEx)

Next you will configure the RegEx expression with its options.

  1. Leave the 'Treat sub XML as a string' checkbox unchecked since returned content from XPath expression is NOT xml
  2. Enter the following RegEx expression
<([a-z0-9]{0,})
  1. Set the 'RegEx groups' to 1 (only match on 1st group)
  2. Check the 'Global' checkbox since we want to continue after 1st match (get all node names)

Valid XPath examples

You can find more examples of allowed XPath expressions here _

Next Step

How to Add or manage Search Fields
How to Add or manage Log Views

Expression Type Plugins are used in Search Fields
What are Search Fields?
What are Search Field Expressions?
What are Message Types?
What are Log Views?

Flat File Fixed Width
Message Context Key
RegEx
RegEx On Message Context
XPath
XPath on wrapped XPath
JSON Path