Navigating through a document using the XPath queries

The XPathEvaluator interface defines methods allowing to access from one element to another one in the document tree. All instances of Document also implement the XPathEvaluator interface, which means you can access Evaluate() method if you want to query the HTML page. For example:

    //get first div
    var xpathResult = Document.Evaluate("//div", Document, null, XPathResultType.FirstOrderedNode, null);
    var firstDiv = xpathResult.SingleNodeValue as HTMLDivElement;    
    Console.WriteLine($"First div ID is {firstDiv.Id}");

In this example we used SingleNodeValue property of IXPathResult interface to get single value of result.

There are two ways to return multiple nodes, via iterator or snapshot. Iterator results are still tied to the document, so any changes made will automatically be reflected in the result set. Snapshot results, on the other hand, take the results at that point in time and are not affected by further document augmentation. Both result types require you to iterate over the results. For iterator results, you’ll need to use the IterateNext() method, which will either return a node or null (this works for both ordered and unordered iterator results):


    //get all divs - iterator style
    var result = Document.Evaluate("//div", Document.DocumentElement, null,
        XPathResultType.OrderedNodeIterator,null);
    if (result!=null)
    {
        var node = (Element) result.IterateNext();
        while (node!=null)
        {                    
            Console.WriteLine($"div ID is {node.Id}");
            node = (Element)result.IterateNext();
        }
    }

For snapshot results, you can use the SnapshotLength property to determine how many results were returned and the SnapshotItem() method to retrieve a result in a specific position. Example (this works for both ordered and unordered snapshot results):

            
    //get all divs - snapshot style 
    var result = Document.Evaluate("//div", Document.DocumentElement, null,
        XPathResultType.OrderedNodeSnapshot, null);
    if (result!=null)
    {                
        for (int i = 0, len = result.SnapshotLength; i < len; i++)
        {
            var element = (Element) result.SnapshotItem(i);
            Console.WriteLine($"div ID is {element.Id}");
        }
    }

In most cases, a snapshot result is preferable to an iterator result, because the connection with the document has been severed; every call to iterateNext() re-executes the XPath query on the document and so is much slower. In short, iterator results have the same performance implications as using HTMLCollection objects, which also query the document repeatedly. If you plan to use the expression several times, you can use CreateExpression. It allows you to compile the expression string into a more efficient internal form and move all the namespace prefixes that occur inside the expression.


    private static readonly IXPathExpression XPathExpression = Document.CreateExpression("//p[contains(@style,'color: violet')]", null);
    ...
    //get all divs - snapshot style 
    var result = XPathExpression.Evaluate(Document.DocumentElement, XPathResultType.OrderedNodeSnapshot, null);                
    if (result != null)
    {
        for (int i = 0, len = result.SnapshotLength; i < len; i++)
        {
            var element = (Element)result.SnapshotItem(i);
            Console.WriteLine($"Paragraph ID is {element.Id}");
        }
    }