XPath Selectors Cheatsheet

written

A XPath selectors cheatsheet using the concepts laid out in Thinking in Dimensions: A Unified Approach to Filter grammars.

Logic operators

XPath provides some of its logic operators both inside expressions and predicates (sub-filters).


Inside expression
Inside predicate
Equal
//a[@id = "xyz"]

Not equal
//a[@id != "xyz"]

Greater than
//a[@price > 25]

AND
//div[@class="head"][@id="top"]
//div[@id="head" and position()=2]
OR
//a | //div
//div[(x and y) or not(z)]
NOT

//div[(x and y) or not(z)]

Filtering vs selection

XPath provides a more flexible set of selection options than CSS.

XPath expressions allow filtering nodes based on their ancestors or children without changing the selection target.

1
2
3
4
5
# Selects descendant
//ul/li[1]

# Select parent based on descendant
//ul[li[position()=1]]

XPath also allows matching or returning element attributes (as opposed to the whole element):

1
2
3
4
5
# Filter on attribute
//button[text()="Submit”]

# Match or return the attribute value
//button/text()

Semantic dimensions

XPath can be used for XML or HTML documents.


Element Hierarchy and content
(Permutation)
Element Position
(Permutation)
Element type
(Nominal)
Attribute
(Permutation)

class and id attributes are Fields separated by spaces
Non anchored match
Any / Presence
Anywhere:

//hr

Self:
.

Short for: self::node()

*

CSS Equivalent:
*
Attribute:

//a[@rel]

CSS Equivalent:
a[rel]
Exact match
Match Content:

//button[text()="Submit”]

Return Content:
//span/text()

//h1

CSS Equivalent:
h1
//input[@type="submit"]

CSS Equivalent:
input[type="submit"]

Return attribute value:
//a/@href

Match on language:
lang(str)

Matching on id and class

No fields operator, so need workaround:

//div[contains(concat(' ',normalize-space(@class),' '),' foobar ')]

CSS Equivalent:
.foobar
Top / Left anchored match
Absolute match
Root
/
/body

CSS Equivalent:
:root

Match content:

//[starts-with(text(), 'h')]
//ul/li[1]

//ul/li[position()=1]

CSS Equivalent:
ul > li:first-child
//[starts-with(name(), 'h')]
//a[starts-with(@href, '/‘)]

CSS Equivalent:
a[href^='/']
Immediate Relative match
Match on child type:
//ul/li

Short for: //ul/child::li

CSS Equivalent:
ul > li

Content:

//button[contains(text(),”Submit”)]

Adjacent sibling (immediately proceeded by)

//h1/following-sibling::ul[1]

CSS Equivalent:
h1 + ul
//[contains(name(), 'h')]
(Thinking of each character in the substring as being the next immediate match of the one before):

a[href*='://‘]
Or
a[href~='://‘]

font[contains(@class,"head")]

CSS Equivalent:
//a[contains(@href, 
'://')]
Any relative match
Descendant or self

//div//p

Short for: 

/descendant-or-self::node()/

Any descendant

//div/descendant::p/

CSS Equivalent:

div p
Any following sibling:

//h1/following-sibling::ul

CSS Equivalent:
h1 ~ ul

Combines position and hierarchy:

Everything in the document after the closing tag of the current node

/following
Matches tag types with ‘h’ after ‘1’

//[contains(substring-after(name(), “h”), “1”)]
Matches attributes with ids with ‘bar’ after ‘foo’:

//[contains(substring-after(@id, “foo”), “bar”)]
Bottom / Right anchored match
Absolute match
Content:

//[ends-with(text(), 'h')]
/li[last()]

//ul/li[position()= last()]

CSS Equivalent:
:last-child
//[ends-with(name(), 'h')]
//a[ends-with(@href, '.pdf’)]

CSS Equivalent:
a[href$='pdf']
Immediate Relative match
Parent:

..

Short for: parent::node()

Has children

Single match: 
//ul[*]
//ul[li] 

Match per child:
//ul/li/..

jQuery Equivalent:
$('ul > li').parent()

Has certain number of children:

//table[count(tr)=1]
Adjacent sibling (immediately proceeding)

//h1/proceeding-sibling::ul[1]
Same as Left/Top anchor match, reversed
Any relative match
Closest ancestor (or self) matching selector:

//ul/ancestor-or-self::li

jQuery Equivalent:
$('li').closest('section’)


All proceeding siblings

//h1/proceeding-sibling::ul

Combines position and hierarchy:

Everything in the document before the opening tag of the current node

/preceding
Matches tag types with ‘h’ before ‘1’

//[contains(substring-before(name(), “1”), “h”)]
Matches attributes with ids with ‘foo’ before ‘bar’:

//[contains(substring-before(@id, “bar”), “foo”)]
Projections
Size or count
Element with single child:

//table[count(tr)
=1]

//table[count(tr)
> 1]


Match all tags with a type that is 2 characters long:

//[string-length(name()) = 2)]
Match all tags with an id that is 2 characters long:

//[string-length(@id) = 2)]

Other Projections

Some other projection functions that may not be covered above:

Accessors
lang(str)
namespace-uri()
concat(x,y)
String projections
substring(str, start, len)
substring-before("01/02", "/")
substring-after("01/02", "/")
translate()
normalize-space()
string-length()
Type conversion
string()
number()
boolean()

Comments