Talk:XPath 1.0

Expanded syntax

Latest comment: 18 years ago1 comment1 person in discussion

In the full, unabbreviated syntax, the two examples above would be written

...

* child::A/descendant-or-self::node()/child::B[1] respectively.

Isn't that suppost to be child::A/descendant-or-self::B/child::node()[1]?

>> No, it's correct as written! Mhkay 20:27, 12 May 2006 (UTC)Reply

XPATH 2.0

Latest comment: 15 years ago3 comments3 people in discussion

either XPATH 2.0 section moves to XPATH 2.0 article, or the opposite.

atm this section here is way better than the article.

Another possibility is to say a sentence about XPATH 2.0 and point to XPATH 2.0 article (which will have the XPATH 2.0 section from this page)

The current situtation is bad --Nkour 14:05, 15 April 2006 (UTC)Reply

It's still bad Mhkay (talk) 22:46, 24 November 2007 (UTC)Reply

I think it's pretty clear that these articles need to be merged, we just need someone to do it. --sweecoo (talk) 18:41, 16 February 2009 (UTC)Reply

Predicates

Latest comment: 18 years ago1 comment1 person in discussion

Surely //a[@href='help.php'][../../div/@class='header']/@target means an <a> with href of help.php - not with a parent that's a div (as suggested in the article), but rather with a grandparent that contains a div (with class 'header') - even if that div is not an ancestor of the <a> node? —The preceding unsigned comment was added by Tim.spears (talk • contribs) .

Well spotted, Tim! That example has always been a pain. It was Hwiechers who first took me to task over it back in May, and by the time I'd fixed his point, while still trying to keep its original purpose, I opened up this problem without noticing.

So, rather than rushing at it again, today I took some time and, I hope, came up with something quite realistic (and tested!) that makes the point I was after. The point is something I've found quite useful in designing complex XPaths: Sometimes from a predicate in the midddle of an XPath you can, as I think of it, stick up a periscope (maybe by using ../../../) and peer into some far place in the document to check some detail over there. When I discovered I could do this at any time, it made some hard XPaths much easier for me. (I do this in XSLT, usually)

I hope I'm now making that point OK without labouring the issue. The reason I wasn't happy with Hwiechers' or AutumnSnow's otherwise fine suggestions was that, while they technically fixed the old example XPath, they shifted the emphasis away from that idea. Sorry, guys, if it seemed I was just being obdurate. --Nigelj 19:18, 29 July 2006 (UTC)Reply

" Predicate order is significant, however. Each predicate 'filters' a location step's selected node-set in turn. //a[1][/html/@lang='en'][@href='help.php']/@target will find a match only if the first a element in a @lang='en' document also meets @href='help.php'"... Well this is not true. /html/@lang='en' evaluates true if an attribute @lang with value "en" exists within /html. So there is no connection to 'a'. --Ntropy

A realworld example would be useful

Latest comment: 15 years ago1 comment1 person in discussion

For the unitiated ...

Reply:

I think that I've tidied things up under Examples so the explanation of what the code does shows up better. I'm still not entirely happy with it, though. I did try putting the plain English in italics or bold but that didn't look right either. But at least the lines of code being explained aren't boxed off. Does this help?

Meanwhile, /wikimedia/projects/project/@name is a piece of code used in the explanation but I'm not finding it in the full example. Could someone help with this?

--Kovar (talk) 21:21, 12 December 2008 (UTC)Reply

Textual representation of example xpath is wrong

Latest comment: 17 years ago2 comments2 people in discussion

The text is ambiguous. It says:

A//B/*[1]

selects the first element ('[1]'), whatever its name ('*'), that is a child ('/') of a B element that itself is a child or other, deeper descendant ('//') of an A element that is a child of the current context node

The expression "first element" doesn't specify "first among which?". It makes one think that it first lists *all* the children of *any* such B, and *then* picks the first, returning only one element. When in fact what happens is that it returns a set consisting of the first child element of each such B. Helder Ribeiro 20:48, 5 July 2007 (UTC)Reply

Absolutely right, Helder! Well spotted! I've added a new sentence into the article that, I hope, makes that point. Feel free to clarify the text if you can think of a better wording. You can't get a better encyclopedia than one with this many proof readers and copy editors!!! --Nigelj 21:59, 5 July 2007 (UTC)Reply

Please check this xpath expression

Latest comment: 16 years ago3 comments2 people in discussion

The expanded form of A//B/*[1] is given as child::A/descendant-or-self::node()/child::B/child::*[position()=1] . Could someone sanity-check me? Is there a reason why this does not read child::A/descendant-or-self::B/child::*[position()=1] ? The response to the earlier post about this states that the original given example is correct. Which it is. But is the more concise alternative correct?

The abbreviated syntax A//B/*[1] is by definition short for child::A/descendant-or-self::node()/child::B/child::*[position()=1]. Now, there are many circumstances in which A//B actually gives the same result as A/descendant::B, or for that matter A/descendant-or-self::B, notably when there is no positional predicate and when A and B are disjoint tests. But we're not trying to find alternative ways of replacing this XPath expression by different expressions that give the answer; we're saying how the spec defines the semantics of the abbreviated syntax. Mhkay 22:59, 15 July 2007 (UTC)Reply

Why don't we say something like this:

By definition, the abbreviated syntax is equivalent to

child::A/descendant-or-self::node()/child::B/child::*[position()=1]

but it could also be written as

child::A/descendant::B/child::*[position()=1]

As it is now, it looks like we're purposely trying to make the expanded syntax look bloated and ugly. Herorev 02:46, 24 August 2007 (UTC)Reply

But if you want a compact representation, then the original one is fine. What's the point of offering an intermediate representation? Mhkay (talk) 22:49, 24 November 2007 (UTC)Reply

Abbreviated syntax

Latest comment: 16 years ago2 comments2 people in discussion

W3C states that "// is short for /descendant-or-self::node()/". First, in the article, // is placed under descendant, not descendant-or-self. Second, to my understanding, // is not an abbreviated axis specifier -- that would yield ///node --, but merely syntactical sugar on the expression level.

Similarily, the axis specifiers self and parent don't correspond to the shortcuts . and .., respectively, as . and .. are abbreviated steps.

I call for a clarification in the article. Knut Vidar Siem 08:18, 16 October 2007 (UTC)Reply

Agreed, changing // to descendant-or-self and some clarification would be great. I'm changing to descendant-or-self now and letting someone better in expressing themselves doing the other changes. Fredrikc 10:11, 2 November 2007 (UTC)Reply

Incorrect example

Latest comment: 16 years ago3 comments2 people in discussion

The example is as follows: For example, h3[.='See also'] selects an element called h3 in the current context, whose text content is See also. The example is only correct if there are no children in the matched nodes. h3[text()='See also'] matches better and if unwanted spaces hasn't previously been accounted for the following is needed h3[normalize-space(text())='See also']. I'll change into h3[text()='See also'] but some text explaining about white-spaces might be appropriate. Fredrikc 11:09, 5 November 2007 (UTC)Reply

No, this was a bad change. [.='See also'] is usually better practice than [text()='See also'] and corresponds better to the English description "whose text content is...". Examples where the results are different are:

and in both cases [.='See also'] gives the "better" answer.

I'm going to revert it.

Mhkay (talk) 22:55, 24 November 2007 (UTC)Reply

<h3>See also<sup><a href="...">[1]</a></sup></h3> and such gives worse with [. = 'See also'] but it might be more uncommon, using dot instead of text() is easier to read. Fredrikc 16:44, 1 December 2007 (UTC)Reply

XPath vs. URL

Latest comment: 16 years ago3 comments3 people in discussion

I'll admit, I get terminology wrong a lot, particularly since I know few people I can talk using the terminology without them giving me a funny (and dirty) look. I have seen references to XPaths also being called URLs. In my head, which is almost always incorrect in some way, I think of URLs as having protocols like http:, ftp:, afp:, etc. Am I stupid for thinking an XPath is not related to a URL? If I am, should we make a mention to it being referred to by many as a URL? Dprust (talk) 19:44, 12 December 2007 (UTC)Reply

There is no relationship between path expressions and URLs, other that the psychological one that they both have hierarchic components separated by slashes. Mhkay (talk) 22:14, 13 February 2008 (UTC)Reply

Actually, the similarity is with file-system paths, not URLs. May be an analogy between directories and node elements (as container of objects) may be outlined. --190.20.241.225 (talk) 05:18, 20 June 2008 (UTC)Reply

Content node

Latest comment: 16 years ago2 comments2 people in discussion

The text twice mentions 'content node'. Shouldn't that read 'context node' instead, according to [1]? Muffat (talk) 14:43, 11 February 2008 (UTC)Reply

Yes. Fixed. Mhkay (talk) 22:16, 13 February 2008 (UTC)Reply

Nesting node sets in parenthesis

Latest comment: 16 years ago3 comments3 people in discussion

I dont know if it is part of the standard, but I'm using PHP's xpath implementation and you can do things like this one:
(//ul)[1] => gives the first UL element in the document whereever it is.
On the other hand:
//ul[1] => gives all UL elements that are the first child of any other element.

If it is part of the standard then it should be included in the article.
--190.20.241.225 (talk) 05:27, 20 June 2008 (UTC)Reply

This behaviour is correct according to the standard.

However, this is an encyclopia article not a tutorial or programming guide. It's a place to get an overview of what the language does, where it comes from, and what it's useful for. It shouldn't try to cover every detail of the specification.

Mhkay (talk) 18:56, 28 June 2008 (UTC)Reply

It's not a tutorial of course and I cite yourself "It's a place to get an overview of what the language does". The posibility to nest expressions in parenthesis is something that the language does. Just a little example won't take more than a couple of lines.
--190.20.234.42 (talk) 17:54, 29 June 2008 (UTC)Reply

Hype and Disinformation

Latest comment: 16 years ago2 comments2 people in discussion

There isn't a shred of evidence offered that developers have rapidly adopted Xpath. For instance ElementTree has minimal support for the abbreviated syntax. It is just too much work to implement anything more than the basics, and when you get done, you have a tool that few people will have the patience to use anyway. So this standard looks like it will go the way of SGML. If you disagree, post a link. A statement like "developers are rapidly adopting technology X" needs to be backed up. 169.229.200.176 (talk) 22:49, 26 June 2008 (UTC)Reply

Firstly, I don't see any claims in the article about the level of adoption. That's sensible, because it's very hard to get such data. And it's not necessary, unless you are really claiming that XPath isn't significant to justify an article at all. What kind of evidence do you want? Is it enough that Saxon gets 500 downloads a day (and has done so for over 5 years) and is often in the top 100 Sourceforge projects? Is it enough to sell 50,000 books? (Both those figures relate to XSLT rather than specifically to XPath, but every XSLT user is an XPath user).

"For instance ElementTree has minimal support for the abbreviated syntax." - I don't follow your logic. There are dozens of good implementations of XPath, why does one poor implementation matter?

"It is just too much work to implement anything more than the basics" - I think you're talking here about the challenge of writing an implementation of XPath - sure, that's a challenging thing to do well. But the article is talking about users of XPath implementations, who exist in their millions.

"So this standard looks like it will go the way of SGML. If you disagree, post a link." Well, I could post links to dozens of people who have the opposite opinion, but I think Wikipedia is supposed to cover facts, not speculation about the future.

But perhaps you're just a troll and I shouldn't feed you.

Mhkay (talk) 18:53, 28 June 2008 (UTC)Reply

//a[1]

Latest comment: 15 years ago6 comments3 people in discussion

The section on Predicates repeatedly uses the expression //a[1] (perhaps with intervening predicates) and explains it as meaning "the first <a> element in the document". This is incorrect. //a[1] selects every <a> element that is the first <a> child of its parent. The correct way to select the first <a> element in the document is (//a)[1]. I corrected this and someone reverted it - God save us from self-styled experts!

Mhkay (talk) 21:39, 26 August 2008 (UTC)Reply

You should go ahead and revert the revert, because you are correct. The person that disagrees with you should just try it out... it's clear that you're right :) Jrockway (talk) 23:54, 26 August 2008 (UTC)Reply

//a[first()] would work right? --69.125.25.190 (talk) 05:02, 3 November 2008 (UTC)Reply

No. There is no first() function. Mhkay (talk) 11:38, 3 November 2008 (UTC)Reply

How about //a[position()=1] ? —Preceding unsigned comment added by 69.125.25.190 (talk) 16:29, 6 November 2008 (UTC)Reply

Look, if you aren't an expert in XPath, why are you writing here? //a[position()=1] means /descendant-or-self::node()/child::a[position()=1], which selects every a element that is the first a child of its parent. Mhkay (talk) 23:59, 6 November 2008 (UTC)Reply

MERGE

Latest comment: 15 years ago5 comments5 people in discussion

I didn't even realize there were Xpath 1.0 and 2.0 articles. It would be better if all three articles were merged, with the version histories and whatnot still intact. —Preceding unsigned comment added by 69.125.25.190 (talk) 15:50, 14 September 2008 (UTC)Reply

Until recently the "XPath" article was largely about XPath 1.0. It was difficult to see how to give XPath 2.0 proper coverage within that article, given that there are so many differences in the fundamental concepts of the two versions; the simplest approach seemed to be to create a new XPath 2.0 article, rename the existing article as XPath 1.0, and create a new XPath article to point to the two. If someone wants to merge them back I won't have any fundamental objections, but I think the merged article should essentially be the concatenation of the two. I don't think it's possible to write a single article about XPath that simply mentions the version differences "in passing", because the starting points are so different.

Mhkay (talk) 21:29, 15 September 2008 (UTC)Reply

Here's a vote against the merge. As a newcomer I found the XPath article to be succinct and clear, and I'm happy to follow the links to the 1.0 and 2.0 articles for more info on those, with clear delineation between them. BenWilliamson (talk) 03:44, 25 November 2008 (UTC)Reply

I've looked over these discussions, carefully gone through the articles, applied the 'idiot test' -- e.g., having someone who doesn't know much and wants to know more start with Google, the first link there being to the main XPath article -- etc. I think that these suggestions cover the concerns and what a user would need:

   Keep the main article succinct and clear; merging the articles would result in an overwhelming amount of information.

   Make it very clear that the links on the main article to the versions take you to deeper information.  A casual user wouldn't 
   think to look at version information.  So perhaps something such as See also for further details instead of See also.

   Go through all three articles at once, keeping each but also doing the edits such that they form a cohesive whole.

A couple of examples: a brief glance shows that those pages don't reference each other clearly nor do they reference the basic XPath article, much less the very important fact that it's an overview. Another is that the XPath 1.0 article says that that version is still more widely used than 2.0; is that still true? If so, could it be changed to 'as of [date] XPath 1.0 is still more widely used than 2.0'? Should there be something in the Version 2.0 about this? The See also for XPath 1 is a longish list of articles, mostly internal, but doesn't reference XPath 2 or the main article. On the other hand it has an external link to a tutorial albeit in German; there must be one in English out there somewhere. Put that one there and a tutorial for XPath 2.0 in that article? XPath 2.0 has a number of clear internal links embedded in the text and no See also although it does have internal links. The main XPath article has the most basic external links; the other two use them as well.

There's probably a lot more that I'm missing: I haven't done anything serious with code in over a decade and am now mainly a researcher and editor.

Kovar (talk) 20:21, 12 December 2008 (UTC)Reply

I vote to merge. There are hundreds of technical articles which discuss differences among versions within one article, and they all look great. Don't think that XPath should be different. --Pinnecco (talk) 13:39, 4 February 2009 (UTC)Reply

XPath error I introduced?

Latest comment: 15 years ago5 comments3 people in discussion

What was the issue with the correction I made? Is the forward slash actually required? I never use the forward slash after a open-bracket in XPath, it seems to break the path, at least when I use the JDOM and Dom4J libraries. - Poobslag (talk) 20:37, 6 November 2008 (UTC)Reply

Hi Poobslag. I moved your question from my talk page to here as you seem to be discussing this article rather than me personally. The reason I reverted your edit was because in the two XPaths you altered the point was that, in the context of an 'a' element, one predicate was going out of context and looking at a top-level 'html' element ("[/html..."), whereas another was staying in context and referring to a child 'href' attribute ("[@href..."). As these 'look-arounds' were contained in predicates, they "affect[ed] neither the context of other predicates nor that of the location step itself", as the article says. A leading slash is not a matter of habit, but has a particular meaning. --Nigelj (talk) 22:24, 6 November 2008 (UTC)Reply

Alright thanks, I didn't catch that. Rereading the example within the context you've provided here, everything makes sense. - Poobslag (talk) 22:54, 10 November 2008 (UTC)Reply

I don't know the history of this. The current expression that we have in the Predicates section a[/html/@lang='en'][@href='help.php'][1]/@target is correct, but far more complex in my view than is appropriate for a brief XPath overview article: it's not a good choice of example for explaining predicates, because it distracts attention from the essentials. Mhkay (talk) 23:56, 6 November 2008 (UTC)Reply

I agree, these examples are more advanced than I would expect from an Xpath overview. - Poobslag (talk) 22:54, 10 November 2008 (UTC)Reply

Add topic