Talk:Graph homomorphism/GA1
GA Review
editGA toolbox |
---|
Reviewing |
Article (edit | visual edit | history) · Article talk (edit | history) · Watch
Reviewer: David Eppstein (talk · contribs) 06:57, 3 August 2017 (UTC)
Reviewing. But there are multiple whole paragraphs and some even larger sections of the article lacking inline citations: most of the "Definitions" section, most of the "Connection to colorings" section, most of the "Examples" section, etc. Unless these can be fixed, this is far from the verifiability requirement in good article criteria #2 and likely to be a quick fail. —David Eppstein (talk) 06:57, 3 August 2017 (UTC)
- Thanks, I'll add some citations later this week (for now: almost all of it is in Hell & Nesetril's Graphs and Homomorphisms, mostly in the publicly available Chapter 1). Tokenzero (talk) 08:29, 3 August 2017 (UTC)
- I tried to add citations for everything. I don't have anything for the parts-of-speech tagging example, but in my opinion it is a nice and simple example of how CSPs can express something like G fits the model H, instead of the usual kind of resource allocation examples. And at the same time directed graphs are more natural here. In literature I could only find much more involved applications, nothing in secondary sources, the most similar (but still much more general and too primary source) in here: Padró, Lluís (1996), A Constraint Satisfaction Alternative for POS Tagging (PDF). Tokenzero (talk) 13:26, 6 August 2017 (UTC)
Second reading
editLead:
- This is significantly less technical than the rest of the article (a good thing) but I think it's mostly because the rest of the article is too technical. More effort should be made in making it readable by non-experts in graph theory. (Let's say, at the level of someone with an undergraduate mathematics degree who hasn't studied any graph theory.) Right now, the lead is at that level but the rest is not. In particular, there is far too much reliance on notation when words would be clearer, and far too complex average sentence length (GAN criteria 1a).
- Throughout, many technical terms are used with the assumption that the reader either knows what they mean or will follow a link to see what they mean. I think it would be helpful to provide a short gloss here for each such word. A good example: "arc (directed edge)" in definitions. A bad example: "injection" (in the same section).
- I tried to split some sentences into two, reduce notation, simplify some wording and add glosses as much as I could. Tokenzero (talk) 23:35, 20 August 2017 (UTC)
- The lead summarizes the first three sections of the article (definitions, coloring, and applications) but not the rest (structure, incomparability, and complexity) (GAN criteria 1b).
Definitions:
- Because it only maps vertices to vertices, the definition of homomorphism here seems to be unsuitable for multigraphs, and that is reinforced by the implicit assumption that edges are the same thing as sets of two vertices. Both should be stated more explicitly. This is especially important because "sets of two vertices" is not actually correct: later, we see examples where loops are allowed. So the graphs considered here are not simple graphs, but graphs with loops allowed but with multiple adjacencies disallowed.
- "H-colorable, we shall": comma splice. Also, Wikipedia's manual of style says to avoid first person.
- There is a little attempt to distinguish homomorphisms from other types of maps (the description of subgraphs, isomorphisms and covering maps as special types of homomorphisms) but it might be helpful to also discuss homeomorphisms (more than just in the hatnote) and minors at least to the point where someone can distinguish them from homomorphisms.
- "If the homomorphism f : G → H is an injection, then G is simply a subgraph of H.": is this a description of all subgraph relations, or does the if-then only go one way? Same for the covering maps in the next sentence.
- More examples would go a long way towards making this understandable. In particular, bipartite double cover provides useful and nontrivial examples of covering maps.
- "homomorphically equivalent": this paragraph no footnote.
- Can you please provide an example of a core that is not a complete graph?
- "Any finite graph G is..." this is the first time that finiteness or non-finiteness has been mentioned in this article. Some later material seems to assume finiteness (talking about unique cores) without stating it. I think it would be good to be more explicit about this, either stating that graphs are allowed to be infinite unless explicitly stated as finite (and then fixing the later parts that implicitly assume finiteness) or vice versa.
Colorings:
- "Each k-coloring corresponds to a homomorphism from G to the complete graph": and vice versa?
- "two colors are different if and only if they are adjacent as vertices of Kk (since Kk has edges between all possible different vertices and no edge from a vertex to itself)": while not wrong mathematically, this is very confusingly worded. This whole paragraph has the feel of someone repeating over and over an obvious mathematical statement, with different but equivalent wording, because the statement is so obvious that they have no idea how to go about actually proving it.
- This whole section is seriously incomplete (GAN criteria 3) without any discussion of the Gallai–Hasse–Roy–Vitaver theorem, and especially of the formulation of this theorem that a directed graph G has a homomorphism to a k-vertex transitive tournament iff it has no homomorphism from a k+1-vertex path.
- "General homomorphisms can also be thought of": this paragraph no footnote.
Constraint satisfaction:
- The first two paragraphs are essentially about distance coloring and its formulation as a homomorphism problem. Maybe this should be at least mentioned?
- You mean coloring where vertices at distance at most p get different colors? Or L(h, k)-coloring? Neither is really the same thing. You could define metric spaces and a kind of distance-preserving colorings, but that's far fetched. Anyway this is meant as examples that are as simple as possible, where you can change H to change constraints. Tokenzero (talk)
- "part-of-speech tagging": unsourced, and I'm skeptical that homomorphism is an effective way of approaching this. The reason is that one needs huge amounts of data to get anywhere with unsupervised learning of language, but huge data is instrinsically noisy, and you have no principled way of throwing out the pairs that are coincidentally next to each other from the ones that you are trying to find. Also the connection between HMMs and homomorphisms is tenous and again needs sourcing.
- I removed it. But my intention was to give this obviously simplified example, which would nevertheless show a very different way of thinking about homomorphisms. In practice of course you'd want a kind of fractional weighted homomorphism, and constraints between more than just two words. Then it's exactly what they do in Padró, Lluís (1996), A Constraint Satisfaction Alternative for POS Tagging (PDF). Also I did not connect HMMs and homomorphisms, I meant only that HMMs are the theoretical concept more commonly used for part-of-speech tagging. Tokenzero (talk)
- "Most algorithmic methods": the methods listed are only naive ones, and for coloring at least other methods including dynamic programming and inclusion-exclusion have also been effective. So is "most" really accurate?
Structure:
- This whole section does not carefully distinguish the directed versus the undirected case. Really there are two dense posets and two categories, one for directed graphs and one for undirected graphs. Which of these two are the ones whose properties are described here?
- I made 'undirected' default, mentioned 'directed' in Cores, added final paragraphs here and in Incomparability on directed graphs, and added '(undirected)' when stating density to warn the reader this one place is not true for digraphs (they are mooostly dense, but it's complicated). Tokenzero (talk)
- "equivalence classes, it defines": comma splice.
- This appears to be the first time we've seen the slashed-arrow notation. What does it mean? (My guess would be the nonexistence of a homomorphism but this needs to be clearer.)
Incomparability:
- Have we seen a definition for the notation CG used here? Again, are we considering directed or undirected graphs?
Complexity:
- "can be solved by brute-force": this time bound needs a source.
- Why the scare quotes (and why single quotes) on left and right?
- "Hell-Nešetřil", "Feder-Vardi", etc., should use en-dashes, not hyphens.
- Why is graph isomorphism mentioned, but not subgraph isomorphism? And this is another paragraph without footnotes.
—David Eppstein (talk) 01:47, 14 August 2017 (UTC)
Many thanks for the suggestions, I've added specific answers above. Tokenzero (talk) 23:35, 20 August 2017 (UTC)
Status query
editDavid Eppstein, where does this review stand now? Are there any issues remaining at this point? Thanks. BlueMoonset (talk) 14:06, 1 October 2017 (UTC)
Third reading
edit@Tokenzero: Sorry for the delay. @BlueMoonset: still some issues remaining, but it looks like it is converging towards GA.
Here are my comments on the current version. Overall, it looks well-balanced, with good coverage that has been made significantly less technical. At least, I think it should now be readable by someone who already has a little familiarity with graph theory, while before it was a little more advanced than that. I only have some minor issues of writing quality (GA criterion 1a) to address:
Lead: Now at a good balance between avoiding enough jargon to be accessible and, on the other hand, covering the subject. Issues with inadequate summarization of some parts of the article have been addressed.
Definitions: Minor inaccuracies and differentiation from other related notions now addressed. One minor quibble: The article contains 17 instances of the word "any", which often functions as a quantifier, but an ambiguous one: sometimes it means the same thing as "every", and sometimes it means "a single arbitrarily chosen thing". So in "no homomorphism to any proper subgraph", there is no reasonable substitute for "any" (in the second meaning) but in "Any graph G is homomorphically equivalent", I think it would be crisper to replace "Any" by "Every".
Colorings: The first k is not italicized. Re "It is not hard to show that a graph G is k-colorable": I think the clearer statement is that whenever G->H as undirected graphs and H is given an orientation, you can pull the orientation back to G and get a homomorphism of directed graphs. Re "one of the two possible orientation": missing the "s" at the end of "orientations". Re "A folklore theorem states": the scope of the "for all k" tacked at the end of this sentence is unclear. Also the math is badly formatted: the plus signs, equal sign, and minus sign (currently and incorrectly a hyphen) should have spaces around them.
Connection to CSP: "called relational structure" needs either an article or a plural.
Structure: "defined as the disjoint union [G ∪ H]": this is a little confusingly stated, because it skips a step. It's actually the equivalence class of the disjoint union of representative graphs, and it takes an (easy) argument to observe that it doesn't matter which representative graphs G and H you use to construct the disjoint union. The disjoint union of equivalence classes would be [G] ∪ [H] and is not itself an equivalence class unless [G] = [H]. The same issue applies also to the meet, but it's less confusing there because there's no obvious way to define a tensor product of equivalence classes. Re "same definitions apply, in particular": another comma splice. Re "can be though of": should be "can be thought of".
Incomparable graphs: I'm confused about something. Supposedly by considering only the two parameters odd girth and chromatic number (both monotone for homomorphisms) we can generate an infinite antichain. Doesn't that contradict Dickson's lemma, according to which every antichain among pairs of non-negative integers is finite?
Homomorphisms from a fixed family of graphs: "for any class of graphs G, ... [statement that does not refer to G]." What is the class of graphs doing in this sentence?
References: Brown et al 2008 appears not to be used. Should it be moved to a separate "Additional reading" section, or removed, since it isn't actually a reference? Godsil and Royle is only used once; other references that are only used once are detailed in the Notes section rather than given a short reference there and a long reference in the references section. Also, it's a whole book, so it would be helpful if the footnote also included a page number where its claim could be found. Gray 2014 appears not to have been published and while its author seems to have led an interesting life [1] she currently appears to be a student, i.e. not established enough for the "recognized expert" clause of WP:SPS to apply. So I'm skeptical that this is a reliable source, by our standards. And although it doesn't say so, I suspect it may also have been somewhat influenced by our article, raising WP:CIRCULAR issues. Can we find the same claims elsewhere?
—David Eppstein (talk) 05:08, 2 October 2017 (UTC)
- @David Eppstein: Ok, I've tried to fix it all, see the diff to check if the new phrasings are clear enough now.
- Incomparable graphs: G → H implies G has at most the chromatic number of H, but at least the girth of H, so the integer parameters are compared in opposite directions. In terms of comparing integer tuples, the infinite antichain is of the form (k,−k), while Dickson's lemma applies to non-negative integer tuples only.
- References:
- In general, my intention was to separate very focused references (in particular original journal articles) to Notes, while putting surveys, books and such into References. In particular Godsil & Royle is a book with a whole chapter devoted to homomorphisms, I believe the largest beside H&N, with some results (admittedly obscure) not covered by H&N, but of course mostly redundant. Brown et al. is the most comprehensive treatment on the categories of graphs I could find, not just a 'by the way, this is a category' , but seriously going into some details. (It's now also a reference in the technical sense, since I needed to reference one statement from Gray to Brown et al.). I've tried to make that more consistent and ordered now, but I'm open to suggestions (rename Notes to References and References to Further reading?).
- Gray: It's a useful source in the sense that it is a freely available, concise, elementary exposition with complete proofs for quite a few fundamental statements that appear in the article, but I agree "recognized expert" does not apply. It's apparently a survey done as part of a Vacation Research Scholarship (under 2013/2014, La Trobe), under the supervision of two persons that appear to be recognized experts, which is kind of the best we could hope for. Anyway, I've added other references to the two statements where this was the only one, but kept the references to Gray (the others are more general and require going through their much more technical definitions) and added 'student research report' to Gray. Given the state of the article before 2016 I doubt it's WP:CIRCULAR (Gray's is 2014). (I've seen the Sydney Morning Herald article before, but never associated it with this reference; you have a very good eye for persons behind articles!)
- Tokenzero (talk) 16:03, 14 October 2017 (UTC)
- All remaining issues addressed; passing. —David Eppstein (talk) 07:13, 16 October 2017 (UTC)