Talk:Regular grammar
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||
|
I have taken the liberty to disntnguish between right regular grammars and left regular grammars more clearly -- rp, mar 09 2004
It would be nice to give a sense of what (N, Σ, P, S) are without sending readers immediately to Formal_grammar. Is there a way to do this that would make sense? Jodi.a.schneider 04:01, 10 October 2006 (UTC)
Redundancy in definition of left/right regular grammar
editIf you allow the empty rhs, so A → ε, you do not need the form A → a, since you can add a non-terminal E → ε and then use the rule A → aE.
So, having form A → a along A → aB makes only sense if you restrict the empty rhs to the start symbol, meaning S → ε.
I suggest to refine the definitions (or, if not wished, explain their redundancy clearly). Andreasabel (talk) 18:49, 22 February 2023 (UTC)
- I think you will find that definitions vary. Some authors don't allow empty productions at all. Rp (talk) 07:42, 23 February 2023 (UTC)
- I think we should mention the most typical definition variants of regular grammars, so if you can provide a citation (textbook preferred) about your variant, please provide it here, or insert an appropriate remark directly into the article. Currently, only the "extended" version by Hopcroft+Ullman 1979 is sourced. - Jochen Burghardt (talk) 11:47, 23 February 2023 (UTC)
Regular Grammar
editOn some books left regular grammar is defined as below
A → a* - where A is a non-terminal in N and a is a terminal in Σ A → Ba* - where A and B are in N and a is in Σ
However in Wikipedia the definition has excluded the possibility of having more than one terminals on the right hand side of the production. Why is this difference?
For what I know there is no real difference: the grammars showed on the page are "strongly" (or "strictly") linear, while the ones you report are just linear (even if the way you wrote them is not so clear to me, because if you write a* it could mean "0 or more instances of the same symbol (the one which 'a' refers to)", and that wouldn't mean "regular" because it could produce an infinite sequence: in BNF you would use recursion to mean that, and you'd get something like A → B A for the second rule, which makes the grammar absolutely not regular). From a linear grammar you can always get its strictly linear equivalent by substituting the non-strict rules with sets of rules each of which produce at most a terminal symbol (i.e., each rule does only one step). Sorry for my English, anyway... and if I said something wrong let me know! :-) --Ma82 12:12, 2 September 2006 (UTC)
if you write a* it could mean 0 or more instances of the same symbol and that wouldn't mean "regular" because it could produce an infinite sequence
No. You can have more than one terminal on the right side. Every Formal Languages and Automata book states this.
However, you cannot write the production as:
A → a*
You can have multiple terminals on the right-side of the production. And in the case of a language consisting of a*, you could use the following productions to generate the language:
A → aA
A → ε
which is completely legal and can produces valid string of infinite length for the regular language L = {a* | a is an element of Σ = {a}*}.
The issue here is you're not writing the production, you're giving the form for the production. And productions can be of the form both shown in the current article and seen below:
right-linear:
S → abS | aba
left-linear:
S → Aaba
A → Aab | B
B → ε
Where each production contains at most one Variable that is contained in N and zero or more Terminals that are an element of the language Σ*. Both grammars produce the regular language ab(ab)*a and are valid forms of the right-linear and left-linear regular grammars.
The way this page is set up right now, it looks like you cannot produce a right or left grammar for a regular language containg an infinite amount of strings, such as a*, b*, (ab)*, a*b*, (a + b)*, etc.
I just tripped over the article prefix grammar, which is almost nearly a left regular grammar, but not quite. I think that a prefix grammar can be described so that its isomorphic to a left regular grammar, but I'm not expert enough to try to guess how, off the top of my head. Clarification in both articles would be appreciated. linas 21:07, 21 April 2007 (UTC)
Question on claim in article
edit- Every context-free grammar can be easily rewritten into a form in which only a combination of left regular and right regular rules is used. Therefore, such grammars can express all context-free languages. Regular grammars, which use either left-regular or right-regular rules but not both, can only express a smaller set of languages, called the regular languages. In that sense they are equivalent with finite state automata and regular expressions. (for illustration: the paradigmatic context-free language with strings of the form aibi is generated by the grammar G with N = {S, A}, Σ = {a, b}, P with the rules
- S → aA
- A → Sb
- S → ε
- and S being the start symbol. Note that this grammar has both left-regular and right-regular rules and is therefore not regular any more.)
How do you express the language of valid bracket pairings in this setup? Seems to me you'd need to be able to replace one nonterminal with two to do that. Ben Standeven 04:44, 18 May 2007 (UTC)
Indeed. Grammars with a single nonterminal that are able to write terminals both to the left are called linear rammars, and are less powerful than the context-free grammars. They have a nice connection to push-down automata though: they can be accepted by PDA which make a single turn (from pushing to popping). One-sided linear grammars are then called left-linear and right-linear respectively depending on the side of the nonterminal. Jochgem 16:09, 31 October 2007 (UTC)
Quite easily:
- S → { A
- A → }
- A → S }
- A → } S
No? BadZen (talk) 17:28, 27 August 2015 (UTC)
- How would you derive "{{}}{{}}" from S with this grammar? - Jochen Burghardt (talk) 19:57, 27 August 2015 (UTC)
- S →1 {A →4 {}S →3 {}{A →2 {}{}
- The idea of the proof: all rules with right hand sides of length > 2 can be replaced with shorter equivalent rules by interjecting newly invented nonterminals. E.g. a production A → aBcD can be replaced with A → aX; X → BY; Y → cD. For linear grammars, this can be done such that no right hand side has two nonterminals, e.g. A → abCde can be replaced with A → aX; X → bY; Y → Ze; Z → Cd. Rp (talk) 09:03, 28 August 2015 (UTC)
Floating point example
editThis rule set requires at least one digit after the dot, should the dot occur. Not every programming language disallows "3." as a valid floating point number (though none that I know of allow ".e0", but many might allow "0.e0" or ".0e0" or both). It wouldn't hurt for the example to comment on the assumptions underlying the floating-point language recognized. — MaxEnt 03:20, 13 April 2018 (UTC)
In desperate need of a practical expample.
editThe example (section) really isn’t one. It’s just more abstract theory and non-descriptive single-letter symbol walls.
How about a real-world example of a simple regular grammar. (Are globbing patterns a regular grammar?)
As almost always on Wikipedia, the article is completely obfuscated and incomprehensible to everyone except those who already know the subject better than the author. ^^