Re: AST, BNF, ANTLR

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Kermit Kiser
Please bear with me if my question below seems elementary but my formal computer science education is mostly limited to the undergraduate level.

Although I have written uncountable parsing routines, my knowledge of the formal terminology and algorithms of parsers and compilers is quite limited. That may be part of the reason that I have not yet attempted any enhancements to the NetRexx language. I think I can see the use of the AST and CST structures in the translator but I don't have a solid enough grasp on how they interact to modify them. Bill's comment indicates that the NetRexx AST is not a "full AST" but I have no idea what that means.

I have seen several requests for an ANTLR grammar or a BNF definition for NetRexx. But Wikipedia says this about ANTLR:

A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF).

and this about EBNF:

In computer science, Extended Backus–Naur Form (EBNF) is a family of metasyntax notations used for expressing context-free grammars

Yet I have seen comments indicating that Rexx is not a context-free language and I think that applies equally to NetRexx as it's keywords are also not generally reserved outside of their context, for example.

To confuse things further, I have seen multiple comments indicating that there IS a BNF definition of Rexx in the ANSI standard for that language which RexxLA is said to maintain a copy of, but which I cannot find as the RexxLA site seems to be inaccessible currently.

So here is my question: Is it really possible to create an ANTLR grammar or BNF definition for NetRexx?

-- Kermit


On 3/8/2013 6:37 AM, Bill Fenlason wrote:
My understanding of the translator logic and data structures isn't complete enough to be sure of what the best approach would be.

When I was looking at it, it appeared to me that a full AST construction might not be easy to add to the translator.  Ideally a secondary processor (like to generate HTML, different output codes, formatting and pretty printing, etc.) could work just by walking the tree.  The design choice is between generating an AST which is processed by different applications or imbedding code in the translator to do the various output generations.

The AST that I currently generate in the Eclipse plugin certainly needs refinement.  It's a bit of a kludge at this point.  Eclipse builds ASTs for the Java code, and one obvious approach would be to have the translator actually generate the Java ASTs, but understanding how to do that would take a great deal of research. One up side of that approach would be that NetRexx code could be debugged using the (very sophisticated) Eclipse debugger.

The ultra-dynamic nature of NetRexx is a complication of course.

Bill

On 3/8/2013 9:21 AM, René Jansen wrote:
Yes I thought of that too. Bill has a parser for the Eclipse tool; the open sourcing was a bit late as I can imagine he would have rather used the translator itself. I was just asking around because I want to go to production any day now and perhaps someone had something around that fit the bill.

I agree that the translator itself would be the road to do the best syntax colouring possible.

best regards,

René.
On 8 mrt. 2013, at 14:24, "Mike Cowlishaw" [hidden email] wrote:

Who has, or knows of a NetRexx program that translates
NetRexx source to syntax coloured source in html? It has to
be quite quick, also.

i would be very interested in this for a new netrexx website
I am working on.
Maybe a nice option for the NetRexx compiler.  Instead of emitting Java, a
version that emitted HTML.  Given that the compiler has access to semantics as
well as syntax it could do a lot more .. e.g., different colours (or minor
variations of colours) for different types.  ints a different colour than
floats.  Local variables a different font than inherited ones.  Italic or bold
used for other attributes, etc.

Lots of possibilities :-).

Mike



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2904 / Virus Database: 2641/6155 - Release Date: 03/07/13




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/





_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

rvjansen
Not that I can really help. I have been wondering the same.

Cases in point here are the Chad Slaughter ANTLR grammar for NetRexx. It is 'almost correct', and should be correctable. In the days of the 'open source parallel implementation of NetRexx' (nothing ever came from it) I have tried to interest the maintainers of ANTLR with $1000 to correct and maintain it. They probably had enough money OR it is too hard. (The offer has sinc been rescinded, thanks).

Bill's javacc grammar works, otherwise the Eclipse plugin would not display such good syntax coloring and parsing. It might miss some of the things you need to build a whole compiler, though.
I read Bill thinks it needs polishing before public consumption, but I have not often seen polished things that still work - and when they work, they'll get dirty again.

The fact that NetRexx's parser is hand-written does not proof that a parser generator cannot handle it, in itself. The ooRexx parser is lexx/yacc I think and it handles something very akin to NetRexx.

The problem with ANTLR is in the 'keyword-less' approach of NetRexx. I have spoken to Terence Parr ages ago and he says that however hard, it is no showstopper. Also years ago, I have invested time in this and with a depressing outcome. Normally, I am used to things happening when I work on them. This is not the case with grammar tools, and the working grammars I produced (one for an SQL tool in NetRexx, called nsql (I might publish it one day), and one for a special-purpose modelling language called 'bint) I have to keep in version management and check in after every character I change - grammar files are that brittle and even looking at them makes it stop working. Also, ANTLR has had a number of versions that changed everything.

It might be that I am just not smart enough. I have a friend with a grade in exactly this and I will forward your questions.

best regards,

René.


On 11 mrt. 2013, at 13:09, Kermit Kiser <[hidden email]> wrote:

Please bear with me if my question below seems elementary but my formal computer science education is mostly limited to the undergraduate level.

Although I have written uncountable parsing routines, my knowledge of the formal terminology and algorithms of parsers and compilers is quite limited. That may be part of the reason that I have not yet attempted any enhancements to the NetRexx language. I think I can see the use of the AST and CST structures in the translator but I don't have a solid enough grasp on how they interact to modify them. Bill's comment indicates that the NetRexx AST is not a "full AST" but I have no idea what that means.

I have seen several requests for an ANTLR grammar or a BNF definition for NetRexx. But Wikipedia says this about ANTLR:

A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF).

and this about EBNF:

In computer science, Extended Backus–Naur Form (EBNF) is a family of metasyntax notations used for expressing context-free grammars

Yet I have seen comments indicating that Rexx is not a context-free language and I think that applies equally to NetRexx as it's keywords are also not generally reserved outside of their context, for example.

To confuse things further, I have seen multiple comments indicating that there IS a BNF definition of Rexx in the ANSI standard for that language which RexxLA is said to maintain a copy of, but which I cannot find as the RexxLA site seems to be inaccessible currently.

So here is my question: Is it really possible to create an ANTLR grammar or BNF definition for NetRexx?

-- Kermit


On 3/8/2013 6:37 AM, Bill Fenlason wrote:
My understanding of the translator logic and data structures isn't complete enough to be sure of what the best approach would be.

When I was looking at it, it appeared to me that a full AST construction might not be easy to add to the translator.  Ideally a secondary processor (like to generate HTML, different output codes, formatting and pretty printing, etc.) could work just by walking the tree.  The design choice is between generating an AST which is processed by different applications or imbedding code in the translator to do the various output generations.

The AST that I currently generate in the Eclipse plugin certainly needs refinement.  It's a bit of a kludge at this point.  Eclipse builds ASTs for the Java code, and one obvious approach would be to have the translator actually generate the Java ASTs, but understanding how to do that would take a great deal of research. One up side of that approach would be that NetRexx code could be debugged using the (very sophisticated) Eclipse debugger.

The ultra-dynamic nature of NetRexx is a complication of course.

Bill

On 3/8/2013 9:21 AM, René Jansen wrote:
Yes I thought of that too. Bill has a parser for the Eclipse tool; the open sourcing was a bit late as I can imagine he would have rather used the translator itself. I was just asking around because I want to go to production any day now and perhaps someone had something around that fit the bill.

I agree that the translator itself would be the road to do the best syntax colouring possible.

best regards,

René.
On 8 mrt. 2013, at 14:24, "Mike Cowlishaw" [hidden email] wrote:

Who has, or knows of a NetRexx program that translates
NetRexx source to syntax coloured source in html? It has to
be quite quick, also.

i would be very interested in this for a new netrexx website
I am working on.
Maybe a nice option for the NetRexx compiler.  Instead of emitting Java, a
version that emitted HTML.  Given that the compiler has access to semantics as
well as syntax it could do a lot more .. e.g., different colours (or minor
variations of colours) for different types.  ints a different colour than
floats.  Local variables a different font than inherited ones.  Italic or bold
used for other attributes, etc.

Lots of possibilities :-).

Mike



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2904 / Virus Database: 2641/6155 - Release Date: 03/07/13




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Mike Cowlishaw
Definitely a non-trivial question.  I have maintained for decades that language
syntax rules that rely upon 'reserved words' are not suitable for interpreted
languages (i.e., those that are generally run from source rather than form a
compiled 'binary').  This has been proven over and over again -- JavaScript
being a classic example which is trapped in its own syntax.
 
Tools such as LEX and YACC make it quite easy to implement language
parsers/compilers .. but they lead the language designer into exactly that trap.

 
I wrote the original Rexx parser 'by hand' because that was essential for
performance.  It was only later that I found that various 'packaged' parsers
were not only slow but also flawed.  I had a several-hour long argument with
Tony Hoare about this some time ago .. he felt that mathematical elegance was
more important than usability and extendability, so we never did agree :-).
 
Nuje

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

ThSITC
May I just put into my 0.1 cents ...

1.) We shall and could *not* change the bahaviour of *not reserved
words* in all Rexx Languages
2.) All Rexx dialects *do have* a *context sensitive* grammar (look, for
instance, at the *abut* and
*blank* operators, which are *context sensitive*, as well as the usage
of *builtin* methods/functions)
3.) That has been also the reason why some trials to implement any & all
attempts to implement
Rexx and/or Netrexx by using AST's with tools like Lexx, Yacc, or
similar tools did and shall fail.

I did study Compiler Wrting Techniques very long time ago, and also the
syntactical requirements
of languages as PL/I etc to build a BNF definition (Note all the
parenthesis required in PL/I
syntac to be able to define PL/I in Backus Naur Form, and resolve
possible syntactik ambiguities
within PL/I). The Vienna IBM labor did a very great job, decenniums ago,
to be able to define
PL/I in BNF, with all it's advantages and disadvantages, of course).

Thus, my (personal) opinion is, that you shall need a hand-written
Scanner and Parser, anyway.

My personal approach, when writing Rexx2Nrx, back to early 2000's, has
been, however, to separate
the Scanning, Parsing, Semantic Analysis, and Code Generation steps, in
turn.

It shall be *not* possible to define a so called LR k Grammar for any
Rexx dialect, which shall be
needed to provide a *context free* Grammar definition.

Thomas Schneider.

PS: When somebody shall be able to proove the contrary, I will of course
accept this proove,
as it would increase my knowledge ;-)

============================================================================
.
Am 11.03.2013 16:02, schrieb Mike Cowlishaw:

> Definitely a non-trivial question.  I have maintained for decades that language
> syntax rules that rely upon 'reserved words' are not suitable for interpreted
> languages (i.e., those that are generally run from source rather than form a
> compiled 'binary').  This has been proven over and over again -- JavaScript
> being a classic example which is trapped in its own syntax.
>  
> Tools such as LEX and YACC make it quite easy to implement language
> parsers/compilers .. but they lead the language designer into exactly that trap.
>
>  
> I wrote the original Rexx parser 'by hand' because that was essential for
> performance.  It was only later that I found that various 'packaged' parsers
> were not only slow but also flawed.  I had a several-hour long argument with
> Tony Hoare about this some time ago .. he felt that mathematical elegance was
> more important than usability and extendability, so we never did agree :-).
>  
> Nuje
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>
>


--
Thomas Schneider, IT Consulting; http://www.thsitc.com; Vienna, Austria,
Europe

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com
Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
In reply to this post by rvjansen
There are several points to address here.

Kermit, the AST used by the Eclipse NetRexx plugin is a "full" AST in that it describes all the syntactic aspects of a NetRexx program.  What I meant was that it is still under development.  The plugin is Alpha code (it is not "elegant" and it changes drastically from time to time), but the current AST could be considered "full".  But the details are not fixed or documented yet.

In order to correctly show operators in the AST, the blank concatenation operator must be handled, and I did that with the tokenization.  For most languages in a compiler setting, all unnecessary free space is discarded before being passed to the generated parser.  What I do is to generate tokens containing every character in the input, and the AST totally reflects the exact source input.  I see that as essential for other non-editing applications like formatters, etc., and it allows a significant blank to be marked as a concatenation operator.  It also fits nicely into the editing environment so every text change can be directly linked back to the AST.  The downside is that the grammar includes whitespace and processing is slightly less efficient. My parser does incremental scanning and parsing (i.e. only updates that part of the AST required by the text change) to be more efficient. 

The "keyword-less" nature of NetRexx is indeed a problem, but it can be handled.  PL/I is the classic language case of that, and there are successful (totally full) parsers for PL/I using Flex / Bison, JFlex / JavaCC and perhaps others.  The crux is that the tokens are tokenized as keywords and the token type is converted to "ID" as necessary by the parser, and it works (more or less) with NetRexx as well.  Unfortunately it cannot handle the assumption that tokens are identifiers first and converting them to keywords depending on the execution environment as NetRexx currently does.

There are a number of different compiler generators for the Java environment, and ANTLR is one of the more popular.  But others include JavaCC, COCO, Sable, and other older or less well known ones including Bison.  I chose the JFlex / JavaCC combination after considerable investigation.  I'll skip the details here but would be happy to discuss them.

I did look at the existing ANTLR grammar and I found it lacking, particularly with regard to the above points (as well as some others as I recall).  Rene, I suspect the job is too hard rather than a $ issue.  I did not judge that grammar to be "almost-correct", but perhaps my criteria are substantially different.

I did not choose ANTLR for several reasons.  First, it requires the use of external run time libraries, and I'm opposed to that.  Next, it did not seem to have the necessary flexibility to handle the problems mentioned above as well as some others (including incremental parsing).  In trying to understand it and use it, I thought it contained considerable "bloat", and finally I did not like the idea that the only way to get decent documentation was to buy Parr's book (I suspect he has plenty of grad students to generate free documentation).  It was actually last on my list of alternatives.

As for the ability of a parser generator to handle NetRexx, I did mention that the ultra-dynamic nature of NetRexx is a problem.  The bottom line of that is that NetRexx language is intended to be interpreted and the Java language is intended to be compiled.  The idea of using an interpreter to generate compilable source code (depending on the execution environment) is certainly interesting.

Classic Rexx is compilable and is not as dynamic as NetRexx, and thus Rexx was definable in BNF.

I would prefer that NetRexx could be compilable independently of the execution environment (as Rexx can be), but that is not the case.  For that reason I doubt that a generated parser can ever totally handle NetRexx without substantial work.  Fortunately in the real world it doesn't matter too much, and the Eclipse plugin can be effectively used.  As I've said before, if the NetRexx source had been released earlier, perhaps I would not have used a parser generator at all.

Mike and I disagree about when and how the recognition of keywords should happen, but NetRexx is his language!! :) :)

Bill

PS. Thomas, Regina Rexx, as I recall, also uses Lex and Yacc.  I don't recall if it actually generates an AST, but any language that uses Lex and Yacc certainly can (with perhaps some post-parsing adjustment).  Yacc does not generate the AST automatically, but some other parser generators (like JavaCC / JJTree which I used) do.  I believe that a Rexx interpreter using an AST could be (or perhaps has been) implemented, so I think we disagree on that point.  NetRexx is a different matter.

PPS. Thomas, I do agree about PL/I and the Vienna IBM efforts.  Remember, that was just about 50 years ago!!  That was some time before Unix, C, Lex and Yacc, and not long after ALGOL and the invention of BNF. 


On 3/11/2013 11:46 AM, Thomas Schneider wrote:
May I just put into my 0.1 cents ...

1.) We shall and could *not* change the bahaviour of *not reserved words* in all Rexx Languages
2.) All Rexx dialects *do have* a *context sensitive* grammar (look, for instance, at the *abut* and
*blank* operators, which are *context sensitive*, as well as the usage of *builtin* methods/functions)
3.) That has been also the reason why some trials to implement any & all attempts to implement
Rexx and/or Netrexx by using AST's with tools like Lexx, Yacc, or similar tools did and shall fail.

I did study Compiler Wrting Techniques very long time ago, and also the syntactical requirements
of languages as PL/I etc to build a BNF definition (Note all the parenthesis required in PL/I
syntac to be able to define PL/I in Backus Naur Form, and resolve possible syntactik ambiguities
within PL/I). The Vienna IBM labor did a very great job, decenniums ago, to be able to define
PL/I in BNF, with all it's advantages and disadvantages, of course).

Thus, my (personal) opinion is, that you shall need a hand-written Scanner and Parser, anyway.

My personal approach, when writing Rexx2Nrx, back to early 2000's, has been, however, to separate
the Scanning, Parsing, Semantic Analysis, and Code Generation steps, in turn.

It shall be *not* possible to define a so called LR k Grammar for any Rexx dialect, which shall be
needed to provide a *context free* Grammar definition.

Thomas Schneider.

PS: When somebody shall be able to proove the contrary, I will of course accept this proove,
as it would increase my knowledge ;-)

============================================================================ .
Am 11.03.2013 16:02, schrieb Mike Cowlishaw:
Definitely a non-trivial question.  I have maintained for decades that language
syntax rules that rely upon 'reserved words' are not suitable for interpreted
languages (i.e., those that are generally run from source rather than form a
compiled 'binary').  This has been proven over and over again -- JavaScript
being a classic example which is trapped in its own syntax.
  Tools such as LEX and YACC make it quite easy to implement language
parsers/compilers .. but they lead the language designer into exactly that trap.

  I wrote the original Rexx parser 'by hand' because that was essential for
performance.  It was only later that I found that various 'packaged' parsers
were not only slow but also flawed.  I had a several-hour long argument with
Tony Hoare about this some time ago .. he felt that mathematical elegance was
more important than usability and extendability, so we never did agree :-).
  Nuje


On 3/11/2013 8:34 AM, René Jansen wrote:
Not that I can really help. I have been wondering the same.

Cases in point here are the Chad Slaughter ANTLR grammar for NetRexx. It is 'almost correct', and should be correctable. In the days of the 'open source parallel implementation of NetRexx' (nothing ever came from it) I have tried to interest the maintainers of ANTLR with $1000 to correct and maintain it. They probably had enough money OR it is too hard. (The offer has sinc been rescinded, thanks).

Bill's javacc grammar works, otherwise the Eclipse plugin would not display such good syntax coloring and parsing. It might miss some of the things you need to build a whole compiler, though.
I read Bill thinks it needs polishing before public consumption, but I have not often seen polished things that still work - and when they work, they'll get dirty again.

The fact that NetRexx's parser is hand-written does not proof that a parser generator cannot handle it, in itself. The ooRexx parser is lexx/yacc I think and it handles something very akin to NetRexx.

The problem with ANTLR is in the 'keyword-less' approach of NetRexx. I have spoken to Terence Parr ages ago and he says that however hard, it is no showstopper. Also years ago, I have invested time in this and with a depressing outcome. Normally, I am used to things happening when I work on them. This is not the case with grammar tools, and the working grammars I produced (one for an SQL tool in NetRexx, called nsql (I might publish it one day), and one for a special-purpose modelling language called 'bint) I have to keep in version management and check in after every character I change - grammar files are that brittle and even looking at them makes it stop working. Also, ANTLR has had a number of versions that changed everything.

It might be that I am just not smart enough. I have a friend with a grade in exactly this and I will forward your questions.

best regards,

René.


On 11 mrt. 2013, at 13:09, Kermit Kiser <[hidden email]> wrote:

Please bear with me if my question below seems elementary but my formal computer science education is mostly limited to the undergraduate level.

Although I have written uncountable parsing routines, my knowledge of the formal terminology and algorithms of parsers and compilers is quite limited. That may be part of the reason that I have not yet attempted any enhancements to the NetRexx language. I think I can see the use of the AST and CST structures in the translator but I don't have a solid enough grasp on how they interact to modify them. Bill's comment indicates that the NetRexx AST is not a "full AST" but I have no idea what that means.

I have seen several requests for an ANTLR grammar or a BNF definition for NetRexx. But Wikipedia says this about ANTLR:

A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF).

and this about EBNF:

In computer science, Extended Backus–Naur Form (EBNF) is a family of metasyntax notations used for expressing context-free grammars

Yet I have seen comments indicating that Rexx is not a context-free language and I think that applies equally to NetRexx as it's keywords are also not generally reserved outside of their context, for example.

To confuse things further, I have seen multiple comments indicating that there IS a BNF definition of Rexx in the ANSI standard for that language which RexxLA is said to maintain a copy of, but which I cannot find as the RexxLA site seems to be inaccessible currently.

So here is my question: Is it really possible to create an ANTLR grammar or BNF definition for NetRexx?

-- Kermit


On 3/8/2013 6:37 AM, Bill Fenlason wrote:
My understanding of the translator logic and data structures isn't complete enough to be sure of what the best approach would be.

When I was looking at it, it appeared to me that a full AST construction might not be easy to add to the translator.  Ideally a secondary processor (like to generate HTML, different output codes, formatting and pretty printing, etc.) could work just by walking the tree.  The design choice is between generating an AST which is processed by different applications or imbedding code in the translator to do the various output generations.

The AST that I currently generate in the Eclipse plugin certainly needs refinement.  It's a bit of a kludge at this point.  Eclipse builds ASTs for the Java code, and one obvious approach would be to have the translator actually generate the Java ASTs, but understanding how to do that would take a great deal of research. One up side of that approach would be that NetRexx code could be debugged using the (very sophisticated) Eclipse debugger.

The ultra-dynamic nature of NetRexx is a complication of course.

Bill

On 3/8/2013 9:21 AM, René Jansen wrote:
Yes I thought of that too. Bill has a parser for the Eclipse tool; the open sourcing was a bit late as I can imagine he would have rather used the translator itself. I was just asking around because I want to go to production any day now and perhaps someone had something around that fit the bill.

I agree that the translator itself would be the road to do the best syntax colouring possible.

best regards,

René.
On 8 mrt. 2013, at 14:24, "Mike Cowlishaw" [hidden email] wrote:

Who has, or knows of a NetRexx program that translates
NetRexx source to syntax coloured source in html? It has to
be quite quick, also.

i would be very interested in this for a new netrexx website
I am working on.
Maybe a nice option for the NetRexx compiler.  Instead of emitting Java, a
version that emitted HTML.  Given that the compiler has access to semantics as
well as syntax it could do a lot more .. e.g., different colours (or minor
variations of colours) for different types.  ints a different colour than
floats.  Local variables a different font than inherited ones.  Italic or bold
used for other attributes, etc.

Lots of possibilities :-).

Mike



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2904 / Virus Database: 2641/6155 - Release Date: 03/07/13




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2904 / Virus Database: 2641/6155 - Release Date: 03/07/13



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

rvjansen

On 11 mrt. 2013, at 18:28, Bill Fenlason <[hidden email]> wrote:

I did look at the existing ANTLR grammar and I found it lacking, particularly with regard to the above points (as well as some others as I recall).  Rene, I suspect the job is too hard rather than a $ issue.  I did not judge that grammar to be "almost-correct", but perhaps my criteria are substantially different.

I was starting to suspect that, because as we say in Amsterdam, I would not spit on $1000. A colleage of mine ran some NetRexx through it and reported back that it was not bad; in that sense I equate almost-correct with not-working but correctable. I see now that you judged it to be beyond salvaging (in the same sense that a crashed car can be fixed, but is written off by insurance if the fixing is too expensive), in addition to some more principal issues.

I am going to have a look at your grammar, if I can find it somewhere. Is the source included in the plugin downloadable?

I am a bit puzzled by your remark that NetRexx is not compilable independently of the execution environment. Nothing really is, I think. I am thankful that its source is NetRexx and not C++ or assembler. Everything we build always needs to link with its execution environment. But I am not sure what you mean here.

The other thing I do not understand is this: you say "the AST used by the Eclipse NetRexx plugin is a "full" AST in that it describes all the syntactic aspects of a NetRexx program" but in the post script for Thomas you state: " I believe that a Rexx interpreter using an AST could be (or perhaps has been) implemented, so I think we disagree on that point.  NetRexx is a different matter."

Isn't this contradictory? If true only for a subset, what would that subset be?

best regards,

René. 


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
Rene,

On 3/11/2013 1:56 PM, René Jansen wrote:

On 11 mrt. 2013, at 18:28, Bill Fenlason <[hidden email]> wrote:

I did look at the existing ANTLR grammar and I found it lacking, particularly with regard to the above points (as well as some others as I recall).  Rene, I suspect the job is too hard rather than a $ issue.  I did not judge that grammar to be "almost-correct", but perhaps my criteria are substantially different.

I was starting to suspect that, because as we say in Amsterdam, I would not spit on $1000. A colleage of mine ran some NetRexx through it and reported back that it was not bad; in that sense I equate almost-correct with not-working but correctable. I see now that you judged it to be beyond salvaging (in the same sense that a crashed car can be fixed, but is written off by insurance if the fixing is too expensive), in addition to some more principal issues.

Exactly.


I am going to have a look at your grammar, if I can find it somewhere. Is the source included in the plugin downloadable?

The source for the plugin grammar (Nrx.jjt)  is on sourceforge:

http://eclipsenetrexx.svn.sourceforge.net/viewvc/eclipsenetrexx/src/nrxParser/Nrx.jjt?revision=169&view=markup

The actual grammar def starts at line 539 or so.  This version is a bit old - I have a later version (prep for the next release, and it is somewhat different - bug fixes, etc) but I haven't had time to work on it or upload it.  My code is still in the debugging phase, so don't judge it too harshly :)

The JFlex token definitions are in the "scanner" directory (Nrx.lex).  The JJT source is not particularly easy to follow.  The grammar spec is not like BNF or EBNF - it is more like methods.  I doubt that it would be easy to migrate this to another environment, particularly since I had to make modifications to JavaCC itself to generate special code to support incremental parsing.


I am a bit puzzled by your remark that NetRexx is not compilable independently of the execution environment. Nothing really is, I think. I am thankful that its source is NetRexx and not C++ or assembler. Everything we build always needs to link with its execution environment. But I am not sure what you mean here.

What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.


The other thing I do not understand is this: you say "the AST used by the Eclipse NetRexx plugin is a "full" AST in that it describes all the syntactic aspects of a NetRexx program" but in the post script for Thomas you state: " I believe that a Rexx interpreter using an AST could be (or perhaps has been) implemented, so I think we disagree on that point.  NetRexx is a different matter."

Isn't this contradictory? If true only for a subset, what would that subset be?

As above.  Any place where NetRexx (at runtime) determines that a keyword is to be overridden is problematic.  I believe that the reason that was done was to allow keyword extensions to the language.  I suggested some time ago that another way to get around the keyword-set extension problem is to tag each source file with a language version (as HTML, etc. do).  I think that would have been preferable (but it is too late now even if I could convince Mike of its merit :).


best regards,

René.

Bill

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Mike Cowlishaw
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.
 
I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.
 
 
Mike 


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

ThSITC
In reply to this post by billfen
Hello Bill,
    many thanks for providing the link below!

From a short look at your source, this solution does look very compact and *elegant* for me!

Whil'st I do not understand all details, I do like *the style* how it is done!

My Gratulations!
Thomas.
==================================================================
Am 11.03.2013 20:33, schrieb Bill Fenlason:
Rene,

On 3/11/2013 1:56 PM, René Jansen wrote:

On 11 mrt. 2013, at 18:28, Bill Fenlason <[hidden email]> wrote:

I did look at the existing ANTLR grammar and I found it lacking, particularly with regard to the above points (as well as some others as I recall).  Rene, I suspect the job is too hard rather than a $ issue.  I did not judge that grammar to be "almost-correct", but perhaps my criteria are substantially different.

I was starting to suspect that, because as we say in Amsterdam, I would not spit on $1000. A colleage of mine ran some NetRexx through it and reported back that it was not bad; in that sense I equate almost-correct with not-working but correctable. I see now that you judged it to be beyond salvaging (in the same sense that a crashed car can be fixed, but is written off by insurance if the fixing is too expensive), in addition to some more principal issues.

Exactly.


I am going to have a look at your grammar, if I can find it somewhere. Is the source included in the plugin downloadable?

The source for the plugin grammar (Nrx.jjt)  is on sourceforge:

http://eclipsenetrexx.svn.sourceforge.net/viewvc/eclipsenetrexx/src/nrxParser/Nrx.jjt?revision=169&view=markup

The actual grammar def starts at line 539 or so.  This version is a bit old - I have a later version (prep for the next release, and it is somewhat different - bug fixes, etc) but I haven't had time to work on it or upload it.  My code is still in the debugging phase, so don't judge it too harshly :)

The JFlex token definitions are in the "scanner" directory (Nrx.lex).  The JJT source is not particularly easy to follow.  The grammar spec is not like BNF or EBNF - it is more like methods.  I doubt that it would be easy to migrate this to another environment, particularly since I had to make modifications to JavaCC itself to generate special code to support incremental parsing.


I am a bit puzzled by your remark that NetRexx is not compilable independently of the execution environment. Nothing really is, I think. I am thankful that its source is NetRexx and not C++ or assembler. Everything we build always needs to link with its execution environment. But I am not sure what you mean here.

What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.


The other thing I do not understand is this: you say "the AST used by the Eclipse NetRexx plugin is a "full" AST in that it describes all the syntactic aspects of a NetRexx program" but in the post script for Thomas you state: " I believe that a Rexx interpreter using an AST could be (or perhaps has been) implemented, so I think we disagree on that point.  NetRexx is a different matter."

Isn't this contradictory? If true only for a subset, what would that subset be?

As above.  Any place where NetRexx (at runtime) determines that a keyword is to be overridden is problematic.  I believe that the reason that was done was to allow keyword extensions to the language.  I suggested some time ago that another way to get around the keyword-set extension problem is to tag each source file with a language version (as HTML, etc. do).  I think that would have been preferable (but it is too late now even if I could convince Mike of its merit :).


best regards,

René.

Bill


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



--
Thomas Schneider, IT Consulting; http://www.thsitc.com; Vienna, Austria, Europe

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com
Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

ThSITC
In reply to this post by billfen
Am 12.03.2013 13:30, schrieb Bill Fenlason:
*snip*:
I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  


Very long time ago, on my first trials with NetRexx, I did report a quite trivial problem to Mike, as follows:

method mymethod(from=int, to=int)

...
loop i = from to to
   ... --- doing some statements depending of i
end

As far as I do remind, this sample code *did fail to compile*, as the first *to* in the
loop statement is no longer recognized as a keyword!

Mike's answer has been that this has been the first sample he did see for this
kind of potential problems ...

Did simply accept this restriction, those days ...

Hence, I would *underline* (personally) Bill's comment above with a +1 !
Thomas.



 

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com
Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
In reply to this post by billfen
Sorry for the delay with this.  A simple test shows that keywords can be overloaded with methods.  Here is a snip from a program named "keyword_test.nrx" :

/* NetRexx 3 */

begin

loop  i = 1 to 5
   say "\n Loop iteration" i
  
   if i = 3 then do
      say "   Leave in iteration 3"
      leave
      end
  
   if i > 3 then say "   Why am I still in this loop?"
   end

say "\n Now exit"
exit
say "   Why was the exit was not done?\n"


/* Remainder of the program */
method begin static
say "\n Begin Test"

------------------- snip (what has been omitted here?) ----------------------
...

Here is the somewhat surprising output from interpreting this program:

===== Exec: keyword_test =====

 Begin Test

 Loop iteration 1

 Loop iteration 2

 Loop iteration 3
   Leave in iteration 3

 Loop iteration 4
   Why am I still in this loop?

 Loop iteration 5
   Why am I still in this loop?

 Now exit
   Why was the exit was not done?

Processing of 'keyword_test.nrx' complete


And here is the rest of the program (not shown above):
...
-------------------- snip (what has been omitted here?) ----------------------

/* Method with the same name as leave keyword */
method leave static
--say "Leave method entered"

/* Method with the same name as exit keyword */
method exit static
--say "Exit method entered"

------------------------------------------------------------------------------------

I was wrong about a method named "end" since the mismatch with "loop" etc. will be caught.  But I think this example shows the astonishment factor with regard to overloading keywords.

I have not tested having the keyword overloads in an imported package, but I assume it would produce the same results.

Of course some may feel that keyword overloading is an advantage, but I think it is not, particularly for the beginning programmer.

Bill


On 3/12/2013 8:30 AM, Bill Fenlason wrote:
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Mike Cowlishaw
Bill, very many thanks for putting this example together.
 
What is important here is that -- yes -- local names act as the writer expected, regardless of new 'keywords' added to the language later.   Suppose that NetRexx did not have the LEAVE instruction (or the programmer did not know about it).  A programmer might have written the code as you show.  And it would have worked as you demonstrate. 
 
Then .. ten years later .. a NetRexx developer thinks .. hey, it would be nice to add LEAVE to the language and this is now a reserved keyword.   At that point the program below is broken -- long after the original programmer has left the company/stopped programming.  However, with the current rules the working program would be unaffected.
 
Or, suppose the method were called 'break' .. used by the programmer because NetRexx used the different 'keyword' "leave" -- and then later NetRexx developers decided to add 'break' as a synonym to 'leave' to help C and Java programmers?
 
Perhaps we are misunderstanding each other.  The design intent of NetRexx (in this respect) was that later additions to the language would not affect/break programs already written.  And, as far as possible, external changes to classes and superclasses should equally not invalidate a correct NetRexx program.  Kermit's changes may help strengthen the latter, for example.
 
Mike


From: [hidden email] [mailto:[hidden email]] On Behalf Of Bill Fenlason
Sent: 16 March 2013 18:46
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] AST, BNF, ANTLR

Sorry for the delay with this.  A simple test shows that keywords can be overloaded with methods.  Here is a snip from a program named "keyword_test.nrx" :

/* NetRexx 3 */

begin

loop  i = 1 to 5
   say "\n Loop iteration" i
  
   if i = 3 then do
      say "   Leave in iteration 3"
      leave
      end
  
   if i > 3 then say "   Why am I still in this loop?"
   end

say "\n Now exit"
exit
say "   Why was the exit was not done?\n"


/* Remainder of the program */
method begin static
say "\n Begin Test"

------------------- snip (what has been omitted here?) ----------------------
...

Here is the somewhat surprising output from interpreting this program:

===== Exec: keyword_test =====

 Begin Test

 Loop iteration 1

 Loop iteration 2

 Loop iteration 3
   Leave in iteration 3

 Loop iteration 4
   Why am I still in this loop?

 Loop iteration 5
   Why am I still in this loop?

 Now exit
   Why was the exit was not done?

Processing of 'keyword_test.nrx' complete


And here is the rest of the program (not shown above):
...
-------------------- snip (what has been omitted here?) ----------------------

/* Method with the same name as leave keyword */
method leave static
--say "Leave method entered"

/* Method with the same name as exit keyword */
method exit static
--say "Exit method entered"

------------------------------------------------------------------------------------

I was wrong about a method named "end" since the mismatch with "loop" etc. will be caught.  But I think this example shows the astonishment factor with regard to overloading keywords.

I have not tested having the keyword overloads in an imported package, but I assume it would produce the same results.

Of course some may feel that keyword overloading is an advantage, but I think it is not, particularly for the beginning programmer.

Bill


On 3/12/2013 8:30 AM, Bill Fenlason wrote:
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

George Hovey-2
Mike,
Not breaking old programs seems a worthy feature indeed.  Can or should NetRexx warn that a program uses current keywords in the way you describe?  This could be useful to someone given the task of bringing ancient code up to date.

On Sat, Mar 16, 2013 at 3:36 PM, Mike Cowlishaw <[hidden email]> wrote:
Bill, very many thanks for putting this example together.
 
What is important here is that -- yes -- local names act as the writer expected, regardless of new 'keywords' added to the language later.   Suppose that NetRexx did not have the LEAVE instruction (or the programmer did not know about it).  A programmer might have written the code as you show.  And it would have worked as you demonstrate. 
 
Then .. ten years later .. a NetRexx developer thinks .. hey, it would be nice to add LEAVE to the language and this is now a reserved keyword.   At that point the program below is broken -- long after the original programmer has left the company/stopped programming.  However, with the current rules the working program would be unaffected.
 
Or, suppose the method were called 'break' .. used by the programmer because NetRexx used the different 'keyword' "leave" -- and then later NetRexx developers decided to add 'break' as a synonym to 'leave' to help C and Java programmers?
 
Perhaps we are misunderstanding each other.  The design intent of NetRexx (in this respect) was that later additions to the language would not affect/break programs already written.  And, as far as possible, external changes to classes and superclasses should equally not invalidate a correct NetRexx program.  Kermit's changes may help strengthen the latter, for example.
 
Mike


From: [hidden email] [mailto:[hidden email]] On Behalf Of Bill Fenlason
Sent: 16 March 2013 18:46
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] AST, BNF, ANTLR

Sorry for the delay with this.  A simple test shows that keywords can be overloaded with methods.  Here is a snip from a program named "keyword_test.nrx" :

/* NetRexx 3 */

begin

loop  i = 1 to 5
   say "\n Loop iteration" i
  
   if i = 3 then do
      say "   Leave in iteration 3"
      leave
      end
  
   if i > 3 then say "   Why am I still in this loop?"
   end

say "\n Now exit"
exit
say "   Why was the exit was not done?\n"


/* Remainder of the program */
method begin static
say "\n Begin Test"

------------------- snip (what has been omitted here?) ----------------------
...

Here is the somewhat surprising output from interpreting this program:

===== Exec: keyword_test =====

 Begin Test

 Loop iteration 1

 Loop iteration 2

 Loop iteration 3
   Leave in iteration 3

 Loop iteration 4
   Why am I still in this loop?

 Loop iteration 5
   Why am I still in this loop?

 Now exit
   Why was the exit was not done?

Processing of 'keyword_test.nrx' complete


And here is the rest of the program (not shown above):
...
-------------------- snip (what has been omitted here?) ----------------------

/* Method with the same name as leave keyword */
method leave static
--say "Leave method entered"

/* Method with the same name as exit keyword */
method exit static
--say "Exit method entered"

------------------------------------------------------------------------------------

I was wrong about a method named "end" since the mismatch with "loop" etc. will be caught.  But I think this example shows the astonishment factor with regard to overloading keywords.

I have not tested having the keyword overloads in an imported package, but I assume it would produce the same results.

Of course some may feel that keyword overloading is an advantage, but I think it is not, particularly for the beginning programmer.

Bill


On 3/12/2013 8:30 AM, Bill Fenlason wrote:
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/





--
"One can live magnificently in this world if one knows how to work and how to love."  --  Leo Tolstoy
_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

ThSITC
In reply to this post by billfen
Hi Bill,
   as far as I do know, *begin* is *no keyword* in NetRexx!
... *or* did I miss something ?
Happy Sunday, all, anyway !
Thomas.
==================================================================================


Am 16.03.2013 19:46, schrieb Bill Fenlason:
Sorry for the delay with this.  A simple test shows that keywords can be overloaded with methods.  Here is a snip from a program named "keyword_test.nrx" :

/* NetRexx 3 */

begin

loop  i = 1 to 5
   say "\n Loop iteration" i
  
   if i = 3 then do
      say "   Leave in iteration 3"
      leave
      end
  
   if i > 3 then say "   Why am I still in this loop?"
   end

say "\n Now exit"
exit
say "   Why was the exit was not done?\n"


/* Remainder of the program */
method begin static
say "\n Begin Test"

------------------- snip (what has been omitted here?) ----------------------
...

Here is the somewhat surprising output from interpreting this program:

===== Exec: keyword_test =====

 Begin Test

 Loop iteration 1

 Loop iteration 2

 Loop iteration 3
   Leave in iteration 3

 Loop iteration 4
   Why am I still in this loop?

 Loop iteration 5
   Why am I still in this loop?

 Now exit
   Why was the exit was not done?

Processing of 'keyword_test.nrx' complete


And here is the rest of the program (not shown above):
...
-------------------- snip (what has been omitted here?) ----------------------

/* Method with the same name as leave keyword */
method leave static
--say "Leave method entered"

/* Method with the same name as exit keyword */
method exit static
--say "Exit method entered"

------------------------------------------------------------------------------------

I was wrong about a method named "end" since the mismatch with "loop" etc. will be caught.  But I think this example shows the astonishment factor with regard to overloading keywords.

I have not tested having the keyword overloads in an imported package, but I assume it would produce the same results.

Of course some may feel that keyword overloading is an advantage, but I think it is not, particularly for the beginning programmer.

Bill


On 3/12/2013 8:30 AM, Bill Fenlason wrote:
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



--
Thomas Schneider, IT Consulting; http://www.thsitc.com; Vienna, Austria, Europe

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com
Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Mike Cowlishaw
In reply to this post by George Hovey-2
George wrote: 
 
Not breaking old programs seems a worthy feature indeed.  Can or should NetRexx warn that a program uses current keywords in the way you describe?  This could be useful to someone given the task of bringing ancient code up to date. 
 
George, it doesn't need to.  Adding new keywords to NetRexx won't break old programs, whether they be instrction keywords or the host of sub-keywords.
 
But yes one could add an option to the compiler that warned whenever a variable name 'conflicted' with a keyword ... but that would need a 'central registry' of such keywords (at present, a new or experimental instruction can be added to NetRexx without the rest of the compiler needing to know anything about its internal syntax, I think).
 
Mike

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
In reply to this post by Mike Cowlishaw
Mike,

I don't believe that I've misunderstood you, but I do believe that we have a disagreement about how best to handle the language (keyword) extensibility problem.

You have described the problem quite clearly - in many programming languages breakage can occur when new keywords are added.  In other words, old programs do not work as they originally did.

My point is that the NetRexx approach to distinguishing keywords from variable names has significant downsides. 

Of course NetRexx 3 is not about to change - as the saying goes "It is what it is".  I'm not advocating any change, although if a new Rexx dialect is developed I am advocating that  the breakage problem be handled differently.  This is a philosophical discussion, not a practical one.

I think it is important to point out that with careful design, a programming language can totally avoid the breakage problem.  The best example is PL/I.  In its 50 year history, PL/I has added many dozens of keywords to the language, but as far as I know, there has never been an instance of breakage.  Why? Because keywords are never identified by examining them!  In other words, tokens are are identified as keywords by syntax context rather than content, and that is why keywords are never confused with variables with the same name.  The down side is that the language has lots of parens and commas, and sometimes an unnatural feel.  I know that you are well aware of this Mike, but some other readers may not be.  PL/I was famously ridiculed for its acceptance of the perfectly valid statement:
     "if if = then then then = else; else else = if;". 
The mind sees "if", "then" and "else" as keywords and not variable names.  The fact that the situation was a byproduct of avoiding the breakage problem was generally not acknowledged.

The crucial point is that in any language that avoids the breakage problem, the separation of keywords and variable names comes first.  Then the keyword token in question is compared with the list of known keywords.  If a token which is known to be a keyword is not within the list of known keywords (for that version of the language), it is an "invalid keyword" situation.  It is not presumed to be a variable name.

Most other languages use the "reserved keyword" approach.  Keywords are identified by comparing tokens with a list of words, and anything that matches is a keyword, anything that doesn't match is a variable name, and "never the twain shall meet".  In that case, breakage will always occur in places where keywords and variables can occur in the same location. 

As you know, NetRexx takes the opposite approach.  It compares tokens with a dynamically computed list of variable and method names (i.e. everything that is not a keyword).  If the token is not within that list, the token is judged to be a keyword.  Then if the keyword is not within the list of known keywords, it is an "invalid keyword" situation.  Thus NetRexx, like PL/I, avoids the breakage problem.

In my opinion, here are some downsides of the NetRexx approach.

First, as I tried to point out, by giving variable names priority over keywords, keywords may be overloaded.  In my view, that is a bad idea for a language which strives for simplicity and low "astonishment" levels.

It is a natural tendency for programmers (particularly those with experience in other languages) to recognize keywords by content.  In other words, when reading "options args", options is assumed to be a keyword.  Allowing any other interpretation is simply confusing. 

Second, using a dynamic list of available variable names locks the program into its execution environment.  The example on page 79 of TRL contains two occurrences of "say 'hello' " in the same short program.  The first is valid, and the second is an error.  In my opinion, that is confusing and a bad idea.  It is, of course, a byproduct of the way that NetRexx avoids the breakage problem.

Consider the following program:

/* NetRexx 4 */
import some.package.
please
explain
this
program

What does a person familiar with only NetRexx 3 make of this?  Each of the words might be a method or a new keyword added in version 4 of the language. 

Third, using a dynamic list of available variable names not only locks the program into its execution environment, it also locks any other program which attempts to correctly process a NetRexx source file into the execution environment.  That means than any formatter, pretty printer, statistics gatherer, intelligent editor etc. for NetRexx must also include the same logic that the translator uses.  It must dynamically determine everything which is not a keyword to identify keywords.  I think that is unfortunate since it makes the development of  peripheral NetRexx processors more difficult or impossible.

Finally, it makes the language difficult, if not impossible, to define in a formal manner with BNF or another formal definition method.  While some may feel this is actually an advantage (!), the truth is that it makes standardization difficult.  Essentially all compilable programming languages have formal definitions.

The overall problem of language versions is a complex one, since every change to a language in effect defines a new language.  I believe the assumption that any future NetRexx processor should be able to correctly process every NetRexx program without knowing which version of the NetRexx language it is programmed in, is, (while laudable), not worth it if it requires the current NetRexx method of identifying keywords. 

As I have suggested, I believe in adopting the approach that NetRexx programs should identify themselves.  In HTML web pages, the very first thing is a DOCTYPE declaration of exactly what language the page is written in.  I think the same approach could be adopted for NetRexx so that if at some later point the method of identifying keywords is changed, it could be accommodated.  I suggest that the language level should be included in the initial comment or in an option which must be specified before the remainder of the program.  Existing NetRexx programs would, of course, default to the current language version.

As I said, you and I just disagree on this, Mike.  I don't expect you to change anything, but I do hope you will give it some (more) serious thought.

Bill


On 3/16/2013 3:36 PM, Mike Cowlishaw wrote:
Bill, very many thanks for putting this example together.
 
What is important here is that -- yes -- local names act as the writer expected, regardless of new 'keywords' added to the language later.   Suppose that NetRexx did not have the LEAVE instruction (or the programmer did not know about it).  A programmer might have written the code as you show.  And it would have worked as you demonstrate. 
 
Then .. ten years later .. a NetRexx developer thinks .. hey, it would be nice to add LEAVE to the language and this is now a reserved keyword.   At that point the program below is broken -- long after the original programmer has left the company/stopped programming.  However, with the current rules the working program would be unaffected.
 
Or, suppose the method were called 'break' .. used by the programmer because NetRexx used the different 'keyword' "leave" -- and then later NetRexx developers decided to add 'break' as a synonym to 'leave' to help C and Java programmers?
 
Perhaps we are misunderstanding each other.  The design intent of NetRexx (in this respect) was that later additions to the language would not affect/break programs already written.  And, as far as possible, external changes to classes and superclasses should equally not invalidate a correct NetRexx program.  Kermit's changes may help strengthen the latter, for example.
 
Mike


From: [hidden email] [[hidden email]] On Behalf Of Bill Fenlason
Sent: 16 March 2013 18:46
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] AST, BNF, ANTLR

Sorry for the delay with this.  A simple test shows that keywords can be overloaded with methods.  Here is a snip from a program named "keyword_test.nrx" :

/* NetRexx 3 */

begin

loop  i = 1 to 5
   say "\n Loop iteration" i
  
   if i = 3 then do
      say "   Leave in iteration 3"
      leave
      end
  
   if i > 3 then say "   Why am I still in this loop?"
   end

say "\n Now exit"
exit
say "   Why was the exit was not done?\n"


/* Remainder of the program */
method begin static
say "\n Begin Test"

------------------- snip (what has been omitted here?) ----------------------
...

Here is the somewhat surprising output from interpreting this program:

===== Exec: keyword_test =====

 Begin Test

 Loop iteration 1

 Loop iteration 2

 Loop iteration 3
   Leave in iteration 3

 Loop iteration 4
   Why am I still in this loop?

 Loop iteration 5
   Why am I still in this loop?

 Now exit
   Why was the exit was not done?

Processing of 'keyword_test.nrx' complete


And here is the rest of the program (not shown above):
...
-------------------- snip (what has been omitted here?) ----------------------

/* Method with the same name as leave keyword */
method leave static
--say "Leave method entered"

/* Method with the same name as exit keyword */
method exit static
--say "Exit method entered"

------------------------------------------------------------------------------------

I was wrong about a method named "end" since the mismatch with "loop" etc. will be caught.  But I think this example shows the astonishment factor with regard to overloading keywords.

I have not tested having the keyword overloads in an imported package, but I assume it would produce the same results.

Of course some may feel that keyword overloading is an advantage, but I think it is not, particularly for the beginning programmer.

Bill


On 3/12/2013 8:30 AM, Bill Fenlason wrote:
Mike,

On 3/12/2013 3:55 AM, Mike Cowlishaw wrote:
Bill wrote: 
 
 What I mean is that the same NetRexx statement is either valid or an error depending on the dynamic execution environment - that is, can depend on exactly what is available at runtime and the current state of affairs within the program execution.  See page 79 of the language reference manual ("Keyword Instructions").  Also, because you can overload a keyword with a method, you can not determine if the statement is in error until the runtime environment is available.  (Consider a program which accesses a Java method named "end" :).
 
Is this really true?   I thought it was only variables (including arguments & properties).  Since those are statically determined there shouldn't be any dependence on runtime environment.  Or maybe I'm forgetting something again.


Mike, I may be wrong on that.  I must admit that I have not tried it, but it is based on my understand of the determination of keywords.  I may have it confused in my memory - my current medications seem to encourage that :).  I will test this later today when I have time.

I think keywords should have priority over variable names IN SITUATIONS WHERE THE KEYWORD IS VALID, but NetRexx does not work that way.  I think that in the example on page 79, the fact that the second  "say 'Hello' " is in error rates high on the astonishment scale.  
 
This means the programmer has to learn all the keywords in the language ... but I agree a more complicated rule might be possible.  However it wasn't clear whether the syntax (effectively "3 'Hello') might _not_ be an error in the future, especially as an expression like that is valid in Classic Rexx, e.g., the second line of:
 
  options='open binary stream'
  options args
 
would issue a command in Rexx but be a keyword instruction in (modified) NetRexx.   


 Considering how few keywords Rexx has, I doubt that learning the keywords would be a problem. 

I think that the statement "options args" that issues a command also ranks pretty high on the astonishment scale.

I agree that assuming that all versions of NetRexx (now and in the future) should be parsable without language version information will certainly allow breakage.  That is why I suggest that the language version (if not the original) be explicit in the heading.  I see no difficulty on insisting that the first comment token be "NetRexx" and the next non-whitespace token sequence be a language level indication (e.g. /*  NetRexx  3.2  */ )

 
Mike 


Bill




_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Kermit Kiser
In reply to this post by rvjansen
René and all --

I have followed this discussion with interest and although I cannot say that I understand it all, I have drawn some conclusions and I do have a recommendation.

It seems that some people believe that a formal grammar definition for NetRexx is possible and that is somewhat supported by the ANSI standard for Rexx which RexxLA provides here:

http://www.rexxla.org/rexxlang/standards/j18pub.pdf

The above document contains a BNF definition for classic Rexx. However I have no idea if it could be converted into a NetRexx definition or if such an item would be of any use in automated systems.

Therefore, given the currently limited development resources of the NetRexx community, I suggest that we define an API for the NetRexx translator which allows passing of it's AST equivalent data structures to those programming tools which desire to walk the tree and perform highlighting or other special processing for source code. This would allow different tools to use the same standard NetRexx parsing system without having to attempt developing an independent parsing approach (a project which we have all agreed is difficult). I grant that this would probably not provide the piecewise recompilation that Bill desires, but it would still enable many advanced tools to be developed for NetRexx programmers, such as refactoring editors etc.

Does that approach seem feasible?

-- Kermit


On 3/11/2013 2:34 AM, René Jansen wrote:
Not that I can really help. I have been wondering the same.

Cases in point here are the Chad Slaughter ANTLR grammar for NetRexx. It is 'almost correct', and should be correctable. In the days of the 'open source parallel implementation of NetRexx' (nothing ever came from it) I have tried to interest the maintainers of ANTLR with $1000 to correct and maintain it. They probably had enough money OR it is too hard. (The offer has sinc been rescinded, thanks).

Bill's javacc grammar works, otherwise the Eclipse plugin would not display such good syntax coloring and parsing. It might miss some of the things you need to build a whole compiler, though.
I read Bill thinks it needs polishing before public consumption, but I have not often seen polished things that still work - and when they work, they'll get dirty again.

The fact that NetRexx's parser is hand-written does not proof that a parser generator cannot handle it, in itself. The ooRexx parser is lexx/yacc I think and it handles something very akin to NetRexx.

The problem with ANTLR is in the 'keyword-less' approach of NetRexx. I have spoken to Terence Parr ages ago and he says that however hard, it is no showstopper. Also years ago, I have invested time in this and with a depressing outcome. Normally, I am used to things happening when I work on them. This is not the case with grammar tools, and the working grammars I produced (one for an SQL tool in NetRexx, called nsql (I might publish it one day), and one for a special-purpose modelling language called 'bint) I have to keep in version management and check in after every character I change - grammar files are that brittle and even looking at them makes it stop working. Also, ANTLR has had a number of versions that changed everything.

It might be that I am just not smart enough. I have a friend with a grade in exactly this and I will forward your questions.

best regards,

René.


On 11 mrt. 2013, at 13:09, Kermit Kiser <[hidden email]> wrote:

Please bear with me if my question below seems elementary but my formal computer science education is mostly limited to the undergraduate level.

Although I have written uncountable parsing routines, my knowledge of the formal terminology and algorithms of parsers and compilers is quite limited. That may be part of the reason that I have not yet attempted any enhancements to the NetRexx language. I think I can see the use of the AST and CST structures in the translator but I don't have a solid enough grasp on how they interact to modify them. Bill's comment indicates that the NetRexx AST is not a "full AST" but I have no idea what that means.

I have seen several requests for an ANTLR grammar or a BNF definition for NetRexx. But Wikipedia says this about ANTLR:

A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF).

and this about EBNF:

In computer science, Extended Backus–Naur Form (EBNF) is a family of metasyntax notations used for expressing context-free grammars

Yet I have seen comments indicating that Rexx is not a context-free language and I think that applies equally to NetRexx as it's keywords are also not generally reserved outside of their context, for example.

To confuse things further, I have seen multiple comments indicating that there IS a BNF definition of Rexx in the ANSI standard for that language which RexxLA is said to maintain a copy of, but which I cannot find as the RexxLA site seems to be inaccessible currently.

So here is my question: Is it really possible to create an ANTLR grammar or BNF definition for NetRexx?

-- Kermit


On 3/8/2013 6:37 AM, Bill Fenlason wrote:
My understanding of the translator logic and data structures isn't complete enough to be sure of what the best approach would be.

When I was looking at it, it appeared to me that a full AST construction might not be easy to add to the translator.  Ideally a secondary processor (like to generate HTML, different output codes, formatting and pretty printing, etc.) could work just by walking the tree.  The design choice is between generating an AST which is processed by different applications or imbedding code in the translator to do the various output generations.

The AST that I currently generate in the Eclipse plugin certainly needs refinement.  It's a bit of a kludge at this point.  Eclipse builds ASTs for the Java code, and one obvious approach would be to have the translator actually generate the Java ASTs, but understanding how to do that would take a great deal of research. One up side of that approach would be that NetRexx code could be debugged using the (very sophisticated) Eclipse debugger.

The ultra-dynamic nature of NetRexx is a complication of course.

Bill

On 3/8/2013 9:21 AM, René Jansen wrote:
Yes I thought of that too. Bill has a parser for the Eclipse tool; the open sourcing was a bit late as I can imagine he would have rather used the translator itself. I was just asking around because I want to go to production any day now and perhaps someone had something around that fit the bill.

I agree that the translator itself would be the road to do the best syntax colouring possible.

best regards,

René.
On 8 mrt. 2013, at 14:24, "Mike Cowlishaw" [hidden email] wrote:

Who has, or knows of a NetRexx program that translates
NetRexx source to syntax coloured source in html? It has to
be quite quick, also.

i would be very interested in this for a new netrexx website
I am working on.
Maybe a nice option for the NetRexx compiler.  Instead of emitting Java, a
version that emitted HTML.  Given that the compiler has access to semantics as
well as syntax it could do a lot more .. e.g., different colours (or minor
variations of colours) for different types.  ints a different colour than
floats.  Local variables a different font than inherited ones.  Italic or bold
used for other attributes, etc.

Lots of possibilities :-).

Mike


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

Tom Maynard
On 18/3/13 0:35, Kermit Kiser wrote:
I suggest that we define an API for the NetRexx translator which allows passing of it's AST equivalent data structures to those programming tools which desire to walk the tree and perform highlighting or other special processing for source code.

As an only mildly interested bystander -- since in all likelihood I'll never refer to the AST -- this seems like an extremely suitable compromise.  Let 'the next guy' take up the task of applying this resource to future tools.  At least it will exist -- more than can be said today.

I would add one caveat, however: This project should have a significantly lower priority than normal NetRexx development work (but I doubt that I needed to stipulate that).

Excellent idea, Kermit.
Tom.


_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Reply | Threaded
Open this post in threaded view
|

Re: AST, BNF, ANTLR

billfen
In reply to this post by Kermit Kiser
There were several different implementations of Classic Rexx, including Regina which is perhaps the most used.  It seems to me that without the standard definition for Rexx and the ability to use automated scanner and parser generators (as well as hand written ones), a number of the Rexx implementations might not have occurred.  In my opinion those implementations were of significant benefit to the Rexx language and its usage.

I do not believe an effective and workable standard definition for NetRexx (using BNF or another grammar definition method) is possible.  Keywords can not be defined without adding caveats such as: "The word "exit" shall not be recognized in this context if there exists any active variable or method named "exit" when the word "exit" is encountered in the source input."

To clarify, it is only because keywords are seldom used as variable or method names that a processor using automated tools can be effective since they assume that keywords take priority over variable and method names. 

Adding an AST generation capability to the NetRexx translator (along with API) is certainly possible, although that in effect means that the whole translator will be included in any additional NetRexx processing tool.  I doubt that method would be suitable for dynamic tools such as intelligent editors, etc.

As I mentioned in my prior append, this is a philosophical discussion.  I think that in the long run, NetRexx would be a better language if the priority of keywords and names was definable (perhaps by option) and a convention for the specification of language level were added.  I believe that the easier it is to implement NetRexx tools (including compilers, etc) the better.  Clearly this is just my opinion, others may disagree, and of course I might be wrong :)

Bill

On 3/18/2013 1:35 AM, Kermit Kiser wrote:
René and all --

I have followed this discussion with interest and although I cannot say that I understand it all, I have drawn some conclusions and I do have a recommendation.

It seems that some people believe that a formal grammar definition for NetRexx is possible and that is somewhat supported by the ANSI standard for Rexx which RexxLA provides here:

http://www.rexxla.org/rexxlang/standards/j18pub.pdf

The above document contains a BNF definition for classic Rexx. However I have no idea if it could be converted into a NetRexx definition or if such an item would be of any use in automated systems.

Therefore, given the currently limited development resources of the NetRexx community, I suggest that we define an API for the NetRexx translator which allows passing of it's AST equivalent data structures to those programming tools which desire to walk the tree and perform highlighting or other special processing for source code. This would allow different tools to use the same standard NetRexx parsing system without having to attempt developing an independent parsing approach (a project which we have all agreed is difficult). I grant that this would probably not provide the piecewise recompilation that Bill desires, but it would still enable many advanced tools to be developed for NetRexx programmers, such as refactoring editors etc.

Does that approach seem feasible?

-- Kermit


On 3/11/2013 2:34 AM, René Jansen wrote:
Not that I can really help. I have been wondering the same.

Cases in point here are the Chad Slaughter ANTLR grammar for NetRexx. It is 'almost correct', and should be correctable. In the days of the 'open source parallel implementation of NetRexx' (nothing ever came from it) I have tried to interest the maintainers of ANTLR with $1000 to correct and maintain it. They probably had enough money OR it is too hard. (The offer has sinc been rescinded, thanks).

Bill's javacc grammar works, otherwise the Eclipse plugin would not display such good syntax coloring and parsing. It might miss some of the things you need to build a whole compiler, though.
I read Bill thinks it needs polishing before public consumption, but I have not often seen polished things that still work - and when they work, they'll get dirty again.

The fact that NetRexx's parser is hand-written does not proof that a parser generator cannot handle it, in itself. The ooRexx parser is lexx/yacc I think and it handles something very akin to NetRexx.

The problem with ANTLR is in the 'keyword-less' approach of NetRexx. I have spoken to Terence Parr ages ago and he says that however hard, it is no showstopper. Also years ago, I have invested time in this and with a depressing outcome. Normally, I am used to things happening when I work on them. This is not the case with grammar tools, and the working grammars I produced (one for an SQL tool in NetRexx, called nsql (I might publish it one day), and one for a special-purpose modelling language called 'bint) I have to keep in version management and check in after every character I change - grammar files are that brittle and even looking at them makes it stop working. Also, ANTLR has had a number of versions that changed everything.

It might be that I am just not smart enough. I have a friend with a grade in exactly this and I will forward your questions.

best regards,

René.


On 11 mrt. 2013, at 13:09, Kermit Kiser <[hidden email]> wrote:

Please bear with me if my question below seems elementary but my formal computer science education is mostly limited to the undergraduate level.

Although I have written uncountable parsing routines, my knowledge of the formal terminology and algorithms of parsers and compilers is quite limited. That may be part of the reason that I have not yet attempted any enhancements to the NetRexx language. I think I can see the use of the AST and CST structures in the translator but I don't have a solid enough grasp on how they interact to modify them. Bill's comment indicates that the NetRexx AST is not a "full AST" but I have no idea what that means.

I have seen several requests for an ANTLR grammar or a BNF definition for NetRexx. But Wikipedia says this about ANTLR:

A language is specified using a context-free grammar which is expressed using Extended Backus–Naur Form (EBNF).

and this about EBNF:

In computer science, Extended Backus–Naur Form (EBNF) is a family of metasyntax notations used for expressing context-free grammars

Yet I have seen comments indicating that Rexx is not a context-free language and I think that applies equally to NetRexx as it's keywords are also not generally reserved outside of their context, for example.

To confuse things further, I have seen multiple comments indicating that there IS a BNF definition of Rexx in the ANSI standard for that language which RexxLA is said to maintain a copy of, but which I cannot find as the RexxLA site seems to be inaccessible currently.

So here is my question: Is it really possible to create an ANTLR grammar or BNF definition for NetRexx?

-- Kermit


On 3/8/2013 6:37 AM, Bill Fenlason wrote:
My understanding of the translator logic and data structures isn't complete enough to be sure of what the best approach would be.

When I was looking at it, it appeared to me that a full AST construction might not be easy to add to the translator.  Ideally a secondary processor (like to generate HTML, different output codes, formatting and pretty printing, etc.) could work just by walking the tree.  The design choice is between generating an AST which is processed by different applications or imbedding code in the translator to do the various output generations.

The AST that I currently generate in the Eclipse plugin certainly needs refinement.  It's a bit of a kludge at this point.  Eclipse builds ASTs for the Java code, and one obvious approach would be to have the translator actually generate the Java ASTs, but understanding how to do that would take a great deal of research. One up side of that approach would be that NetRexx code could be debugged using the (very sophisticated) Eclipse debugger.

The ultra-dynamic nature of NetRexx is a complication of course.

Bill

On 3/8/2013 9:21 AM, René Jansen wrote:
Yes I thought of that too. Bill has a parser for the Eclipse tool; the open sourcing was a bit late as I can imagine he would have rather used the translator itself. I was just asking around because I want to go to production any day now and perhaps someone had something around that fit the bill.

I agree that the translator itself would be the road to do the best syntax colouring possible.

best regards,

René.
On 8 mrt. 2013, at 14:24, "Mike Cowlishaw" [hidden email] wrote:

Who has, or knows of a NetRexx program that translates
NetRexx source to syntax coloured source in html? It has to
be quite quick, also.

i would be very interested in this for a new netrexx website
I am working on.
Maybe a nice option for the NetRexx compiler.  Instead of emitting Java, a
version that emitted HTML.  Given that the compiler has access to semantics as
well as syntax it could do a lot more .. e.g., different colours (or minor
variations of colours) for different types.  ints a different colour than
floats.  Local variables a different font than inherited ones.  Italic or bold
used for other attributes, etc.

Lots of possibilities :-).

Mike



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/



No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2904 / Virus Database: 2641/6172 - Release Date: 03/13/13



_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

12