ibm-netrexx

Nested List Support?

Classic

List

Threaded

50 messages Options

123

billfen

Nested List Support?

This question is primarily for Mike, but I'm sure others of you will
have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is
a fundamental concept.

Since the Rexx family of languages is string oriented, it is not
uncommon to encode a list within a string with a separator character,
and to process the list with something like:

parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a
list may be encoded with specific matched start and end characters and a
separator character (e.g. '(' and ')' and ',' ). If the lists may be
nested, there does not appear to be any easy way to parse it while still
retaining the list structure.

One example would be the parsing of source containing expressions or
argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}' and
'`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow
convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse
statement but which allows the specification of beginning and ending
characters. The scanning would check for matched pairs and process
appropriately.

2) Extend the parse statement by providing a new type of pattern,
perhaps a function notation which calls a function to scan ahead,
skipping the contents of the data within matched beginning and ending
characters.

3) Provide a built in function which perhaps includes the parse
statement arguments.

4) Provide a new type of list object, and have the parse statement
understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it
or I don't remember it. It seems to me that this is a general weakness
in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to
NetRexx.

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

rvjansen

Re: Nested List Support?

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
parse cdr car cdr
do
car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason <[hidden email]> wrote:

> This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.
>
> In some other programming languages (like lisp and python), a "list" is a fundamental concept.
>
> Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:
>
> parse list_contents list_item "," list_contents ;
>
> The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.
>
> One example would be the parsing of source containing expressions or argument lists.
>
> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>
> Another example might be an encoding approach which uses '{', '}' and '`' for generic list encoding.
>
> The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?
>
> 1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.
>
> 2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.
>
> 3) Provide a built in function which perhaps includes the parse statement arguments.
>
> 4) Provide a new type of list object, and have the parse statement understand its structure.
>
> 5) Some other approach?
>
> 6) Ignore this problem.
>
> I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.
>
> How have other people handled this problem in the past?
>
> The topic came up for me when I tried to convert a python program to NetRexx.
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

billfen

Re: Nested List Support?

Rene,

I can see how that might be used if the input consists of blank
delimited words, but I'm not sure I understand how you would parse the
example I provided - one character at a time?

If you don't know if the next special character is a begin list, end
list or separator, how would you specify the template? Of course you
could parse the remainder 3 different times and compare lengths etc.,
but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into
elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

> Bill,
>
> 5) just my 2 cents here:
>
> Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:
>
> loop while cdr.words()
> parse cdr car cdr
> do
> car.something()
> end
> end
>
> or some variations thereof
>
> The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.
>
> best regards,
>
> René.
>
>
>
> On 4 dec. 2012, at 19:09, Bill Fenlason <[hidden email]> wrote:
>
>> This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.
>>
>> In some other programming languages (like lisp and python), a "list" is a fundamental concept.
>>
>> Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:
>>
>> parse list_contents list_item "," list_contents ;
>>
>> The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.
>>
>> One example would be the parsing of source containing expressions or argument lists.
>>
>> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>>
>> Another example might be an encoding approach which uses '{', '}' and '`' for generic list encoding.
>>
>> The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?
>>
>> 1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.
>>
>> 2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.
>>
>> 3) Provide a built in function which perhaps includes the parse statement arguments.
>>
>> 4) Provide a new type of list object, and have the parse statement understand its structure.
>>
>> 5) Some other approach?
>>
>> 6) Ignore this problem.
>>
>> I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.
>>
>> How have other people handled this problem in the past?
>>
>> The topic came up for me when I tried to convert a python program to NetRexx.
>>
>> _______________________________________________
>> Ibm-netrexx mailing list
>> [hidden email]
>> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2013.0.2793 / Virus Database: 2629/5893 - Release Date: 11/13/12
> Internal Virus Database is out of date.
>
>
>

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Kermit Kiser

Re: Nested List Support?

Hi Bill --

Interesting problem you propose. I wonder if it is related to the
RegRexx project and mailing list that RexxLA started a couple years ago
and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
say offset key link "token="data["token"]
loop i=1 to data["items"]
dump(data[i],i,offset "--*","==>")
end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
in=Rexx(input)
in["count"]=0 -- in order to work recursively, we will need to count how many characters are consumed at each step

loop label scanloop itemno=1 while start<in.length -- need to loop in case multiple items
deloc=in.substr(start).verify(delims,"match") -- locate next delimiter
in[itemno]=" "
if deloc=0 then do --did not find a delimiter
in[itemno,"token"]=in.substr(start)
in["count"]=in["count"]+in.substr(start).length
leave scanloop
end
else do --found a delimiter
if delims.pos(in.substr(start+deloc-1,1))=1 then do -- found a sublist - handle it recursively
in[itemno]=parselist(in.substr(start+deloc),delims)
in[itemno,"token"]=in.substr(start,deloc-1)
deloc=deloc+in[itemno,"count"]
in["count"]=in["count"]+deloc
if start+deloc<in.length then do -- syntax rules are not clear but do allow for a second delimiter after a sublist
secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
deloc=deloc+secdel
in["count"]=in["count"]+secdel
end
end
else do -- found end of item
in[itemno,"token"]=in.substr(start,deloc-1)
in["count"]=in["count"]+deloc
if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop -- found end of list indicator
end
end

start=start+deloc -- get next area to scan
end scanloop
in["items"]=itemno
return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

> Rene,
>
> I can see how that might be used if the input consists of blank
> delimited words, but I'm not sure I understand how you would parse the
> example I provided - one character at a time?
>
> If you don't know if the next special character is a begin list, end
> list or separator, how would you specify the template? Of course you
> could parse the remainder 3 different times and compare lengths etc.,
> but that seems a bit of a kludge, as would parsing 1 character at a time.
>
> What I'm looking for is an outer loop that parses the input into
> elements or lists (which may contain additional lists).
>
> Bill
>
> On 12/4/2012 6:32 PM, René Jansen wrote:
>> Bill,
>>
>> 5) just my 2 cents here:
>>
>> Something that I use often and which seems relevant for this, is the
>> 'recursive parse' stolen from lisp car and cdr:
>>
>> loop while cdr.words()
>> parse cdr car cdr
>> do
>> car.something()
>> end
>> end
>>
>> or some variations thereof
>>
>> The trick here is to take off the first element and leave the rest
>> for the next iteration; did this already in classic Rexx years ago.
>>
>> best regards,
>>
>> René.
>>
>>
>>
>> On 4 dec. 2012, at 19:09, Bill Fenlason <[hidden email]> wrote:
>>
>>> This question is primarily for Mike, but I'm sure others of you will
>>> have comments or suggestions.
>>>
>>> In some other programming languages (like lisp and python), a "list"
>>> is a fundamental concept.
>>>
>>> Since the Rexx family of languages is string oriented, it is not
>>> uncommon to encode a list within a string with a separator
>>> character, and to process the list with something like:
>>>
>>> parse list_contents list_item "," list_contents ;
>>>
>>> The problem arises when a list may contain other lists. For
>>> example, a list may be encoded with specific matched start and end
>>> characters and a separator character (e.g. '(' and ')' and ',' ). If
>>> the lists may be nested, there does not appear to be any easy way to
>>> parse it while still retaining the list structure.
>>>
>>> One example would be the parsing of source containing expressions or
>>> argument lists.
>>>
>>> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>>>
>>> Another example might be an encoding approach which uses '{', '}'
>>> and '`' for generic list encoding.
>>>
>>> The question is, if NetRexx or Rexx were to be extended to allow
>>> convenient parsing of nested lists, how should it be approached?
>>>
>>> 1) Provide a new statement like "parselist", similar to the parse
>>> statement but which allows the specification of beginning and ending
>>> characters. The scanning would check for matched pairs and process
>>> appropriately.
>>>
>>> 2) Extend the parse statement by providing a new type of pattern,
>>> perhaps a function notation which calls a function to scan ahead,
>>> skipping the contents of the data within matched beginning and
>>> ending characters.
>>>
>>> 3) Provide a built in function which perhaps includes the parse
>>> statement arguments.
>>>
>>> 4) Provide a new type of list object, and have the parse statement
>>> understand its structure.
>>>
>>> 5) Some other approach?
>>>
>>> 6) Ignore this problem.
>>>
>>> I assume this topic has been discussed before, but I must have
>>> missed it or I don't remember it. It seems to me that this is a
>>> general weakness in NetRexx.
>>>
>>> How have other people handled this problem in the past?
>>>
>>> The topic came up for me when I tried to convert a python program to
>>> NetRexx.
>>>
>>> _______________________________________________
>>> Ibm-netrexx mailing list
>>> [hidden email]
>>> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>>>
>> _______________________________________________
>> Ibm-netrexx mailing list
>> [hidden email]
>> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2013.0.2793 / Virus Database: 2629/5893 - Release Date:
>> 11/13/12
>> Internal Virus Database is out of date.
>>
>>
>>
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>
>
>

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Mike Cowlishaw

Re: Nested List Support?

In reply to this post by billfen

> This question is primarily for Mike, but I'm sure others of
> you will have comments or suggestions.

OK, I'll try ... :-)

> In some other programming languages (like lisp and python), a
> "list" is a fundamental concept.
>
> Since the Rexx family of languages is string oriented, it is
> not uncommon to encode a list within a string with a
> separator character, and to process the list with something like:
>
> parse list_contents list_item "," list_contents

Yes .. a very useful 'pattern'.

The general issue of lists in [Net]Rexx is one I've struggled with on many
occasions. There is implicitly a strong notion of a 'list of strings' in Rexx
including argument lists (and the parsing thereof) and the lists in IF and WHEN
clauses. Yet there has never really been an obvious elegant way to merge the
concept with Indexed Strings (stem variables), because lists and indexed strings
are both collections yet lists have no 'clean' indices. It would be tempting to
allow something like

foo=10, 20, 30

which would set foo[1] to 10, foo[2] to 20, etc. But somehow that never felt
right. So lists remain 'poor cousins' of indexed strings.

> The problem arises when a list may contain other lists. For
> example, a list may be encoded with specific matched start
> and end characters and a separator character (e.g. '(' and
> ')' and ',' ). If the lists may be nested, there does not
> appear to be any easy way to parse it while still retaining
> the list structure.
>
> One example would be the parsing of source containing
> expressions or argument lists.
>
> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>
> Another example might be an encoding approach which uses '{',
> '}' and '`' for generic list encoding.
>
> The question is, if NetRexx or Rexx were to be extended to
> allow convenient parsing of nested lists, how should it be approached?

This is perhaps just a different facet of parsing in general. Leaving lists
aside, parsing strings for matching pairs of delimiters is hard enough
(quote-delimited parameters in commands, for example) .. because there are so
many different rules for allowing delimiters and other characters in strings.
Doubling-up is common, as are escape characters -- but every language has subtle
differences here.

'Regular expressions' are not really the answer; when I was ECMAScript Editor I
was responsible for the first standardized Regular Expression syntax (close to
that of Perl). That exercise convinced me that regular expressions were too
complex for almost all programmers (let alone the 'average' programmer). We had
the top JavaScript implementers from Microsoft and Netscape (and others) in the
room -- and they often disagreed on how RexExp should work, even for quite
"straightforward" cases.

In short, the parsing in Rexx has gone just about as far as (I think) it should
go. There are already edge cases that confuse some programmers.

> 1) Provide a new statement like "parselist", similar to the
> parse statement but which allows the specification of
> beginning and ending characters. The scanning would check
> for matched pairs and process appropriately.
>
> 2) Extend the parse statement by providing a new type of
> pattern, perhaps a function notation which calls a function
> to scan ahead, skipping the contents of the data within
> matched beginning and ending characters.
>
> 3) Provide a built in function which perhaps includes the
> parse statement arguments.
>
> 4) Provide a new type of list object, and have the parse
> statement understand its structure.
>
> 5) Some other approach?
>
> 6) Ignore this problem.

I did make a proposal for Rexx that allowed 'intelligent parsers'. I forget the
details, but it was something on the lines of allowing a function call wherever
a pattern was allowed. The function would be called with the "string remaining"
and the arguments supplied in the template, and (as one possibility) returning a
split point. So something like:

parse var fred start pos('s') rest

world give the same result as

parse var fred start 's' +1 rest

With the right functions/rules this could be shown to allow nested strings of
various kinds, and might work for lists, too. But is was quite ugly, even
though it allowed a lot of flexibility from a relatively monor enhancement. But
how that could/sould coexist with the existing (comma-based) list notation in
Rexx is tricky...

Mike

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Tom Maynard

Re: Nested List Support?

On 5/12/12 9:34, Mike Cowlishaw wrote:
> regular expressions were too
> complex for almost all programmers (let alone the 'average' programmer)

Some people, when confronted with a problem, think “I know,
I'll use regular expressions." Now they have two problems.

Jamie Zawinski, 12 Aug 1997

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

billfen

Re: Nested List Support?

In reply to this post by Kermit Kiser

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2793 / Virus Database: 2629/5893 - Release Date: 11/13/12
Internal Virus Database is out of date.

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.2793 / Virus Database: 2629/5893 - Release Date: 11/13/12
Internal Virus Database is out of date.

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

billfen

Re: Nested List Support?

In reply to this post by Mike Cowlishaw

Mike,

On 12/5/2012 10:34 AM, Mike Cowlishaw wrote:

>> In some other programming languages (like lisp and python), a
>> "list" is a fundamental concept.
>>
>> Since the Rexx family of languages is string oriented, it is
>> not uncommon to encode a list within a string with a
>> separator character, and to process the list with something like:
>>
>> parse list_contents list_item "," list_contents
> Yes .. a very useful 'pattern'.
>
> The general issue of lists in [Net]Rexx is one I've struggled with on many
> occasions. There is implicitly a strong notion of a 'list of strings' in Rexx
> including argument lists (and the parsing thereof) and the lists in IF and WHEN
> clauses. Yet there has never really been an obvious elegant way to merge the
> concept with Indexed Strings (stem variables), because lists and indexed strings
> are both collections yet lists have no 'clean' indices. It would be tempting to
> allow something like
>
> foo=10, 20, 30
>
> which would set foo[1] to 10, foo[2] to 20, etc. But somehow that never felt
> right. So lists remain 'poor cousins' of indexed strings.

I agree that encoding or converting a list of strings to an indexed
string is not the answer, since it does not handle the nesting problem.
The essence of list processing is that lists contain elements or other
lists.

>> The problem arises when a list may contain other lists. For
>> example, a list may be encoded with specific matched start
>> and end characters and a separator character (e.g. '(' and
>> ')' and ',' ). If the lists may be nested, there does not
>> appear to be any easy way to parse it while still retaining
>> the list structure.
>>
>> One example would be the parsing of source containing
>> expressions or argument lists.
>>
>> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>>
>> Another example might be an encoding approach which uses '{',
>> '}' and '`' for generic list encoding.
>>
>> The question is, if NetRexx or Rexx were to be extended to
>> allow convenient parsing of nested lists, how should it be approached?
> This is perhaps just a different facet of parsing in general. Leaving lists
> aside, parsing strings for matching pairs of delimiters is hard enough
> (quote-delimited parameters in commands, for example) .. because there are so
> many different rules for allowing delimiters and other characters in strings.
> Doubling-up is common, as are escape characters -- but every language has subtle
> differences here.
>
> 'Regular expressions' are not really the answer; when I was ECMAScript Editor I
> was responsible for the first standardized Regular Expression syntax (close to
> that of Perl). That exercise convinced me that regular expressions were too
> complex for almost all programmers (let alone the 'average' programmer). We had
> the top JavaScript implementers from Microsoft and Netscape (and others) in the
> room -- and they often disagreed on how RexExp should work, even for quite
> "straightforward" cases.
>
> In short, the parsing in Rexx has gone just about as far as (I think) it should
> go. There are already edge cases that confuse some programmers.

I agree that the parse instruction, if extended, might be too
complicated (or too ugly).

>> 1) Provide a new statement like "parselist", similar to the
>> parse statement but which allows the specification of
>> beginning and ending characters. The scanning would check
>> for matched pairs and process appropriately.
>>
>> 2) Extend the parse statement by providing a new type of
>> pattern, perhaps a function notation which calls a function
>> to scan ahead, skipping the contents of the data within
>> matched beginning and ending characters.
>>
>> 3) Provide a built in function which perhaps includes the
>> parse statement arguments.
>>
>> 4) Provide a new type of list object, and have the parse
>> statement understand its structure.
>>
>> 5) Some other approach?
>>
>> 6) Ignore this problem.
> I did make a proposal for Rexx that allowed 'intelligent parsers'. I forget the
> details, but it was something on the lines of allowing a function call wherever
> a pattern was allowed. The function would be called with the "string remaining"
> and the arguments supplied in the template, and (as one possibility) returning a
> split point. So something like:
>
> parse var fred start pos('s') rest
>
> world give the same result as
>
> parse var fred start 's' +1 rest
>
> With the right functions/rules this could be shown to allow nested strings of
> various kinds, and might work for lists, too. But is was quite ugly, even
> though it allowed a lot of flexibility from a relatively monor enhancement. But
> how that could/sould coexist with the existing (comma-based) list notation in
> Rexx is tricky...

That is what I had in mind with point 2.

Possibly another approach would be to provide a pair of new statements,
such as makelist and parselist. The list encoding characters might be
global and settable, but default reasonably.

makelist newstringlist string1 string2 stringlist3 ... would
perform the encoding using the global list encoding characters
parselist newstringlist listpart1 listpart2 listpart3 ... would
parse the encoded string into elements.

A builtin function isstringlist(listpart) would return True if listpart
is a list, etc. The elements and lists would still be standard strings.

This solution may not handle the general parsing problem of source with
nested parens, etc., but perhaps by setting the encoding characters to
'(', ')' and ',' it might come close.
> Mike

In any event it is encouraging that you have thought about this, but
discouraging that you haven't come up with a solution :).

Bill

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

ThSITC

Re: Nested List Support?

In reply to this post by Tom Maynard

Just my 0,001 (0 Komma 001) Cents:

What is needed for PARSING a language, is defining a *Precedence
Grammar* for the Nesting.
At least, that is the way I did go to resolve those issues.

One of the pre-requisites, however, is, that the PARSE Verb shall and
should honour the meaning of a
QUOTED String, possibly with a possibility to DEFINE which Quotes are
allowed (only *"*, *or* *"" *and *'*)

I did propose that already, long time ago...

Full Stop,
Thomas.

PS: Of course, PARENTHESIS of any kind (Parenthesis, Brackets, Braces)
must then be also DEFINABLE,
which will introduce then a so called *token_level* (as I do call it, at
least).

PPS: Still busy to deploy those my thinkings and my implementation to
the NetRexx Repository on KENAI...
===================================================================================

Am 05.12.2012 16:51, schrieb Tom Maynard:

> On 5/12/12 9:34, Mike Cowlishaw wrote:
>> regular expressions were too
>> complex for almost all programmers (let alone the 'average' programmer)
>
> Some people, when confronted with a problem, think “I know,
> I'll use regular expressions." Now they have two problems.
>
> Jamie Zawinski, 12 Aug 1997
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>
>

--
Thomas Schneider, IT Consulting; http://www.thsitc.com; Vienna, Austria,
Europe

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com

kenner

JProgressBar

In reply to this post by billfen

Does anyone have a simple example of using JProgressBar in nextrexx?

Kenneth Klein

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Kermit Kiser

Re: JProgressBar

Not sure how simple it is but here is one that I have. It does some funny thread looping stuff that you would not want to do in a real app, but that is because it was designed as a special purpose demo for NetRexxScript in jEdit.

-- Kermit

------------------------------------------------------------------------------------------------------------------------------------------
import javax.swing.
import java.text.

class progressbardemo implements ActionListener        --    ActionListener interface lets the GUI objects talk to the program code

    properties static

    frame=JFrame                                --    holder for a GUI window

    textfield=JTextField                        --    holder for some text to edit

    method main(sa=String[]) static

        frame=JFrame("Sample GUI window")        --    create a GUI window frame

        frame.setSize(400,100)                    --    give the window some space on the screen

        panel=JPanel()                            --    create a panel to hold some GUI objects

        frame.add(panel)                        --    put the panel in the window frame

        parse Date().toString a b c d e f
        textfield=JTextField(b c f)    --    create a spot for some text

        panel.add(textfield)                    --    add the text field to the panel

        button=JButton("OK")                    --    create a button to click

        button.addActionListener(progressbardemo())    --    attach some code (an instance of this class) to watch the button

        panel.add(button)                        --    put the button in the panel

        pb=JProgressBar(0,100)                --        make a progress bar

        panel.add(pb)                                --        add it to the panel

        frame.show                                --    display the GUI window on the screen

        loop i=1 to 100                            -- loop for 100 increments as we specified in the progress bar
            Thread.sleep(100)                --        wait 1/10 second
            pb.setValue(i)                        --        increment the progress bar value
            end

        loop while frame\=null;Thread.sleep(100);end                --    wait for the GUI window to do something

    method actionPerformed(e=ActionEvent)         --    this is the code that listens to the button

        say textfield.getText                    --    show the text field contents

        frame.dispose                            --    clear the GUI window frame from screen and memory

        frame=null                                --    erase the pointer to stop the main program

------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 9:29 AM, [hidden email] wrote:

Does anyone have a simple example of using JProgressBar in nextrexx?

Kenneth Klein
_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Kermit Kiser

Re: Nested List Support?

In reply to this post by billfen

Bill --

Sometimes I wonder if we live on the same planet. ;-)

I don't understand how you think you can parse a string but not parse it in order. Nor do I understand why you would want a halfway solution that does not fully parse a string.

I am sorry if you did not understand my code example, but I am not sure it can be simplified further. As I said in my post, "Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial."

Possibly you skipped that last phrase?

I try things out with code because I don't think in abstract logic like you and Mike seem to do. So I provide working code examples to show my thoughts here. But then you say that my code has to scan strings one character at a time which is no more true or false than saying the PARSE instruction has to scan strings one character at a time. (Or do you really think that PARSE does not look at all of the characters?)

You also seem to feel that the lowly NetRexx data type could not possibly maintain the structure of a list but I think that the Rexx object is the most powerful data structure ever invented. It can not only hold strings and numbers, it can hold lists and maps and do amazing things with them and each one is a complete associative database! (And even more features are in the advanced after3.01 NetRexx version!)

Since I think that way, I will try again to explain what I mean with a code example. I modified my original sample program and added a method to reconstruct a parsed list, showing at each stage of reconstruction what list structure data can be extracted from the parsed string object. I even showed how you can transform one list syntax to another with the example parsed list Rexx object. (Your new example is basically the same structure with different delimiters, so the same code handles both examples fine.) Just ignore it if you still don't believe it can be done.

BTW: PARSE is intended for very simple parsing problems. That is why RexxLA started the RegRexx project to provide a more sophisticated pattern matching and parsing facility with a simpler syntax and more flexibility than regex has. (It remains to be seen if that can be done.) I think that is also why Mike included the verify and translate, etc, mechanisms to handle more complex parsing needs.

-- Kermit

----------------------------------------------------------------------------- Program output: ---------------------------------------------------------------------------------------------------------------------------------------------------

parsing this list:
fun( arg1(arg1a, arg1b), arg2(((nested)), z) )

display parsed list structure
   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
items=1
--* 1 ==> token=fun
--* items=2
--* --* 1 ==> token= arg1
--* --* items=2
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* items=2
--* --* --* 1 ==> token=
--* --* --* items=1
--* --* --* --* 1 ==> token=
--* --* --* --* items=1
--* --* --* --* --* 1 ==> token=nested
--* --* --* 2 ==> token= z

now reconstruct original input list
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=(arg1a, arg1b)
element= arg2
element=null
element=null
element=nested
list=(nested)
list=((nested))
element= z
list=(((nested)), z)
list=( arg1(arg1a, arg1b), arg2(((nested)), z))
list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))

reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))

parsing this list:
{ 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }

display parsed list structure
   token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
items=1
--* 1 ==> token=
--* items=5
--* --* 1 ==> token= 1
--* --* 2 ==> token= 2 3
--* --* 3 ==> token=
--* --* items=2
--* --* --* 1 ==> token= 9 8
--* --* --* 2 ==> token=
--* --* --* items=2
--* --* --* --* 1 ==> token= 7
--* --* --* --* 2 ==> token= 6
--* --* 4 ==> token=
--* --* 5 ==> token= 4 5

now reconstruct original input list
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list={ 7 , 6 }
list={ 9 8 , { 7 , 6 }}
element=null
element= 4 5
list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}

reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }

now for more fun lets reconstruct string 1 with string 2 syntax
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list={arg1a, arg1b}
element= arg2
element=null
element=null
element=nested
list={nested}
list={{nested}}
element= z
list={{{nested}}, z}
list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}

reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}

and then lets reconstruct string 2 with string 1 syntax
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=( 7 , 6 )
list=( 9 8 , ( 7 , 6 ))
element=null
element= 4 5
list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))

reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )

and lets also try a new syntax for string 1
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=<arg1a/ arg1b>
element= arg2
element=null
element=null
element=nested
list=<nested>
list=<<nested>>
element= z
list=<<<nested>>/ z>
list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>

reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>

and likewise for string 2
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=< 7 / 6 >
list=< 9 8 / < 7 / 6 >>
element=null
element= 4 5
list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>

reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------- Program code: ---------------------------------------------------------------------------------------------------------------------------------------
trace var x
in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
delims="(),"
say "\n parsing this list:";say in
parseout=parselist(in)
say "\n display parsed list structure"
dump(parseout)
say "\n now reconstruct original input list"
rl=reconstructlist(parseout)
say "\n reconstructed list==" rl.substr(2,rl.length-2)        --    items are stored as an implicit list

in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
say "\n parsing this list:";say in2
delims2="{},"
parseout2=parselist(in2,delims2)
say "\n display parsed list structure"
dump(parseout2)
say "\n now reconstruct original input list"
rl2=reconstructlist(parseout2,delims2)
say "\n reconstructed list==" rl2.substr(2,rl2.length-2)        --        --    items are stored as an implicit list

say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
rl1a=reconstructlist(parseout,delims2)
say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)        --        --    items are stored as an implicit list

say "\n and then lets reconstruct string 2 with string 1 syntax"
rl2a=reconstructlist(parseout2,delims)
say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)        --        --    items are stored as an implicit list

say "\n and lets also try a new syntax for string 1"
rl1aa=reconstructlist(parseout,"<>/")
say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)        --        --    items are stored as an implicit list

say "\n and likewise for string 2"
rl2aa=reconstructlist(parseout2,"<>/")
say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)        --        --    items are stored as an implicit list

method reconstructlist(input=Rexx,delims=Rexx "(),") static

    if input.exists("token") then
        segment1=input["token"]
    else segment1=""
    if segment1\="" then
        say "element="segment1
        else say "element=null"
    segment=""
    if input.exists("items") then do
        segment=delims.substr(1,1)
        loop i=1 to input["items"]
            segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
            end
        segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
        say "list="segment
        end
    return segment1||segment

method dump(data=Rexx,key="",offset="",link="") static
        say offset key link "token="data["token"]
    if data.exists("items") then do
        say offset "items="data["items"]
        loop i=1 to data["items"]
                dump(data[i],i,offset "--*","==>")
                end
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<=in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=""
        if deloc=0 then do --did not find a delimiter
            in[itemno,"token"]=in.substr(start)
            in["count"]=in["count"]+in.substr(start).length
            leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    list can be terminated by list ender with no more tokens
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        if start>in.length then leave scanloop    --    don't increment items count if done
        if start=in.length then
            if in.substr(start).verify(delims)=0 then leave scanloop        --    ignore delim at end of line
        end scanloop
        in["items"]=itemno
        return in

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 5:55 AM, Bill Fenlason wrote:

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
I

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

billfen

Re: Nested List Support?

Kermit,

On 12/5/2012 7:02 PM, Kermit Kiser wrote:

Bill --

Sometimes I wonder if we live on the same planet. ;-)

You are not alone :)

I don't understand how you think you can parse a string but not parse it in order. Nor do I understand why you would want a halfway solution that does not fully parse a string.

There is a difference between "parsing a string" and dividing up a list which has been encoded as a string. With a list, it is natural to request the items in the list one at a time, and the items in a list may be either a string or another list. Possibly you are not familiar with list processing (ala Lisp) and find this confusing, but I assume that is not the case.

Historical note: Original Lisp (for the IBM 704 in assembler) used a tree structure and used the contents of car (address register) and cdr (decrement register) to extract the first (next) item in a list and the remainder of the list. In Rene's example,he shows how to strip the first or next list item using the parse instruction but used the ancient names which are still in common use after all these years.

The essential point here is that one of the ways that would make sense in NetRexx would be to encode a list as a string. In other words, a "super" string which is a "list of strings" or a list of: "strings" and "lists of strings".

In no way did I mean to imply that the parsing was "half way", but just that the decomposition of the lists comes before the parsing of strings. First a list is processed, and then sublists are processed as necessary.

I am sorry if you did not understand my code example, but I am not sure it can be simplified further. As I said in my post, "Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial."

I didn't say that I didn't understand your code - it was well written and clear.

Possibly you skipped that last phrase?

The phase I meant was "while still retaining the list structure." Parsing a single string is not the same as breaking up a list (which happens to be encoded as a single string).

I try things out with code because I don't think in abstract logic like you and Mike seem to do. So I provide working code examples to show my thoughts here. But then you say that my code has to scan strings one character at a time which is no more true or false than saying the PARSE instruction has to scan strings one character at a time. (Or do you really think that PARSE does not look at all of the characters?)

My point there was that I was asking if an approach that could be used would be to extend the parse command. I am looking for the best general purpose approach to handle the nested list problem. I agree that my question is more abstract than specific.

You also seem to feel that the lowly NetRexx data type could not possibly maintain the structure of a list but I think that the Rexx object is the most powerful data structure ever invented. It can not only hold strings and numbers, it can hold lists and maps and do amazing things with them and each one is a complete associative database! (And even more features are in the advanced after3.01 NetRexx version!)

I don't know how you came to that conclusion - what did I say that gave you that idea? All I was asking was how to make the Rexx string object hold a list of strings and other (nested) lists of strings. Certainly the Rexx object can hold a simple list of strings. But it can not inherently hold a list containing strings and other lists of strings. External conventions for list delimiters must be provided. Possibly as an extension they could be added as fields in the Rexx object.

Since I think that way, I will try again to explain what I mean with a code example. I modified my original sample program and added a method to reconstruct a parsed list, showing at each stage of reconstruction what list structure data can be extracted from the parsed string object. I even showed how you can transform one list syntax to another with the example parsed list Rexx object. (Your new example is basically the same structure with different delimiters, so the same code handles both examples fine.) Just ignore it if you still don't believe it can be done.

I certainly understand that it can be done, Kermit, and your code obviously demonstrates it.

But the code itself does not provide an answer to the original question I asked, which was "If NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?"

In retrospect, perhaps I should have replaced the word "parsing" with "deconstruction".

I provided 6 possibilities, and perhaps your code could be the basis of possibility 3 (built in functions), although there doesn't seem to be a clear API. It certainly demonstrates an example, but obviously I'm trying to avoid that level of user coding for the general case.

I was asking "what is the best approach?", not "can it be done?" or "is there a code snippet that can be used?".

I thought my original post asked a single question and was reasonably clear, but apparently I was wrong about that.

BTW: PARSE is intended for very simple parsing problems. That is why RexxLA started the RegRexx project to provide a more sophisticated pattern matching and parsing facility with a simpler syntax and more flexibility than regex has. (It remains to be seen if that can be done.) I think that is also why Mike included the verify and translate, etc, mechanisms to handle more complex parsing needs.

Yes, that is what Mike said as well, and I agree in general. I suggested the possibility of extending the parse statement by adding a functional notation in the template, but Mike said he considered and rejected it some time ago.

-- Kermit

Bill

----------------------------------------------------------------------------- Program output: ---------------------------------------------------------------------------------------------------------------------------------------------------

parsing this list:
fun( arg1(arg1a, arg1b), arg2(((nested)), z) )

display parsed list structure
   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
items=1
--* 1 ==> token=fun
--* items=2
--* --* 1 ==> token= arg1
--* --* items=2
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* items=2
--* --* --* 1 ==> token=
--* --* --* items=1
--* --* --* --* 1 ==> token=
--* --* --* --* items=1
--* --* --* --* --* 1 ==> token=nested
--* --* --* 2 ==> token= z

now reconstruct original input list
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=(arg1a, arg1b)
element= arg2
element=null
element=null
element=nested
list=(nested)
list=((nested))
element= z
list=(((nested)), z)
list=( arg1(arg1a, arg1b), arg2(((nested)), z))
list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))

reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))

parsing this list:
{ 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }

display parsed list structure
   token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
items=1
--* 1 ==> token=
--* items=5
--* --* 1 ==> token= 1
--* --* 2 ==> token= 2 3
--* --* 3 ==> token=
--* --* items=2
--* --* --* 1 ==> token= 9 8
--* --* --* 2 ==> token=
--* --* --* items=2
--* --* --* --* 1 ==> token= 7
--* --* --* --* 2 ==> token= 6
--* --* 4 ==> token=
--* --* 5 ==> token= 4 5

now reconstruct original input list
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list={ 7 , 6 }
list={ 9 8 , { 7 , 6 }}
element=null
element= 4 5
list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}

reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }

now for more fun lets reconstruct string 1 with string 2 syntax
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list={arg1a, arg1b}
element= arg2
element=null
element=null
element=nested
list={nested}
list={{nested}}
element= z
list={{{nested}}, z}
list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}

reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}

and then lets reconstruct string 2 with string 1 syntax
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=( 7 , 6 )
list=( 9 8 , ( 7 , 6 ))
element=null
element= 4 5
list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))

reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )

and lets also try a new syntax for string 1
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=<arg1a/ arg1b>
element= arg2
element=null
element=null
element=nested
list=<nested>
list=<<nested>>
element= z
list=<<<nested>>/ z>
list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>

reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>

and likewise for string 2
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=< 7 / 6 >
list=< 9 8 / < 7 / 6 >>
element=null
element= 4 5
list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>

reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------- Program code: ---------------------------------------------------------------------------------------------------------------------------------------
trace var x
in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
delims="(),"
say "\n parsing this list:";say in
parseout=parselist(in)
say "\n display parsed list structure"
dump(parseout)
say "\n now reconstruct original input list"
rl=reconstructlist(parseout)
say "\n reconstructed list==" rl.substr(2,rl.length-2)        --    items are stored as an implicit list

in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
say "\n parsing this list:";say in2
delims2="{},"
parseout2=parselist(in2,delims2)
say "\n display parsed list structure"
dump(parseout2)
say "\n now reconstruct original input list"
rl2=reconstructlist(parseout2,delims2)
say "\n reconstructed list==" rl2.substr(2,rl2.length-2)        --        --    items are stored as an implicit list

say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
rl1a=reconstructlist(parseout,delims2)
say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)        --        --    items are stored as an implicit list

say "\n and then lets reconstruct string 2 with string 1 syntax"
rl2a=reconstructlist(parseout2,delims)
say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)        --        --    items are stored as an implicit list

say "\n and lets also try a new syntax for string 1"
rl1aa=reconstructlist(parseout,"<>/")
say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)        --        --    items are stored as an implicit list

say "\n and likewise for string 2"
rl2aa=reconstructlist(parseout2,"<>/")
say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)        --        --    items are stored as an implicit list

method reconstructlist(input=Rexx,delims=Rexx "(),") static

    if input.exists("token") then
        segment1=input["token"]
    else segment1=""
    if segment1\="" then
        say "element="segment1
        else say "element=null"
    segment=""
    if input.exists("items") then do
        segment=delims.substr(1,1)
        loop i=1 to input["items"]
            segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
            end
        segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
        say "list="segment
        end
    return segment1||segment

method dump(data=Rexx,key="",offset="",link="") static
        say offset key link "token="data["token"]
    if data.exists("items") then do
        say offset "items="data["items"]
        loop i=1 to data["items"]
                dump(data[i],i,offset "--*","==>")
                end
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<=in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=""
        if deloc=0 then do --did not find a delimiter
            in[itemno,"token"]=in.substr(start)
            in["count"]=in["count"]+in.substr(start).length
            leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    list can be terminated by list ender with no more tokens
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        if start>in.length then leave scanloop    --    don't increment items count if done
        if start=in.length then
            if in.substr(start).verify(delims)=0 then leave scanloop        --    ignore delim at end of line
        end scanloop
        in["items"]=itemno
        return in

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 5:55 AM, Bill Fenlason wrote:

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
I

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

measel

Re: Nested List Support?

Bill, I’m also from another planet. On my planet we have advanced lists called objects. You should check them out. They’re really fun to use in NetRexx.

From: [hidden email] [mailto:[hidden email]] On Behalf Of Bill Fenlason
Sent: Wednesday, December 05, 2012 9:17 PM
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] Nested List Support?

Kermit,

On 12/5/2012 7:02 PM, Kermit Kiser wrote:

Bill --

Sometimes I wonder if we live on the same planet. ;-)

You are not alone :)

I don't understand how you think you can parse a string but not parse it in order. Nor do I understand why you would want a halfway solution that does not fully parse a string.

I am sorry if you did not understand my code example, but I am not sure it can be simplified further. As I said in my post, "Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial."

I didn't say that I didn't understand your code - it was well written and clear.

Possibly you skipped that last phrase?

The phase I meant was "while still retaining the list structure." Parsing a single string is not the same as breaking up a list (which happens to be encoded as a single string).

I try things out with code because I don't think in abstract logic like you and Mike seem to do. So I provide working code examples to show my thoughts here. But then you say that my code has to scan strings one character at a time which is no more true or false than saying the PARSE instruction has to scan strings one character at a time. (Or do you really think that PARSE does not look at all of the characters?)

You also seem to feel that the lowly NetRexx data type could not possibly maintain the structure of a list but I think that the Rexx object is the most powerful data structure ever invented. It can not only hold strings and numbers, it can hold lists and maps and do amazing things with them and each one is a complete associative database! (And even more features are in the advanced after3.01 NetRexx version!)

Since I think that way, I will try again to explain what I mean with a code example. I modified my original sample program and added a method to reconstruct a parsed list, showing at each stage of reconstruction what list structure data can be extracted from the parsed string object. I even showed how you can transform one list syntax to another with the example parsed list Rexx object. (Your new example is basically the same structure with different delimiters, so the same code handles both examples fine.) Just ignore it if you still don't believe it can be done.

BTW: PARSE is intended for very simple parsing problems. That is why RexxLA started the RegRexx project to provide a more sophisticated pattern matching and parsing facility with a simpler syntax and more flexibility than regex has. (It remains to be seen if that can be done.) I think that is also why Mike included the verify and translate, etc, mechanisms to handle more complex parsing needs.

-- Kermit

Bill

----------------------------------------------------------------------------- Program output: ---------------------------------------------------------------------------------------------------------------------------------------------------

parsing this list:
fun( arg1(arg1a, arg1b), arg2(((nested)), z) )

display parsed list structure
   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
items=1
--* 1 ==> token=fun
--* items=2
--* --* 1 ==> token= arg1
--* --* items=2
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* items=2
--* --* --* 1 ==> token=
--* --* --* items=1
--* --* --* --* 1 ==> token=
--* --* --* --* items=1
--* --* --* --* --* 1 ==> token=nested
--* --* --* 2 ==> token= z

now reconstruct original input list
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=(arg1a, arg1b)
element= arg2
element=null
element=null
element=nested
list=(nested)
list=((nested))
element= z
list=(((nested)), z)
list=( arg1(arg1a, arg1b), arg2(((nested)), z))
list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))

reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))

parsing this list:
{ 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }

display parsed list structure
   token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
items=1
--* 1 ==> token=
--* items=5
--* --* 1 ==> token= 1
--* --* 2 ==> token= 2 3
--* --* 3 ==> token=
--* --* items=2
--* --* --* 1 ==> token= 9 8
--* --* --* 2 ==> token=
--* --* --* items=2
--* --* --* --* 1 ==> token= 7
--* --* --* --* 2 ==> token= 6
--* --* 4 ==> token=
--* --* 5 ==> token= 4 5

now reconstruct original input list
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list={ 7 , 6 }
list={ 9 8 , { 7 , 6 }}
element=null
element= 4 5
list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}

reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }

now for more fun lets reconstruct string 1 with string 2 syntax
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list={arg1a, arg1b}
element= arg2
element=null
element=null
element=nested
list={nested}
list={{nested}}
element= z
list={{{nested}}, z}
list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}

reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}

and then lets reconstruct string 2 with string 1 syntax
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=( 7 , 6 )
list=( 9 8 , ( 7 , 6 ))
element=null
element= 4 5
list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))

reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )

and lets also try a new syntax for string 1
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=<arg1a/ arg1b>
element= arg2
element=null
element=null
element=nested
list=<nested>
list=<<nested>>
element= z
list=<<<nested>>/ z>
list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>

reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>

and likewise for string 2
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=< 7 / 6 >
list=< 9 8 / < 7 / 6 >>
element=null
element= 4 5
list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>

reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------- Program code: ---------------------------------------------------------------------------------------------------------------------------------------
trace var x
in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
delims="(),"
say "\n parsing this list:";say in
parseout=parselist(in)
say "\n display parsed list structure"
dump(parseout)
say "\n now reconstruct original input list"
rl=reconstructlist(parseout)
say "\n reconstructed list==" rl.substr(2,rl.length-2)        --    items are stored as an implicit list

in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
say "\n parsing this list:";say in2
delims2="{},"
parseout2=parselist(in2,delims2)
say "\n display parsed list structure"
dump(parseout2)
say "\n now reconstruct original input list"
rl2=reconstructlist(parseout2,delims2)
say "\n reconstructed list==" rl2.substr(2,rl2.length-2)        --        --    items are stored as an implicit list

say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
rl1a=reconstructlist(parseout,delims2)
say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)        --        --    items are stored as an implicit list

say "\n and then lets reconstruct string 2 with string 1 syntax"
rl2a=reconstructlist(parseout2,delims)
say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)        --        --    items are stored as an implicit list

say "\n and lets also try a new syntax for string 1"
rl1aa=reconstructlist(parseout,"<>/")
say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)        --        --    items are stored as an implicit list

say "\n and likewise for string 2"
rl2aa=reconstructlist(parseout2,"<>/")
say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)        --        --    items are stored as an implicit list

method reconstructlist(input=Rexx,delims=Rexx "(),") static

    if input.exists("token") then
        segment1=input["token"]
    else segment1=""
    if segment1\="" then
        say "element="segment1
        else say "element=null"
    segment=""
    if input.exists("items") then do
        segment=delims.substr(1,1)
        loop i=1 to input["items"]
            segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
            end
        segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
        say "list="segment
        end
    return segment1||segment

method dump(data=Rexx,key="",offset="",link="") static
        say offset key link "token="data["token"]
    if data.exists("items") then do
        say offset "items="data["items"]
        loop i=1 to data["items"]
                dump(data[i],i,offset "--*","==>")
                end
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<=in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=""
        if deloc=0 then do --did not find a delimiter
            in[itemno,"token"]=in.substr(start)
            in["count"]=in["count"]+in.substr(start).length
            leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    list can be terminated by list ender with no more tokens
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        if start>in.length then leave scanloop    --    don't increment items count if done
        if start=in.length then
            if in.substr(start).verify(delims)=0 then leave scanloop        --    ignore delim at end of line
        end scanloop
        in["items"]=itemno
        return in

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 5:55 AM, Bill Fenlason wrote:

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
I

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Aviatrexx

Re: Nested List Support?

Raising the obvious question: Is there an method that takes an
arbitrary list of items and sublists such as Bill describes, and
creates an object containing the items and their hierarchical
relationships?

For problems such as this, I tend to fall back on the gospel according
to Knuth. :-)

-Chip-

On 12/6/2012 08:27 Measel, Mike said:

> Bill, I’m also from another planet. On my planet we have advanced
> lists called objects. You should check them out. They’re really fun
> to use in NetRexx.
>
> *From:* Bill Fenlason
> *Sent:* Wednesday, December 05, 2012 9:17 PM
>
> Kermit,
>
> On 12/5/2012 7:02 PM, Kermit Kiser wrote:
>
> Bill --
>
> Sometimes I wonder if we live on the same planet. ;-)
>
>
> You are not alone :)
>
>
> I don't understand how you think you can parse a string but not parse
> it in order. Nor do I understand why you would want a halfway solution
> that does not fully parse a string.
>
>
> There is a difference between "parsing a string" and dividing up a
> list which has been encoded as a string. With a list, it is natural
> to request the items in the list one at a time, and the items in a
> list may be either a string or another list. Possibly you are not
> familiar with list processing (ala Lisp) and find this confusing, but
> I assume that is not the case.
>
> Historical note: Original Lisp (for the IBM 704 in assembler) used a
> tree structure and used the contents of car (address register) and cdr
> (decrement register) to extract the first (next) item in a list and
> the remainder of the list. In Rene's example,he shows how to strip
> the first or next list item using the parse instruction but used the
> ancient names which are still in common use after all these years.
>
> The essential point here is that one of the ways that would make sense
> in NetRexx would be to encode a list as a string. In other words, a
> "super" string which is a "list of strings" or a list of: "strings"
> and "lists of strings".
>
> In no way did I mean to imply that the parsing was "half way", but
> just that the decomposition of the lists comes before the parsing of
> strings. First a list is processed, and then sublists are processed
> as necessary.
>
>
> I am sorry if you did not understand my code example, but I am not
> sure it can be simplified further. As I said in my post, "Handling
> this type of syntax is way beyond what a parse instruction can do and
> I think this example shows that the general case is not trivial."
>
>
> I didn't say that I didn't understand your code - it was well written
> and clear.
>
>
> Possibly you skipped that last phrase?
>
>
> The phase I meant was /_"while still retaining the list structure."_/
> Parsing a single string is not the same as breaking up a list (which
> happens to be encoded as a single string).
>
>
> I try things out with code because I don't think in abstract logic
> like you and Mike seem to do. So I provide working code examples to
> show my thoughts here. But then you say that my code has to scan
> strings one character at a time which is no more true or false than
> saying the PARSE instruction has to scan strings one character at a
> time. (Or do you really think that PARSE does not look at all of the
> characters?)
>
>
> My point there was that I was asking if an approach that could be used
> would be to extend the parse command. I am looking for the best
> general purpose approach to handle the nested list problem. I agree
> that my question is more abstract than specific.
>
>
> You also seem to feel that the lowly NetRexx data type could not
> possibly maintain the structure of a list but I think that the Rexx
> object is the most powerful data structure ever invented. It can not
> only hold strings and numbers, it can hold lists and maps and do
> amazing things with them and each one is a complete associative
> database! (And even more features are in the advanced after3.01
> NetRexx version!)
>
>
> I don't know how you came to that conclusion - what did I say that
> gave you that idea? All I was asking was how to make the Rexx string
> object hold a list of strings and other (nested) lists of strings.
> Certainly the Rexx object can hold a simple list of strings. But it
> can not inherently hold a list containing strings and other lists of
> strings. External conventions for list delimiters must be provided.
> Possibly as an extension they could be added as fields in the Rexx object.
>
>
> Since I think that way, I will try again to explain what I mean with a
> code example. I modified my original sample program and added a method
> to reconstruct a parsed list, showing at each stage of reconstruction
> what list structure data can be extracted from the parsed string
> object. I even showed how you can transform one list syntax to another
> with the example parsed list Rexx object. (Your new example is
> basically the same structure with different delimiters, so the same
> code handles both examples fine.) Just ignore it if you still don't
> believe it can be done.
>
>
> I certainly understand that it can be done, Kermit, and your code
> obviously demonstrates it.
>
> But the code itself does not provide an answer to the original
> question I asked, which was "If NetRexx or Rexx were to be extended to
> allow convenient parsing of nested lists, how should it be approached?"
>
> In retrospect, perhaps I should have replaced the word "parsing" with
> "deconstruction".
>
> I provided 6 possibilities, and perhaps your code could be the basis
> of possibility 3 (built in functions), although there doesn't seem to
> be a clear API. It certainly demonstrates an example, but obviously
> I'm trying to avoid that level of user coding for the general case.
>
> I was asking "what is the best approach?", not "can it be done?" or
> "is there a code snippet that can be used?".
>
> I thought my original post asked a single question and was reasonably
> clear, but apparently I was wrong about that.
>
>
> BTW: PARSE is intended for very simple parsing problems. That is why
> RexxLA started the RegRexx project to provide a more sophisticated
> pattern matching and parsing facility with a simpler syntax and more
> flexibility than regex has. (It remains to be seen if that can be
> done.) I think that is also why Mike included the verify and
> translate, etc, mechanisms to handle more complex parsing needs.
>
>
> Yes, that is what Mike said as well, and I agree in general. I
> suggested the possibility of extending the parse statement by adding a
> functional notation in the template, but Mike said he considered and
> rejected it some time ago.
>
>
> -- Kermit
>
>
> Bill
>
>
> -----------------------------------------------------------------------------
> Program output:
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
> parsing this list:
> fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
>
> display parsed list structure
> token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
> items=1
> --* 1 ==> token=fun
> --* items=2
> --* --* 1 ==> token= arg1
> --* --* items=2
> --* --* --* 1 ==> token=arg1a
> --* --* --* 2 ==> token= arg1b
> --* --* 2 ==> token= arg2
> --* --* items=2
> --* --* --* 1 ==> token=
> --* --* --* items=1
> --* --* --* --* 1 ==> token=
> --* --* --* --* items=1
> --* --* --* --* --* 1 ==> token=nested
> --* --* --* 2 ==> token= z
>
> now reconstruct original input list
> element=null
> element=fun
> element= arg1
> element=arg1a
> element= arg1b
> list=(arg1a, arg1b)
> element= arg2
> element=null
> element=null
> element=nested
> list=(nested)
> list=((nested))
> element= z
> list=(((nested)), z)
> list=( arg1(arg1a, arg1b), arg2(((nested)), z))
> list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))
>
> reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))
>
> parsing this list:
> { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
>
> display parsed list structure
> token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
> items=1
> --* 1 ==> token=
> --* items=5
> --* --* 1 ==> token= 1
> --* --* 2 ==> token= 2 3
> --* --* 3 ==> token=
> --* --* items=2
> --* --* --* 1 ==> token= 9 8
> --* --* --* 2 ==> token=
> --* --* --* items=2
> --* --* --* --* 1 ==> token= 7
> --* --* --* --* 2 ==> token= 6
> --* --* 4 ==> token=
> --* --* 5 ==> token= 4 5
>
> now reconstruct original input list
> element=null
> element=null
> element= 1
> element= 2 3
> element=null
> element= 9 8
> element=null
> element= 7
> element= 6
> list={ 7 , 6 }
> list={ 9 8 , { 7 , 6 }}
> element=null
> element= 4 5
> list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
> list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}
>
> reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
>
> now for more fun lets reconstruct string 1 with string 2 syntax
> element=null
> element=fun
> element= arg1
> element=arg1a
> element= arg1b
> list={arg1a, arg1b}
> element= arg2
> element=null
> element=null
> element=nested
> list={nested}
> list={{nested}}
> element= z
> list={{{nested}}, z}
> list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
> list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}
>
> reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
>
> and then lets reconstruct string 2 with string 1 syntax
> element=null
> element=null
> element= 1
> element= 2 3
> element=null
> element= 9 8
> element=null
> element= 7
> element= 6
> list=( 7 , 6 )
> list=( 9 8 , ( 7 , 6 ))
> element=null
> element= 4 5
> list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
> list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))
>
> reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
>
> and lets also try a new syntax for string 1
> element=null
> element=fun
> element= arg1
> element=arg1a
> element= arg1b
> list=<arg1a/ arg1b>
> element= arg2
> element=null
> element=null
> element=nested
> list=<nested>
> list=<<nested>>
> element= z
> list=<<<nested>>/ z>
> list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
> list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>
>
> reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
>
> and likewise for string 2
> element=null
> element=null
> element= 1
> element= 2 3
> element=null
> element= 9 8
> element=null
> element= 7
> element= 6
> list=< 7 / 6 >
> list=< 9 8 / < 7 / 6 >>
> element=null
> element= 4 5
> list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
> list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>
>
> reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> -----------------------------------------------------------------------------------------
> Program code:
> ---------------------------------------------------------------------------------------------------------------------------------------
> trace var x
> in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
> delims="(),"
> say "\n parsing this list:";say in
> parseout=parselist(in)
> say "\n display parsed list structure"
> dump(parseout)
> say "\n now reconstruct original input list"
> rl=reconstructlist(parseout)
> say "\n reconstructed list==" rl.substr(2,rl.length-2) --
> items are stored as an implicit list
>
> in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
> say "\n parsing this list:";say in2
> delims2="{},"
> parseout2=parselist(in2,delims2)
> say "\n display parsed list structure"
> dump(parseout2)
> say "\n now reconstruct original input list"
> rl2=reconstructlist(parseout2,delims2)
> say "\n reconstructed list==" rl2.substr(2,rl2.length-2) --
> -- items are stored as an implicit list
>
> say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
> rl1a=reconstructlist(parseout,delims2)
> say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)
> -- -- items are stored as an implicit list
>
> say "\n and then lets reconstruct string 2 with string 1 syntax"
> rl2a=reconstructlist(parseout2,delims)
> say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)
> -- -- items are stored as an implicit list
>
> say "\n and lets also try a new syntax for string 1"
> rl1aa=reconstructlist(parseout,"<>/")
> say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)
> -- -- items are stored as an implicit list
>
> say "\n and likewise for string 2"
> rl2aa=reconstructlist(parseout2,"<>/")
> say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)
> -- -- items are stored as an implicit list
>
> method reconstructlist(input=Rexx,delims=Rexx "(),") static
>
> if input.exists("token") then
> segment1=input["token"]
> else segment1=""
> if segment1\="" then
> say "element="segment1
> else say "element=null"
> segment=""
> if input.exists("items") then do
> segment=delims.substr(1,1)
> loop i=1 to input["items"]
>
> segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
> end
> segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
> say "list="segment
> end
> return segment1||segment
>
> method dump(data=Rexx,key="",offset="",link="") static
> say offset key link "token="data["token"]
> if data.exists("items") then do
> say offset "items="data["items"]
> loop i=1 to data["items"]
> dump(data[i],i,offset "--*","==>")
> end
> end
>
> method parselist(input=Rexx,delims=Rexx "(),",start="1") static
> in=Rexx(input)
> in["count"]=0 -- in order to work recursively, we will need
> to count how many characters are consumed at each step
>
> loop label scanloop itemno=1 while start<=in.length --
> need to loop in case multiple items
> deloc=in.substr(start).verify(delims,"match") --
> locate next delimiter
> in[itemno]=""
> if deloc=0 then do --did not find a delimiter
> in[itemno,"token"]=in.substr(start)
> in["count"]=in["count"]+in.substr(start).length
> leave scanloop
> end
> else do --found a delimiter
> if delims.pos(in.substr(start+deloc-1,1))=1 then do
> -- found a sublist - handle it recursively
> in[itemno]=parselist(in.substr(start+deloc),delims)
> in[itemno,"token"]=in.substr(start,deloc-1)
> deloc=deloc+in[itemno,"count"]
> in["count"]=in["count"]+deloc
> if start+deloc<in.length then do -- syntax
> rules are not clear but do allow for a second delimiter after a sublist
>
> secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
> if secdel=0 then leave scanloop -- ignore
> extra junk before end of line unless valid list separator
> deloc=deloc+secdel
> in["count"]=in["count"]+secdel
> if
> in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop
> -- list can be terminated by list ender with no more tokens
> end
> end
> else do -- found end of item
> in[itemno,"token"]=in.substr(start,deloc-1)
> in["count"]=in["count"]+deloc
> if in.substr(start+deloc-1,1)=delims.substr(2,1)
> then leave scanloop -- found end of list indicator
> end
> end
>
> start=start+deloc -- get next area to scan
> if start>in.length then leave scanloop -- don't
> increment items count if done
> if start=in.length then
> if in.substr(start).verify(delims)=0 then leave
> scanloop -- ignore delim at end of line
> end scanloop
> in["items"]=itemno
> return in
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> On 12/5/2012 5:55 AM, Bill Fenlason wrote:
>
> Kermit,
>
> The problem I was trying to address is that of encoding and
> parsing nested lists within a string. The example was just that -
> parsing a function call which happens to contain lists. Perhaps I
> could have been more clear and used a different example.
>
> I do not want to parse the string in order as your code does - I
> want to encode the string such that the string contains either
> elements or lists, and then parse it into a sequence of either
> elements or lists.
>
> As I said:
>
> "In some other programming languages (like lisp and python), a
> "list" is a fundamental concept.
>
> Since the Rexx family of languages is string oriented, it is not
> uncommon to encode a list within a string with a separator
> character, and to process the list with something like:
>
> parse list_contents list_item "," list_contents ;
>
> The problem arises when a list may contain other lists. For
> example, a list may be encoded with specific matched start and end
> characters and a separator character (e.g. '(' and ')' and ',' ).
> If the lists may be nested, there does not appear to be any easy
> way to parse it /_while still retaining the list structure._/"
>
> Possibly you skipped that last phrase and jumped to the example?
>
> To reiterate, the problem is how to manipulate lists and nested
> lists in NetRexx. As I pointed out, the most natural way would be
> to somehow encode the lists into a string since Rexx is
> essentially string based. As I also pointed out in option 4,
> perhaps there should be some kind of list object instead.
>
> It should be noted that list processing is a very powerful and
> widely used mechanism. Lisp was defined shortly after Fortran,
> and is still used as the initial language in many CS curricular
> (e.g. MIT). The ability to easily manipulate lists would make
> NetRexx richer. As I mentioned, I started to convert a python
> program to NetRexx and discovered that there was no easy way.
>
> Here is another example:
>
> Define an element to be a sequence of zero or more blank separated
> digits.
>
> Define a list to be a sequence of zero or more elements or lists.
>
> How does one encode a list into a string such that it can be
> easily parsed into its constituent elements and lists?
>
> For example, suppose that we encode a list as a sequence of
> elements or lists surrounded by '{' and '}' and separated by ','
> . Note this is just one way to encode a list - others could be
> used. Note that other element definitions can exist.
>
> The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } "
> represents a list whose contents are elements 1 , 2 3 , an
> imbedded list, a null element and element 4 5.
>
> When parsed and processed, the decoded string would produce the
> following:
> element 1
> element 2 3
> list { 9 8 , { 7 , 6 } }
> null element
> element 4 5
>
> This is what I meant by retaining the list structure. Subsequent
> processing could decode the nested list.
>
> The crux of the problem is how to parse content which contains
> _matching _start and end characters. In the example above, the
> first level list contains another list, and therefore another
> matched '{' '}' pair. There has to be some mechanism which
> matches the beginning and ending characters while the content may
> contain additional (nested) pairs.
>
> I'm not familiar with the RegRexx project, but from the name I
> assume it might be related to regular expressions? If that is
> that case it may not help, since standard regular expressions can
> not by themselves be used to solve the nesting problem (i.e.
> insuring that the delimiters are correctly matched).
>
> Thanks for taking the time to write and test your code. I see
> that it has to scan the input one character at a time and the
> parse instruction is not used. My question was also asking if
> there was some way to solve this problem by extending the parse
> instruction.
>
> Bill
>
> On 12/5/2012 6:20 AM, Kermit Kiser wrote:
>
> Hi Bill --
>
> Interesting problem you propose. I wonder if it is related to
> the RegRexx project and mailing list that RexxLA started a
> couple years ago and which Rony has been trying to resurrect
> this year:
>
> http://rice.safedataisp.net/mailman/listinfo/regrexx
>
> Given that language parsing is one of the largest areas of
> computer programming problems, I don't see that it makes any
> sense to try and boil it all down to one instruction or method
> that could be added to a language. On the other hand, some
> helpful aids like RegRexx might be useful for this kind of
> thing. If I get time, I may look into that.
>
> Meanwhile, your request is not well defined in terms of input
> format (can list items span embedded sublists? Can the list
> item separator be omitted after a sublist?, Can multiple list
> separators be used?, etc.) and you have not specified any
> output data structure at all. Looking at your example syntax,
> I made a few assumptions about input format and created a
> simple output format in order to devise a sample recursive
> code approach. Handling this type of syntax is way beyond what
> a parse instruction can do and I think this example shows that
> the general case is not trivial. (This program is also
> interesting because it will not interpret correctly without
> the trace instruction I inserted. The compiled version does
> not care. I tested that all the way back to NetRexx 2.05 by
> the way!)
>
> -- Kermit
>
> -------------------------------------------- Sample parsing
> code for an interesting syntax example
> ----------------------------------------------
> trace var x
>
> in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
>
> parseout=parselist(in)
>
> dump(parseout)
>
> method dump(data=Rexx,key="",offset="",link="") static
> say offset key link "token="data["token"]
> loop i=1 to data["items"]
> dump(data[i],i,offset "--*","==>")
> end
>
> method parselist(input=Rexx,delims=Rexx "(),",start="1") static
> in=Rexx(input)
> in["count"]=0 -- in order to work recursively, we
> will need to count how many characters are consumed at each step
>
> loop label scanloop itemno=1 while start<in.length
> -- need to loop in case multiple items
> deloc=in.substr(start).verify(delims,"match")
> -- locate next delimiter
> in[itemno]=" "
> if deloc=0 then do --did not find a delimiter
> in[itemno,"token"]=in.substr(start)
> in["count"]=in["count"]+in.substr(start).length
> leave scanloop
> end
> else do --found a delimiter
> if delims.pos(in.substr(start+deloc-1,1))=1 then
> do -- found a sublist - handle it recursively
>
> in[itemno]=parselist(in.substr(start+deloc),delims)
> in[itemno,"token"]=in.substr(start,deloc-1)
> deloc=deloc+in[itemno,"count"]
> in["count"]=in["count"]+deloc
> if start+deloc<in.length then do --
> syntax rules are not clear but do allow for a second delimiter
> after a sublist
>
> secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
> if secdel=0 then leave scanloop --
> ignore extra junk before end of line unless valid list separator
> deloc=deloc+secdel
> in["count"]=in["count"]+secdel
> end
> end
> else do -- found end of item
> in[itemno,"token"]=in.substr(start,deloc-1)
> in["count"]=in["count"]+deloc
> if
> in.substr(start+deloc-1,1)=delims.substr(2,1) then leave
> scanloop -- found end of list indicator
> end
> end
>
> start=start+deloc -- get next area to scan
> end scanloop
> in["items"]=itemno
> return in
> -----------------------------------------------------------------------------------------------------------------------------
>
> Program output:
> -----------------------------------------------------------------------------------------------------------------------------
>
>
> token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
> --* 1 ==> token=fun
> --* --* 1 ==> token= arg1
> --* --* --* 1 ==> token=arg1a
> --* --* --* 2 ==> token= arg1b
> --* --* 2 ==> token= arg2
> --* --* --* 1 ==> token=
> --* --* --* --* 1 ==> token=
> --* --* --* --* --* 1 ==> token=nested
> --* --* --* --* 2 ==> token= z
>
> -----------------------------------------------------------------------------------------------------------------------------
>
>
>
> On 12/4/2012 1:58 PM, Bill Fenlason wrote:
>
> Rene,
>
> I can see how that might be used if the input consists of
> blank delimited words, but I'm not sure I understand how you
> would parse the example I provided - one character at a time?
>
> If you don't know if the next special character is a begin
> list, end list or separator, how would you specify the
> template? Of course you could parse the remainder 3 different
> times and compare lengths etc., but that seems a bit of a
> kludge, as would parsing 1 character at a time.
>
> What I'm looking for is an outer loop that parses the input
> into elements or lists (which may contain additional lists).
>
> Bill
>
> On 12/4/2012 6:32 PM, René Jansen wrote:
>
> Bill,
>
> 5) just my 2 cents here:
>
> Something that I use often and which seems relevant for this,
> is the 'recursive parse' stolen from lisp car and cdr:
>
> loop while cdr.words()
> parse cdr car cdr
> do
> car.something()
> end
> end
>
> or some variations thereof
>
> The trick here is to take off the first element and leave the
> rest for the next iteration; did this already in classic Rexx
> years ago.
>
> best regards,
>
> René.
>
>
>
> On 4 dec. 2012, at 19:09, Bill Fenlason <[hidden email]>
> <mailto:[hidden email]> wrote:
>
>
> This question is primarily for Mike, but I'm sure others of
> you will have comments or suggestions.
>
> In some other programming languages (like lisp and python), a
> "list" is a fundamental concept.
>
> Since the Rexx family of languages is string oriented, it is
> not uncommon to encode a list within a string with a separator
> character, and to process the list with something like:
>
> parse list_contents list_item "," list_contents ;
>
> The problem arises when a list may contain other lists. For
> example, a list may be encoded with specific matched start and
> end characters and a separator character (e.g. '(' and ')' and
> ',' ). If the lists may be nested, there does not appear to be
> any easy way to parse it while still retaining the list
> structure.
>
> One example would be the parsing of source containing
> expressions or argument lists.
>
> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>
> Another example might be an encoding approach which uses '{', '}'
>
>
>
> and '`' for generic list encoding.
>
> The question is, if NetRexx or Rexx were to be
> extended to allow convenient parsing of nested lists,
> how should it be approached?
>
> 1) Provide a new statement like "parselist", similar
> to the parse statement but which allows the
> specification of beginning and ending characters. The
> scanning would check for matched pairs and process
> appropriately.
>
> 2) Extend the parse statement by providing a new type
> of pattern, perhaps a function notation which calls a
> function to scan ahead, skipping the contents of the
> data within matched beginning and ending characters.
>
> 3) Provide a built in function which perhaps includes
> the parse statement arguments.
>
> 4) Provide a new type of list object, and have the
> parse statement understand its structure.
>
> 5) Some other approach?
>
> 6) Ignore this problem.
>
> I assume this topic has been discussed before, but I
> must have missed it or I don't remember it. It seems
> to me that this is a general weakness in NetRexx.
>
> How have other people handled this problem in the past?
>
> The topic came up for me when I tried to convert a
> python program to NetRexx.
>
> _______________________________________________
> I
>
>
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

ThSITC

Re: Nested List Support?

Sorry to be there, again:

I *do have actually* so-called 'List-items' in my implementation.

I do, however, currently maintain those by 'Item-Numbers', not as real
Objects, for historical reasons.
(As classic Rexx didn't have any objects at all).

A *List-Item* thus does have:

Item-Type='List'
Item-Name= ' 3 5 7 12' -- where this is the WORD-List of the
item-numbers of the Components

(the NetRexx names do actually use UNDERLINES, not Hyphens, of course)

-- end of current interruption --
Thomas.

(still busy to deliver all of those things to the KENAI NetRexx repository)
=============================================================================
Am 06.12.2012 15:18, schrieb Chip Davis:

> Raising the obvious question: Is there an method that takes an
> arbitrary list of items and sublists such as Bill describes, and
> creates an object containing the items and their hierarchical
> relationships?
>
> For problems such as this, I tend to fall back on the gospel according
> to Knuth. :-)
>
> -Chip-
>
> On 12/6/2012 08:27 Measel, Mike said:
>> Bill, I’m also from another planet. On my planet we have advanced
>> lists called objects. You should check them out. They’re really fun
>> to use in NetRexx.
>>
>> *From:* Bill Fenlason
>> *Sent:* Wednesday, December 05, 2012 9:17 PM
>>
>> Kermit,
>>
>> On 12/5/2012 7:02 PM, Kermit Kiser wrote:
>>
>> Bill --
>>
>> Sometimes I wonder if we live on the same planet. ;-)
>>
>>
>> You are not alone :)
>>
>>
>> I don't understand how you think you can parse a string but not parse
>> it in order. Nor do I understand why you would want a halfway solution
>> that does not fully parse a string.
>>
>>
>> There is a difference between "parsing a string" and dividing up a
>> list which has been encoded as a string. With a list, it is natural
>> to request the items in the list one at a time, and the items in a
>> list may be either a string or another list. Possibly you are not
>> familiar with list processing (ala Lisp) and find this confusing, but
>> I assume that is not the case.
>>
>> Historical note: Original Lisp (for the IBM 704 in assembler) used a
>> tree structure and used the contents of car (address register) and cdr
>> (decrement register) to extract the first (next) item in a list and
>> the remainder of the list. In Rene's example,he shows how to strip
>> the first or next list item using the parse instruction but used the
>> ancient names which are still in common use after all these years.
>>
>> The essential point here is that one of the ways that would make sense
>> in NetRexx would be to encode a list as a string. In other words, a
>> "super" string which is a "list of strings" or a list of: "strings"
>> and "lists of strings".
>>
>> In no way did I mean to imply that the parsing was "half way", but
>> just that the decomposition of the lists comes before the parsing of
>> strings. First a list is processed, and then sublists are processed
>> as necessary.
>>
>>
>> I am sorry if you did not understand my code example, but I am not
>> sure it can be simplified further. As I said in my post, "Handling
>> this type of syntax is way beyond what a parse instruction can do and
>> I think this example shows that the general case is not trivial."
>>
>>
>> I didn't say that I didn't understand your code - it was well written
>> and clear.
>>
>>
>> Possibly you skipped that last phrase?
>>
>>
>> The phase I meant was /_"while still retaining the list structure."_/
>> Parsing a single string is not the same as breaking up a list (which
>> happens to be encoded as a single string).
>>
>>
>> I try things out with code because I don't think in abstract logic
>> like you and Mike seem to do. So I provide working code examples to
>> show my thoughts here. But then you say that my code has to scan
>> strings one character at a time which is no more true or false than
>> saying the PARSE instruction has to scan strings one character at a
>> time. (Or do you really think that PARSE does not look at all of the
>> characters?)
>>
>>
>> My point there was that I was asking if an approach that could be used
>> would be to extend the parse command. I am looking for the best
>> general purpose approach to handle the nested list problem. I agree
>> that my question is more abstract than specific.
>>
>>
>> You also seem to feel that the lowly NetRexx data type could not
>> possibly maintain the structure of a list but I think that the Rexx
>> object is the most powerful data structure ever invented. It can not
>> only hold strings and numbers, it can hold lists and maps and do
>> amazing things with them and each one is a complete associative
>> database! (And even more features are in the advanced after3.01
>> NetRexx version!)
>>
>>
>> I don't know how you came to that conclusion - what did I say that
>> gave you that idea? All I was asking was how to make the Rexx string
>> object hold a list of strings and other (nested) lists of strings.
>> Certainly the Rexx object can hold a simple list of strings. But it
>> can not inherently hold a list containing strings and other lists of
>> strings. External conventions for list delimiters must be provided.
>> Possibly as an extension they could be added as fields in the Rexx
>> object.
>>
>>
>> Since I think that way, I will try again to explain what I mean with a
>> code example. I modified my original sample program and added a method
>> to reconstruct a parsed list, showing at each stage of reconstruction
>> what list structure data can be extracted from the parsed string
>> object. I even showed how you can transform one list syntax to another
>> with the example parsed list Rexx object. (Your new example is
>> basically the same structure with different delimiters, so the same
>> code handles both examples fine.) Just ignore it if you still don't
>> believe it can be done.
>>
>>
>> I certainly understand that it can be done, Kermit, and your code
>> obviously demonstrates it.
>>
>> But the code itself does not provide an answer to the original
>> question I asked, which was "If NetRexx or Rexx were to be extended to
>> allow convenient parsing of nested lists, how should it be approached?"
>>
>> In retrospect, perhaps I should have replaced the word "parsing" with
>> "deconstruction".
>>
>> I provided 6 possibilities, and perhaps your code could be the basis
>> of possibility 3 (built in functions), although there doesn't seem to
>> be a clear API. It certainly demonstrates an example, but obviously
>> I'm trying to avoid that level of user coding for the general case.
>>
>> I was asking "what is the best approach?", not "can it be done?" or
>> "is there a code snippet that can be used?".
>>
>> I thought my original post asked a single question and was reasonably
>> clear, but apparently I was wrong about that.
>>
>>
>> BTW: PARSE is intended for very simple parsing problems. That is why
>> RexxLA started the RegRexx project to provide a more sophisticated
>> pattern matching and parsing facility with a simpler syntax and more
>> flexibility than regex has. (It remains to be seen if that can be
>> done.) I think that is also why Mike included the verify and
>> translate, etc, mechanisms to handle more complex parsing needs.
>>
>>
>> Yes, that is what Mike said as well, and I agree in general. I
>> suggested the possibility of extending the parse statement by adding a
>> functional notation in the template, but Mike said he considered and
>> rejected it some time ago.
>>
>>
>> -- Kermit
>>
>>
>> Bill
>>
>>
>> -----------------------------------------------------------------------------
>>
>> Program output:
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> parsing this list:
>> fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
>>
>> display parsed list structure
>> token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
>> items=1
>> --* 1 ==> token=fun
>> --* items=2
>> --* --* 1 ==> token= arg1
>> --* --* items=2
>> --* --* --* 1 ==> token=arg1a
>> --* --* --* 2 ==> token= arg1b
>> --* --* 2 ==> token= arg2
>> --* --* items=2
>> --* --* --* 1 ==> token=
>> --* --* --* items=1
>> --* --* --* --* 1 ==> token=
>> --* --* --* --* items=1
>> --* --* --* --* --* 1 ==> token=nested
>> --* --* --* 2 ==> token= z
>>
>> now reconstruct original input list
>> element=null
>> element=fun
>> element= arg1
>> element=arg1a
>> element= arg1b
>> list=(arg1a, arg1b)
>> element= arg2
>> element=null
>> element=null
>> element=nested
>> list=(nested)
>> list=((nested))
>> element= z
>> list=(((nested)), z)
>> list=( arg1(arg1a, arg1b), arg2(((nested)), z))
>> list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))
>>
>> reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))
>>
>> parsing this list:
>> { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
>>
>> display parsed list structure
>> token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
>> items=1
>> --* 1 ==> token=
>> --* items=5
>> --* --* 1 ==> token= 1
>> --* --* 2 ==> token= 2 3
>> --* --* 3 ==> token=
>> --* --* items=2
>> --* --* --* 1 ==> token= 9 8
>> --* --* --* 2 ==> token=
>> --* --* --* items=2
>> --* --* --* --* 1 ==> token= 7
>> --* --* --* --* 2 ==> token= 6
>> --* --* 4 ==> token=
>> --* --* 5 ==> token= 4 5
>>
>> now reconstruct original input list
>> element=null
>> element=null
>> element= 1
>> element= 2 3
>> element=null
>> element= 9 8
>> element=null
>> element= 7
>> element= 6
>> list={ 7 , 6 }
>> list={ 9 8 , { 7 , 6 }}
>> element=null
>> element= 4 5
>> list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
>> list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}
>>
>> reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
>>
>> now for more fun lets reconstruct string 1 with string 2 syntax
>> element=null
>> element=fun
>> element= arg1
>> element=arg1a
>> element= arg1b
>> list={arg1a, arg1b}
>> element= arg2
>> element=null
>> element=null
>> element=nested
>> list={nested}
>> list={{nested}}
>> element= z
>> list={{{nested}}, z}
>> list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
>> list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}
>>
>> reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
>>
>> and then lets reconstruct string 2 with string 1 syntax
>> element=null
>> element=null
>> element= 1
>> element= 2 3
>> element=null
>> element= 9 8
>> element=null
>> element= 7
>> element= 6
>> list=( 7 , 6 )
>> list=( 9 8 , ( 7 , 6 ))
>> element=null
>> element= 4 5
>> list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
>> list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))
>>
>> reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
>>
>> and lets also try a new syntax for string 1
>> element=null
>> element=fun
>> element= arg1
>> element=arg1a
>> element= arg1b
>> list=<arg1a/ arg1b>
>> element= arg2
>> element=null
>> element=null
>> element=nested
>> list=<nested>
>> list=<<nested>>
>> element= z
>> list=<<<nested>>/ z>
>> list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
>> list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>
>>
>> reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
>>
>> and likewise for string 2
>> element=null
>> element=null
>> element= 1
>> element= 2 3
>> element=null
>> element= 9 8
>> element=null
>> element= 7
>> element= 6
>> list=< 7 / 6 >
>> list=< 9 8 / < 7 / 6 >>
>> element=null
>> element= 4 5
>> list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
>> list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>
>>
>> reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
>>
>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> -----------------------------------------------------------------------------------------
>>
>> Program code:
>> ---------------------------------------------------------------------------------------------------------------------------------------
>>
>> trace var x
>> in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
>> delims="(),"
>> say "\n parsing this list:";say in
>> parseout=parselist(in)
>> say "\n display parsed list structure"
>> dump(parseout)
>> say "\n now reconstruct original input list"
>> rl=reconstructlist(parseout)
>> say "\n reconstructed list==" rl.substr(2,rl.length-2) --
>> items are stored as an implicit list
>>
>> in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
>> say "\n parsing this list:";say in2
>> delims2="{},"
>> parseout2=parselist(in2,delims2)
>> say "\n display parsed list structure"
>> dump(parseout2)
>> say "\n now reconstruct original input list"
>> rl2=reconstructlist(parseout2,delims2)
>> say "\n reconstructed list==" rl2.substr(2,rl2.length-2) --
>> -- items are stored as an implicit list
>>
>> say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
>> rl1a=reconstructlist(parseout,delims2)
>> say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)
>> -- -- items are stored as an implicit list
>>
>> say "\n and then lets reconstruct string 2 with string 1 syntax"
>> rl2a=reconstructlist(parseout2,delims)
>> say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)
>> -- -- items are stored as an implicit list
>>
>> say "\n and lets also try a new syntax for string 1"
>> rl1aa=reconstructlist(parseout,"<>/")
>> say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)
>> -- -- items are stored as an implicit list
>>
>> say "\n and likewise for string 2"
>> rl2aa=reconstructlist(parseout2,"<>/")
>> say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)
>> -- -- items are stored as an implicit list
>>
>> method reconstructlist(input=Rexx,delims=Rexx "(),") static
>>
>> if input.exists("token") then
>> segment1=input["token"]
>> else segment1=""
>> if segment1\="" then
>> say "element="segment1
>> else say "element=null"
>> segment=""
>> if input.exists("items") then do
>> segment=delims.substr(1,1)
>> loop i=1 to input["items"]
>>
>> segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
>> end
>> segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
>> say "list="segment
>> end
>> return segment1||segment
>>
>> method dump(data=Rexx,key="",offset="",link="") static
>> say offset key link "token="data["token"]
>> if data.exists("items") then do
>> say offset "items="data["items"]
>> loop i=1 to data["items"]
>> dump(data[i],i,offset "--*","==>")
>> end
>> end
>>
>> method parselist(input=Rexx,delims=Rexx "(),",start="1") static
>> in=Rexx(input)
>> in["count"]=0 -- in order to work recursively, we will need
>> to count how many characters are consumed at each step
>>
>> loop label scanloop itemno=1 while start<=in.length --
>> need to loop in case multiple items
>> deloc=in.substr(start).verify(delims,"match") --
>> locate next delimiter
>> in[itemno]=""
>> if deloc=0 then do --did not find a delimiter
>> in[itemno,"token"]=in.substr(start)
>> in["count"]=in["count"]+in.substr(start).length
>> leave scanloop
>> end
>> else do --found a delimiter
>> if delims.pos(in.substr(start+deloc-1,1))=1 then do
>> -- found a sublist - handle it recursively
>> in[itemno]=parselist(in.substr(start+deloc),delims)
>> in[itemno,"token"]=in.substr(start,deloc-1)
>> deloc=deloc+in[itemno,"count"]
>> in["count"]=in["count"]+deloc
>> if start+deloc<in.length then do -- syntax
>> rules are not clear but do allow for a second delimiter after a sublist
>>
>> secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
>> if secdel=0 then leave scanloop -- ignore
>> extra junk before end of line unless valid list separator
>> deloc=deloc+secdel
>> in["count"]=in["count"]+secdel
>> if
>> in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop
>> -- list can be terminated by list ender with no more tokens
>> end
>> end
>> else do -- found end of item
>> in[itemno,"token"]=in.substr(start,deloc-1)
>> in["count"]=in["count"]+deloc
>> if in.substr(start+deloc-1,1)=delims.substr(2,1)
>> then leave scanloop -- found end of list indicator
>> end
>> end
>>
>> start=start+deloc -- get next area to scan
>> if start>in.length then leave scanloop -- don't
>> increment items count if done
>> if start=in.length then
>> if in.substr(start).verify(delims)=0 then leave
>> scanloop -- ignore delim at end of line
>> end scanloop
>> in["items"]=itemno
>> return in
>>
>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> On 12/5/2012 5:55 AM, Bill Fenlason wrote:
>>
>> Kermit,
>>
>> The problem I was trying to address is that of encoding and
>> parsing nested lists within a string. The example was just that -
>> parsing a function call which happens to contain lists. Perhaps I
>> could have been more clear and used a different example.
>>
>> I do not want to parse the string in order as your code does - I
>> want to encode the string such that the string contains either
>> elements or lists, and then parse it into a sequence of either
>> elements or lists.
>>
>> As I said:
>>
>> "In some other programming languages (like lisp and python), a
>> "list" is a fundamental concept.
>>
>> Since the Rexx family of languages is string oriented, it is not
>> uncommon to encode a list within a string with a separator
>> character, and to process the list with something like:
>>
>> parse list_contents list_item "," list_contents ;
>>
>> The problem arises when a list may contain other lists. For
>> example, a list may be encoded with specific matched start and end
>> characters and a separator character (e.g. '(' and ')' and ',' ).
>> If the lists may be nested, there does not appear to be any easy
>> way to parse it /_while still retaining the list structure._/"
>>
>> Possibly you skipped that last phrase and jumped to the example?
>>
>> To reiterate, the problem is how to manipulate lists and nested
>> lists in NetRexx. As I pointed out, the most natural way would be
>> to somehow encode the lists into a string since Rexx is
>> essentially string based. As I also pointed out in option 4,
>> perhaps there should be some kind of list object instead.
>>
>> It should be noted that list processing is a very powerful and
>> widely used mechanism. Lisp was defined shortly after Fortran,
>> and is still used as the initial language in many CS curricular
>> (e.g. MIT). The ability to easily manipulate lists would make
>> NetRexx richer. As I mentioned, I started to convert a python
>> program to NetRexx and discovered that there was no easy way.
>>
>> Here is another example:
>>
>> Define an element to be a sequence of zero or more blank separated
>> digits.
>>
>> Define a list to be a sequence of zero or more elements or lists.
>>
>> How does one encode a list into a string such that it can be
>> easily parsed into its constituent elements and lists?
>>
>> For example, suppose that we encode a list as a sequence of
>> elements or lists surrounded by '{' and '}' and separated by ','
>> . Note this is just one way to encode a list - others could be
>> used. Note that other element definitions can exist.
>>
>> The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } "
>> represents a list whose contents are elements 1 , 2 3 , an
>> imbedded list, a null element and element 4 5.
>>
>> When parsed and processed, the decoded string would produce the
>> following:
>> element 1
>> element 2 3
>> list { 9 8 , { 7 , 6 } }
>> null element
>> element 4 5
>>
>> This is what I meant by retaining the list structure. Subsequent
>> processing could decode the nested list.
>>
>> The crux of the problem is how to parse content which contains
>> _matching _start and end characters. In the example above, the
>> first level list contains another list, and therefore another
>> matched '{' '}' pair. There has to be some mechanism which
>> matches the beginning and ending characters while the content may
>> contain additional (nested) pairs.
>>
>> I'm not familiar with the RegRexx project, but from the name I
>> assume it might be related to regular expressions? If that is
>> that case it may not help, since standard regular expressions can
>> not by themselves be used to solve the nesting problem (i.e.
>> insuring that the delimiters are correctly matched).
>>
>> Thanks for taking the time to write and test your code. I see
>> that it has to scan the input one character at a time and the
>> parse instruction is not used. My question was also asking if
>> there was some way to solve this problem by extending the parse
>> instruction.
>>
>> Bill
>>
>> On 12/5/2012 6:20 AM, Kermit Kiser wrote:
>>
>> Hi Bill --
>>
>> Interesting problem you propose. I wonder if it is related to
>> the RegRexx project and mailing list that RexxLA started a
>> couple years ago and which Rony has been trying to resurrect
>> this year:
>>
>> http://rice.safedataisp.net/mailman/listinfo/regrexx
>>
>> Given that language parsing is one of the largest areas of
>> computer programming problems, I don't see that it makes any
>> sense to try and boil it all down to one instruction or method
>> that could be added to a language. On the other hand, some
>> helpful aids like RegRexx might be useful for this kind of
>> thing. If I get time, I may look into that.
>>
>> Meanwhile, your request is not well defined in terms of input
>> format (can list items span embedded sublists? Can the list
>> item separator be omitted after a sublist?, Can multiple list
>> separators be used?, etc.) and you have not specified any
>> output data structure at all. Looking at your example syntax,
>> I made a few assumptions about input format and created a
>> simple output format in order to devise a sample recursive
>> code approach. Handling this type of syntax is way beyond what
>> a parse instruction can do and I think this example shows that
>> the general case is not trivial. (This program is also
>> interesting because it will not interpret correctly without
>> the trace instruction I inserted. The compiled version does
>> not care. I tested that all the way back to NetRexx 2.05 by
>> the way!)
>>
>> -- Kermit
>>
>> -------------------------------------------- Sample parsing
>> code for an interesting syntax example
>> ----------------------------------------------
>> trace var x
>>
>> in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
>>
>> parseout=parselist(in)
>>
>> dump(parseout)
>>
>> method dump(data=Rexx,key="",offset="",link="") static
>> say offset key link "token="data["token"]
>> loop i=1 to data["items"]
>> dump(data[i],i,offset "--*","==>")
>> end
>>
>> method parselist(input=Rexx,delims=Rexx "(),",start="1") static
>> in=Rexx(input)
>> in["count"]=0 -- in order to work recursively, we
>> will need to count how many characters are consumed at each step
>>
>> loop label scanloop itemno=1 while start<in.length
>> -- need to loop in case multiple items
>> deloc=in.substr(start).verify(delims,"match")
>> -- locate next delimiter
>> in[itemno]=" "
>> if deloc=0 then do --did not find a delimiter
>> in[itemno,"token"]=in.substr(start)
>> in["count"]=in["count"]+in.substr(start).length
>> leave scanloop
>> end
>> else do --found a delimiter
>> if delims.pos(in.substr(start+deloc-1,1))=1 then
>> do -- found a sublist - handle it recursively
>>
>> in[itemno]=parselist(in.substr(start+deloc),delims)
>> in[itemno,"token"]=in.substr(start,deloc-1)
>> deloc=deloc+in[itemno,"count"]
>> in["count"]=in["count"]+deloc
>> if start+deloc<in.length then do --
>> syntax rules are not clear but do allow for a second delimiter
>> after a sublist
>>
>> secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
>> if secdel=0 then leave scanloop --
>> ignore extra junk before end of line unless valid list separator
>> deloc=deloc+secdel
>> in["count"]=in["count"]+secdel
>> end
>> end
>> else do -- found end of item
>> in[itemno,"token"]=in.substr(start,deloc-1)
>> in["count"]=in["count"]+deloc
>> if
>> in.substr(start+deloc-1,1)=delims.substr(2,1) then leave
>> scanloop -- found end of list indicator
>> end
>> end
>>
>> start=start+deloc -- get next area to
>> scan
>> end scanloop
>> in["items"]=itemno
>> return in
>> -----------------------------------------------------------------------------------------------------------------------------
>>
>> Program output:
>> -----------------------------------------------------------------------------------------------------------------------------
>>
>>
>> token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
>> --* 1 ==> token=fun
>> --* --* 1 ==> token= arg1
>> --* --* --* 1 ==> token=arg1a
>> --* --* --* 2 ==> token= arg1b
>> --* --* 2 ==> token= arg2
>> --* --* --* 1 ==> token=
>> --* --* --* --* 1 ==> token=
>> --* --* --* --* --* 1 ==> token=nested
>> --* --* --* --* 2 ==> token= z
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> On 12/4/2012 1:58 PM, Bill Fenlason wrote:
>>
>> Rene,
>>
>> I can see how that might be used if the input consists of
>> blank delimited words, but I'm not sure I understand how you
>> would parse the example I provided - one character at a time?
>>
>> If you don't know if the next special character is a begin
>> list, end list or separator, how would you specify the
>> template? Of course you could parse the remainder 3 different
>> times and compare lengths etc., but that seems a bit of a
>> kludge, as would parsing 1 character at a time.
>>
>> What I'm looking for is an outer loop that parses the input
>> into elements or lists (which may contain additional lists).
>>
>> Bill
>>
>> On 12/4/2012 6:32 PM, René Jansen wrote:
>>
>> Bill,
>>
>> 5) just my 2 cents here:
>>
>> Something that I use often and which seems relevant for this,
>> is the 'recursive parse' stolen from lisp car and cdr:
>>
>> loop while cdr.words()
>> parse cdr car cdr
>> do
>> car.something()
>> end
>> end
>>
>> or some variations thereof
>>
>> The trick here is to take off the first element and leave the
>> rest for the next iteration; did this already in classic Rexx
>> years ago.
>>
>> best regards,
>>
>> René.
>>
>>
>>
>> On 4 dec. 2012, at 19:09, Bill Fenlason <[hidden email]>
>> <mailto:[hidden email]> wrote:
>>
>>
>> This question is primarily for Mike, but I'm sure others of
>> you will have comments or suggestions.
>>
>> In some other programming languages (like lisp and python), a
>> "list" is a fundamental concept.
>>
>> Since the Rexx family of languages is string oriented, it is
>> not uncommon to encode a list within a string with a separator
>> character, and to process the list with something like:
>>
>> parse list_contents list_item "," list_contents ;
>>
>> The problem arises when a list may contain other lists. For
>> example, a list may be encoded with specific matched start and
>> end characters and a separator character (e.g. '(' and ')' and
>> ',' ). If the lists may be nested, there does not appear to be
>> any easy way to parse it while still retaining the list
>> structure.
>>
>> One example would be the parsing of source containing
>> expressions or argument lists.
>>
>> eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."
>>
>> Another example might be an encoding approach which uses '{',
>> '}'
>>
>>
>>
>> and '`' for generic list encoding.
>>
>> The question is, if NetRexx or Rexx were to be
>> extended to allow convenient parsing of nested lists,
>> how should it be approached?
>>
>> 1) Provide a new statement like "parselist", similar
>> to the parse statement but which allows the
>> specification of beginning and ending characters. The
>> scanning would check for matched pairs and process
>> appropriately.
>>
>> 2) Extend the parse statement by providing a new type
>> of pattern, perhaps a function notation which calls a
>> function to scan ahead, skipping the contents of the
>> data within matched beginning and ending characters.
>>
>> 3) Provide a built in function which perhaps includes
>> the parse statement arguments.
>>
>> 4) Provide a new type of list object, and have the
>> parse statement understand its structure.
>>
>> 5) Some other approach?
>>
>> 6) Ignore this problem.
>>
>> I assume this topic has been discussed before, but I
>> must have missed it or I don't remember it. It seems
>> to me that this is a general weakness in NetRexx.
>>
>> How have other people handled this problem in the past?
>>
>> The topic came up for me when I tried to convert a
>> python program to NetRexx.
>>
>> _______________________________________________
>> I
>>
>>
>>
>> _______________________________________________
>> Ibm-netrexx mailing list
>> [hidden email]
>> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>>
>
> _______________________________________________
> Ibm-netrexx mailing list
> [hidden email]
> Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
>
>

Thomas Schneider, Vienna, Austria (Europe) :-)

www.thsitc.com
www.db-123.com

billfen

Re: Nested List Support?

In reply to this post by measel

Mike M,

My response to Kermit was imbedded in his message. Your response seems to have lost the distinction. Are you using an email program which does not support that kind of processing? Were you able to tell who said what?

I'm not sure I understand what is behind your apparent sarcasm. I've been programming for close to 50 years, including object oriented programming from its inception. The thousands of lines of code in the Eclipse plugin are all Java.

What happened was that I started to convert a python program to NetRexx and was frustrated that there was no native list processing support in NetRexx. I asked MFC "If list processing were to be added to NetRexx by encoding lists within a string, what approach would be best?" and he provided a reasoned answer.

Kermit put some code together implying "do it like this", but when I tried to explain my question in more detail to him, he recoded his program and suggested we live on different planets. I had thanked him for his effort and was explaining that I wasn't after a code snippet.

I didn't think my original question was all that difficult to understand, and I don't know what the fuss is about.

Bill

On 12/6/2012 8:27 AM, Measel, Mike wrote:

Bill, I’m also from another planet. On my planet we have advanced lists called objects. You should check them out. They’re really fun to use in NetRexx.

From: [hidden email] [[hidden email]] On Behalf Of Bill Fenlason
Sent: Wednesday, December 05, 2012 9:17 PM
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] Nested List Support?

Kermit,

On 12/5/2012 7:02 PM, Kermit Kiser wrote:

Bill --

Sometimes I wonder if we live on the same planet. ;-)

You are not alone :)

I don't understand how you think you can parse a string but not parse it in order. Nor do I understand why you would want a halfway solution that does not fully parse a string.

There is a difference between "parsing a string" and dividing up a list which has been encoded as a string. With a list, it is natural to request the items in the list one at a time, and the items in a list may be either a string or another list. Possibly you are not familiar with list processing (ala Lisp) and find this confusing, but I assume that is not the case.

Historical note: Original Lisp (for the IBM 704 in assembler) used a tree structure and used the contents of car (address register) and cdr (decrement register) to extract the first (next) item in a list and the remainder of the list. In Rene's example,he shows how to strip the first or next list item using the parse instruction but used the ancient names which are still in common use after all these years.

The essential point here is that one of the ways that would make sense in NetRexx would be to encode a list as a string. In other words, a "super" string which is a "list of strings" or a list of: "strings" and "lists of strings".

In no way did I mean to imply that the parsing was "half way", but just that the decomposition of the lists comes before the parsing of strings. First a list is processed, and then sublists are processed as necessary.

I am sorry if you did not understand my code example, but I am not sure it can be simplified further. As I said in my post, "Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial."

I didn't say that I didn't understand your code - it was well written and clear.

Possibly you skipped that last phrase?

The phase I meant was "while still retaining the list structure." Parsing a single string is not the same as breaking up a list (which happens to be encoded as a single string).

I try things out with code because I don't think in abstract logic like you and Mike seem to do. So I provide working code examples to show my thoughts here. But then you say that my code has to scan strings one character at a time which is no more true or false than saying the PARSE instruction has to scan strings one character at a time. (Or do you really think that PARSE does not look at all of the characters?)

My point there was that I was asking if an approach that could be used would be to extend the parse command. I am looking for the best general purpose approach to handle the nested list problem. I agree that my question is more abstract than specific.

You also seem to feel that the lowly NetRexx data type could not possibly maintain the structure of a list but I think that the Rexx object is the most powerful data structure ever invented. It can not only hold strings and numbers, it can hold lists and maps and do amazing things with them and each one is a complete associative database! (And even more features are in the advanced after3.01 NetRexx version!)

I don't know how you came to that conclusion - what did I say that gave you that idea? All I was asking was how to make the Rexx string object hold a list of strings and other (nested) lists of strings. Certainly the Rexx object can hold a simple list of strings. But it can not inherently hold a list containing strings and other lists of strings. External conventions for list delimiters must be provided. Possibly as an extension they could be added as fields in the Rexx object.

Since I think that way, I will try again to explain what I mean with a code example. I modified my original sample program and added a method to reconstruct a parsed list, showing at each stage of reconstruction what list structure data can be extracted from the parsed string object. I even showed how you can transform one list syntax to another with the example parsed list Rexx object. (Your new example is basically the same structure with different delimiters, so the same code handles both examples fine.) Just ignore it if you still don't believe it can be done.

I certainly understand that it can be done, Kermit, and your code obviously demonstrates it.

But the code itself does not provide an answer to the original question I asked, which was "If NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?"

In retrospect, perhaps I should have replaced the word "parsing" with "deconstruction".

I provided 6 possibilities, and perhaps your code could be the basis of possibility 3 (built in functions), although there doesn't seem to be a clear API. It certainly demonstrates an example, but obviously I'm trying to avoid that level of user coding for the general case.

I was asking "what is the best approach?", not "can it be done?" or "is there a code snippet that can be used?".

I thought my original post asked a single question and was reasonably clear, but apparently I was wrong about that.

BTW: PARSE is intended for very simple parsing problems. That is why RexxLA started the RegRexx project to provide a more sophisticated pattern matching and parsing facility with a simpler syntax and more flexibility than regex has. (It remains to be seen if that can be done.) I think that is also why Mike included the verify and translate, etc, mechanisms to handle more complex parsing needs.

Yes, that is what Mike said as well, and I agree in general. I suggested the possibility of extending the parse statement by adding a functional notation in the template, but Mike said he considered and rejected it some time ago.

-- Kermit

Bill

----------------------------------------------------------------------------- Program output: ---------------------------------------------------------------------------------------------------------------------------------------------------

parsing this list:
fun( arg1(arg1a, arg1b), arg2(((nested)), z) )

display parsed list structure
   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
items=1
--* 1 ==> token=fun
--* items=2
--* --* 1 ==> token= arg1
--* --* items=2
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* items=2
--* --* --* 1 ==> token=
--* --* --* items=1
--* --* --* --* 1 ==> token=
--* --* --* --* items=1
--* --* --* --* --* 1 ==> token=nested
--* --* --* 2 ==> token= z

now reconstruct original input list
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=(arg1a, arg1b)
element= arg2
element=null
element=null
element=nested
list=(nested)
list=((nested))
element= z
list=(((nested)), z)
list=( arg1(arg1a, arg1b), arg2(((nested)), z))
list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))

reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))

parsing this list:
{ 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }

display parsed list structure
   token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
items=1
--* 1 ==> token=
--* items=5
--* --* 1 ==> token= 1
--* --* 2 ==> token= 2 3
--* --* 3 ==> token=
--* --* items=2
--* --* --* 1 ==> token= 9 8
--* --* --* 2 ==> token=
--* --* --* items=2
--* --* --* --* 1 ==> token= 7
--* --* --* --* 2 ==> token= 6
--* --* 4 ==> token=
--* --* 5 ==> token= 4 5

now reconstruct original input list
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list={ 7 , 6 }
list={ 9 8 , { 7 , 6 }}
element=null
element= 4 5
list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}

reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }

now for more fun lets reconstruct string 1 with string 2 syntax
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list={arg1a, arg1b}
element= arg2
element=null
element=null
element=nested
list={nested}
list={{nested}}
element= z
list={{{nested}}, z}
list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}

reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}

and then lets reconstruct string 2 with string 1 syntax
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=( 7 , 6 )
list=( 9 8 , ( 7 , 6 ))
element=null
element= 4 5
list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))

reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )

and lets also try a new syntax for string 1
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=<arg1a/ arg1b>
element= arg2
element=null
element=null
element=nested
list=<nested>
list=<<nested>>
element= z
list=<<<nested>>/ z>
list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>

reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>

and likewise for string 2
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=< 7 / 6 >
list=< 9 8 / < 7 / 6 >>
element=null
element= 4 5
list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>

reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------- Program code: ---------------------------------------------------------------------------------------------------------------------------------------
trace var x
in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
delims="(),"
say "\n parsing this list:";say in
parseout=parselist(in)
say "\n display parsed list structure"
dump(parseout)
say "\n now reconstruct original input list"
rl=reconstructlist(parseout)
say "\n reconstructed list==" rl.substr(2,rl.length-2)        --    items are stored as an implicit list

in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
say "\n parsing this list:";say in2
delims2="{},"
parseout2=parselist(in2,delims2)
say "\n display parsed list structure"
dump(parseout2)
say "\n now reconstruct original input list"
rl2=reconstructlist(parseout2,delims2)
say "\n reconstructed list==" rl2.substr(2,rl2.length-2)        --        --    items are stored as an implicit list

say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
rl1a=reconstructlist(parseout,delims2)
say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)        --        --    items are stored as an implicit list

say "\n and then lets reconstruct string 2 with string 1 syntax"
rl2a=reconstructlist(parseout2,delims)
say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)        --        --    items are stored as an implicit list

say "\n and lets also try a new syntax for string 1"
rl1aa=reconstructlist(parseout,"<>/")
say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)        --        --    items are stored as an implicit list

say "\n and likewise for string 2"
rl2aa=reconstructlist(parseout2,"<>/")
say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)        --        --    items are stored as an implicit list

method reconstructlist(input=Rexx,delims=Rexx "(),") static

    if input.exists("token") then
        segment1=input["token"]
    else segment1=""
    if segment1\="" then
        say "element="segment1
        else say "element=null"
    segment=""
    if input.exists("items") then do
        segment=delims.substr(1,1)
        loop i=1 to input["items"]
            segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
            end
        segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
        say "list="segment
        end
    return segment1||segment

method dump(data=Rexx,key="",offset="",link="") static
        say offset key link "token="data["token"]
    if data.exists("items") then do
        say offset "items="data["items"]
        loop i=1 to data["items"]
                dump(data[i],i,offset "--*","==>")
                end
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<=in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=""
        if deloc=0 then do --did not find a delimiter
            in[itemno,"token"]=in.substr(start)
            in["count"]=in["count"]+in.substr(start).length
            leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    list can be terminated by list ender with no more tokens
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        if start>in.length then leave scanloop    --    don't increment items count if done
        if start=in.length then
            if in.substr(start).verify(delims)=0 then leave scanloop        --    ignore delim at end of line
        end scanloop
        in["items"]=itemno
        return in

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 5:55 AM, Bill Fenlason wrote:

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
I
_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

George Hovey-2

Re: Nested List Support?

Yes, Bill's query has produced some odd responses ;).

On Thu, Dec 6, 2012 at 11:40 AM, Bill Fenlason <[hidden email]> wrote:

Mike M,

My response to Kermit was imbedded in his message. Your response seems to have lost the distinction. Are you using an email program which does not support that kind of processing? Were you able to tell who said what?

I'm not sure I understand what is behind your apparent sarcasm. I've been programming for close to 50 years, including object oriented programming from its inception. The thousands of lines of code in the Eclipse plugin are all Java.

What happened was that I started to convert a python program to NetRexx and was frustrated that there was no native list processing support in NetRexx. I asked MFC "If list processing were to be added to NetRexx by encoding lists within a string, what approach would be best?" and he provided a reasoned answer.

Kermit put some code together implying "do it like this", but when I tried to explain my question in more detail to him, he recoded his program and suggested we live on different planets. I had thanked him for his effort and was explaining that I wasn't after a code snippet.

I didn't think my original question was all that difficult to understand, and I don't know what the fuss is about.

Bill

On 12/6/2012 8:27 AM, Measel, Mike wrote:
Bill, I’m also from another planet. On my planet we have advanced lists called objects. You should check them out. They’re really fun to use in NetRexx.

From: [hidden email] [[hidden email]] On Behalf Of Bill Fenlason
Sent: Wednesday, December 05, 2012 9:17 PM
To: IBM Netrexx
Subject: Re: [Ibm-netrexx] Nested List Support?

Kermit,

On 12/5/2012 7:02 PM, Kermit Kiser wrote:

Bill --

Sometimes I wonder if we live on the same planet. ;-)

You are not alone :)

I don't understand how you think you can parse a string but not parse it in order. Nor do I understand why you would want a halfway solution that does not fully parse a string.

There is a difference between "parsing a string" and dividing up a list which has been encoded as a string. With a list, it is natural to request the items in the list one at a time, and the items in a list may be either a string or another list. Possibly you are not familiar with list processing (ala Lisp) and find this confusing, but I assume that is not the case.

Historical note: Original Lisp (for the IBM 704 in assembler) used a tree structure and used the contents of car (address register) and cdr (decrement register) to extract the first (next) item in a list and the remainder of the list. In Rene's example,he shows how to strip the first or next list item using the parse instruction but used the ancient names which are still in common use after all these years.

The essential point here is that one of the ways that would make sense in NetRexx would be to encode a list as a string. In other words, a "super" string which is a "list of strings" or a list of: "strings" and "lists of strings".

In no way did I mean to imply that the parsing was "half way", but just that the decomposition of the lists comes before the parsing of strings. First a list is processed, and then sublists are processed as necessary.

I am sorry if you did not understand my code example, but I am not sure it can be simplified further. As I said in my post, "Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial."

I didn't say that I didn't understand your code - it was well written and clear.

Possibly you skipped that last phrase?

The phase I meant was "while still retaining the list structure." Parsing a single string is not the same as breaking up a list (which happens to be encoded as a single string).

I try things out with code because I don't think in abstract logic like you and Mike seem to do. So I provide working code examples to show my thoughts here. But then you say that my code has to scan strings one character at a time which is no more true or false than saying the PARSE instruction has to scan strings one character at a time. (Or do you really think that PARSE does not look at all of the characters?)

My point there was that I was asking if an approach that could be used would be to extend the parse command. I am looking for the best general purpose approach to handle the nested list problem. I agree that my question is more abstract than specific.

You also seem to feel that the lowly NetRexx data type could not possibly maintain the structure of a list but I think that the Rexx object is the most powerful data structure ever invented. It can not only hold strings and numbers, it can hold lists and maps and do amazing things with them and each one is a complete associative database! (And even more features are in the advanced after3.01 NetRexx version!)

I don't know how you came to that conclusion - what did I say that gave you that idea? All I was asking was how to make the Rexx string object hold a list of strings and other (nested) lists of strings. Certainly the Rexx object can hold a simple list of strings. But it can not inherently hold a list containing strings and other lists of strings. External conventions for list delimiters must be provided. Possibly as an extension they could be added as fields in the Rexx object.

Since I think that way, I will try again to explain what I mean with a code example. I modified my original sample program and added a method to reconstruct a parsed list, showing at each stage of reconstruction what list structure data can be extracted from the parsed string object. I even showed how you can transform one list syntax to another with the example parsed list Rexx object. (Your new example is basically the same structure with different delimiters, so the same code handles both examples fine.) Just ignore it if you still don't believe it can be done.

I certainly understand that it can be done, Kermit, and your code obviously demonstrates it.

But the code itself does not provide an answer to the original question I asked, which was "If NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?"

In retrospect, perhaps I should have replaced the word "parsing" with "deconstruction".

I provided 6 possibilities, and perhaps your code could be the basis of possibility 3 (built in functions), although there doesn't seem to be a clear API. It certainly demonstrates an example, but obviously I'm trying to avoid that level of user coding for the general case.

I was asking "what is the best approach?", not "can it be done?" or "is there a code snippet that can be used?".

I thought my original post asked a single question and was reasonably clear, but apparently I was wrong about that.

BTW: PARSE is intended for very simple parsing problems. That is why RexxLA started the RegRexx project to provide a more sophisticated pattern matching and parsing facility with a simpler syntax and more flexibility than regex has. (It remains to be seen if that can be done.) I think that is also why Mike included the verify and translate, etc, mechanisms to handle more complex parsing needs.

Yes, that is what Mike said as well, and I agree in general. I suggested the possibility of extending the parse statement by adding a functional notation in the template, but Mike said he considered and rejected it some time ago.

-- Kermit

Bill

----------------------------------------------------------------------------- Program output: ---------------------------------------------------------------------------------------------------------------------------------------------------

parsing this list:
fun( arg1(arg1a, arg1b), arg2(((nested)), z) )

display parsed list structure
   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
items=1
--* 1 ==> token=fun
--* items=2
--* --* 1 ==> token= arg1
--* --* items=2
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* items=2
--* --* --* 1 ==> token=
--* --* --* items=1
--* --* --* --* 1 ==> token=
--* --* --* --* items=1
--* --* --* --* --* 1 ==> token=nested
--* --* --* 2 ==> token= z

now reconstruct original input list
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=(arg1a, arg1b)
element= arg2
element=null
element=null
element=nested
list=(nested)
list=((nested))
element= z
list=(((nested)), z)
list=( arg1(arg1a, arg1b), arg2(((nested)), z))
list=(fun( arg1(arg1a, arg1b), arg2(((nested)), z)))

reconstructed list== fun( arg1(arg1a, arg1b), arg2(((nested)), z))

parsing this list:
{ 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }

display parsed list structure
   token= { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }
items=1
--* 1 ==> token=
--* items=5
--* --* 1 ==> token= 1
--* --* 2 ==> token= 2 3
--* --* 3 ==> token=
--* --* items=2
--* --* --* 1 ==> token= 9 8
--* --* --* 2 ==> token=
--* --* --* items=2
--* --* --* --* 1 ==> token= 7
--* --* --* --* 2 ==> token= 6
--* --* 4 ==> token=
--* --* 5 ==> token= 4 5

now reconstruct original input list
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list={ 7 , 6 }
list={ 9 8 , { 7 , 6 }}
element=null
element= 4 5
list={ 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }
list={ { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }}

reconstructed list== { 1 , 2 3 , { 9 8 , { 7 , 6 }}, , 4 5 }

now for more fun lets reconstruct string 1 with string 2 syntax
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list={arg1a, arg1b}
element= arg2
element=null
element=null
element=nested
list={nested}
list={{nested}}
element= z
list={{{nested}}, z}
list={ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}
list={fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}}

reconstructed list== fun{ arg1{arg1a, arg1b}, arg2{{{nested}}, z}}

and then lets reconstruct string 2 with string 1 syntax
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=( 7 , 6 )
list=( 9 8 , ( 7 , 6 ))
element=null
element= 4 5
list=( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )
list=( ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 ))

reconstructed list== ( 1 , 2 3 , ( 9 8 , ( 7 , 6 )), , 4 5 )

and lets also try a new syntax for string 1
element=null
element=fun
element= arg1
element=arg1a
element= arg1b
list=<arg1a/ arg1b>
element= arg2
element=null
element=null
element=nested
list=<nested>
list=<<nested>>
element= z
list=<<<nested>>/ z>
list=< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>
list=<fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>>

reconstructed list== fun< arg1<arg1a/ arg1b>/ arg2<<<nested>>/ z>>

and likewise for string 2
element=null
element=null
element= 1
element= 2 3
element=null
element= 9 8
element=null
element= 7
element= 6
list=< 7 / 6 >
list=< 9 8 / < 7 / 6 >>
element=null
element= 4 5
list=< 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >
list=< < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >>

reconstructed list== < 1 / 2 3 / < 9 8 / < 7 / 6 >>/ / 4 5 >

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------- Program code: ---------------------------------------------------------------------------------------------------------------------------------------
trace var x
in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"
delims="(),"
say "\n parsing this list:";say in
parseout=parselist(in)
say "\n display parsed list structure"
dump(parseout)
say "\n now reconstruct original input list"
rl=reconstructlist(parseout)
say "\n reconstructed list==" rl.substr(2,rl.length-2)        --    items are stored as an implicit list

in2=" { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 }"
say "\n parsing this list:";say in2
delims2="{},"
parseout2=parselist(in2,delims2)
say "\n display parsed list structure"
dump(parseout2)
say "\n now reconstruct original input list"
rl2=reconstructlist(parseout2,delims2)
say "\n reconstructed list==" rl2.substr(2,rl2.length-2)        --        --    items are stored as an implicit list

say "\n now for more fun lets reconstruct string 1 with string 2 syntax"
rl1a=reconstructlist(parseout,delims2)
say "\n reconstructed list==" rl1a.substr(2,rl1a.length-2)        --        --    items are stored as an implicit list

say "\n and then lets reconstruct string 2 with string 1 syntax"
rl2a=reconstructlist(parseout2,delims)
say "\n reconstructed list==" rl2a.substr(2,rl2a.length-2)        --        --    items are stored as an implicit list

say "\n and lets also try a new syntax for string 1"
rl1aa=reconstructlist(parseout,"<>/")
say "\n reconstructed list==" rl1aa.substr(2,rl1aa.length-2)        --        --    items are stored as an implicit list

say "\n and likewise for string 2"
rl2aa=reconstructlist(parseout2,"<>/")
say "\n reconstructed list==" rl2aa.substr(2,rl2aa.length-2)        --        --    items are stored as an implicit list

method reconstructlist(input=Rexx,delims=Rexx "(),") static

    if input.exists("token") then
        segment1=input["token"]
    else segment1=""
    if segment1\="" then
        say "element="segment1
        else say "element=null"
    segment=""
    if input.exists("items") then do
        segment=delims.substr(1,1)
        loop i=1 to input["items"]
            segment=segment||reconstructlist(input[i],delims)||delims.substr(3,1)
            end
        segment=segment.strip("t",delims.substr(3,1))||delims.substr(2,1)
        say "list="segment
        end
    return segment1||segment

method dump(data=Rexx,key="",offset="",link="") static
        say offset key link "token="data["token"]
    if data.exists("items") then do
        say offset "items="data["items"]
        loop i=1 to data["items"]
                dump(data[i],i,offset "--*","==>")
                end
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<=in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=""
        if deloc=0 then do --did not find a delimiter
            in[itemno,"token"]=in.substr(start)
            in["count"]=in["count"]+in.substr(start).length
            leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(2,2),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    list can be terminated by list ender with no more tokens
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        if start>in.length then leave scanloop    --    don't increment items count if done
        if start=in.length then
            if in.substr(start).verify(delims)=0 then leave scanloop        --    ignore delim at end of line
        end scanloop
        in["items"]=itemno
        return in

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 5:55 AM, Bill Fenlason wrote:

Kermit,

The problem I was trying to address is that of encoding and parsing nested lists within a string. The example was just that - parsing a function call which happens to contain lists. Perhaps I could have been more clear and used a different example.

I do not want to parse the string in order as your code does - I want to encode the string such that the string contains either elements or lists, and then parse it into a sequence of either elements or lists.

As I said:

"In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure."

Possibly you skipped that last phrase and jumped to the example?

To reiterate, the problem is how to manipulate lists and nested lists in NetRexx. As I pointed out, the most natural way would be to somehow encode the lists into a string since Rexx is essentially string based. As I also pointed out in option 4, perhaps there should be some kind of list object instead.

It should be noted that list processing is a very powerful and widely used mechanism. Lisp was defined shortly after Fortran, and is still used as the initial language in many CS curricular (e.g. MIT). The ability to easily manipulate lists would make NetRexx richer. As I mentioned, I started to convert a python program to NetRexx and discovered that there was no easy way.

Here is another example:

Define an element to be a sequence of zero or more blank separated digits.

Define a list to be a sequence of zero or more elements or lists.

How does one encode a list into a string such that it can be easily parsed into its constituent elements and lists?

For example, suppose that we encode a list as a sequence of elements or lists surrounded by '{' and '}' and separated by ',' . Note this is just one way to encode a list - others could be used. Note that other element definitions can exist.

The encoded string " { 1 , 2 3 , { 9 8 , { 7 , 6 } } , , 4 5 } " represents a list whose contents are elements 1 , 2 3 , an imbedded list, a null element and element 4 5.

When parsed and processed, the decoded string would produce the following:
    element 1
    element 2 3
    list { 9 8 , { 7 , 6 } }
    null element
    element 4 5

This is what I meant by retaining the list structure. Subsequent processing could decode the nested list.

The crux of the problem is how to parse content which contains matching start and end characters. In the example above, the first level list contains another list, and therefore another matched '{' '}' pair. There has to be some mechanism which matches the beginning and ending characters while the content may contain additional (nested) pairs.

I'm not familiar with the RegRexx project, but from the name I assume it might be related to regular expressions? If that is that case it may not help, since standard regular expressions can not by themselves be used to solve the nesting problem (i.e. insuring that the delimiters are correctly matched).

Thanks for taking the time to write and test your code. I see that it has to scan the input one character at a time and the parse instruction is not used. My question was also asking if there was some way to solve this problem by extending the parse instruction.

Bill

On 12/5/2012 6:20 AM, Kermit Kiser wrote:

Hi Bill --

Interesting problem you propose. I wonder if it is related to the RegRexx project and mailing list that RexxLA started a couple years ago and which Rony has been trying to resurrect this year:

http://rice.safedataisp.net/mailman/listinfo/regrexx

Given that language parsing is one of the largest areas of computer programming problems, I don't see that it makes any sense to try and boil it all down to one instruction or method that could be added to a language. On the other hand, some helpful aids like RegRexx might be useful for this kind of thing. If I get time, I may look into that.

Meanwhile, your request is not well defined in terms of input format (can list items span embedded sublists? Can the list item separator be omitted after a sublist?, Can multiple list separators be used?, etc.) and you have not specified any output data structure at all. Looking at your example syntax, I made a few assumptions about input format and created a simple output format in order to devise a sample recursive code approach. Handling this type of syntax is way beyond what a parse instruction can do and I think this example shows that the general case is not trivial. (This program is also interesting because it will not interpret correctly without the trace instruction I inserted. The compiled version does not care. I tested that all the way back to NetRexx 2.05 by the way!)

-- Kermit

-------------------------------------------- Sample parsing code for an interesting syntax example ----------------------------------------------
trace var x

in="fun( arg1(arg1a, arg1b), arg2(((nested)), z) )"

parseout=parselist(in)

dump(parseout)

method dump(data=Rexx,key="",offset="",link="") static
    say offset key link "token="data["token"]
    loop i=1 to data["items"]
        dump(data[i],i,offset "--*","==>")
        end

method parselist(input=Rexx,delims=Rexx "(),",start="1") static
    in=Rexx(input)
    in["count"]=0    --    in order to work recursively, we will need to count how many characters are consumed at each step

    loop label scanloop itemno=1 while start<in.length        --        need to loop in case multiple items
        deloc=in.substr(start).verify(delims,"match")        --    locate next delimiter
        in[itemno]=" "
        if deloc=0 then do --did not find a delimiter
             in[itemno,"token"]=in.substr(start)
             in["count"]=in["count"]+in.substr(start).length
             leave scanloop
            end
        else do --found a delimiter
            if delims.pos(in.substr(start+deloc-1,1))=1 then do        --        found a sublist - handle it recursively
                    in[itemno]=parselist(in.substr(start+deloc),delims)
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    deloc=deloc+in[itemno,"count"]
                    in["count"]=in["count"]+deloc
                    if start+deloc<in.length then do    --    syntax rules are not clear but do allow for a second delimiter after a sublist
                        secdel=in.substr(start+deloc).verify(delims.substr(3,1),"match")
                        if secdel=0 then leave scanloop -- ignore extra junk before end of line unless valid list separator
                        deloc=deloc+secdel
                        in["count"]=in["count"]+secdel
                        end
                    end
            else do    -- found end of item
                    in[itemno,"token"]=in.substr(start,deloc-1)
                    in["count"]=in["count"]+deloc
                    if in.substr(start+deloc-1,1)=delims.substr(2,1) then leave scanloop    --    found end of list indicator
                    end
            end

        start=start+deloc        --        get next area to scan
        end scanloop
        in["items"]=itemno
        return in
-----------------------------------------------------------------------------------------------------------------------------
Program output:
-----------------------------------------------------------------------------------------------------------------------------

   token=fun( arg1(arg1a, arg1b), arg2(((nested)), z) )
--* 1 ==> token=fun
--* --* 1 ==> token= arg1
--* --* --* 1 ==> token=arg1a
--* --* --* 2 ==> token= arg1b
--* --* 2 ==> token= arg2
--* --* --* 1 ==> token=
--* --* --* --* 1 ==> token=
--* --* --* --* --* 1 ==> token=nested
--* --* --* --* 2 ==> token= z

-----------------------------------------------------------------------------------------------------------------------------

On 12/4/2012 1:58 PM, Bill Fenlason wrote:

Rene,

I can see how that might be used if the input consists of blank delimited words, but I'm not sure I understand how you would parse the example I provided - one character at a time?

If you don't know if the next special character is a begin list, end list or separator, how would you specify the template? Of course you could parse the remainder 3 different times and compare lengths etc., but that seems a bit of a kludge, as would parsing 1 character at a time.

What I'm looking for is an outer loop that parses the input into elements or lists (which may contain additional lists).

Bill

On 12/4/2012 6:32 PM, René Jansen wrote:

Bill,

5) just my 2 cents here:

Something that I use often and which seems relevant for this, is the 'recursive parse' stolen from lisp car and cdr:

loop while cdr.words()
   parse cdr car cdr
do
     car.something()
end
end

or some variations thereof

The trick here is to take off the first element and leave the rest for the next iteration; did this already in classic Rexx years ago.

best regards,

René.

On 4 dec. 2012, at 19:09, Bill Fenlason [hidden email] wrote:

This question is primarily for Mike, but I'm sure others of you will have comments or suggestions.

In some other programming languages (like lisp and python), a "list" is a fundamental concept.

Since the Rexx family of languages is string oriented, it is not uncommon to encode a list within a string with a separator character, and to process the list with something like:

    parse list_contents list_item "," list_contents ;

The problem arises when a list may contain other lists. For example, a list may be encoded with specific matched start and end characters and a separator character (e.g. '(' and ')' and ',' ). If the lists may be nested, there does not appear to be any easy way to parse it while still retaining the list structure.

One example would be the parsing of source containing expressions or argument lists.

eg. "fun( arg1(arg1a, arg1b), arg2(((nested)), z) ) ...."

Another example might be an encoding approach which uses '{', '}'

and '`' for generic list encoding.

The question is, if NetRexx or Rexx were to be extended to allow convenient parsing of nested lists, how should it be approached?

1) Provide a new statement like "parselist", similar to the parse statement but which allows the specification of beginning and ending characters. The scanning would check for matched pairs and process appropriately.

2) Extend the parse statement by providing a new type of pattern, perhaps a function notation which calls a function to scan ahead, skipping the contents of the data within matched beginning and ending characters.

3) Provide a built in function which perhaps includes the parse statement arguments.

4) Provide a new type of list object, and have the parse statement understand its structure.

5) Some other approach?

6) Ignore this problem.

I assume this topic has been discussed before, but I must have missed it or I don't remember it. It seems to me that this is a general weakness in NetRexx.

How have other people handled this problem in the past?

The topic came up for me when I tried to convert a python program to NetRexx.

_______________________________________________
I
_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/
_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

--
"One can live magnificently in this world if one knows how to work and how to love." -- Leo Tolstoy

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

kenner

Re: JProgressBar

In reply to this post by Kermit Kiser

Thanks loads, Kermit. That was all I needed. There are now 2 progress bars in my program.

options verbose4 trace1 logo
import javax.swing.
import java.awt.
import java.awt.Graphics
import java.awt.GridBagConstraints
import java.awt.event.ActionListener
import java.awt.event.ActionEvent
import java.awt.event.KeyListener;
import java.awt.event.KeyEvent;
import java.text. -- Needed for the SimpleDateFormat class
import javax.swing.text.
import java.lang.String
import netrexx.lang.Rexx
import javax.swing.

class fcgui extends KeyAdapter uses GridBagConstraints implements ActionListener, Keylistener
Properties constant
keySet = Rexx ' &#\'/.,-+()0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

properties inheritable static
question1 = Rexx[500]
answer1 = Rexx[500]
totalQCount = int 500
search_cnt = int 0
someRandomNum = int -1
response1 = Rexx
retrySwitch = boolean 0
Rand1 = boolean 0
fileName = Rexx
textline = Rexx
DaScore = Rexx
Start_Time = long
try_cnt = 0
rightKeys = int 0
wrongKeys = int 0
rightAnswers = int 0
wrongAnswers = int 0
DrawTheCard_Frame = JFrame
Ananswer = JTextField
mytrue = boolean 1
myfalse = boolean 0
keystyped = Rexx 1
keysaccepted = Rexx 1
keysreleased = Rexx 1
length1 = Rexx 1
A_question = Rexx
A_answer = Rexx
Submit_Check = JButton("This text will never be seen.")
SaveExit = JButton("Save score and exit now.")
advanceThruDeck = JButton("Save score and exit now.")
Append = boolean 1
delta = long
Curr_Time = long
Line_Bld = Rexx
text3 = JTextarea()
text4 = JTextarea()
QBox_1 = JTextarea()
InputText = JTextarea(" <-- Type your answer here.")
tittleInstruction = String 'This is where the instructions should go.'
textForSetting = String 'this was put in the textforsetting string under properties with all this size.....'
this_ans = String
Debug_Level = int 4
ans_done = int[500]
ans_wrong = int[500]
ans_done_counter = int 0
ans_wrong_counter = int 0
ansDoneTableFile = rexx

pb=JProgressBar(0,100)
pb2=JProgressBar(0,100)

trace var keystyped key A_answer keysaccepted rightAnswers wrongAnswers elapsedTime ans_done -
retrySwitch try_cnt totalQCount textForSetting Rand1

method main(s=String[]) static
IsFile("txt_files") -- checks for a hardcoded directory called txt_files, creates it if needed
-- Executed nomally from FrontEnd2, if run standalone these text items are listed
if question1[0] == null then do
loop label PickFile Until response1 = "C" | response1 = "F"| -
response1 = "G" | response1 = "Y"| -
response1 = "T" | response1 = "S"| -
response1 = "M" | response1 = "D"
say
say 'Enter a "C" for the Commercial Examples exercises.'
say
say 'Enter a "F" for the Fermentaion Fill-in-the-Blank quiz.'
say
say 'Enter a "M" for the Miscellaneous Fill-in-the-Blank quiz.'
say
say 'Enter a "Y" for the Yeast name for each substyle.'
say
say 'Enter a "T" for the Troubleshooting quiz, in which styles are these off flavors allowed.'
say
say 'Enter a "S" for the Substyle quiz, matching subcategory numbers with names.'
say
say 'Enter a "G" for the Groups quiz, matching subcategory with group number (1-9).'
say
say 'Enter a "D" for the True/False quiz about DMS and Diacetyl.'
say
response1=ask
end PickFile
fs = string
fs = System.getProperty("file.separator");
say "This is the file name separator: " fs
if response1.upper = "C" then fileName= 'Commercial_examples.txt'
if response1.upper = "M" then fileName= 'flashcards.txt'
if response1.upper = "F" then fileName= 'fermentation.txt'
if response1.upper = "Y" then fileName= 'Yeast for substyle.txt'
if response1.upper = "T" then fileName= 'TroubleShooting.txt'
if response1.upper = "S" then fileName= 'Categories.txt'
if response1.upper = "G" then fileName= 'groups.txt'
if response1.upper = "D" then fileName= 'DMS_Diacetyl.txt'

say 'Enter a "Y" to prompted in random order. Otherwise the input will read sequentially.'
say
response1=ask

if response1.upper = "Y" then Rand1 = 1
else Rand1 = 0
say 'Enter a "Y" to be limited to entering only the correct keystroke (very hard). Otherwise you can guess the whole answer and if incorrect will have to guess again.'
say
response1=ask

if response1.upper = "Y" then retrySwitch = 0
else retrySwitch = 1

LoadTable(fileName)
end

fcgui(totalQCount, Rand1, retrySwitch)

/* Open and check the files */
-- this would be a good place to instantiate an object of type "cardDeck"
-- cardDeck could have properties like aQuestion[] and anAnswer[] and cardRetired = Boolean
Method LoadTable(aFileName = Rexx) static SIGNALS IOException --, FileNotFoundException
trace results
if Debug_Level > 15 then trace var rightAnswer wrongAnswer
reset()
fs = string
fs = System.getProperty("file.separator");
fileName = aFileName
-- rightAnswers = 0
-- wrongAnswers = 0
do
bufferIn=BufferedReader(FileReader("txt_files" || fs || afileName))
say 'Processing infile.'
catch Z=IOException --, X=FileNotFoundException
say '# error opening file' Z.getMessage
-- say '# File not found.' X.getMessage
-- return inhandle
end
/* The processing loop to load our table from the txt file. ***/
loop totalQCount = 0
line = bufferIn.readLine -- get next line as Rexx string
if line == null then leave totalQCount -- normal end of file
parse line '"' myanswer '"' myquestion
question1[totalQCount] = myquestion.strip.strip('B','\t')
answer1[totalQCount] = myanswer.strip.strip('B','\t')
ans_done[totalQCount] = 0
end totalQCount
say 'Total questions to be asked =' totalQCount "-->Press enter to proceed..."
-- response1=ask
-- if response1 == "Q" | response1 == "q" | response1 == "X" | response1 == "x" then exit
-- This is where we read in the file with the status of each "card": right(done) or wrong-count
-- fs = string
fs = System.getProperty("file.separator");
IsFile("donesofar")
do
bufferIn=BufferedReader(FileReader("donesofar" || fs || aFileName))
say 'Processing infile.'
catch Z=IOException -- X=FileNotFoundException
say '# error opening file' Z.getMessage
-- catch X=FileNotFoundException
-- say '# File not found.' X.getMessage
-- return inhandle
end
rightAnswers = 0
wrongAnswers = 0
loop mySubscript = 0 by 1
lineIn = bufferIn.readLine -- get next line as Rexx string
if lineIn = null then leave mySubscript -- normal end of file
parse lineIn ans_done_counter "<>" ans_wrong_counter
ans_done[mySubscript] = ans_done_counter
ans_wrong[mySubscript] = ans_wrong_counter
if ans_done[mySubscript] == 1 then rightAnswers = rightAnswers + 1
if ans_wrong[mySubscript] >= 1 then wrongAnswers = wrongAnswers + ans_wrong[mySubscript]
end mySubscript
-- these progress bars have not been created yet.
-- pb.setValue(((rightAnswers / totalQCount) * 100 ) %1) -- increment the progress bar value
-- pb2.setValue(((rightAnswers / (rightAnswers + wrongAnswers)) * 100 ) %1) -- increment the progress bar value
if rightAnswers >= totalQCount then PbtRevertToZeros
/* The main processing loop. ***/
method fcgui(TotalLines = int, Randomize = boolean, retryOnOff = boolean)
if Debug_Level > 3 then trace results
Rand1 = Randomize
retrySwitch = retryOnOff
say "Are we running in full word answer mode or checking every keystroke?" retrySwitch
say "Are we presenting the prompts in sequential order or randomizing?" Rand1
DrawTheCard()
ShowNextCard(Rand1)

method DrawTheCard()
if Debug_Level > 3 then trace results

-- Create a frame DrawTheCard_Frame

select
when fileName.pos('DMS_D') > 0 then tittleInstruction = 'True or false in which categories DMS, diacetyl or acetaldehyde may be present.'
when fileName.pos('Comme') > 0 then tittleInstruction = 'Name of a commercial example for each of every substyle.'
when fileName.pos('flash') > 0 then tittleInstruction = 'Fill in the blanks on the general brewing process.'
when fileName.pos('ferme') > 0 then tittleInstruction = 'Fill in the blanks on the topic of fermentation.'
when fileName.pos('Yeast') > 0 then tittleInstruction = 'Name of a yeast for this beer style. (Hint: all WLP)'
when fileName.pos('Troub') > 0 then tittleInstruction = 'Fill in the blanks concerneing troubleshooting faults in finished beer.'
when fileName.pos('Categ') > 0 then tittleInstruction = 'Name the number of the category and letter of the subcategory.'
when fileName.pos('group') > 0 then tittleInstruction = 'Reply with the number of the group for the substyle (per Techam).'
otherwise say "This shoulh not have happened, darn." -- we don't expect anything else
end
-- place all the components on the frame
DrawTheCard_Frame = JFrame(tittleInstruction)
gbl = GridBagLayout()
DrawTheCard_Frame.setLayout(gbl)
myConstraints = GridBagConstraints()
myConstraints.anchor=GridBagConstraints.PAGE_START
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.insets.top = 5
myConstraints.insets.left = 4
myConstraints.insets.bottom = 5
myConstraints.insets.right = 4
PhysScreenDimensions = DrawTheCard_Frame.getToolkit().getScreenSize()
-- Set the size of the DrawTheCard_Frame
DrawTheCard_Frame.setSize(PhysScreenDimensions.width/2, PhysScreenDimensions.height/3)
thisFrameSize = DrawTheCard_Frame.getSize()
say "PhysScreenDimensions" PhysScreenDimensions.width PhysScreenDimensions.height
say "thisFrameSize" thisFrameSize.width thisFrameSize.height
-- Set the DrawTheCard_Frame position to the middle of the screen
DrawTheCard_Frame.setLocation((PhysScreenDimensions.width - thisFrameSize.width) % 2,(PhysScreenDimensions.height - thisFrameSize.height)%3)
--
-- Create a label with the question.
QBox_1 = JTextarea()
QBox_1.setForeground( Color.blue )
QBox_1.setFont(Font("Arial", Font.BOLD, 14))
QBox_1.setEditable(boolean 0)
QBox_1.setLineWrap(boolean 1);
QBox_1.setWrapStyleWord(boolean 1)
-- myConstraints.gridwidth = 3
-- myConstraints.gridheight = 3
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.5
myConstraints.gridx=0
myConstraints.gridy=0
gbl.setConstraints(QBox_1,myConstraints)
DrawTheCard_Frame.add( QBox_1, myConstraints )

InputText = JTextarea(123456789012345678901234567890)
InputText.setLineWrap(mytrue)
InputText.setForeground( Color.red )
InputText.setFont( Font( "Arial", Font.BOLD, 14) )
InputText.addKeyListener(this)
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.5
myConstraints.gridx=1
myConstraints.gridy=0
myConstraints.weightx = 2
-- myConstraints.ipadx=1
-- myConstraints.REMAINDER = 2 DrawTheCard_Frame.add( InputText, myConstraints )
-- myConstraints.gridwidth=GridBagConstraints.REMAINDER
-- myConstraints.gridwidth=3
gbl.setConstraints(InputText,myConstraints)
DrawTheCard_Frame.add(InputText, myConstraints)

text3 = JTextarea()
-- build the text3 text on the frame
-- f = SimpleDateFormat("H:mm:ss" ) -- Formats hours:minutes:seconds
if (rightAnswers + wrongAnswers) == 0 then DaScore = 0
else DaScore = (rightAnswers / (rightAnswers + wrongAnswers) * 100 %1)
textForSetting = "Starting at: 00:00:00 Elapsed time:" 000000 "minutes."
text3.setText(textForSetting)
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.3
myConstraints.gridx=0
myConstraints.gridy=1
myConstraints.ipadx=0
gbl.setConstraints(text3,myConstraints)
DrawTheCard_Frame.add(text3, myConstraints)

-- build the text4 text on the frame
text4 = JTextarea()
if rightAnswers + totalQCount > 0 then perCentRight = ( 100 * ( rightAnswers / ( rightAnswers + totalQCount ) ) %1)
if rightAnswers + totalQCount == 0 then perCentRight = 0
textForSetting = rightAnswers 'of' fcgui.totalQCount 'complete.' perCentRight '% \n' "Overall percent of correct responses: " DaScore '% \n'
text4.setText(textForSetting)
myConstraints.weighty = 0.3
myConstraints.gridx=1
myConstraints.gridy=1
myConstraints.fill=GridBagConstraints.BOTH
gbl.setConstraints(text4,myConstraints)
DrawTheCard_Frame.add(text4, myConstraints)

-- Create a submit and check the answer button, add it, and set up its event handler.
Submit_Check = JButton("Submit and check your answer now.")
Submit_Check.setMnemonic('C')
Submit_Check.setActionCommand("Submit_Check")
Submit_Check.addActionListener(this)
Submit_Check.setToolTipText("Click this button to check to see if your answer is correct.")
Submit_Check.addKeyListener(this)
myConstraints.weighty = 0.2
myConstraints.gridx=0
myConstraints.gridy=2
myConstraints.fill=GridBagConstraints.BOTH
gbl.setConstraints(Submit_Check,myConstraints)
DrawTheCard_Frame.add(Submit_Check, myConstraints)

-- Create a save and exit button, add it, and set up its event handler.
SaveExit = JButton("Save score and exit now.")
SaveExit.setMnemonic('S')
SaveExit.setActionCommand("WriteItOut")
SaveExit.addActionListener(this)
SaveExit.setToolTipText("Click this button to save your score and exit the program.")
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.2
myConstraints.gridx=1
myConstraints.gridy=2
gbl.setConstraints(SaveExit,myConstraints)
DrawTheCard_Frame.add(SaveExit, myConstraints)

-- Create a button to pick the next card
advanceThruDeck = JButton("Display the next card.")
advanceThruDeck.setMnemonic('D')
advanceThruDeck.setActionCommand("Advance")
advanceThruDeck.addActionListener(this)
advanceThruDeck.setToolTipText("Click this button to advance to the next card.")
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.2
myConstraints.gridx=0
myConstraints.gridy=3
myConstraints.anchor=GridBagConstraints.PAGE_END
gbl.setConstraints(advanceThruDeck,myConstraints)
DrawTheCard_Frame.add(advanceThruDeck, myConstraints)
-- create a button to reset the status of all the 'cards" to zeros
PbtRevertToZeros = JButton('Reset all scores') -- define a push button
PbtRevertToZeros.setMnemonic('R')
PbtRevertToZeros.setActionCommand("Reset")
PbtRevertToZeros.addActionListener(this)
PbtRevertToZeros.setToolTipText("Click this button to change status of all cards to undone.")
myConstraints.fill=GridBagConstraints.BOTH
myConstraints.weighty = 0.2
myConstraints.gridx=1
myConstraints.gridy=3
myConstraints.anchor=GridBagConstraints.PAGE_END
gbl.setConstraints(PbtRevertToZeros,myConstraints)
DrawTheCard_Frame.add(PbtRevertToZeros, myConstraints)

myConstraints.gridx=0
myConstraints.gridy=4
pb=JProgressBar(0,100) -- make a progress bar
pb.setStringPainted(1)
pb.setString("% completed")
pb.setValue(((rightAnswers / totalQCount) * 100 ) %1)
gbl.setConstraints(pb,myConstraints)
DrawTheCard_Frame.add(pb, myConstraints) -- add it to the panel

myConstraints.gridx=1
myConstraints.gridy=4
pb2=JProgressBar(0,100) -- make a progress bar
pb2.setStringPainted(1)
pb2.setString("% correct of all answers")
if (rightAnswers + wrongAnswers) == 0 then DaScore = 0
else DaScore = (rightAnswers / (rightAnswers + wrongAnswers) * 100 %1)
pb2.setValue(DaScore)
gbl.setConstraints(pb2,myConstraints)
DrawTheCard_Frame.add(pb2, myConstraints) -- add it to the panel

-- show the DrawTheCard_Frame
DrawTheCard_Frame.setVisible(1)
-- InputText.requestFocusinWindow()
-- add the window event listener to the window for close window events
DrawTheCard_Frame.addWindowListener( CloseWindowAdapter())
return 1

method ShowNextCard(Randomize)
QBox_1.setBackground( Color.white )
QBox_1.setForeground( Color.blue )
if Debug_Level > 3 then trace results
Rand1 = Randomize
if javax.swing.SwingUtilities.isEventDispatchThread then say "We are running in the EDT, event dispatching thread."
keystyped = Rexx 0
keysaccepted = Rexx 0
keysreleased = Rexx 0
someRandomNum = getNextCardNum(totalQCount, Rand1)
A_question=question1[someRandomNum]
A_answer=answer1[someRandomNum]
-- ans_done[someRandomNum] = 1
f = SimpleDateFormat("H:mm:ss" ) -- Formats hours:minutes:seconds
textForSetting = "Currently: " f.format(Date()) '\n'-
"Elapsed time:" elapsed() "minutes."
text3.setText(textForSetting)
if (rightAnswers + wrongAnswers) == 0 then DaScore = 0
else DaScore = (rightAnswers / (rightAnswers + wrongAnswers) * 100 %1)
textForSetting = rightAnswers 'of' fcgui.totalQCount 'complete.' (100 * (rightAnswers / fcgui.totalQCount) %1) '% Incorrect attempts: ' wrongAnswers '\n' -
"Overall percent of correct responses: " DaScore '% \n'
text4.setText(textForSetting)
if javax.swing.SwingUtilities.isEventDispatchThread then say "We are running in the EDT, event dispatching thread."
QBox_1.setText( A_question)
if A_answer = "2-3" then QBox_1.setBackground( Color(255,255,138) )
if A_answer = "3-4" then QBox_1.setBackground( Color.YELLOW )
if A_answer = "5-6" then QBox_1.setBackground( Color(255,215,008) )
if A_answer = "6-9" then QBox_1.setBackground( Color(255,181,008) )
if A_answer = "10-14" then QBox_1.setBackground( Color(255,156,008) )
if A_answer = "14-17" then QBox_1.setBackground( Color(200,125,008) )
if A_answer = "17-18" then QBox_1.setBackground( Color(200,100,008) )
if A_answer = "19-22" then QBox_1.setBackground( Color(200,082,008) )
if A_answer = "22-30" then QBox_1.setBackground( Color(139,037,001) )
if A_answer = "30-35" then QBox_1.setBackground( Color(94,39,19) )
if A_answer = "30+" then QBox_1.setBackground( Color.BLACK )
if A_answer = "40+" then QBox_1.setBackground( Color.BLACK )
InputText.setText("")
-- InputText.setCaretPosition(InputText.GetText.length + 1);
InputText.requestFocusinWindow()
DrawTheCard_Frame.setVisible(1)
DrawTheCard_Frame.validate()

-- this is where we check the answer that was entered for being correct or not
method Submit_Check(Answer_In = String)
if Debug_Level > 3 then trace results
say "InputText " InputText.getText
this_ans = InputText.getText
this_ans = this_ans.trim()
this_ans = this_ans.toUpperCase()
if this_ans.length > Answer_In.length then do
this_ans = this_ans.replaceAll("[\\r\\n]", "") -- method will not work on type Rexx
this_ans = this_ans.substring(int 0,Answer_In.length)
say "this_ans " this_ans
end
Answer_In = Answer_In.toUpperCase
say "Answer_In " Answer_In
-- check the answer supplied by the user with the one saved that corresponds to the current prompt
if Answer_In.equals(this_ans) then do
say "That is correct."
JOptionPane.showMessageDialog(DrawTheCard_Frame, "Correct! ---> " Answer_In)
rightAnswers = rightAnswers + 1
ans_done[someRandomNum] = 1
say "Number of times thru so far: " rightAnswers wrongAnswers "<--- any values in there??? "
end
else do
wrongAnswers = wrongAnswers + 1
ans_wrong[someRandomNum] = 1
JOptionPane.showMessageDialog(DrawTheCard_Frame, "Wrong! It should have been ---> " Answer_In)
end
if (rightAnswers + wrongAnswers) = 0 then DaScore = 0
else DaScore = ((rightAnswers / (rightAnswers + wrongAnswers)) * 100 %1)
pb.setValue(((rightAnswers / totalQCount) * 100 ) %1) -- increment the progress bar value
pb2.setValue(((rightAnswers / (rightAnswers + wrongAnswers)) * 100 ) %1) -- increment the progress bar value
if rightAnswers > totalQCount then do
JOptionPane.showMessageDialog(DrawTheCard_Frame, "OK. That is the lot of them. Congrats!!")
DisplayScore(DaScore)
WriteAndLeave(DaScore)
DrawTheCard_Frame.dispose()
GuiClose()
return
end
DisplayScore(DaScore)
ShowNextCard(Rand1)
-- DrawTheCard_Frame.dispose()
-- fcgui(totalQCount, bool, retrySwitch)

method keyTyped(e=Keyevent)
if Debug_Level > 3 then trace results
-- if Debug_Level = 5 then say " keystyped key A_answer keysaccepted rightAnswers wrongKeys"
key = Rexx e.getKeyChar() -- make key of type Rexx for further use
if key.c2d() == KeyEvent.VK_ENTER then Submit_Check(A_answer)
key = key.upper()
keystyped = keystyped + 1
if retrySwitch then return
key = Rexx e.getKeyChar() -- make key of type Rexx for further use
if key.c2d() == KeyEvent.VK_BACK_SPACE then return
say A_answer "<-- this is the answer."
if keyset.pos(key) == 0 then
e.consume
else
if key \= A_answer.substr(keysaccepted+1,length1) then
e.consume
else do
e.setKeyChar(key)
keysaccepted = keysaccepted + 1
end
method keyReleased(e=Keyevent)
if Debug_Level > 3 then trace results
if Debug_Level > 9 then trace var rightKeys
keysreleased = keysreleased + 1
-- if keysreleased > A_answer.length then Submit_Check(A_answer) -- This makes things faster, but confuing.
if retrySwitch then return
numeric digits 4
if keysaccepted >= A_answer.length then do
try_cnt = try_cnt + 1
rightAnswers = rightAnswers + 1
ans_done[someRandomNum] = 1
-- f = SimpleDateFormat("H:mm:ss" ) -- Formats hours:minutes:seconds
JOptionPane.showMessageDialog(DrawTheCard_Frame, "Correct! \""A_answer"\" \n" -
"Total keys typed: " keystyped "\n" -
"Wrong keys typed: " keystyped - A_answer.length "\n" -
"Percent right : " A_answer.length / keystyped * 100 -
)
-- DrawTheCard_Frame.dispose()
rightKeys = rightKeys + A_answer.length
wrongKeys = wrongKeys + keystyped - A_answer.length
-- say rightKeys wrongKeys "<--- any values in there??? "
-- say "Number of times thru so far: " try_cnt
if (rightKeys + wrongKeys) = 0 then DaScore = 0
else DaScore = (rightKeys / (rightKeys + wrongKeys) * 100 %1)

if try_cnt > fcgui.totalQCount then do
DisplayScore(DaScore)
WriteAndLeave(DaScore)
GuiClose()
end
keysaccepted = 0
DisplayScore(DaScore)
ShowNextCard(Rand1)
end

method getNextCardNum(numberIn = int, Randomize = boolean) returns int
if Debug_Level > 3 then trace results

-- first check to see that there is an unanswered question in the table
search_cnt = 0
loop label scanTable tableEntry = 0 to numberIn until ans_done[tableEntry] == 0
search_cnt = search_cnt + 1
if search_cnt >= numberIn then do
JOptionPane.showMessageDialog(DrawTheCard_Frame, "OK. That is the lot of them. Congrats!!")
DisplayScore(DaScore)
WriteAndLeave(DaScore)
DrawTheCard_Frame.dispose()
GuiClose()
exit 0
end
end scanTable
If Randomize then do
loop label randomSearch until ans_done[someRandomNum] == 0
someRandomNum = GetRand(numberIn)
say "someRandomNum = " someRandomNum
end randomSearch
End
else do
someRandomNum = someRandomNum + 1
if someRandomNum > (numberIn - 1) then someRandomNum = 0
loop label findundone while ans_done[someRandomNum] == 1
someRandomNum = someRandomNum + 1
if someRandomNum > (numberIn - 1) then someRandomNum = 0
end findundone
End
return someRandomNum

method actionPerformed(evt=ActionEvent)
if Debug_Level > 3 then trace results
select
when evt.getSource() == advanceThruDeck then do
ShowNextCard(Rand1)
end
when evt.getSource() == SaveExit then do
WriteAndLeave(DaScore)
GuiClose()
return
end
when evt.getSource() == Submit_Check then do
Submit_Check(A_answer)
end
when evt.getSource() == PbtRevertToZeros then do
PbtRevertToZeros()
DisplayScore(DaScore)
end
otherwise say "This should not have happened, darn." -- we don't expect anything else
finally DisplayScore(DaScore)
end

method DisplayScore(textout2 = int)
if Debug_Level > 3 then trace results
if (rightAnswers + wrongAnswers) == 0 then DaScore = 0
else DaScore = (rightAnswers / (rightAnswers + wrongAnswers) * 100 %1)
JOptionPane.showMessageDialog(DrawTheCard_Frame, -
"Elapsed time:" elapsed() "minutes. \n" -
rightAnswers 'of' fcgui.totalQCount 'complete.' (100 * (rightAnswers / fcgui.totalQCount) %1) '% \n' -
"Overall percent of correct answers: " DaScore '% \n')
f = SimpleDateFormat("H:mm:ss" ) -- Formats hours:minutes:seconds
textForSetting = "Currently: " f.format(Date()) '\n'-
"Elapsed time:" elapsed() "minutes."
text3.setText(textForSetting)
textForSetting = rightAnswers 'of' fcgui.totalQCount 'complete.' (100 * (rightAnswers / fcgui.totalQCount) %1) '% Incorrect attempts: ' wrongAnswers '\n' -
"Overall percent of correct responses: " DaScore '% \n'
text4.setText(textForSetting)
-- pb.setValue(DaScore) -- increment the progress bar value
--
method reset() public static returns long
Start_Time = System.currenttimemillis
say date()
return Start_Time

method elapsed() public static returns Rexx
if Debug_Level > 3 then trace results
Curr_Time=System.currenttimemillis
-- numeric digits 16
elapsedTime = (Curr_Time - fcgui.Start_Time) / 60000 %1
elapsedTime = elapsedTime.format(NULL,0)
say elapsedTime date()
return elapsedTime

method GetRand(sub_l = int) static returns int
I = (sub_l * Math.random()) % 1 -- %1 make result to integer
RETURN I

method WriteAndLeave(OutPut1) static
if Debug_Level > 3 then trace results
do
output = 'BJCPquiz_scores.txt'
outFile = FileWriter(output, Append) -- output file BOOLEAN flag means append
dest = PrintWriter(outFile) -- to printer
f = SimpleDateFormat("yy/MM/dd HH:mm" ) -- Formats hours:minutes:seconds
Sortable_Date = f.format(Date())
Line_Bld = fcgui.fileName Sortable_Date elapsed() "Minutes. Completed:" rightAnswers " " OutPut1 || "% correct."
dest.println(Line_Bld)
dest.close() -- close files
catch IOException
say 'I/O Exception.'
end
do
IsFile("donesofar")
fs = System.getProperty("file.separator");
output = "donesofar" || fs || FileName
outFile = FileWriter(output, 0) -- output file BOOLEAN flag means append
dest = PrintWriter(outFile) -- to printer
-- copy the table of correctly answered question numbers out to disk
loop label writeAns_doneTable qNumber = 0 to totalQCount
dest.println(ans_done[qNumber] "<>" ans_wrong[qNumber])
end writeAns_DoneTable
dest.close() -- close files
catch IOException
say 'I/O Exception with the' output '.'
end
JOptionPane.showMessageDialog(DrawTheCard_Frame, "Files updated, exiting...")
GuiClose()
return 1

method GuiClose() static
DrawTheCard_Frame.dispose()

method PbtRevertToZeros static
trace results
rightAnswers = 0
wrongAnswers = 0
pb.setValue(0) -- increment the progress bar value
pb2.setValue(0) -- increment the progress bar value
-- SetTextOfScoreLine()
do
fs = System.getProperty("file.separator");
output = "donesofar" || fs || FileName
outFile = FileWriter(output, 0) -- output file BOOLEAN flag means append
dest = PrintWriter(outFile) -- to printer
-- set all the flags in the file back to 0 = question has not been asked yet.
loop label writeAns_doneTable qNumber = 0 to totalQCount
dest.println("0 <> 0")
ans_done[qNumber] = 0
ans_wrong[qNumber] = 0
end writeAns_DoneTable
dest.close() -- close files
catch IOException
say 'I/O Exception with the' output '.'
end
return 1
class CloseWindowAdapter extends WindowAdapter
-- /*-------------------------------------------------------------------------------
-- The CloseWindowAdapter exits the application when the window is closed.
-- WindowAdapter is an abstract class which implements a WindowListener interface.
-- The windowClosing() method is called when the window is closed.
-- -----------------------------------------------------------------------------*/
method windowClosing( e=WindowEvent )
return

Kenneth Klein

Kermit Kiser <[hidden email]>
Sent by: [hidden email]

12/05/2012 04:36 PM

Please respond to
IBM Netrexx <[hidden email]>

To	IBM Netrexx <[hidden email]>
cc
Subject	Re: [Ibm-netrexx] JProgressBar

Not sure how simple it is but here is one that I have. It does some funny thread looping stuff that you would not want to do in a real app, but that is because it was designed as a special purpose demo for NetRexxScript in jEdit.

-- Kermit

------------------------------------------------------------------------------------------------------------------------------------------
import javax.swing.
import java.text.

class progressbardemo implements ActionListener -- ActionListener interface lets the GUI objects talk to the program code

properties static

frame=JFrame -- holder for a GUI window

textfield=JTextField -- holder for some text to edit

method main(sa=String[]) static

frame=JFrame("Sample GUI window") -- create a GUI window frame

frame.setSize(400,100) -- give the window some space on the screen

panel=JPanel() -- create a panel to hold some GUI objects

frame.add(panel) -- put the panel in the window frame

parse Date().toString a b c d e f
textfield=JTextField(b c f) -- create a spot for some text

panel.add(textfield) -- add the text field to the panel

button=JButton("OK") -- create a button to click

button.addActionListener(progressbardemo()) -- attach some code (an instance of this class) to watch the button

panel.add(button) -- put the button in the panel

pb=JProgressBar(0,100) -- make a progress bar

panel.add(pb) -- add it to the panel

frame.show -- display the GUI window on the screen

loop i=1 to 100 -- loop for 100 increments as we specified in the progress bar
Thread.sleep(100) -- wait 1/10 second
pb.setValue(i) -- increment the progress bar value
end

loop while frame\=null;Thread.sleep(100);end -- wait for the GUI window to do something

method actionPerformed(e=ActionEvent) -- this is the code that listens to the button

say textfield.getText -- show the text field contents

frame.dispose -- clear the GUI window frame from screen and memory

frame=null -- erase the pointer to stop the main program

------------------------------------------------------------------------------------------------------------------------------------------

On 12/5/2012 9:29 AM, kenneth.klein@... wrote:

Does anyone have a simple example of using JProgressBar in nextrexx?

Kenneth Klein

_______________________________________________ Ibm-netrexx mailing list[hidden email]Online Archive :http://ibm-netrexx.215625.n3.nabble.com/
_______________________________________________ Ibm-netrexx mailing list [hidden email] Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

Fernando Cassia-2

Re: JProgressBar

In reply to this post by Kermit Kiser

On Wed, Dec 5, 2012 at 6:36 PM, Kermit Kiser <[hidden email]> wrote:

------------------------------------------------------------------------------------------------------------------------------------------
import javax.swing.
import java.text.

Pastebin is your friend. :)

And thanks for the code snippet, btw :)
FC

--
During times of Universal Deceit, telling the truth becomes a revolutionary act
Durante épocas de Engaño Universal, decir la verdad se convierte en un Acto Revolucionario
- George Orwell

_______________________________________________
Ibm-netrexx mailing list
[hidden email]
Online Archive : http://ibm-netrexx.215625.n3.nabble.com/

123