Hi NetRexxers! Producing an HTML scanner & editor program using NetRexx I had to choose between three opportunities for storing Web-names as: 1. a simple non-ordered list of a names; => wordpos method as an access tool; 2. an ordered list of a names; => BinSearch method as an access tool; 3. an ordered indexed list of a names; => ABinSearch method as an access tool; Very simple and short variants of a BinSearch are given below. I have tested three methods mentioned above using a list of 129 Web-names which were stored in my MS IE 4.0 cache. The number of requests varied from 100 to 4000 and consisted of an equal parts of a words from the Web-names list and from the words absent in the Web-names list. The results are: WORDPOS is always the best, BinSearch for an indexed strings is always the second and BinSearch for nonindexed strings - the last. All three methods asymptotically tends to give the same timing. It would be natural to assume according to these results that a Rexx string class is an ordered collection of the words and wordpos is an efficiently realized binary search. On the other hand absence of a SORT method in the Rexx string class contradicts to this assumption. So would anybody say where an inefficiency of my ABinSearch comparing wordpos lies? ( As for BinSearch - subword is time-wasting for sure ). ( I suppose also that s[i] operation is TIMES FASTER then wordpos ). -- Duplicates are absent in the argument lists -- w - input word which presense in the list is checked. -- s - input ordered list of words -- wp - current word of the list -- nl - number of the words on the left including wp -- nr - number of the words on the right of wp -- non-strict compare is used to encrease speed. class test binary public method BinSearch( w = Rexx, s = Rexx ) public static returns boolean If ( s = "" ) | ( w = "" ) Then return boolean 0 n = int s.words() nl = int ( n + 1 ) % 2 nr = int n - nl wp = s.word( nl ) Loop forever if ( w = wp ) Then return boolean 1 if ( w < wp ) Then Do if ( nl <= 1 ) Then return boolean 0 s = s.subword( 1, nl - 1 ) n = nl - 1 nl = nl % 2 End Else Do if ( nr = n ) - | ( nr = 0 ) Then return boolean 0 s = s.subword( nl + 1 ) n = nr nl = ( nr + 1 ) % 2 End nr = n - nl wp = s.word( nl ) End method ABinSearch( w = Rexx, s = Rexx ) public static returns boolean If ( w = "" ) Then return boolean 0 n = s[0] nl = ( n + 1 ) % 2 nr = n - nl wp = s[nl] Loop forever if ( w = wp ) Then return boolean 1 if ( w < wp ) Then Do if ( nl <= 1 ) Then return boolean 0 n = nl - 1 nl = ( n + 1 ) % 2 End Else Do if ( nr = n ) - | ( nr = 0 ) Then return boolean 0 nl = nl + ( nr + 1 ) % 2 End nr = n - nl wp = s[nl] End Sorry for the poor English and too long novice question, Oleg Kulikov, PIE SI, [hidden email], [hidden email] . ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To unsubscribe from this mailing list ( ibm-netrexx ), please send a note to [hidden email] with the following message in the body of the note unsubscribe ibm-netrexx <e-mail address> |
> Oleg A. Kulikov[SMTP:[hidden email]] wrote
> [...] Producing an HTML scanner & editor program using NetRexx I had to choose between three opportunities for storing Web-names as: 1. a simple non-ordered list of a names; => wordpos method as an access tool; 2. an ordered list of a names; => BinSearch method as an access tool; 3. an ordered indexed list of a names; => ABinSearch method as an access tool; [...] The results are: WORDPOS is always the best, BinSearch for an indexed strings is always the second and BinSearch for nonindexed strings - the last. [...] It would be natural to assume according to these results that a Rexx string class is an ordered collection of the words and wordpos is an efficiently realized binary search. On the other hand absence of a SORT method in the Rexx string class contradicts to this assumption. It never pays to assume to much. word() will be relatively fast for positions close to the beginning of the string, but it will be slow for words near the end of the string, since it has to do a linear search through the string for word delimiters. The trouble with ABinSearch is that it uses the Rexx associative array mechanism the way you would use a normal array, and it does arithmetic with variables of type Rexx, which is relatively slow. If you put options binary at the top of your .nrx file, change the line that assigns s[0] to n: n = int s[0] and make the s argument of ABinSearch to be of type rexx[], then ABinSearch will be faster than using word(). However, NetRexx does provide a better way of doing this. You can use the associative array mechanism of class Rexx like an associative array: values = 0 values["potato"] = 1 values["tomato"] = 1 if values["potato"] = 1 | values["tomato"] = 1 then letscallthewholethingoff() This should be the fastest way to get the result you're after. -- Patrick TJ McPhee DataMirror Corporation [hidden email] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To unsubscribe from this mailing list ( ibm-netrexx ), please send a note to [hidden email] with the following message in the body of the note unsubscribe ibm-netrexx <e-mail address> |
In reply to this post by Oleg A. Kulikov
---- From: Patrick McPhee <[hidden email]> To: [hidden email]; 'Oleg A. Kulikov' <[hidden email]> Date: 16 сентября 1997 г. 19:05 Subject: RE: Why wordpos is better then binary search? >> Oleg A. Kulikov[SMTP:[hidden email]] wrote >> > >It never pays to assume to much. word() will be relatively fast for >positions close to the beginning of the string, but it will be slow for But nevertheless it can't be slower then wordpos for the words absent in the list! >words near the end of the string, since it has to do a linear search >through the string for word delimiters. The trouble with ABinSearch is >that it uses the Rexx associative array mechanism the way you would use >a normal array, and it does arithmetic with variables of type Rexx, I've tested this program with int variables in ABinSearch: it results in a WORSE timing due to conversions when accessing s[i]. >which is relatively slow. If you put > options binary I've placed "binary" into class definition and I guess it is the same as "options binary". >at the top of your .nrx file, change the line that assigns s[0] to n: > n = int s[0] >and make the s argument of ABinSearch to be of type rexx[], then It is impossible to use "rexx[]" because of name list is growing in the caller method. >ABinSearch will be faster than using word(). > >However, NetRexx does provide a better way of doing this. You can use >the associative array mechanism of class Rexx like an associative array: > values = 0 > values["potato"] = 1 > values["tomato"] = 1 > > if values["potato"] = 1 | values["tomato"] = 1 then The list I use in ABinSearch contains an indexes which I use further in the same manner. But it is said in NetRexx document that "loop i over" provide an unpredictable order of an index values and I need full control. >letscallthewholethingoff() > >This should be the fastest way to get the result you're after. > >-- >Patrick TJ McPhee >DataMirror Corporation >[hidden email] > Thanks for remarks , O'Kulikov. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To unsubscribe from this mailing list ( ibm-netrexx ), please send a note to [hidden email] with the following message in the body of the note unsubscribe ibm-netrexx <e-mail address> |
In reply to this post by Oleg A. Kulikov
|
Free forum by Nabble | Edit this page |