Fog Creek Software
Discussion Board




Help - translating a few lines of C into RealBASIC

Ive written programs for most of the statistical tests but incorporated programs to, for example, calculate significance of t and F tests etc. This is the only algorithm I can find for significance testing of this test, if someone could help translate it into RealBASIC for me. Thanks.

An Algorithm to calculate the exact level of significance of the Wilcoxon Matched-Pairs Signed-Ranks Test.

The core of the algorithm is like this (W is the Sum of Ranks found in the test, N is the sample size):

  my $NumberOfPossibilities = 2**$N;
  my $CountLarger = 0;
  for($i=0; $i < $NumberOfPossibilities; ++$i)
  {
    my $RankSum = 0;
    for($j=0; $j < $N; ++$j)
    {
      $RankSum += $j + 1 if(($i >> $j) & 1);
    };
    ++$CountLarger if($RankSum >= $W);
  };
  my $p = 2*$CountLarger / $NumberOfPossibilities;

All 2**N possible distributions of of signs over ranks are generated as bit-patterns (i.e., just the integers from $i = 0 to $i = 2**N-1).
The rank sum of the "1"-bits of each of these patters is determined by shifting out the bits of these patterns one by one (for $j = 0 to $j = N, $i >> $j) and summing them, $RankSum += $j + 1 if(($i >> $j) & 1);.
All bit-patterns with a rank sum larger than or equal to W are counted and the total count is divided by 2**N to get the one-tailed level of significance. The program prints the two-tailed level of significance.



These programs can be used and distributed freely. (copyright 1996, Rob van Son)

terry coulton
Thursday, January 02, 2003

Was the rate that you're offering or are you soliciting quotes?

X. J. Scott
Thursday, January 02, 2003

And that's not C.  Perhaps it's in RealBASIC already?

Michael Chansky
Friday, January 03, 2003

It's pseudocode from his textbook.

What class did you say this was for?

Ed the Millwright
Friday, January 03, 2003

I don't get it, anyone should be able to convert some lines of Perl into something more usefull ... 8-)

rob
Friday, January 03, 2003

I your in the UK, I can drive over and tie your shoe laces for you as well.

Ged Byrne
Friday, January 03, 2003


Pseudocode? C? Blah.

rob won the guess game. Give the man a beer.

Leonardo Herrera
Friday, January 03, 2003

I feeling guilty now for being so harsh.  Heres some tips.  I don't know basic, but these should be true for any basic.

1)  Get rid of the Mys and $s.

2) 'for($j=0; $j < $N; ++$j)' becomes FOR j = 1 TO n

3)  Rather than shifting the binary numbers, you can divide by two and take the remainder.

Take a look at the full source.  It has more comments:

#! /usr/bin/perl
#
#
# Calculate the exact level of significance for a
# Wilcoxon Matched-Pair Signed-Ranks Test using the sample's
# Sum of Ranks W and the sample size (i.e., number of pairs) N.
# This whole routine can be run as a stand-alone program.
#
# Use:
# WX-MP-SR.pl W N
#
# Copyright 1996, Rob van Son
# This program may be freely used and distributed
# -------------------------------------------------------
#                Rob van Son
# Institute of Phonetic Sciences & IFOTT
# University of Amsterdam, Herengracht 338
# NL-1016CG Amsterdam, The Netherlands
# Tel.: (+31) 205252183    Fax.: (+31) 205252197
# Email: rob@fon.let.uva.nl
# WWW page: http://fonsg3.let.uva.nl
# -------------------------------------------------------
#
#
# This is the actual routine that calculates the exact (two-tailed)
# level of significance for the Wilcoxon Matched-Pairs Signed-Ranks
# test. The inputs are the Sum of Ranks of either the positive of
# negative samples (W) and the sample size (N).
# The Level of significance is calculated by checking for each
# possible outcome (2**N possibilities) whether the sum of ranks
# is larger than or equal to the observed Sum of Ranks (W).
#
# NOTE: The execution-time ~ N*2**N, i.e., more than exponential.
# Adding a single pair to the sample (i.e., increase N by 1) will
# more than double the time needed to complete the calculations
# (apart from an additive constant).
# The execution-time of this program can easily outrun your
# patience.
#
sub LevelOfSignificanceWXMPSR  # ($W, $N)
{
  my $W = shift;
  my $N = shift;
  my $i, $j;
  #
  # Determine Wmax, i.e., work with the largest Rank Sum
  my $MaximalW = $N*($N+1)/2;
  $W = $MaximalW - $W if($W < $MaximalW/2);
  #
  # The total number of possible outcomes is 2**N
  my $NumberOfPossibilities = 2**$N;
  #
  # Initialize and loop. The loop-interior will be run 2**N times.
  my $CountLarger = 0;
  # Generate all distributions of sign over ranks as bit-patterns ($i).
  for($i=0; $i < $NumberOfPossibilities; ++$i)
  {
    my $RankSum = 0;
    # Shift "sign" bits out of $i to determine the Sum of Ranks ($j).
    for($j=0; $j < $N; ++$j)
    {
      $RankSum += $j + 1 if(($i >> $j) & 1);
    };
    # Count the number of "samples" that have a Sum of Ranks larger or
    # equal to the one found (i.e., >= W).
    ++$CountLarger if($RankSum >= $W);
  };
  #
  # The level of significance is the number of outcomes with a
  # sum of ranks equal to or larger than the one found (W)
  # divided by the total number of possible outcomes.
  # The level is doubled to get the two-tailed result.
  my $p = 2*$CountLarger / $NumberOfPossibilities;
  return $p;
};
#
#
$W = shift;
$N = shift;
print LevelOfSignificanceWXMPSR($W, $N), "\n";

Ged Byrne
Friday, January 03, 2003


O((2n)^n)

I miss seeing that old friend.

X. J. Scott
Friday, January 03, 2003

>>> 2) 'for($j=0; $j < $N; ++$j)' becomes FOR j = 1 TO n <<<

Actually, it becomes FOR j = 0 to N - 1

Otherwise you end up with off by one errors elsewhere in the algorithm.

Also, watch out for starting array indexes. Most C style languages start arrays at index 0; many BASIC's start arrays at index 1. I don't know what REALBasic does though.

Chris Tavares
Saturday, January 04, 2003

Thanks for the replies. So it's shaping up as:
CODE AS GIVEN:
my $NumberOfPossibilities = 2**$N;
  my $CountLarger = 0;
  for($i=0; $i < $NumberOfPossibilities; ++$i)
  {
    my $RankSum = 0;
    for($j=0; $j < $N; ++$j)
    {
      $RankSum += $j + 1 if(($i >> $j) & 1);
    };
    ++$CountLarger if($RankSum >= $W);
  };
  my $p = 2*$CountLarger / $NumberOfPossibilities;

MY ATTEMPT INTO BASIC
NumberOfPossibilities = 2^N
CountLarger = 0

REM  for($i=0; $i < $NumberOfPossibilities; ++$i)
becomes
For I = 0 to NumberOfPossibilities
REM thanks Chris (Travers) but what actually is the third part, ++$i,  in Perl signifying?

RankSum = 0

    For J=0 to N-1

REM Then I Translate $RankSum += $j + 1 if(($i >> $j) & 1);

REM as
   
If I > J AND I >1
    Then RankSum = J + 1
REM SIMILARLY I TRANSLATE
If RankSum >= W
Then CountLarger =CountLarger +1
REM BUT ARE THEY CORRECT??
Next J
Next I
  p = 2*CountLarger / NumberOfPossibilities
Anyone add a little further light on this for me?
thanks, Terry

terry coulton
Saturday, January 04, 2003

++$i means "Add 1 to i, and return the result after the increment".

In this context, the for loop is already doing it (adding 1 to i each time through the loop) so it just dissapears.

The ++ operator also returns a value, which can be used like this (I'll switch to C, since I hate the $):

a = ++i; // i incremented by 1, a = i

or

a = i++; // i incremented by 1, a = i - 1

Yes, it's confusing. :-)

-Chris

Chris Tavares
Tuesday, January 07, 2003

Terry,

Sorry for misjudging you for another "Can't be bother to do my own homework" type.

Your nearly their.

------------------------------------------------------------
REM Then I Translate $RankSum += $j + 1 if(($i >> $j) & 1);

REM as
 
If I > J AND I >1
    Then RankSum = J + 1

-------------------------------------------------------------

Not quite.  Your mistaking >> for the greater that sign.  Thats just on >.  Thie operator is binary shift right.  From the perlop:

-------------------------------------------------------------
Binary ``>>'' returns the value of its left argument shifted right by the number of bits specified by the right argument. Arguments should be integers. (See also Integer Arithmetic.)
--------------------------------------------------------------

In other words, if you have a binary number such as 8 it looks like this:
    
8 4 2 1
1 0 0 0

Shifting it on to the right makes it 4.

8 4 2 1
0 1 0 0

Shifting it another one to the right makes it 2.

8 4 2 1
0 0 1 0

This is the same as dividing by 2.  8/2 = 4.  4/2 = 2.

Its the same in decimal.  Shifting 1 o the right is the same as dividing by 10 (1000/10 = 100)

The ampersand (&) means AND.  & 1 will return true if the 1 bit above is set.  So for any even number, it will return false and for any odd number true:
      8 4 2 1
7 = 0 1 1 1
            &1 = true

6 = 8 4 2 1
      0 1 1 0
            &1 = false

So $i >> $j is the euquivalent to int(i/(2^j)).  The int is necessary because you want to ignore any decimals.

Check to see if the binary AND operator is supported, other wise use the MOD function to test if its an odd or even number.

Ged Byrne
Tuesday, January 07, 2003

*  Recent Topics

*  Fog Creek Home