csc 591-024, (8290)
csc 791-024, (8291)
fall 2024, special topics in computer science
Tim Menzies, timm@ieee.org, com sci, nc state


home :: timetable :: syllabus :: groups :: moodle :: license

HW3 : Testing an Research Hypotheses

(IMPORTANT NOTE: the experimental runs for this one can take a while– especially if you find a find a mistake and have to start again. Do not make this a last minute rush job!!!).

Three students from this class in the spring claim (Jacob, Joshua, and Rohan) claim that:

Use the extension from hw1 to find which data sets have less than 6 independent values

Run the following twice (once for the low dimensional data sets and once for the other). See what conclusions are found.

Now that the following is quickly written pseudo code. May have mistakes. You fix them. Have fun!

Experiments, part1: comission for one data set

First you must write an experiment function

That experiment file needs to loop through some options and write to a list of SOME instances (one per option). Note that one treatment must be asIs that runs over the data and collects all the distances to heaven. This is the baseline result against which everything else will be compared.

d = DATA().adds(csv(the.train))
b4 = [d.chebyshev(row) for row in d.rows]
somes = [stats.SOME(b4,f"asIs,{len(d.rows)}")]

Then you need to loop through some options to collect some numbers into a list. This gets added to SOME with a name that identiges the treatment. In the following ,see some +=:

rnd = lambda z: z
scoring_policies = [
  ('exploit', lambda B, R,: B - R),
  ('explore', lambda B, R :  (exp(B) + exp(R))/ (1E-30 + abs(exp(B) - exp(R))))]

for what,how in scoring_policies:
  for the.Last in [0,20, 30, 40]:
    for the.branch in [False, True]:
      start = time()
      result = []
      runs = 0
      for _ in range(repeats):
         tmp=d.shuffle().activeLearning(score=how)
         runs += len(tmp)
         result += [rnd(d.chebyshev(tmp[0]))]

      pre=f"{what}/b={the.branch}" if the.Last >0 else "rrp"
      tag = f"{pre},{int(runs/repeats)}"
      print(tag, f": {(time() - start) /repeats:.2f} secs")
      somes +=   [stats.SOME(result,    tag)]
pre=f"{what}/b={the.branch}" if the.Last >0 else "rrp"
tag = f"{pre},{int(runs/repeats)}"
somes +=   [stats.SOME(result,    tag)]

When all the looping is done, you have to print the result:

stats.report(somes, 0.01)

(In the above, “0.01” controls the size of the smallest difference we can print in the output.)

Experimental Scripts Must be “Commissioned”

The scripts you write for these experiments are always quirky and complex. It is very easy to make mistakes and have to throw out days of compute. So test experimental scripts have to be commissioned.

Also: add in tests to check that the expected stuff is actually happening. e.g.

Experiments, part2: run it over many datasets

Makefile has a tool for generating a todo file for running multiple experiments

Lets say your experument can be called from the command line -e branch.

For example:

make Act=branch actb4 # this outputs

mkdir -p .../tmp/branch
rm .../tmp/branch/*
python3 .../ezr.py  -D -t .../Apache_AllMeasurements.csv -e branch | tee .../tmp/branch/Apache_AllMeasurements.csv &
python3 .../ezr.py  -D -t .../HSMGP_num.csv -e branch              | tee .../tmp/branch/HSMGP_num.csv &
python3 .../ezr.py  -D -t ../SQL_AllMeasurements.csv -e branch     | tee .../tmp/branch/SQL_AllMeasurements.csv &
...

You can catch the output of actb4 into a todo file:

make Act=branch actb4 > ~/tmp/branch.sh

See here for a full example branch.sh.

You can now run all this to generate lots of output files. See here for a sample.

All those outputs can be summarizes with the rq.sh script:

cd ~/tmp/branch ; bash ~/gits/timm/ezr/etc/rq.sh 

RANK                0            1            2             3            
exploi/b=True      92            4            4                         
explore/b=True     80           16            2             2          
exploi/b=False     71           24            4                       
explore/b=False    59           27           10             4        
rrp                10           16           14            37       
asIs                2            8           12            12      
#
#EVALS
RANK                      0            1            2             3 
exploi/b=True      29 (  8)     35 (  0)     20 (  0)       0 (  0)
explore/b=True     29 (  8)     29 (  0)     20 (  0)      30 (  0)   
exploi/b=False     28 (  8)     26 (  4)     20 (  0)       0 (  0)  
explore/b=False    28 (  4)     31 (  8)     30 (  0)      30 (  0) 
rrp                 4 (  0)      4 (  0)      4 (  0)       5 (  0)
asIs              3840 (  0)   6581 (  0)   12835 (  0)   16307 (  0)  
#
#DELTAS
RANK                      0            1            2             3   
exploi/b=True      73 ( 23)     48 (  0)     41 (  0)       0 (  0)  
explore/b=True     74 ( 21)     61 (  0)     24 (  0)      24 (  0)   
exploi/b=False     73 ( 26)     59 ( 19)     46 (  0)       0 (  0)  
explore/b=False    71 ( 22)     58 ( 15)     52 (  0)      54 (  0) 
rrp                61 (  0)     50 (  0)     22 (  0)      22 ( 11) 

RANKS: how often treatments are in rank 0,1,2,…

EVALS: is the budgets used to achieve those ranks.

DELTAS: are the 100*(asIs - now)/asIs change.

If rq.sh fails:

What to hand in

Submit a url link to moodle with a repo link that has a /hw3 subdirectory