Review1
- What is NFL? What are its implications for selecting the "best" learner
- List 5 baseline criteria for an AI tools, make argument:
- Why it is important to meet that criteria?
- Why, pragmatically, it might be necessary to ignore (or, at least relax) that criteria
- Distinguish supervised (S) from unsupervised (U) learning
- When would do S or U?
- Is FFT an S or U? Why? (Hint: your answer should say something about FFT).
- Is the Fastmap Cluster an S or U? Why? (Hint: your answer should say something about the Fastmap clusterer).
- Instance models refer to specific values (e.g. X=6,Y=10,Z=100) while other models refer to ranges of some attributes (e.g. X>6 and Y < 5).
- Which of these models reference points or volumes?
- Which of these offer more generalizations of the past?
- Which of these are shorter to share?
- How to generalize from point models to models that cover volumes?
- FFTs:
- What are the attributes in an node of an FFT tree?
- What is the structure of an FFT tree (hint: use the node attributes to make that description).
- Fastmap Clustering
- Given 2 points Y,Z at distance c, if a new point X is a=dist(X,Y) and b=dist(X,Z), derive an expression for the distance x that X falls along the line between Y,Z.
- What are the attributes in an node of an Fastmap cluster? (hint: parts of it are recursive)
- What is the structure of an Fastmap clustering tree (hint: use the node attributes to make that description).
- How to use a Fastmap cluster node for
- classification (hint: there are many ways)
- anomaly detection, sharing, privacy (hint: see III.D.4
- incremental model updates over an infinite stream of data? (hint
- How to use any clustering method for optimization, anomaly detection, classification, sharing, privacy.
- How to use a Fastmap cluster free for: - very fast optimization (hint: see Algorithm1)
- What are diversity measures for numeric and symbolic values? Offer a formula for each.
- What is the diversity of "y,n,y,y,y,n,y,y,n"? If you don't have a calculator, show all working and stop before the final calculation.
- What is the diversity of "10,89,32,11,9,90,30,31,91"? If you don't have a calculator, show all working and stop before the final calculation.
- Consider the following data.
- Just considering the
age
column, where to divide it such thatage
diversity is minimized? Assume a minimum bin size of 3. - Now consider the
age,alive
relationship. Where to divideage
in order to reduce the diversity ofalive
? Assume a minimum bin size of 3.
- Just considering the
Data:
age,alive
10, y
89, n
32, y
11, y
9, y
90, n
30, y
31, y
91, n
- Iterative dichmotization (ID) algorithms divide attributes into ranges, then recurse on each range.
- How do ID algorithms decide what attribute with which to split the data