Apriori
Lift ratio is symmetric
Properties
- If is large, any subset of is large.
- If is not large, any superset of is not large.
Steps
Join
Combine items with the same prefix
Prune
- Enumerate through all sized itemsets in
- Exclude the itemset if any of its sized itemsets is not in
Count
Delete itemsets if their frequency is less than the threshold frequency
Proof
- is a set of all association rule with and
- is a set of all large itemsets with
- is a set of rules with
Claim 1
For each rule in , it is in
- Since is in ,
- We know that we have to calculate in step (*)
- So {B, C} and {B} are in
- Which means and
- Since
Claim 2
For each rule in , it is in
- Since
- Thus, is in
- Since ,
- Thus, is in
- Since and are in
- Step 2 must consider and together and must generate (since )
- So, is in
FP Growth
Steps
Setup
- Frequency table
- Frequency table of frequent items
- Frequent table of frequent items (descending)
- Ordered frequent items table
Tree
- Draw the tree according to ordered frequent items table
- Create conditional FP-tree from the bottom
- Write down all conditional itemsets
- Draw frequency table
- Draw frequency table of frequent items (descending)
- Write down the ordered frequent items on the right of itemsets
- Draw the tree (ignoring the condition item)
- Generate large itemsets if the tree has a single path, otherwise repeat the whole process.
- Size-1 is the condition itself
- Enumerate all possible itemsets for tree nodes
- Append the condition to each possible itemsets