Apriori

Lift ratio is symmetric

Properties

  • If is large, any subset of is large.
  • If is not large, any superset of is not large.

Steps

Join

Combine items with the same prefix

Prune

  1. Enumerate through all sized itemsets in
  2. Exclude the itemset if any of its sized itemsets is not in

Count

Delete itemsets if their frequency is less than the threshold frequency

Proof

  • is a set of all association rule with and
  • is a set of all large itemsets with
  • is a set of rules with

Claim 1

For each rule in , it is in

  • Since is in ,
    • We know that we have to calculate in step (*)
  • So {B, C} and {B} are in
  • Which means and
  • Since

Claim 2

For each rule in , it is in

  • Since
    • Thus, is in
  • Since ,
    • Thus, is in
  • Since and are in
    • Step 2 must consider and together and must generate (since )
  • So, is in

FP Growth

Steps

Setup

  1. Frequency table
  2. Frequency table of frequent items
  3. Frequent table of frequent items (descending)
  4. Ordered frequent items table

Tree

  1. Draw the tree according to ordered frequent items table
  2. Create conditional FP-tree from the bottom
    1. Write down all conditional itemsets
    2. Draw frequency table
    3. Draw frequency table of frequent items (descending)
    4. Write down the ordered frequent items on the right of itemsets
    5. Draw the tree (ignoring the condition item)
    6. Generate large itemsets if the tree has a single path, otherwise repeat the whole process.
      1. Size-1 is the condition itself
      2. Enumerate all possible itemsets for tree nodes
      3. Append the condition to each possible itemsets