diff options
Diffstat (limited to 'SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md')
| -rw-r--r-- | SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md new file mode 100644 index 0000000..f88fe80 --- /dev/null +++ b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md @@ -0,0 +1,27 @@ +--- +id: pattern discovery +aliases: + - What is Pattern Discovery? +tags: [] +--- + +## What is Pattern Discovery? + +- ==What are patterns?== + - ==Patterns==: A set of items, subsequences, or substructures that occur + frequently together (or strongly correlated) in a data set + - Patterns represent ==intrinsic== and ==important properties== of datasets +- ==Pattern discovery==: Uncovering patterns from massive data sets +- Motivation examples: + - What products were often purchased together? + - What are the subsequent purchases after buying an iPad? + - What code segments likely contain copy-and-paste bugs? + - What word sequences likely form phrases in this corpus? ![[CleanShot +2023-10-26 at 01.53.56@2x.png]] ![[CleanShot 2023-10-26 at 01.54.32@2x.png]] + ![[CleanShot 2023-10-26 at 01.54.44@2x.png]] ![[CleanShot 2023-10-26 at +01.55.00@2x.png]] + +## Efficient Pattern Mining Methods + +- The [[Apriori]] Algorithm +- [[FP-Growth]]: A Frequent Pattern-Growth Approach |
