From 4d53fa14ee0cd615444aca6f6ba176e0ccc1b5be Mon Sep 17 00:00:00 2001 From: TheSiahxyz <164138827+TheSiahxyz@users.noreply.github.com> Date: Mon, 29 Apr 2024 22:06:12 -0400 Subject: init --- .../Content/pattern discovery.md | 27 ++++++++++++++++++++++ 1 file changed, 27 insertions(+) create mode 100644 SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md (limited to 'SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md') diff --git a/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md new file mode 100644 index 0000000..f88fe80 --- /dev/null +++ b/SI/Resource/Fundamentals of Data Mining/Content/pattern discovery.md @@ -0,0 +1,27 @@ +--- +id: pattern discovery +aliases: + - What is Pattern Discovery? +tags: [] +--- + +## What is Pattern Discovery? + +- ==What are patterns?== + - ==Patterns==: A set of items, subsequences, or substructures that occur + frequently together (or strongly correlated) in a data set + - Patterns represent ==intrinsic== and ==important properties== of datasets +- ==Pattern discovery==: Uncovering patterns from massive data sets +- Motivation examples: + - What products were often purchased together? + - What are the subsequent purchases after buying an iPad? + - What code segments likely contain copy-and-paste bugs? + - What word sequences likely form phrases in this corpus? ![[CleanShot +2023-10-26 at 01.53.56@2x.png]] ![[CleanShot 2023-10-26 at 01.54.32@2x.png]] + ![[CleanShot 2023-10-26 at 01.54.44@2x.png]] ![[CleanShot 2023-10-26 at +01.55.00@2x.png]] + +## Efficient Pattern Mining Methods + +- The [[Apriori]] Algorithm +- [[FP-Growth]]: A Frequent Pattern-Growth Approach -- cgit v1.2.3