Class Project
In this project, you will be expected to do a comprehensive literature search and survey, select and study a specific topic in one subject area of data mining and its applications in business intelligence and analytics (BIA), and write a research paper on the selected topic by yourself. The research paper you are required to write can be a detailed comprehensive study on some specific topic or the original research work that will have been done by yourself.

Requirements and Instructions for the Research Paper:

1. The objective of the paper should be very clear about subject, scope, domain, and the goals to be achieved.

2. The paper should address the important advanced and critical issues in a specific area of data mining and its applications in business intelligence and analytics. Your research paper should emphasize not only breadth of coverage, but also depth of coverage in the specific area.

3. The research paper should give the measurable conclusions and future research directions (this is your contribution).

4. It might be beneficial to review or browse through about 20 to 30 relevant technical articles before you make decision on the topic of the research project.

5. The research paper can be:

a. Literature review papers on data mining techniques and its applications for business intelligence and analytics.
b. Study and examination of data mining techniques in depth with technical details.

c. Applied research that applies a data mining method to solve a real world application in terms of the domain of BIA.

6. The research paper should reflect the quality at certain academic research level.

7. The paper should be about at least 3000-3500 words.

8. The paper should include adequate abstraction or introduction, and reference list.

9. Please write the paper in your words and statements, and please give the names of references, citations, and resources of reference materials if you want to use the statements from other reference articles.

10. From the systematic study point of view, you may want to read a list of technical papers from relevant magazines, journals, conference proceedings and theses in the area of the topic you choose.

11. For the format and style of your research paper, please make reference to CEC Dissertation Guide (http://cec.nova.edu/documents/diss_guide.pdf), Publication Manual of APA, or the format of ACM and IEEE journal publications.

Suggested and Possible Topics for Written Report (But Not Limited)

Supervised Learning Methods:

Classification Methods:

Regression Methods

Multiple Linear Regression

Logistic Regression

Ordered Logistic And Ordered Probit Regression Models

Multinomial Logistic Regression Model

Poisson and Negative Binomial Regression Models

Bayesian Classification

Naïve Bayes Method

k Nearest Neighbors

Decision Trees

ID3 (Iterative Dichotomiser 3)

C4.5 and C5.0

CART (Classification and Regression Trees)

Scalable Decision Tree Techniques

Neural Network-Based Methods

Back Propagation

Neural Network Supervised Learning

Bayes Belief Network

Rule-Based Methods

Generating Rules from a Decision Tree

Generating Rules from a Neural Net

Generating Rules without Decision Tree or Neural Net

Support Vector Machine

Fuzzy Set and Rough Set Methods

Unsupervised Learning Methods:

Clustering Methods:

Partition Based Methods

Squared Error Clustering

K-Means Clustering (Centroid-Based Technique)

K-Medoids Method (Partition Around Medoids, Representative Object-Based Technique)

Bond Energy

Hierarchical Methods

Agnes(Agglomerative vs. Divisive Hierarchical Clustering)

BIRCH (Balanced Iterative Reducing and Clustering Using Hierarchies) Chameleon (Hierarchical Clustering using Dynamic Modeling)

CLARANS (Clustering Large Applications Based Upon Randomized Search) CURE (Clustering Using REpresentatives)

Density Based Methods

DBSCAN (Density Based Spatial Clustering of Applications with Noise, Density Based Clustering Based on Connected Regions with High Density)

OPTICS (Ordering Points to Identity the Clustering Structure)

DENCLUE (DENsity Based CLUstEring, Clustering Based on Density Distribution Functions)

Grid-Based Methods

STING (Statistical Information Grid)

CLIQUE (Clustering In QUEst, An Apriori-like Subspace Clustering Method)

Probabilistic Model Based Clustering

Clustering Graph and Network Data (For Example, Social Networks)

Self-Organized Map Technique

Assessment and Performance Measurement of Clustering Methods

Assessing Clustering Technology

Determining the Number of Clusters

Measuring Clustering Quality

Association Rule Mining

Evolution Based Methods:

Genetic Algorithms

Applications:

Data Mining Applications for Business Intelligence and Analytics

Text Mining

Spatial Mining

Temporal Mining

Web Mining

Others:

Over fitting and Under fitting issues

Outliers

Performance Assessment and Measurement

Confusion Matrix

ROC (Receiver Operating Characteristic)

AUC (Area Under the Curve)

Data Mining Tools

XLMiner

RapdiMiner

Weka

NodeXL

Suggested and Possible Topics for Written Report (But Not Limited)

Supervised Learning Methods:

Classification Methods:

Regression Methods

Multiple Linear Regression
Logistic Regression

Ordered Logistic And Ordered Probit Regression Models
Multinomial Logistic Regression Model

Poisson and Negative Binomial Regression Models

Bayesian Classification
Naïve Bayes Method

k Nearest Neighbors

Decision Trees
ID3 (Iterative Dichotomiser 3)
C4.5 and C5.0
CART (Classification and Regression Trees)

Scalable Decision Tree Techniques

Neural Network-Based Methods
Back Propagation

Neural Network Supervised Learning

Bayes Belief Network

Rule-Based Methods
Generating Rules from a Decision Tree
Generating Rules from a Neural Net

Generating Rules without Decision Tree or Neural Net

Support Vector Machine

Fuzzy Set and Rough Set Methods

Unsupervised Learning Methods:

Clustering Methods:

Partition Based Methods

Squared Error Clustering
K-Means Clustering (Centroid-Based Technique)
K-Medoids Method (Partition Around Medoids, Representative Object-Based Technique)

Bond Energy

Hierarchical Methods
Agnes(Agglomerative vs. Divisive Hierarchical Clustering)

BIRCH (Balanced Iterative Reducing and Clustering Using Hierarchies) Chameleon (Hierarchical Clustering using Dynamic Modeling)

CLARANS (Clustering Large Applications Based Upon Randomized Search) CURE (Clustering Using REpresentatives)

Density Based Methods

DBSCAN (Density Based Spatial Clustering of Applications with Noise, Density Based Clustering Based on Connected Regions with High Density)

OPTICS (Ordering Points to Identity the Clustering Structure)

DENCLUE (DENsity Based CLUstEring, Clustering Based on Density Distribution Functions)

Grid-Based Methods
STING (Statistical Information Grid)

CLIQUE (Clustering In QUEst, An Apriori-like Subspace Clustering Method)

Probabilistic Model Based Clustering

Clustering Graph and Network Data (For Example, Social Networks)

Self-Organized Map Technique

Assessment and Performance Measurement of Clustering Methods

Assessing Clustering Technology

Determining the Number of Clusters

Measuring Clustering Quality

Association Rule Mining

Evolution Based Methods:

Genetic Algorithms

Applications:

Data Mining Applications for Business Intelligence and Analytics

Text Mining

Spatial Mining

Temporal Mining

Web Mining

Others:

Over fitting and Under fitting issues

Outliers

Performance Assessment and Measurement

Confusion Matrix

ROC (Receiver Operating Characteristic)

AUC (Area Under the Curve)

Data Mining Tools

XLMiner

RapdiMiner

Weka

NodeXL

Published by
Essays
View all posts