Using any suitable open-source research dataset - Dissertation App

Assignment Answers
Assignment Help
Using any suitable open-source research dataset

Assignment Help

Using any suitable open-source research dataset

June 15th, 2022 Essays

Task
Using any suitable open-source research dataset of your choosing (e.g., from Google Dataset Search (Links to an external site.) or UCI Machine Learning Repository (Links to an external site.)), carry out the following tasks:
1. Build and train classification and/or regression models from the dataset in any suitable programming environment of your choosing (e.g., MATLAB) using three machine learning techniques of your choice.
2. Justify the rationale behind the choice of your dataset, machine learning techniques, and programming environment.
3. Compare and contrast the performance of the three machine learning techniques in terms of prediction or validation accuracy, training time, prediction speed, R-squared values, MSE values, and transparency (as may be applicable).
4. Analyse the error matrices, the ROCs (and AUCs) for all three methods (as may be applicable).
5. Comment on how the hyperparameters (if any) are tuned or optimized (if applicable) to enhance the built/trained models.
6. Submit a report showing the work carried out.
A sample dataset from Google Dataset Search for Credit Card Scoring with Targets is also provided on Canvas.
Note that for any dataset that you choose, you must be able to demonstrate a clear interpretation of the dataset and how it fits the problem you are trying to solve and acknowledge the source via a detailed citation.
As stated above, please you can use any dataset of your choice. Preferably, open-source datasets that are freely available from online repositories. Should you decide to “stray” by using a dataset from your workplace or based on your experiments, please be certain that this does not infringe upon any data privacy and protection policies. In other words, this dataset must not be restricted in any way; that is, it can be viewed, edited, and modified by anybody.
For datasets without labels or classes or categories, you can generate suitable labels or classes or categories using conventional methods that are appropriate value. For example, in a credit card scoring dataset, a 24-year-old male who rents and has a large unpaid credit amount on his car, with little money in his checking and savings accounts may be considered to have a “high risk” of defaulting on any additional credit.
Guidelines
Your submission should 2000 words in length (+/- 10%).
As stated in the task you can use any suitable open-source research dataset of your choice, though here is a dataset (source: Kaggle (Leonardo Ferreira, 2018) (Links to an external site.)) that you may wish to use.
Again, please you can use any dataset of your choice. Preferably, open-source datasets that are freely available from online repositories such as the ones mentioned above. Should you decide to “stray” by using a dataset from your workplace or based on your experiments, please be certain that this does not infringe upon any data privacy and protection policies. In other words, this dataset must not be restricted in any way; that is, it can be viewed, edited, and modified by anybody.
Note that for any dataset that you choose, you must be able to demonstrate a clear interpretation of the dataset and how it fits the problem you are trying to solve and acknowledge the source via a detailed citation.
For datasets without labels or classes or categories, you can generate suitable labels or classes or categories using conventional methods. For example, in a credit card scoring dataset, a 24-year-old male who rents and has a large unpaid credit amount on his car, with little money in his checking and savings accounts may be considered to have a “high risk” of defaulting on any additional credit.
Please make sure that you correctly cite all secondary sources you use, and include a reference list. The reference list will not be included in your final word count.
Hint
Ensure that your submission fulfils the marking criteria detailed below.
Please note that for this assignment, you have the “laxity” of using any programming or development environment of your choosing. For students who have little or no background in programming, please it is strongly recommended that you use the nearly “plug-and-play” approach available via built-in applications in the MATLAB environment as carried out in the demonstration videos (Week 7).
Please refer to this published paper to have an analytical and methodical understanding of how machine learning techniques or algorithms (classification methods ONLY) can be evaluated and compared for a given machine learning problem or task:
• A. O. Sangodoyin, M. O. Akinsolu, P. Pillai and V. Grout, “Detection and Classification of DDoS Flooding Attacks on Software-Defined Networks: A Case Study for the Application of Machine Learning (Links to an external site.),” in IEEE Access, vol. 9, pp. 122495-122508, 2021, doi: 10.1109/ACCESS.2021.3109490.
Please refer to this published paper to have an analytical and methodical understanding of how linear regression can be applied for a given machine learning problem or task:
• M.O Akinsolu, A. O. Sangodoyin, and U. E. Uyoata, “Behavioral Study of Software-Defined Network Parameters Using Exploratory Data Analysis and Regression-Based Sensitivity Analysis (Links to an external site.),” in Mathematics, vol. 10, no. 14, pp. 2536, 2022, doi: 10.3390/math10142536.
Note that you will need to include as much information as you can in your submission to sufficiently show that you have carried out your work independently. Consequently, the onus is on you to independently decide whatever inputs you want to use to fulfil the marking criteria (see below). For example, full scripts or “excerpts” or “narratives” or clear screenshots or whatever form you want to present your code(s) is for you to decide.
Grading:

Description Marks
Dataset (including citation of source and acknowledgment) 2.5
Definition and Justification of Problem According to the Dataset:
Classification Problem and/or Regression Problem 2.5
Data Pre-processing and Feature Extraction: Identification of Predictors, Categories and Targets, Handling Noisy Data and Missing Data, Others 10.0
Rationale Informing the Selection or Choice of the Three Machine Learning Techniques or Methods to Build and Train Models to Address the Problem. 10.0
Model Performance Assessment Using the 1st Machine Learning Technique or Method. 10.0
Model Performance Assessment Using the 2nd Machine Learning Technique or Method. 10.0
Model Performance Assessment Using the 3rd Machine Learning Technique or Method. 10.0
Comparisons between all Machine Learning Techniques or Methods 10.0
Recommendations and Conclusions 5.0
References 2.5
Organisation of Report 2.5
TOTAL= 75

best custom paper writing service, best nursing paper writing service, best nursing writing service, best paper writing services

Published by

Essays

View all posts

RELATED ARTICLES

Cultural Competence in Healthcare: Application of the Purnell Model Essay

Cultural Competence in Healthcare: A Case Study Analysis Using the Purnell Model Healthcare professionals increasingly encounter diverse patient populations, necessitating cultural competence for effective care delivery. This paper examines a significant cross-cultural healthcare interaction through the lens of the Purnell Model for Cultural Competence, emphasizing the critical role of communication in transcultural nursing care. Case […]

Strengthening Legal Frameworks for Prosecuting Piracy and Related Offenses in the Arabian Sea and Red Sea

Strengthening legal frameworks for prosecuting piracy and related offenses in the Arabian Sea and Red Sea. Piracy in the Arabian Sea and Red Sea poses significant threats to international maritime security and trade. This paper examines the effectiveness of current legal frameworks in prosecuting piracy and related offenses in these regions. By analyzing international conventions, […]

National Patient Safety Goals in Nursing Practice

The National Patient Safety Goals Template Nursing Specialty My nursing specialty is Medical-Surgical/Telemetry nursing within an acute care hospital setting. Chapter The appropriate NPSG Chapter for my area of practice is Hospital. NPSG 1 Year: 2023 Name and Number: Identify patients correctly (NPSG.01.01.01) Description: This goal emphasizes the crucial need for two patient identifiers (e.g., […]

In need of this paper or similar homework assignment answers? Order from us today and get the best custom writing services! Top grades - AI/plagiarsim free papers guaranteed

Place an order | Check price