Please read the following:

According to Donoho (Donoho, D., 2015, “50 years of Data Science”. Based on a Presentation at the Tukey Centennial Workshop), data science workflow consists of the following steps
* data exploration and preparation
* data representation and transformation
* Computing
* Modeling
* Visualization and presentation
* Insights
Which of the above steps do you encounter as part of your daily work? Based on what you know so far about python, where would it be ideally suited and which steps would you rather use other tools such as excel, tableau etc.

Here are some points you need to know:

– I just know some basic information about python.
– Primary post should be a minimum of 275 words with at least one reference.(you can use 2-3 sources)
– All references used for support/evidence/information in this course must be SCHOLARLY resources from Google Scholar.
– Make sure to use the following website to look for sources for the paper https://scholar.google.com
– THE INSTRUCTOR WILL RUN A PLAGIARISM TEST. Please use your own words. This is a very important paper, please do your best!

Using Python for Data Science
Your Name
Your Institution

Using Python for Data Science
Data science is the art of making sense out of messy data. It requires one to study the data sets, analyze them and also present the information in ways that other involved parties can understand without the dense knowledge of statistics or programming. In my daily work life, the steps I participate in mostly are visualization, modeling and insights. Visualization comes first because it opens the door to modeling and insights. Once datasets have been understood, plotting the data gives statisticians the ability to predict trends and analyze the information.
Data modellings encompasses both predictive and generative designs that support the analysis of the datasets (Donoho, 2015). This helps with market analysis of the sales patterns for a business. One can use python to perform traditional conjoint analysis to show how product attributes may affect the decisions of consumers when it comes to selecting products (Miller, 2015). This is achievable through the use of importing Numpy which offers matrices, and multidimensional arrays. However, using R gives more power if one intends to perform advanced data mining. Better yet, the software is free and can handle intense statistical computing.
For insights, using Pandas is reliable since it enables users to read datasets, calculate statistics, and run basic dataset operations (Kumar, 2017). However, Spyder Download is a software that one can integrate to python for better analysis of insights. What makes it better is that one can use “Variable Explorre” to explore patterns in the data, a function that is not readily available on Pandas.
In conclusion, python seems to have a lot of support on most software platforms that offer more advanced techniques for data science. Due to this support, modelling and insights are easy to achieve when using Python as a programming language.

References
Donoho, D. (2015). 50 years of Data Science. Princeton NJ.
Kumar, A. (2017). Learning Predictive Analytics with Python (p. 16). Packt Publishing Ltd.
Miller, T. (2015). Marketing data science. Old Tappan, New Jersey: Pearson Education.

Published by
Essays
View all posts