Data Science Exploration and Visualization
According to Donoho (Donoho, D., 2015, “50 years of Data Science”. Based on a Presentation at the Tukey Centennial Workshop), data science workflow consists of the following steps.

data exploration and preparation
data representation and transformation
Computing
Modeling
Visualization and presentation
Insights

Which of the above steps do you encounter as part of your daily work? Based on what you know so far about python, where would it be ideally suited and which steps would you instead use other tools such as excel, tableau etc.

Few critical points to get a full grade in discussions:-
1- The Discussion paper should be well thought and written posts in your own words. 2- DO NOT copy and paste from your own previous class or external sources.
3- The discussion should have at least one reference listed.
4- All references used for support/evidence/information in this course must be SCHOLARLY resources. I suggest that you use Google Scholar to search for resources.
5- You may NOT, under any circumstances, use: Wikipedia, eHow, Ask.com, or ANY OTHER such non-scholarly website as a source.

Data Science Exploration and Visualization
As part of daily work, most data scientists encounter exploration and visualization. Donoho (2015) claims that exploration is one of the most important steps that may be encountered almost 80% of the time to detect unusual patterns and check the sanity of the messy data. Data visualization is almost similar to exploration but the difference is that the visualization aspect takes on a deeper course of looking into the patterns in the data sets. Visualization gives the data scientist a way of understanding the relationships properly before creating models. Python is a popular programming language when it comes to data science since it offers a wide array of libraries that is easy to integrate and learn.
For Exploratory Data Analysis as well as Visualization, python offers Matplotlib, and Seaborn to deliver pictorial representations of the data in form of graphs, pivot charts, or histograms. They are both libraries that are the most popular among programmers for understanding the behavior of datasets. Plotting with python gives more insight on customer patterns and can be used for predictive analysis. All that is possible, once EDA and Visualization are carried out accordingly. However, one can find that integrating Plotly gives a better way to share the data online. It also offers a shortcut to presenting the data in a much more attractive way since the graphs are already predefined. Plotly offers advantages such as less working time, and easy to understand. One does not need to be a programming guru in order to use Plotly. Even though it is fairly easy to perform visualization, making a good one is the challenge (Grus, 2019). Using Plotly makes it easy to perform a good exploration, and produce a good visualization.

References
Donoho, D. (2015). 50 years of Data Science (p.1).
Grus, J. (2019). Data science from scratch (p. 41). O’Reilly Media, Inc.

Published by
Essays
View all posts