Chengwei LEI, Ph.D.    Associate Professor

Department of Computer and Electrical Engineering and Computer Science
California State University, Bakersfield

 Data Science


 

What is Data Science

Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary approach that combines principles and practices from the fields of mathematics, statistics, artificial intelligence, and computer engineering to analyze large amounts of data. This analysis helps data scientists to ask and answer questions like what happened, why it happened, what will happen, and what can be done with the results. (Definition by Amazon)

 

 Under the under big umbrella "Science of Intelligence", Data Science usually refers to Data Mining and Machine Learning

 

What is the difference between Machine Learning and Data Mining

Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on known properties learned from the training data, data mining focuses on the discovery of (previously) unknown properties in the data (this is the analysis step of knowledge discovery in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy.

 




Introduction to Data Science


Data Mining


Machine Learning


Data Visualization




Useful Math Skills

Mathematics for Machine Learning Local download

Statistical Thinking for the 21st Century github

 


Data Visualization

Python Data Visualization Cookbook