This series starts from the beginning of learning Data Science and try solving the Titanic problem on Kaggle.
- scikit-learn : learn on kaggle
- Learn skills on kaggle (python, pandas, machine learning, data visualisation, sql, r, deep learning)
- My Journey to Data Science (For beginner with zero-coding experience)
- Big data university, IBM cloud (expired April 6, 2019)
- The Data Science Handbook (free): This book is not for self-learning data science, it talks about data world instead. This book contains “in-depth interviews with 25 remarkable data scientists, where they share their insights, stories, and advice.”
- Data Scientist path on Dataquest. (see my notes for this path)
- Data Scientist path on OpenClassRoom (in french)
- CS109 Data Science by Harvard.
- Statistical Data Analysis in Python by Christopher Fonnesbeck.
- Installing Anaconda (to use pandas and other neccessary packages)
- Add Anaconda to PATH:
- If you have something strange, check this note of python first
- Loan prediction : an example of a finished project.
- The first step in creating a project is to decide on a topic. You want the topic to be something you’re interested in and motivated to explore. It’s very obvious when people are making projects just to make them, rather than out of a genuine interest in the topic.
- Think about what sectors or angles you’re really interested in, then find data sets relating to those sectors.
- Review several data sets, and find one that seems interesting enough to explore.
- Some resources
Data analyst, data scientist and data engineer
I found many paths on the internet to enter the data science’s world. Most of them categorized into 3 types: analyst - scientist - engineer. I wanna find out the difference between them and what is the perfect fit for me. In this section, I am not an expert to answer this question for you, I noted this answer only for me.
- Explanation on Dataquest.
- Data analyst: the bridge, the driver, from the past show the present. Common tasks done by data analysts include data cleaning, performing analysis and creating data visualizations.
- Data scientist: behind the scenes, from the past show the future. They apply their expertise in statistics and building machine learning models to make predictions and answer key business questions. Jobs: clean, analyze, and visualize data (like data analyst) + have more depth and expertise in these skills, and will also be able to train and optimize machine learning models.
- Data engineer: the workers build and optimize the systems that allow data scientists and analysts to perform their work. The data engineer ensures that any data is properly received, transformed, stored, and made accessible to other users.
Data science, data mining, big data
I heard alot about these these terms.
- Datacamp vs Dataquest on reddit