Collection of notebooks compiling tools used in Data Science. The notebooks are some of the exercises that were part of the IBM Data Science professional certificate in Coursera.
Predicting if the Falcon 9 first stage will land successfully. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is because SpaceX can reuse the first stage. Therefore if we can determine if the first stage will land, we can determine the cost of a launch.
This project serves as a demostration in the use of different Data Science tools for acquiring and manipulating data.
API get request, html parsing and data wrangling.
Libraries:
- Requests
- Pandas
API get request and html parsing and search.
Libraries:
- BeautifulSoup
- Requests
- Pandas
Exploratory Data Analysis and label creation to train a classification algorithm.
Libraries:
- Numpy
- Pandas
Exploratory Data Analysis using SQL
Libraries:
- ipython-sql
- sqlalchemy
Exploratory Data Analysis and visualization.
Libraries:
- Numpy
- Pandas
- Matplotlib
- Seaborn
Using Folium to explore satellite data.
Libraries:
- Pandas
- Folium
Using plotly Dash to create a dashboard.
Libraries:
- Pandas
- Dash
- Plotly.express
Use of SVM, Classification Trees and Logistic Regression to predict if a rocket launch will be successful.
Libraries:
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Sklearn
Example use of basic machine learning classifiers to predict whether there will be rain the following day, using a rainfall dataset from the Australian Government's Bureau of Meteorology.
After cleaning the data, different classification algorithms as well as accuracy metrics are compared in ther performance to classify the data.
Contained in the notebook "rainfallPrediction".
Libraries:
- Sklearn
- Numpy
- Pandas
Classification algorithms used to build the models:
- Linear Regression
- KNN
- Decision Trees
- Logistic Regression
- SVM
The results are reported as the accuracy of each classifier, using the following metrics when these are applicable:
- Accuracy Score
- Jaccard Index
- F1-Score
- LogLoss
- Mean Absolute Error
- Mean Squared Error
- R2-Score