Willem de Beijer and Daan Kolkman
This tutorial will take you through the steps of using Google Colab for data science. It is part of our Cloud Computing for Data Science series.
1. About Google Colab
Google Colaboraty is a service that allows you to run Jupyter Notebooks in the cloud for free. While it is more limited than a virtual machine, it’s much easier to set up and get going. Aditionally, you can use your existing Google account to login to the service. A good introduction to Colab can be found on https://colab.research.google.com/notebooks/welcome.ipynb#

2. Getting started
To get started, go to “File” in the top menu and choose either “New Python 3 notebook” or “Upload notebook…” to start with one of your existing notebooks.
Getting data in Colab can be a bit of a hassle sometimes. Colab can be synchronized with Google Drive, but the connection is not always seamless. The easiest way to upload a dataset is to run the following in a notebook cell:
from google.colab import files
uploaded = files.upload()
This will prompt you to select and upload a file.

For other methods on how to upload data to Google Colab I would recommend the following blogpost: https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92
3. What you get
Packages
Most packages you will need for data science are pre-installed on Google Colab. This is especially true for Google-made packages such as TensorFlow. Recently, Google has introduced Swift for TensorFlow which allows you to use the Swift programming language with TensorFlow directly in a Colab notebook. As of writing the project is still in beta version, but it might be interesting to note for those who are interested.
Computing resources
Just like with Kaggle, Google Colab will provide you with free computing resources. Colab also offers TPU support, which is like a GPU but faster for deep learning. Keep in mind though that while TensorFlow does support TPU usage, PyTorch does not.
4. When to use
Collaboration
Google Colab can be especially useful to use for group projects since Colab notebooks can be easily shared on Google Drive.
Personal
Just like with Kaggle, Google Colab can also be used to extend on the computing resources of your own device. Whether you want to use Google Colab or Kaggle ultimately comes down to personal preference, but for
For a good comparison between Google Colab and Kaggle I would suggest:
https://towardsdatascience.com/kaggle-vs-colab-faceoff-which-free-gpu-provider-is-tops-d4f0cd625029