How to use gsutil and Python to deal with files in Google Cloud Storage

Learn to interact with GCP Storage programmatically

Lynn Kwong
6 min readApr 24, 2021

--

Google Cloud Storage can be a convenient option if you want to store your data in the cloud programmatically, especially if you use the Google Cloud Platform. You can store any data in your work, such as plain text files, images, videos, etc. As a beginner, you may prefer to use the Google Cloud Console to manage your files, which is very straightforward to use. However, as a developer, you would need to use the command-line tool or the client library in your code to deal with Google Cloud Storage more programmatically.

Photo by Memed_Nurrohmad on Pixabay.

Before we introduce the command-line tool and the client library, we need to know two basic terms.

  • Bucket. A bucket is a special container that holds your data in Google Cloud Storage. A bucket must have a globally unique name in the Google Cloud Storage system. Buckets cannot be nested but you can create folders inside a bucket to organize your data. A bucket acts like a folder directly under the root folder (\) in the Linux system, such as home, usr, bin, etc.
  • Object. An object is a piece of data stored in a bucket. As mentioned above, it can be any data. It is just a fancy name to call a file in Google Cloud Storage. Especially, an object is called a blob in the client libraries. Blob and object basically mean the same thing but are used in different circumstances.

Now that we know the basic terminology, let’s begin to use the command-line tool gsutil and the client library google-cloud-storage in Python to deal with buckets and objects/blobs. It’s better to show the commands in parallel for easier comparison. Before we can use the gsutil and the google-cloud-storage library, we need to install and configure them.

  1. Install gsutil.

gsutil is part of the Google Cloud SDK. Depending on your operating system, the procedures to install Google Cloud SDK can vary. It should be fairly easy to install. You need to authorize Cloud SDK after installation. Normally it is to run the following commands:

gcloud auth login
gcloud config set project YOUR-PROJECT-ID
gcloud auth application-default login
gcloud auth list

--

--

Lynn Kwong

I’m a Software Developer (https://superdataminer.com) keen on sharing thoughts, tutorials, and solutions for the best practice of software development.