How to use gsutil and Python to deal with files in Google Cloud Storage
Learn to interact with GCP Storage programmatically
--
Google Cloud Storage can be a convenient option if you want to store your data in the cloud programmatically, especially if you use the Google Cloud Platform. You can store any data in your work, such as plain text files, images, videos, etc. As a beginner, you may prefer to use the Google Cloud Console to manage your files, which is very straightforward to use. However, as a developer, you would need to use the command-line tool or the client library in your code to deal with Google Cloud Storage more programmatically.
Before we introduce the command-line tool and the client library, we need to know two basic terms.
- Bucket. A bucket is a special container that holds your data in Google Cloud Storage. A bucket must have a globally unique name in the Google Cloud Storage system. Buckets cannot be nested but you can create folders inside a bucket to organize your data. A bucket acts like a folder directly under the
root
folder (\
) in the Linux system, such ashome
,usr
,bin
, etc. - Object. An object is a piece of data stored in a bucket. As mentioned above, it can be any data. It is just a fancy name to call a file in Google Cloud Storage. Especially, an object is called a blob in the client libraries. Blob and object basically mean the same thing but are used in different circumstances.
Now that we know the basic terminology, let’s begin to use the command-line tool gsutil
and the client library google-cloud-storage
in Python to deal with buckets and objects/blobs. It’s better to show the commands in parallel for easier comparison. Before we can use the gsutil
and the google-cloud-storage
library, we need to install and configure them.
- Install
gsutil
.
gsutil
is part of the Google Cloud SDK. Depending on your operating system, the procedures to install Google Cloud SDK can vary. It should be fairly easy to install. You need to authorize Cloud SDK after installation. Normally it is to run the following commands:
gcloud auth login
gcloud config set project YOUR-PROJECT-ID
gcloud auth application-default login
gcloud auth list