I am going to explain 2 methods to upload large files to Colab. And there is a problem in Google Colab, or maybe it is a restriction, that does not allow to upload files larger than 1Mb using its graphical interface.
It is very useful for those who are going to work with Whisper, since any audio file weighs more than 1MB.
When uploading a file, it starts to load, takes a long time and at the end the upload disappears or only 1Mb of our file is uploaded, leaving it incomplete.
I leave you a video
To solve this I am going to explain 2 methods:
- Importing the files from Google Drive
- With the files library
Also I leave you a Colab with the code so you can see and try it live.
Importing files to Colab from Google Drive
Another option to work with large files in Colab is to upload them to our Google Drive and synchronize Colab with Drive, so that we can use any file we have there.
A very interesting option, especially when we have to use a Notebook on a recurring basis. We must remember that every time we run a notebook, all the information on the virtual hard drive is lost. Therefore having the notebook connected to Drive
IMPORTANT: The email of the Colab account and the Google Drive account must be the same, when I tried to change it, using a Colab account and a Driva account it gave me problems although in theory it should work fine.
For this we will use the following code
from google.colab import drive drive.mount('/content/drive')
The tutorial images are in Spanish because I have taken them from the Spanish version of my website. If you need them in English, let me know and I will correct them.
Drive will ask for permissions from the account
Once accepted we will see that it mounts the hard disk and we can see the files.
And then
They will be in a folder called drive or mydrive, in our case within content as we have indicated
You can update the content in the left bar, with the folder icon.
How to upload files to Colab with files
Very simple just add 2 cells with the following code, it could be done all in one cell but I like to have the one that allows us to select our file in an individual cell.
So at the beginning of our Colab we will use
from google.colab import files
to import that Python library
And then in the step that we want to upload our file we will put
files.upload()
This will upload it to the Colab root.
If you know of any other way, leave a comment.