Informatics

Google Drive direct download of big files

Files hosted at Google Drive service can be shared publicly. Nevertheless, if they’re too big, Google pages advice that “Google Drive can’t scan this file for viruses” and ask the user for confirmation in order to proceed with the download.

If for any reason you’re trying to do an unattended download of the file or downloading it on a resource-limited environment, this will prevent direct use of the link.

It seems that google exposes a confirmation key with the advice page, that must match a value returned with the next request. So the process transits within four main urls, starting from the (supossed) direct public link of the file:

  1. https:// docs.google.com/open?id=…
  2. https:// docs.google.com/uc?id=…&export=download&revid=…
  3. https:// docs.google.com/uc?export=download&confirm=SqLb&id=…&revid=…/…
  4. https:// doc-0s-ak-docs.googleusercontent.com/docs/securesc/…

The first url redirects after various steps to a page in which the next url of the process can be catch with:

/href=”(\/uc\?export=download[^”]+)/

This will be the second url of the process: this is the one which will ends up setting a “confirm” random value inside this final step’s url. This value seems to have a corresponding and different one on the response url inside the page: this value inside the page must be captured in order to form a new url similar to the 3 step: this time the cookie (previously set by google) returned will match the url value and it’ll eventually (after some redirections) end on an url like that of the 4 step, which is finally the one which will directly download the file.

Too much hassle for a single clean command line…

So, I’ve written a perl script that does all the work. With the help of wget, of course, which can follow “302” HTTP redirections, and load and store cookies. Neat tool!

Here it is also as git repository.

Certificates can pose a problem on your system, because Google uses https all the time:

ERROR: Certificate verification error for drive.google.com: unable to get local issuer certificate
To connect to drive.google.com insecurely, use `–no-check-certificate’.
Unable to establish SSL connection.

If this is the case, install the “ca-certificates” package on your system, or replace “wget” in one line at the end of the script for “wget –no-check-certificate”.

The script runs on *nix… But a simple change of the $TMP variable to ‘.’ and the use of the above indicated “wget –no-check-certificate” change, should make it runnable on a windows with Perl and wget in the PATH.

I’d like to have posted the code here, but wordpress templates are a complete mess… I’d prefer plain text. Really.

Advertisements

15 thoughts on “Google Drive direct download of big files

  1. Thank you very much; It helped me and saved my time.
    I wonder what it’s license.
    I’m working on a bash script that downloads an entire google drive public folder recursively, and it works fine. Soon I’ll upload it to github; but it will be better if your script is included.
    Thanks :)

  2. Pingback: Google Drive Direct Downloads of large files | Heimic

  3. Pingback: Google Drive Direct Download of large files | Heimic

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s