Get Level0 files size from curl command line

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
madjidhadjal
Posts: 3
Joined: Thu Mar 28, 2024 11:22 am America/New_York
Answers: 0

Get Level0 files size from curl command line

by madjidhadjal » Mon Jul 08, 2024 1:43 pm America/New_York

Hello,

I would like to check if a file I downloaded (say https://oceandata.sci.gsfc.nasa.gov/getfile/A2002311192500.L0_LAC.bz2 ) was fully downloaded to avoid downloading it again during batch processing. For that, I think I need to know the size of the file on the server, usually saved in "Content-length". This information is not available when I try to do "curl -sI link".
If I follow the location in the header, I get :

###################
curl -sI https://urs.earthdata.nasa.gov//oauth/authorize?client_id=pDPu0awH156XLrK6VV0Y0w&response_type=code&redirect_uri=https://oceandata.sci.gsfc.nasa.gov/getfile/urs/
[1] 81559
[2] 81560
[2]+ Done response_type=code
[data]$ HTTP/1.1 302 Found
Server: nginx/1.22.1
Date: Mon, 08 Jul 2024 17:38:48 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Cache-Control: no-store
Pragma: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Location: https://urs.earthdata.nasa.gov/home
Set-Cookie: _urs-gui_session=xxxxx; path=/; expires=Tue, 09 Jul 2024 17:38:48 GMT; HttpOnly
X-Request-Id: xxxxx
X-Runtime: 0.011409
Strict-Transport-Security: max-age=31536000
################

Still no Content-length (maybe due to the nosniff?). Any solution?
Thanks

Filters:

OB ODPS - towens
Subject Matter Expert
Subject Matter Expert
Posts: 450
Joined: Fri Feb 05, 2021 9:17 am America/New_York
Answers: 0
Been thanked: 7 times

Re: Get Level0 files size from curl command line

by OB ODPS - towens » Tue Jul 09, 2024 11:48 am America/New_York

Since the files are bz2 compressed, you can run the

Code: Select all

bunzip2 -t <file>
test command. If the download is incomplete, the file will not have a valid bz2 structure and the test will return a non-zero exit status.

Tommy

madjidhadjal
Posts: 3
Joined: Thu Mar 28, 2024 11:22 am America/New_York
Answers: 0

Re: Get Level0 files size from curl command line

by madjidhadjal » Wed Jul 10, 2024 4:44 pm America/New_York

That would be a lovely solution if it would not take so much time to process. We are speaking at 10-20 seconds to check a single file (700 mb each), not realistic when batch processing thousands / more.

My initial step was to do something similar with linux built in sha1sum function, but many files (take a random MODIS AQUA L0 file) do not have a valid hash key, so I cannot use this approach. Open to suggestions.

OB ODPS - towens
Subject Matter Expert
Subject Matter Expert
Posts: 450
Joined: Fri Feb 05, 2021 9:17 am America/New_York
Answers: 0
Been thanked: 7 times

Re: Get Level0 files size from curl command line

by OB ODPS - towens » Sat Jul 13, 2024 10:57 pm America/New_York

Use wget and check the return code on the transfer?

Code: Select all

#!/bin/bash
url=https://oceandata.sci.gsfc.nasa.gov/getfile
file=MOD00.P2002165.0000_1.PDS.bz2
wget ${url}/${file} &> /dev/null
if [[ "$?" != 0 ]]; then
    echo "Error downloading $file"
else
    echo "Success"
fi
Either way, there is no chance of you processing a partial file, because the bunzip2 will fail.

Tommy

Post Reply