Page 1 of 1

Search and download data files

Posted: Fri Oct 15, 2021 1:59 pm America/New_York
by arlindo.arriaga
Dear Lady/Sir

I had left several months ago some software running around the clock to work with oceancolor binned data files from Aqua, Terra and SNPP satellites at the Portuguese Institute for the Ocean and Atmosphere, Lisbon, Portugal.


Recently, a colleague asked me to find out why the control script could not find and download data files as before. This is then the motivation of my message addressed now to the Forum.


My script has worked fine with curl (for a long time) and two local files, .netrc and .urs_cookies as explained in your Search and Download instructions available in Direct Data section of the OBPG website.


Inside the curl command I first call your function search_file to check if the given filename exists. If it exists then I issue a second curl command. In this curl command I use your function get_file for downloading it. In case the given filename does not exists (please note that your services have modified the names of many oceancolor data files about 2 years ago) then I take the alternative name and repeat the curl command to activate your function search_file. In case of not found, I still try the same last name for the Near Real Time version.

This has worked quite well for many months until about last summer or so (I believe when the OBPG website was integrated in the Earthdata website).


Please take a look to my curl commands below, where I insert calls to your own function search_file (and get_file). In my script, after calling the curl command for searching the given filename, the connection aborts 60 seconds after CONNECT. The curl issues the error message


curl: (56) Received HTTP code 503 from proxy after CONNECT


That error message is issued after the curl command for searching the file

result1=$(curl -d "sensor=${SENSOR}&sdate=${YYYY1}-${MM1}-${DD1}&edate=${YYYY2}-${MM2}-${DD2}&dtype=L3b&addurl=1&results_as_file=1&search=${old_name}" https://oceandata.sci.gsfc.nasa.gov/api/file_search | head -1)


My questions are as follows, please:
(1) Is NASA server name correct ?
(2) May I still work with your "file_search" executable as before, or some permission is required ?
(3) In case the executable "file_search" is no longer available, could you please tell me what should I do to checking if a given filename exists before trying to download the respective file, please ?
(4) May I still work with your function "get_file" to download a file using the curl command like the following?

$HOME/local/bin/curl -b $HOME/.urs_cookies -c $HOME/.urs_cookies -L -n -O https://oceandata.sci.gsfc.nasa.gov/cgi ... /$old_name



Please note the locations api of file_search, and cgi of get_file functions. Are they still correct?


I thank you for the kindness of your attention and help.
Kind regards and good weekend.
Arlindo Arriaga
PhD Meteorology UW-Madison 1991

Re: Search and download data files

Posted: Fri Oct 15, 2021 3:31 pm America/New_York
by gnwiii
There have been changes, see <https://oceancolor.gsfc.nasa.gov/data/download_methods/>. For me, the "obdaac_download" python script mentioned in this document has been more reliable than my existing scripts using curl or wget. For bulk downloads I prefer to use the search mechanism to get a list with checksums so I can verify that the downloaded files are correct.

Re: Search and download data files

Posted: Fri Oct 15, 2021 4:57 pm America/New_York
by arlindo.arriaga
Dear gnwiii

Thanks for your help. Unfortunately I have to stick with cURL now as I am retired and my colleague needs time to develop new scripts in Python - and the present scripts (in bash) are really too long. The best way to go is to fix now what is really needed in the present script (working with cURL) to allow resuming the acquisition of data files as soon as possible. Next, other colleagues may then plan their own work to adding the development of new scripts in Python (the present scripts generate data images with GMT...).


Thus, I would greatly appreciate if you could specify please what needs to be changed in my cURL command to activate the API function file_search (I compared it with the instructions in the section Search and Download and it seems my command looks OK). In case of the cURL command to run the function get_file, I understand the path is now different from the path in my bash command.

I thank you for the kindness of your attention and help.
Kind regards
Arlindo Arriaga

Re: Search and download data files

Posted: Mon Oct 18, 2021 8:13 am America/New_York
by OB.DAAC - amscott
My questions are as follows, please:
(1) Is NASA server name correct ?
(2) May I still work with your "file_search" executable as before, or some permission is required ?
(3) In case the executable "file_search" is no longer available, could you please tell me what should I do to checking if a given filename exists before trying to download the respective file, please ?
(4) May I still work with your function "get_file" to download a file using the curl command like the following?
1) Yes
2) Yes, you do need to have an Earthdata account and set up a .netrc file now.
3) file_search is available, see cURL example when you navigate to Data > Search and Download Methods page.
4) getfile still works

Re: Search and download data files

Posted: Mon Oct 18, 2021 11:43 am America/New_York
by gnwiii
arlindo.arriaga wrote: Fri Oct 15, 2021 4:57 pm America/New_York Thanks for your help. Unfortunately I have to stick with cURL now as I am retired and my colleague needs time to develop new scripts in Python - and the present scripts (in bash) are really too long. The best way to go is to fix now what is really needed in the present script (working with cURL) to allow resuming the acquisition of data files as soon as possible. Next, other colleagues may then plan their own work to adding the development of new scripts in Python (the present scripts generate data images with GMT...).
The python script can be run from a bash script, and should allow you to
simplify scripts that use curl, but curl does work for me most of the time.
When everything is working, python, wget (1 or 2), or curl are good, but
when the internet glitches, you see differences (which may be just default
timeout and retry parameters). You do need to follow the
current configuration details for wget or curl given in the NASA
document, while the python script needs no configuration.

Re: Search and download data files

Posted: Mon Oct 18, 2021 1:17 pm America/New_York
by arlindo.arriaga
Dear Alicia

Thanks for your information on thee Python script.

Meanwhile I could solve the problem as follows:
(1) I added to my profile the application reverb and all applications related with Ocean Biology

(2) O added the option -k to the two curl commands (for search_file and get file functions).

Data files could be downloaded again.

I do not know what exactly has solved the problem, the added applications or the -k option added to the curl command line...but all is working fine again.

Thanks for your attention.
Kind regards
Arlindo Arriaga