Troubleshooting Data Retrievals

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
mshatley
Posts: 10
Joined: Thu Sep 30, 2010 10:27 am America/New_York

Troubleshooting Data Retrievals

by mshatley » Fri Mar 08, 2019 7:58 am America/New_York

I have been downloading Aqua L1A granules over a study area and am running into connection issues I could use some help with. I'm using the list of granules from the Level 1&2 browser tool as input to wget to retrieve the granules. The first few transfers work perfectly but then I start seeing noticeable delays during the connection process in wget. When this starts happening, I can see between a 10 - 20 second delay where wget hangs on connecting to the site for each granule.

After I get the granules downloaded, I'm also seeing sporadic failures(some granules fail, while others succeed) retrieving ancillary files required for geolocation and processing.  If I try to download these files on a different machine, the download succeeds. Output from getanc.py for both machines is below(getanc.py on both machines report version 2.1).

In the past(few months ago), I've been able to do these types of transfers/processes without issue. I'm trying to narrow down the cause of the problem and was wondering if you could confirm if there are limits to the frequency of data retrieval that would lead to the above issues?

Thanks for your help.

Machine 1:
getanc.py -v A2019066150500.L1A_LAC
Searching database: /home/user/anc.db
Determining pass start and end times...
Aqua

Input file: A2019066150500.L1A_LAC
Start time: 2019066150500
End time: 2019066150959

Error! could not establish a network connection. Check your network connection.
If you do not find a problem, please try again later.


Machine 2:
getanc.py A2019066150500.L1A_LAC -v -r
Searching database: /opt/ocssw/var/ancillary_data.db
Determining pass start and end times...
Aqua

Input file: A2019066150500.L1A_LAC
Sensor    : aqua
Start time: 2019066150500
End time  : 2019066150959

  Found: /opt/ocssw/var/anc/2019/066/N201906618_MET_NCEP_6h.hdf
  Found: /opt/ocssw/var/anc/2019/066/N201906618_MET_NCEP_6h.hdf
  Found: /opt/ocssw/var/anc/2019/066/N201906612_MET_NCEP_6h.hdf
  Found: /opt/ocssw/var/anc/2019/065/N201906500_SEAICE_NSIDC_24h.hdf
  Found: /opt/ocssw/var/anc/2019/064/N2019064_SST_OIV2AV_24h.nc

Created 'A2019066150500.L1A_LAC.anc' l2gen parameter text file:

icefile=/opt/ocssw/var/anc/2019/065/N201906500_SEAICE_NSIDC_24h.hdf
met1=/opt/ocssw/var/anc/2019/066/N201906612_MET_NCEP_6h.hdf
met2=/opt/ocssw/var/anc/2019/066/N201906618_MET_NCEP_6h.hdf
met3=/opt/ocssw/var/anc/2019/066/N201906618_MET_NCEP_6h.hdf
sstfile=/opt/ocssw/var/anc/2019/064/N2019064_SST_OIV2AV_24h.nc

*** WARNING: The following ancillary data types were missing or are not optimal:  OZONE
*** Beware that certain MET and OZONE files just chosen by this program are not optimal.
*** For near real-time processing the remaining files may become available soon.

Tags:

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1324
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 1 time

Troubleshooting Data Retrievals

by OB.DAAC - SeanBailey » Fri Mar 08, 2019 4:00 pm America/New_York

There may be an issue with the way you're using wget.  The pause you are seeing is likely due to a firewall limit we have.  We allow a given IP only so many states on the firewall - once the IP reaches the limit, it has to wait until previous states expire.  That can look like a pause.  You want to make sure that you reuse your connection whenever possible.  Serially calling wget may be initiating a new connection with each call...that'd be bad for you.

Sean

tintinabula
Posts: 11
Joined: Mon Jan 03, 2005 11:08 am America/New_York

Troubleshooting Data Retrievals

by tintinabula » Fri Mar 08, 2019 4:29 pm America/New_York

If wget is being called once per file then you are going to reach the upper limit on the amount of concurrent connections we allow. A suggestion would be to create a list of urls in a single text file and then pass that text file to wget.

With the list of urls to download in a text file, wget can import that list using "wget --input-file=file" . The import file method allows wget to connect a single time and download the list serially avoiding the concurrent connection limit. I would also suggest adding the "--no-clobber" option to wget so the client can check if you already have a copy of the file and not ask for the file again. Also, avoid using wget's "--continue" option as this can corrupt your local files if the client believes the local file is a different size compared to the http header from the server. From the wget man page, "...if the file is bigger on the server because its been changed, as opposed to just appended to, you'll end up with a garbled file.  Wget has no way of verifying that the local file is really a valid prefix of the remote file."

mshatley
Posts: 10
Joined: Thu Sep 30, 2010 10:27 am America/New_York

Troubleshooting Data Retrievals

by mshatley » Mon Mar 11, 2019 12:18 pm America/New_York

Thanks for the update and suggestions.

Currently, I'm grabbing the list of granules from the file on search results page of the Level 1&2 browser(See https://oceancolor.gsfc.nasa.gov/cgi/browse.pl/filelist.1552320866.txt?sub=filenamelist&id=1552320866.32730&prm=CHL). I pipe this into wget (wget --limit-rate=10M -B https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ --content-disposition -i - )  In the output, I see wget reporting it is reusing the existing connection. This is all based off a script that was on the forums a few years ago and was working up until recently. I'll take a look into using alternative retrieval methods and see if I find any improvement.

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1324
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 1 time

Troubleshooting Data Retrievals

by OB.DAAC - SeanBailey » Wed Mar 13, 2019 12:06 pm America/New_York

You may want to update your ocssw/scripts...in debugging another issue it was discovered that he code that selects and retrieves the ancillary files
(specifically scripts/modules/anc_utils.py) was no longer doing a unique list, so files which appear duplicated in the return (e.g. MET) would be downloaded twice.
Firewall rules are in place that make that a bad idea....

The code has been fixed.  Looking back at this thread makes me think you might have been bitten by this.

Sean

Post Reply