Welcome to the Earthdata Forum! Here, the scientific user community and subject matter experts from NASA Distributed Active Archive Centers (DAACs), and other contributors, discuss research needs, data, and data applications.
by s_barzin » Sat Oct 31, 2020 4:58 pm America/New_York
I am trying to download relatively large amounts of data (granules of the Terra & Aqua MAIAC data). I execute the bash script on a high performance computing cluster to load the data directly onto the system, however most scripts just stop somewhere along the way due to connection issue, but given that I load this on a high performance computing cluster that is really reliable I am certain the the connection issue is not on this side but on the side of NASA server. How can I load the data without having to constantly check if it has timed out (and then restart)?
All ideas are much appreciated
by lien » Thu Nov 05, 2020 3:36 pm America/New_York
In some instances while your system is rifling through the files downloading them there maybe a split second between when the last file downloads and the new one begins. When the traffic is really busy near or at capacity of about 2000 connections another user may connect. Also, the system limits a single IP address to 20 at a time, could your script be trying to go over that? If these are not it, can you send your script to email@example.com
by s_barzin » Thu Nov 05, 2020 4:00 pm America/New_York
I have noticed that there is a split delay between each file download, but that is not a problem. Additionally, I have generally 5 download scripts running at the same time, so that shouldn't be the issue either. I have also noticed that the scripts stop at certain times of the day with a higher probability than during other times of the day. Essentially, the scripts runs through the downloads, but then it hits some file and the speed goes to 0 (this can be in the middle of the download of the specific file, or at the beginning), and then I see it stuck at this 0 download speed for about 15 minutes and then it terminates; then I have to restart the whole script again, which is quite frustrating and causes a huge delay. If the issue may be that the whole server is busy, is there an option to amend the script in such a way that instead of terminating the script, it just has a long waiting time? (I have tested the standard curl options, but they don't seem to be making any difference so I assume the connection is terminated from the server's side rather than my side and thus not within my control unfortunately). I can send me script over, but these are just the scripts that the earth data catalogue generates when I indicate a specific spatial rectangle and date range (so they are those scripts generated through the NASA earth data system rather than anything I write personally), should I still send them?
by lien » Fri Nov 06, 2020 10:24 am America/New_York
If you can send your IP address through the LP DAAC contact: firstname.lastname@example.org
We can track your pinging our server. First though, I just want to make sure we are talking about the same server: https: //e4ftl01.cr.usgs.gov/MOTA
by s_barzin » Fri Nov 06, 2020 11:58 am America/New_York
many thanks.I will send an email with the IP address, is there any specific subject I should put in the email?
The server is the following: https://e4ftl01.cr.usgs.gov//MODV6_Dal_H/MOTA/
so I assume this would the same as you've mentioned (?)
thanks again for the help in solving this issue, much appreciated!