Page 1 of 3

Data access stopped due to IP adress blocked

Posted: Thu Aug 23, 2018 8:34 am America/New_York
by jstum
Good evening,

Our access to your server was impossible during two consecutive days :
from 21 August 20h UTC to 22 August 10h UTC
from 22 August 14h30 UTC to 23 August 10h UTC

It looks like our IP adress (xx.xxx.xx.xx) is blocked

We had similar issues in the past (see https://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=8432 and https://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=8930), due to bad or excessive number of wget request leading to ERROR 404

But this time, we didn't find anymore ERROR 404 on our side. We also checked our IR support team, and everything is fine.

Could you tell us what happened ? and remove the blocking ?

Thank you for your help,

Jacques

Data access stopped due to IP adress blocked

Posted: Thu Aug 23, 2018 10:25 am America/New_York
by dana.r.wilson
Jacques
You should have access now.  You were blocked because we kept getting to many errors from your ip address.

Data access stopped due to IP adress blocked

Posted: Fri Aug 24, 2018 6:21 am America/New_York
by jstum
Hi Dana,
Unfortunately, the access has not been permitted last night, so we would appreciate if you could check it again.
Is it possible to have more detail on the "many errors from your ip address" ? We must seek what process on our side has been responsible for that
Thanks

Jacques

Data access stopped due to IP adress blocked

Posted: Fri Aug 24, 2018 7:55 am America/New_York
by dana.r.wilson
Jacques this was the response I received from the network admin.  I will ask the admins to check again.
The ip address, xx.xxx.xx.xx , was automatically blocked on 21
August and 22 August for generating too many errors on the web
servers. Errors could be generated by looking for files that do not
exist to downloading too many current times leading to spurious error
503's. An error is a code other than a 200, 301, and 304.

The automated block list is cleared every 5am EST5EDT.

As of today, xx.xxx.xx.xx is not currently blocked.

Data access stopped due to IP adress blocked

Posted: Fri Aug 24, 2018 9:00 am America/New_York
by dana.r.wilson
The system admin team have manually unblocked xx.xxx.xx.xx this morning.

xx.xxx.xx.xx has been automatically blocked every day for the past four(4) days.

It looks like the client is opening up 20 connections to download random parts of every single file they request. This multipart
download is causing many "206" error codes which are triggering the temporary blocks.  Multipart client downloads put extra load on our
system and are considered unnecessary. We do not limit the download throughput of the client. I would suggest not using any download
accelerators and download the files serially, one file at a time using a single connection.

# For example, download a file using a single wget request
wget -c https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.P2002165.0030_1.PDS.bz2

If the client script is still causing errors our system will automatically block the ip again.

Data access stopped due to IP adress blocked

Posted: Fri Aug 24, 2018 9:41 am America/New_York
by jstum
Thank you Dana for this information.

We just made modifications to the way we download the ancillary files needed by our data processing (download once for all instead of each data processing downloading the ancillary data) : this should reduce drastically the number of wget requests. Fingers crossed...

Regards,

Jacques

Data access stopped due to IP adress blocked

Posted: Tue Aug 28, 2018 6:23 am America/New_York
by jstum
Hello,

Sorry for this new post, but our IP address has been blocked again from yesterday 15hUTC to today 10hUTC, after three days with successfull connection
As we have the feeling to make correct wget requests now, we would like to obtain from you more specific information from your logs about some unauthorized wget command sent by us to your server (identifying the files we attempt to download, and the time tag of the wget command would help a lot).
On our side, we are again going to limit the number of wget commands we send (adding a timeout in wget, adding sleeps in our scripts to avoid repeating wget command in a smal time interval...)

Your help is much appreciated,

Jacques

Data access stopped due to IP adress blocked

Posted: Wed Aug 29, 2018 4:06 pm America/New_York
by OB.DAAC - SeanBailey
Jacques,

The issue is that your IP is making an inordinate number of partial data connections, and accessing zero bytes of data for each connection.
For a single VIIRS geolocation file (V2018239015400.GEO-M_SNPP.nc), your IP downloaded the full file 11 times, but made 225 connections, 214 of them look like this:

xx.xxx.xx.xx oceandata.sci.gsfc.nasa.gov - [27/Aug/2018:10:29:53 -0400] "GET /cgi/getfile/V2018239015400.GEO-M_SNPP.nc HTTP/1.1" 206 0 "-" "-"

This is occurring for every file you access.  I hope this helps you sort out what your script is doing that is odd.

Sean

Data access stopped due to IP adress blocked

Posted: Thu Aug 30, 2018 9:20 am America/New_York
by jstum
Hello Sean,

Thank you for this very useful information. We identified the problem in our scripts, which was due to unsufficient wget options
Before, our wget options were limited to : --continue
Now we just added : --timeout 300 --tries 2.
Option timeout will limit the access duration to 5mn, and option tries will limit the number of attempts to download a file. By default its 20 times, it will be now 2 times

This modification has been installed in all scripts

We hope that the blocking of our IP will be now removed soon

Regards,

Jacques

Data access stopped due to IP adress blocked

Posted: Tue Sep 18, 2018 2:05 pm America/New_York
by adybbroe
Hi,

We have had access problems for quite some days now. Unsure exactly how long, but probably a couple of weeks.
It is hitting operational production quite hard now! :-(
We have problems both getting MODIS ocean color data as well as ancillary data needed to run SeaDAS (on locally received Terra/Aqua data).

We fetch the MODIS level-1 data like this:

wget --post-data="subID=1527&subType=2&addurl=1&results_as_file=1&sdate=`date --date="yesterday" +"%Y-%m-%d"`" -O - https://oceandata.sci.gsfc.nasa.gov/api/file_search | wget -c -i -

And we fetch ancillary data from here:

https://oceandata.sci.gsfc.nasa.gov/Ancillary/LUTs/modis/

We fetch the files from python when the local file is 2 weeks old. Like this:
        try:
            usock = urllib2.urlopen(URL + filename)
        except urllib2.HTTPError:
            LOG.warning("Failed opening file " + filename)
            continue

        data = usock.read()
        usock.close()
        LOG.info("Data retrieved from url...")

The MODIS level-1 files are being fetch from a cron-type job, where we make an attempt to see of new data needs to be downloaded. So, we do regular attempts. Can't remember how often we try, but think it is at least every hour. We can check if this is important. We suspect that we are being blocked at the moment. This has happened before, unfortunately.

The IP address I tried just now, and which fails is this:
xx.xxx.xx.xx

We have tried from various servers inhouse, and it fails (almost) everywhere. So, suspect you block us on a set of IPs?

Doing it from this IP adress here is okay:
xx.xxx.xx.xx

Adding to the strange behaviour, and which made it difficult for us to capture that the problems are only related to blockings is that asking [url=]https://downforeveryoneorjustme.com/
[/url] says both this site https://oceandata.sci.gsfc.nasa.gov/ and this site [url=]https://oceancolor.gsfc.nasa.gov
[/url] are down!???

Grateful for any help!
Adam