Data access stopped due to IP adress blocked

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
jstum
Posts: 83
Joined: Mon Jul 04, 2005 4:54 am America/New_York
Answers: 0
Been thanked: 1 time

Data access stopped due to IP adress blocked

by jstum » Thu Aug 23, 2018 8:34 am America/New_York

Good evening,

Our access to your server was impossible during two consecutive days :
from 21 August 20h UTC to 22 August 10h UTC
from 22 August 14h30 UTC to 23 August 10h UTC

It looks like our IP adress (xx.xxx.xx.xx) is blocked

We had similar issues in the past (see https://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=8432 and https://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=8930), due to bad or excessive number of wget request leading to ERROR 404

But this time, we didn't find anymore ERROR 404 on our side. We also checked our IR support team, and everything is fine.

Could you tell us what happened ? and remove the blocking ?

Thank you for your help,

Jacques

Tags:

dana.r.wilson
Posts: 71
Joined: Mon Apr 20, 2020 8:04 am America/New_York
Answers: 0

Data access stopped due to IP adress blocked

by dana.r.wilson » Thu Aug 23, 2018 10:25 am America/New_York

Jacques
You should have access now.  You were blocked because we kept getting to many errors from your ip address.

jstum
Posts: 83
Joined: Mon Jul 04, 2005 4:54 am America/New_York
Answers: 0
Been thanked: 1 time

Data access stopped due to IP adress blocked

by jstum » Fri Aug 24, 2018 6:21 am America/New_York

Hi Dana,
Unfortunately, the access has not been permitted last night, so we would appreciate if you could check it again.
Is it possible to have more detail on the "many errors from your ip address" ? We must seek what process on our side has been responsible for that
Thanks

Jacques

dana.r.wilson
Posts: 71
Joined: Mon Apr 20, 2020 8:04 am America/New_York
Answers: 0

Data access stopped due to IP adress blocked

by dana.r.wilson » Fri Aug 24, 2018 7:55 am America/New_York

Jacques this was the response I received from the network admin.  I will ask the admins to check again.
The ip address, xx.xxx.xx.xx , was automatically blocked on 21
August and 22 August for generating too many errors on the web
servers. Errors could be generated by looking for files that do not
exist to downloading too many current times leading to spurious error
503's. An error is a code other than a 200, 301, and 304.

The automated block list is cleared every 5am EST5EDT.

As of today, xx.xxx.xx.xx is not currently blocked.

dana.r.wilson
Posts: 71
Joined: Mon Apr 20, 2020 8:04 am America/New_York
Answers: 0

Data access stopped due to IP adress blocked

by dana.r.wilson » Fri Aug 24, 2018 9:00 am America/New_York

The system admin team have manually unblocked xx.xxx.xx.xx this morning.

xx.xxx.xx.xx has been automatically blocked every day for the past four(4) days.

It looks like the client is opening up 20 connections to download random parts of every single file they request. This multipart
download is causing many "206" error codes which are triggering the temporary blocks.  Multipart client downloads put extra load on our
system and are considered unnecessary. We do not limit the download throughput of the client. I would suggest not using any download
accelerators and download the files serially, one file at a time using a single connection.

# For example, download a file using a single wget request
wget -c https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.P2002165.0030_1.PDS.bz2

If the client script is still causing errors our system will automatically block the ip again.

jstum
Posts: 83
Joined: Mon Jul 04, 2005 4:54 am America/New_York
Answers: 0
Been thanked: 1 time

Data access stopped due to IP adress blocked

by jstum » Fri Aug 24, 2018 9:41 am America/New_York

Thank you Dana for this information.

We just made modifications to the way we download the ancillary files needed by our data processing (download once for all instead of each data processing downloading the ancillary data) : this should reduce drastically the number of wget requests. Fingers crossed...

Regards,

Jacques

jstum
Posts: 83
Joined: Mon Jul 04, 2005 4:54 am America/New_York
Answers: 0
Been thanked: 1 time

Data access stopped due to IP adress blocked

by jstum » Tue Aug 28, 2018 6:23 am America/New_York

Hello,

Sorry for this new post, but our IP address has been blocked again from yesterday 15hUTC to today 10hUTC, after three days with successfull connection
As we have the feeling to make correct wget requests now, we would like to obtain from you more specific information from your logs about some unauthorized wget command sent by us to your server (identifying the files we attempt to download, and the time tag of the wget command would help a lot).
On our side, we are again going to limit the number of wget commands we send (adding a timeout in wget, adding sleeps in our scripts to avoid repeating wget command in a smal time interval...)

Your help is much appreciated,

Jacques

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1464
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 5 times

Data access stopped due to IP adress blocked

by OB.DAAC - SeanBailey » Wed Aug 29, 2018 4:06 pm America/New_York

Jacques,

The issue is that your IP is making an inordinate number of partial data connections, and accessing zero bytes of data for each connection.
For a single VIIRS geolocation file (V2018239015400.GEO-M_SNPP.nc), your IP downloaded the full file 11 times, but made 225 connections, 214 of them look like this:

xx.xxx.xx.xx oceandata.sci.gsfc.nasa.gov - [27/Aug/2018:10:29:53 -0400] "GET /cgi/getfile/V2018239015400.GEO-M_SNPP.nc HTTP/1.1" 206 0 "-" "-"

This is occurring for every file you access.  I hope this helps you sort out what your script is doing that is odd.

Sean

jstum
Posts: 83
Joined: Mon Jul 04, 2005 4:54 am America/New_York
Answers: 0
Been thanked: 1 time

Data access stopped due to IP adress blocked

by jstum » Thu Aug 30, 2018 9:20 am America/New_York

Hello Sean,

Thank you for this very useful information. We identified the problem in our scripts, which was due to unsufficient wget options
Before, our wget options were limited to : --continue
Now we just added : --timeout 300 --tries 2.
Option timeout will limit the access duration to 5mn, and option tries will limit the number of attempts to download a file. By default its 20 times, it will be now 2 times

This modification has been installed in all scripts

We hope that the blocking of our IP will be now removed soon

Regards,

Jacques

adybbroe
Posts: 24
Joined: Tue Mar 20, 2012 4:43 pm America/New_York
Answers: 0

Data access stopped due to IP adress blocked

by adybbroe » Tue Sep 18, 2018 2:05 pm America/New_York

Hi,

We have had access problems for quite some days now. Unsure exactly how long, but probably a couple of weeks.
It is hitting operational production quite hard now! :-(
We have problems both getting MODIS ocean color data as well as ancillary data needed to run SeaDAS (on locally received Terra/Aqua data).

We fetch the MODIS level-1 data like this:

wget --post-data="subID=1527&subType=2&addurl=1&results_as_file=1&sdate=`date --date="yesterday" +"%Y-%m-%d"`" -O - https://oceandata.sci.gsfc.nasa.gov/api/file_search | wget -c -i -

And we fetch ancillary data from here:

https://oceandata.sci.gsfc.nasa.gov/Ancillary/LUTs/modis/

We fetch the files from python when the local file is 2 weeks old. Like this:
        try:
            usock = urllib2.urlopen(URL + filename)
        except urllib2.HTTPError:
            LOG.warning("Failed opening file " + filename)
            continue

        data = usock.read()
        usock.close()
        LOG.info("Data retrieved from url...")

The MODIS level-1 files are being fetch from a cron-type job, where we make an attempt to see of new data needs to be downloaded. So, we do regular attempts. Can't remember how often we try, but think it is at least every hour. We can check if this is important. We suspect that we are being blocked at the moment. This has happened before, unfortunately.

The IP address I tried just now, and which fails is this:
xx.xxx.xx.xx

We have tried from various servers inhouse, and it fails (almost) everywhere. So, suspect you block us on a set of IPs?

Doing it from this IP adress here is okay:
xx.xxx.xx.xx

Adding to the strange behaviour, and which made it difficult for us to capture that the problems are only related to blockings is that asking [url=]https://downforeveryoneorjustme.com/
[/url] says both this site https://oceandata.sci.gsfc.nasa.gov/ and this site [url=]https://oceancolor.gsfc.nasa.gov
[/url] are down!???

Grateful for any help!
Adam

Post Reply