Recommendations to avoid Blacklisting

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
dem1
Posts: 82
Joined: Mon Nov 28, 2005 4:49 am America/New_York
Answers: 0

Recommendations to avoid Blacklisting

by dem1 » Fri Jan 21, 2022 3:45 am America/New_York

Hi,

We have a download chain since a long time to download your L2 products, and it seems that we were blacklisted from (nearly) 2022-01-19 17:00 UTC to 2022-01-20 12:00 UTC. I think this is related to the recent reprocessing concerning ozone issue because our chain have downloaded more files, but our chain applies the following logic to avoid blacklisting and we try to understand why is it not enough: our download chain ensure a minimum delay of 5 seconds between each HTTP request to your website. Is it enough? Because sometimes we have this kind of error:
HTTPError: 429 Client Error: Too Many Requests for url: https://oceandata.sci.gsfc.nasa.gov/ob/ ... SNPP_OC.nc
and I think we did not have this error in the past.
I've just tried to increase the delay to 10 seconds and it seems better for the moment.

Maybe we could improve our download approach to avoid these blacklistings? What delay do yo recommend between 2 HTTP requests? Are there other recommendations to avoid blacklisting?

Note that it is also difficult on our side to ensure that no other people from our company does other manual access to your website because the same IP is also used for general usage (firefox, etc...). Do you have recommendations about this problem?

Many thanks in advance,
Julien

Tags:

dem1
Posts: 82
Joined: Mon Nov 28, 2005 4:49 am America/New_York
Answers: 0

Re: Recommendations to avoid Blacklisting

by dem1 » Fri Jan 21, 2022 3:55 am America/New_York

I just had now several time the same error again even with the 10 seconds delay so it seems not enough.

dem1
Posts: 82
Joined: Mon Nov 28, 2005 4:49 am America/New_York
Answers: 0

Re: Recommendations to avoid Blacklisting

by dem1 » Fri Jan 21, 2022 8:37 am America/New_York

After analysis of our global HTTP requests logs we finally found that another service was also wrongly generating a lot of requests... so it seems that the issue is on our side, sorry!
Anyway we are still interested by any recommendation to avoid blacklisting.

christophermoellers
Posts: 4
Joined: Fri Jul 10, 2020 10:58 am America/New_York
Answers: 0

Re: Recommendations to avoid Blacklisting

by christophermoellers » Fri Jan 21, 2022 10:17 am America/New_York

NASA Ocean Color servers use "rate limiting" to protect the servers and allow fair access from all clients.

The error code 429 "Too Many Requests" response status code indicates the user has sent too many requests in a given amount of time. Note that server side rate limiting is dynamic depending on server side load and the amount of clients currently being serviced. So, sometimes the rate limit is sixty(60) requests per minute while another time the limit might be twenty(20) request per minute.

Depending on your download client you should be able to limit requests so that the downloads avoid the 429 error code or retry the download when a 429 is received.

For example, wget has an option to add a wait time if needed. "-w 5" would have wget sleeping for five(5) seconds between download requests.

Another option is to allow wget to watch for 429 "Too Many Requests" errors and sleep for a few seconds to get below the server's dynamic rate limiting. The following command instructs wget to wait for ten(10) seconds before retrying the download when the server sends back an error 429 "Too Many Requests".

wget --waitretry=10 --retry-on-http-error=429 <URL>

Check the wget man page for more details.

Hope this helps.

dem1
Posts: 82
Joined: Mon Nov 28, 2005 4:49 am America/New_York
Answers: 0

Re: Recommendations to avoid Blacklisting

by dem1 » Tue Jan 25, 2022 10:30 am America/New_York

Thanks a lot Christopher for all these very instructive information!

Post Reply