OCSSW ancillary data server unresponsive
-
- Posts: 25
- Joined: Tue Aug 09, 2005 12:58 pm America/New_York
OCSSW ancillary data server unresponsive
The OCSSW ancillary data server has been unresponsive for more than 48 hours.
Example 1: The "Direct Data Access" link at https://oceancolor.gsfc.nasa.gov/data/find-data/ is unresponsive, i.e.
https://oceandata.sci.gsfc.nasa.gov/directdataaccess/
Example 2: Scripted ancillary data download is unresponsive (it times out) as shown below
[oper@leodbp1 ~]$ time getanc -s 2024200000000 --verbose
ancillary_data.db
Searching database: /home/oper/dbvm/apps/ocssw/var/log/ancillary_data.db
Input file: None
Sensor : None
Start time: 2024-07-18T00:00:00
End time : None
OBPG session started
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 385, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 381, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib64/python3.6/http/client.py", line 1365, in getresponse
response.begin()
File "/usr/lib64/python3.6/http/client.py", line 320, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python3.6/http/client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib64/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/usr/lib64/python3.6/ssl.py", line 1005, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib64/python3.6/ssl.py", line 867, in read
return self._sslobj.read(len, buffer)
File "/usr/lib64/python3.6/ssl.py", line 590, in read
v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 307, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
[Previous line repeated 2 more times]
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Max retries exceeded with url: /api/anc_data_api/?&m=0&s=2024-07-18T00:00:00&e=2024-07-18T00:05:00&missing_tags=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)",))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/oper/dbvm/apps/ocssw/bin/getanc", line 149, in <module>
exit(main())
File "/home/oper/dbvm/apps/ocssw/bin/getanc", line 142, in main
g.findweb()
File "/data1/oper/dbvm/apps/ocssw/bin/seadasutils/anc_utils.py", line 511, in findweb
verbose=self.verbose
File "/data1/oper/dbvm/apps/ocssw/bin/seadasutils/ProcUtils.py", line 105, in httpdl
with obpgSession.get(urlStr, stream=True, timeout=timeout, headers=headers) as req:
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 548, in get
return self.request('GET', url, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 535, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 648, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Max retries exceeded with url: /api/anc_data_api/?&m=0&s=2024-07-18T00:00:00&e=2024-07-18T00:05:00&missing_tags=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)",))
real 1m0.497s
user 0m0.131s
sys 0m0.013s
Example 1: The "Direct Data Access" link at https://oceancolor.gsfc.nasa.gov/data/find-data/ is unresponsive, i.e.
https://oceandata.sci.gsfc.nasa.gov/directdataaccess/
Example 2: Scripted ancillary data download is unresponsive (it times out) as shown below
[oper@leodbp1 ~]$ time getanc -s 2024200000000 --verbose
ancillary_data.db
Searching database: /home/oper/dbvm/apps/ocssw/var/log/ancillary_data.db
Input file: None
Sensor : None
Start time: 2024-07-18T00:00:00
End time : None
OBPG session started
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 385, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 381, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib64/python3.6/http/client.py", line 1365, in getresponse
response.begin()
File "/usr/lib64/python3.6/http/client.py", line 320, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python3.6/http/client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib64/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
File "/usr/lib64/python3.6/ssl.py", line 1005, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib64/python3.6/ssl.py", line 867, in read
return self._sslobj.read(len, buffer)
File "/usr/lib64/python3.6/ssl.py", line 590, in read
v = self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 307, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 668, in urlopen
**response_kw)
[Previous line repeated 2 more times]
File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Max retries exceeded with url: /api/anc_data_api/?&m=0&s=2024-07-18T00:00:00&e=2024-07-18T00:05:00&missing_tags=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)",))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/oper/dbvm/apps/ocssw/bin/getanc", line 149, in <module>
exit(main())
File "/home/oper/dbvm/apps/ocssw/bin/getanc", line 142, in main
g.findweb()
File "/data1/oper/dbvm/apps/ocssw/bin/seadasutils/anc_utils.py", line 511, in findweb
verbose=self.verbose
File "/data1/oper/dbvm/apps/ocssw/bin/seadasutils/ProcUtils.py", line 105, in httpdl
with obpgSession.get(urlStr, stream=True, timeout=timeout, headers=headers) as req:
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 548, in get
return self.request('GET', url, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 535, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 648, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Max retries exceeded with url: /api/anc_data_api/?&m=0&s=2024-07-18T00:00:00&e=2024-07-18T00:05:00&missing_tags=1 (Caused by ReadTimeoutError("HTTPSConnectionPool(host='oceandata.sci.gsfc.nasa.gov', port=443): Read timed out. (read timeout=10.0)",))
real 1m0.497s
user 0m0.131s
sys 0m0.013s
Filters:
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Re: OCSSW ancillary data server unresponsive
Me, too.
Using --timeout=60 didn't solve the issue.
And the running time of getanc and modis_atteph command took ten or hundred times long of the timeout setting to finish or return an error message, in the last 24 hours, I made only about 210 requests.
I tried to post earlier but got a Web Page Blocked ! error.
Yuyuan
Using --timeout=60 didn't solve the issue.
And the running time of getanc and modis_atteph command took ten or hundred times long of the timeout setting to finish or return an error message, in the last 24 hours, I made only about 210 requests.
I tried to post earlier but got a Web Page Blocked ! error.
Yuyuan
-
- Posts: 396
- Joined: Mon Jun 22, 2020 5:24 pm America/New_York
- Has thanked: 8 times
- Been thanked: 8 times
Re: OCSSW ancillary data server unresponsive
Thanks for reporting your findings! The team has begun working on a solution to improve the response times.
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Re: OCSSW ancillary data server unresponsive
I also found freezing connections to the subscription API, the command looked like:
The command could freeze when my program called it, I ran it again in console and got the results returned, but my program still freeze. I had to kill the curl process by pid, then my program could move on.
I wonder if this and the anc query issue came across. While I am asking our IT guys to check if it's our Network issue, please let me know what do you find on your side.
Thanks
Yuyuan
Code: Select all
curl --silent -L --connect-timeout 5 --retry 5 --retry-max-time 40 -d subID=1067&sdate=2024-08-26 00:00:00&edate=2024-08-28 23:59:59&results_as_file=1 https://oceandata.sci.gsfc.nasa.gov/api/file_search
I wonder if this and the anc query issue came across. While I am asking our IT guys to check if it's our Network issue, please let me know what do you find on your side.
Thanks
Yuyuan
-
- Posts: 25
- Joined: Tue Aug 09, 2005 12:58 pm America/New_York
Re: OCSSW ancillary data server unresponsive
To be clear, the problem is not that the ancillary data server is slow. The problem is that the ancillary data server is unresponsive; it does not return any data. It just times out.
-
- Posts: 1519
- Joined: Wed Sep 18, 2019 6:15 pm America/New_York
- Been thanked: 9 times
Re: OCSSW ancillary data server unresponsive
Liam,
Yes, there is an issue that we've not yet rooted out. While not an ideal solution, you can increase the timeout period with getanc (e.g. --timeout=90; the default is 30). It should never take 90 seconds, or even 30, or even 1, but it has been. We'll keep roto-rootering until we find and clear the clog, but I have no idea how long it will take...it's a frustrating one...
Oh and to respond to Yuyuan, it's not just the ancillary service...
Sean
Yes, there is an issue that we've not yet rooted out. While not an ideal solution, you can increase the timeout period with getanc (e.g. --timeout=90; the default is 30). It should never take 90 seconds, or even 30, or even 1, but it has been. We'll keep roto-rootering until we find and clear the clog, but I have no idea how long it will take...it's a frustrating one...
Oh and to respond to Yuyuan, it's not just the ancillary service...
Sean
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Re: OCSSW ancillary data server unresponsive
For subscription, my temporary solution is to force curl stop at 60 seconds. Because it's random, so if I don't grab this subID now, I would probably get it at the next hour.oo_processing wrote: ↑Wed Aug 28, 2024 9:28 am America/New_York I also found freezing connections to the subscription API, the command looked like:The command could freeze when my program called it, I ran it again in console and got the results returned, but my program still freeze. I had to kill the curl process by pid, then my program could move on.Code: Select all
curl --silent -L --connect-timeout 5 --retry 5 --retry-max-time 40 -d subID=1067&sdate=2024-08-26 00:00:00&edate=2024-08-28 23:59:59&results_as_file=1 https://oceandata.sci.gsfc.nasa.gov/api/file_search
I wonder if this and the anc query issue came across. While I am asking our IT guys to check if it's our Network issue, please let me know what do you find on your side.
Thanks
Yuyuan
Code: Select all
curl --max-time 60 ...
Yuyuan
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Re: OCSSW ancillary data server unresponsive
Looks like this has been fixed. Just out of curiosity, did you find out the cause?
Yuyuan
Yuyuan
-
- Posts: 1519
- Joined: Wed Sep 18, 2019 6:15 pm America/New_York
- Been thanked: 9 times
Re: OCSSW ancillary data server unresponsive
Yes, it has been fixed
There are a lot of under the hood changes we've made - not all related to this problem, but allowed us to get past this problem. We believe the issue boiled down to a bad index on a database table.
Sean

There are a lot of under the hood changes we've made - not all related to this problem, but allowed us to get past this problem. We believe the issue boiled down to a bad index on a database table.
Sean