database queries killed?

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
oo_processing
Posts: 304
Joined: Wed Apr 06, 2005 12:11 pm America/New_York
Answers: 0
Has thanked: 6 times

database queries killed?

by oo_processing » Mon Apr 01, 2019 11:11 am America/New_York

To avoid processing problems in a clustered computing environment, I am trying to download all the anc files in advance. I run modis_atteph.py and getanc.py with the no-download option first. When it returns, I parse it and check to see if the files exist. If they do not exist, the command is run again without the no-download option. Here is some logging from the program. Note that even the no-download option causes a Error! could not establish a network connection. Check your network connection.   response, so I guess that your firewall is killing these too? What is the best way to handle this? NOTE that I have a 5 second sleep statement before each call. I still have errors. Thanks, Brock

$PDS:MOD00.A2004003.1530_1.PDS.bz2                                                                    
All required modis_atteph.py files are present                                                        
All required getanc.py files are present
                                                             
$PDS:MOD00.A2004003.1535_1.PDS.bz2                                                                    
All required modis_atteph.py files are present                                                        
All required getanc.py files are present
                                                              
$PDS:MOD00.A2004003.1540_1.PDS.bz2                                                                    
All required modis_atteph.py files are present                                                        
All required getanc.py files are present
                                                              
$PDS:MOD00.A2004003.1545_1.PDS.bz2                                                                    
Error! could not establish a network connection. Check your network connection.      (This is a first attempt with the no-download option)                  
If you do not find a problem, please try again later.                                                 
All required modis_atteph.py files are __NOT__ present                                                
Issuing this command: modis_atteph.py -m terra -s 2004003154500 -e 2004003154500 --timeout=30 (This is a second attempt without the no-download option fails)        
Error! could not establish a network connection. Check your network connection.                       
If you do not find a problem, please try again later.                                                 
All required getanc.py files are present

$PDS:MOD00.A2004003.1710_1.PDS.bz2    
All required modis_atteph.py files are present
All required getanc.py files are present    

$PDS:MOD00.A2004003.1715_1.PDS.bz2          
All required modis_atteph.py files are present
All required getanc.py files are present
    
$PDS:MOD00.A2004003.1720_1.PDS.bz2          
Error! could not establish a network connection. Check your network connection.     (This is a first attempt with the no-download option)   
If you do not find a problem, please try again later.                        
All required modis_atteph.py files are present                               
All required getanc.py files are __NOT__ present                             
Issuing this command: getanc.py -m terra -s 2004003172000 -e 2004003172000 --timeout=30   (This is a second attempt without the no-download option works
icefile=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N200400300_SEAICE_NSIDC_24h.hdf
met1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N200400312_MET_NCEPR2_6h.hdf   
met2=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N200400318_MET_NCEPR2_6h.hdf
met3=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N200400318_MET_NCEPR2_6h.hdf
ozone1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N200400300_O3_EPTOMS_24h.hdf
ozone2=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/004/N200400400_O3_EPTOMS_24h.hdf
ozone3=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/004/N200400400_O3_EPTOMS_24h.hdf
sstfile=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/003/N2004003_SST_OIV2AVAM_24h.nc

Tags:

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1470
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 5 times

database queries killed?

by OB.DAAC - SeanBailey » Wed Apr 03, 2019 8:29 am America/New_York

Brock,
First, I'll remind you that the scripts delivered with SeaDAS are intended for individual users NOT a distributed cluster.

It is not the size of the files or the type of request, but rather the number of request you're making in a given span of time.
It is a waste of time to use the no-download option, only to then immediately call the script again with it set.  All you're doing
is increasing the number of connections you have to make to our system.  If you also have the --refresh-db option set,
then you'll be forcing the script to query our database and ignore the local cache...again, this unnecessarily increases the
number of connections you'll be making. 

You may want to look into using the ancDBmysql.py module in place of the ancDB.py module (look under $OCSSWROOT/scripts/modules).
It is an example of how to use a mysql database for storing the ancillary metadata instead of the sqlite default (which doesn't play nice with
multiple connections).  It is an exercise for the user to get it to work.  If you don't want to use mysql, the script should be easily rewritten to
use another DB option.

Sean

oo_processing
Posts: 304
Joined: Wed Apr 06, 2005 12:11 pm America/New_York
Answers: 0
Has thanked: 6 times

database queries killed?

by oo_processing » Wed Apr 03, 2019 12:16 pm America/New_York

Sean,

I am an individual running on the cluster. No one else is currently using it for OCSSW processing.

I had changed my script (yesterday) to only make two calls per PDS with 3 seconds between PDS files in the loop.

So for example PDS file MOD00.P2004086.1640_1.PDS.bz2  results in these 2 calls at the bottom of the post, and everything seems to be working fine in terms of downloading the missing anc files. I guess my question is this:

If I have downloaded all the L0 PDS files, and anc data for both modis_atteph.py and getanc.py, if there a way for me to process that doesn't require me to touch your system? Can I process without the internet? In the past we set a local anc due to permissions and file locking like this, does that affect/trigger a NASA query:

modis_GEO.py $extracted_l1a_file_w_vdir_path -o $extracted_geo_file_w_vdir_path --threshold=95 --ancdb=./ancillary_data.db --enable-dem

Example from above.
modis_atteph.py -m aqua -s 2004086164000 -e 2004086164000
returns:      att1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/PM1ATTNR.P2004086.1600.003
eph1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/PM1EPHND.P2004086.1200.001                                                                      
                                                                                                                                                                       
getanc.py -m aqua -s 2004086164000 -e 2004086164000
returns:    icefile=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N200408600_SEAICE_NSIDC_24h.hdf
met1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N200408612_MET_NCEPR2_6h.hdf                                                                    
met2=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N200408618_MET_NCEPR2_6h.hdf                                                                    
met3=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N200408618_MET_NCEPR2_6h.hdf                                                                    
ozone1=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N200408600_O3_EPTOMS_24h.hdf                                                                  
ozone2=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/087/N200408700_O3_EPTOMS_24h.hdf                                                                  
ozone3=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/087/N200408700_O3_EPTOMS_24h.hdf                                                                  
sstfile=/shares/cms_optics/apps/seadas/seadas-7.5/ocssw/var/anc/2004/086/N2004086_SST_OIV2AVAM_24h.nc

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1470
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 5 times

database queries killed?

by OB.DAAC - SeanBailey » Wed Apr 03, 2019 8:02 pm America/New_York

Brock,  as an individual user you have at least 5 IPs that are the TOP 5 in terms of number of connections to our system by a wide margin.
So, whatever you are doing is NOT typical of our average user.

If you do not have a local ancDB populated for the granules you are processing, then running getanc.py WILL connect to our system at least to identify the granule specific ancillary data.
Any subsequent calls will not connect to our servers - this is the whole point behind the ancDB.  If you are using more than one ancDB (like having a local one per process stream) then, yes, every time it runs it will connect to our servers.  This is why I suggested you use a thread-capable approach, like the ancDBmysql.py.

Sean

Post Reply