Downloading specific time data from MERRA-2
-
- Posts: 8
- Joined: Tue Jul 16, 2024 1:45 pm America/New_York
- Been thanked: 1 time
Downloading specific time data from MERRA-2
I am researching Dust Storm events in Arizona and trying to download temperature, windspeed, and humidity for a specific time from M2T1NXFLX. The product link is below:
https://disc.gsfc.nasa.gov/datasets/M2T1NXFLX_5.12.4/summary
I created the attached Python script (which is run from a virtual environment on Ubuntu). I followed the instruction for "Example #2" on https://disc.gsfc.nasa.gov/information/howto?keywords=level%203&title=How%20to%20Use%20the%20Web%20Services%20API%20for%20Subsetting%20MERRA-2%20Data
The coordinates are all in Arizona. The MERRA2 product time is based on UTC. In the attached example, we are looking for information on 12:30 PM Arizona time, on January 17th, 1996 which translates to 19:30 UTC. Thus the "begHour" and "endHour" variables will have the value of 19:30
I also commented in/out the "diurnalAggregation" to see if I could get the specific time data we were looking for, but that did not help.
We need to download about 650 instances of date/time, and it is important for us to have information about the specific time of day.
Could someone explain why I can not get the specific time of the day information in the .nc file?
Thank you,
Soheil
The below Python script returns the result for the whole day of January 17th instead of that specific time mentioned above.
# STEP 1
import sys
import json
import urllib3
import certifi
import requests
from time import sleep
from http.cookiejar import CookieJar
import urllib.request
from urllib.parse import urlencode
import getpass
# STEP 2
# Create a urllib PoolManager instance to make requests.
http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED',ca_certs=certifi.where())
# Set the URL for the GES DISC subset service endpoint
url = 'https://disc.gsfc.nasa.gov/service/subset/jsonwsp'
# STEP 3
# This method POSTs formatted JSON WSP requests to the GES DISC endpoint URL
# It is created for convenience since this task will be repeated more than once
def get_http_data(request):
hdrs = {'Content-Type': 'application/json',
'Accept' : 'application/json'}
data = json.dumps(request)
r = http.request('POST', url, body=data, headers=hdrs)
response = json.loads(r.data)
# Check for errors
if response['type'] == 'jsonwsp/fault' :
print('API Error: faulty %s request' % response['methodname'])
sys.exit(1)
return response
# STEP 4
# Define the parameters for the data subset
# SPEED = surface wind speed
# SPEEDMAX = surface wind speed
# TLML = surface air temperature
# QLML = surface specific humidity
# QSH = effective surface specific humidity
# HLML = surface layer height
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -109.608682
maxlon = -108.983682
minlat = 32.8008966
maxlat = 33.3008966
begTime = '1996-01-17'
endTime = '1996-01-17'
begHour = '19:30'
endHour = '19:30'
# Disabling diurnalAggregation to see if we receive hourly value per day
diurnalAggregation = '1'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime + ' ' + begHour,
'end' : endTime + ' ' + endHour,
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'diurnalAggregation': diurnalAggregation,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
# STEP 6
# Submit the subset request to the GES DISC Server
response = get_http_data(subset_request)
# Report the JobID and initial status
myJobId = response['result']['jobId']
print('Job ID: '+myJobId)
print('Job status: '+response['result']['Status'])
# STEP 7
# Construct JSON WSP request for API method: GetStatus
status_request = {
'methodname': 'GetStatus',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {'jobId': myJobId}
}
# Check on the job status after a brief nap
while response['result']['Status'] in ['Accepted', 'Running']:
sleep(5)
response = get_http_data(status_request)
status = response['result']['Status']
percent = response['result']['PercentCompleted']
print ('Job status: %s (%d%c complete)' % (status,percent,'%'))
if response['result']['Status'] == 'Succeeded' :
print ('Job Finished: %s' % response['result']['message'])
else :
print('Job Failed: %s' % response['fault']['code'])
sys.exit(1)
# STEP 8 (Plan A - preferred)
# Construct JSON WSP request for API method: GetResult
batchsize = 20
results_request = {
'methodname': 'GetResult',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {
'jobId': myJobId,
'count': batchsize,
'startIndex': 0
}
}
# Retrieve the results in JSON in multiple batches
# Initialize variables, then submit the first GetResults request
# Add the results from this batch to the list and increment the count
results = []
count = 0
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Increment the startIndex and keep asking for more results until we have them all
total = response['result']['totalResults']
while count < total :
results_request['args']['startIndex'] += batchsize
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Check on the bookkeeping
print('Retrieved %d out of %d expected items' % (len(results), total))
# Sort the results into documents and URLs
docs = []
urls = []
for item in results :
try:
if item['start'] and item['end'] : urls.append(item)
except:
docs.append(item)
# Print out the documentation links, but do not download them
# print('\nDocumentation:')
# for item in docs : print(item['label']+': '+item['link'])
# STEP 10
# Use the requests library to submit the HTTP_Services URLs and write out the results.
print('\nHTTP_services output:')
for item in urls :
URL = item['link']
result = requests.get(URL)
try:
result.raise_for_status()
outfn = item['label']
f = open(outfn,'wb')
f.write(result.content)
f.close()
print(outfn, "is downloaded")
except:
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
print('Help for downloading data is at https://disc.gsfc.nasa.gov/data-access')
print('Downloading is done and find the downloaded files in your current working directory')
https://disc.gsfc.nasa.gov/datasets/M2T1NXFLX_5.12.4/summary
I created the attached Python script (which is run from a virtual environment on Ubuntu). I followed the instruction for "Example #2" on https://disc.gsfc.nasa.gov/information/howto?keywords=level%203&title=How%20to%20Use%20the%20Web%20Services%20API%20for%20Subsetting%20MERRA-2%20Data
The coordinates are all in Arizona. The MERRA2 product time is based on UTC. In the attached example, we are looking for information on 12:30 PM Arizona time, on January 17th, 1996 which translates to 19:30 UTC. Thus the "begHour" and "endHour" variables will have the value of 19:30
I also commented in/out the "diurnalAggregation" to see if I could get the specific time data we were looking for, but that did not help.
We need to download about 650 instances of date/time, and it is important for us to have information about the specific time of day.
Could someone explain why I can not get the specific time of the day information in the .nc file?
Thank you,
Soheil
The below Python script returns the result for the whole day of January 17th instead of that specific time mentioned above.
# STEP 1
import sys
import json
import urllib3
import certifi
import requests
from time import sleep
from http.cookiejar import CookieJar
import urllib.request
from urllib.parse import urlencode
import getpass
# STEP 2
# Create a urllib PoolManager instance to make requests.
http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED',ca_certs=certifi.where())
# Set the URL for the GES DISC subset service endpoint
url = 'https://disc.gsfc.nasa.gov/service/subset/jsonwsp'
# STEP 3
# This method POSTs formatted JSON WSP requests to the GES DISC endpoint URL
# It is created for convenience since this task will be repeated more than once
def get_http_data(request):
hdrs = {'Content-Type': 'application/json',
'Accept' : 'application/json'}
data = json.dumps(request)
r = http.request('POST', url, body=data, headers=hdrs)
response = json.loads(r.data)
# Check for errors
if response['type'] == 'jsonwsp/fault' :
print('API Error: faulty %s request' % response['methodname'])
sys.exit(1)
return response
# STEP 4
# Define the parameters for the data subset
# SPEED = surface wind speed
# SPEEDMAX = surface wind speed
# TLML = surface air temperature
# QLML = surface specific humidity
# QSH = effective surface specific humidity
# HLML = surface layer height
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -109.608682
maxlon = -108.983682
minlat = 32.8008966
maxlat = 33.3008966
begTime = '1996-01-17'
endTime = '1996-01-17'
begHour = '19:30'
endHour = '19:30'
# Disabling diurnalAggregation to see if we receive hourly value per day
diurnalAggregation = '1'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime + ' ' + begHour,
'end' : endTime + ' ' + endHour,
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'diurnalAggregation': diurnalAggregation,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
# STEP 6
# Submit the subset request to the GES DISC Server
response = get_http_data(subset_request)
# Report the JobID and initial status
myJobId = response['result']['jobId']
print('Job ID: '+myJobId)
print('Job status: '+response['result']['Status'])
# STEP 7
# Construct JSON WSP request for API method: GetStatus
status_request = {
'methodname': 'GetStatus',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {'jobId': myJobId}
}
# Check on the job status after a brief nap
while response['result']['Status'] in ['Accepted', 'Running']:
sleep(5)
response = get_http_data(status_request)
status = response['result']['Status']
percent = response['result']['PercentCompleted']
print ('Job status: %s (%d%c complete)' % (status,percent,'%'))
if response['result']['Status'] == 'Succeeded' :
print ('Job Finished: %s' % response['result']['message'])
else :
print('Job Failed: %s' % response['fault']['code'])
sys.exit(1)
# STEP 8 (Plan A - preferred)
# Construct JSON WSP request for API method: GetResult
batchsize = 20
results_request = {
'methodname': 'GetResult',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {
'jobId': myJobId,
'count': batchsize,
'startIndex': 0
}
}
# Retrieve the results in JSON in multiple batches
# Initialize variables, then submit the first GetResults request
# Add the results from this batch to the list and increment the count
results = []
count = 0
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Increment the startIndex and keep asking for more results until we have them all
total = response['result']['totalResults']
while count < total :
results_request['args']['startIndex'] += batchsize
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Check on the bookkeeping
print('Retrieved %d out of %d expected items' % (len(results), total))
# Sort the results into documents and URLs
docs = []
urls = []
for item in results :
try:
if item['start'] and item['end'] : urls.append(item)
except:
docs.append(item)
# Print out the documentation links, but do not download them
# print('\nDocumentation:')
# for item in docs : print(item['label']+': '+item['link'])
# STEP 10
# Use the requests library to submit the HTTP_Services URLs and write out the results.
print('\nHTTP_services output:')
for item in urls :
URL = item['link']
result = requests.get(URL)
try:
result.raise_for_status()
outfn = item['label']
f = open(outfn,'wb')
f.write(result.content)
f.close()
print(outfn, "is downloaded")
except:
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
print('Help for downloading data is at https://disc.gsfc.nasa.gov/data-access')
print('Downloading is done and find the downloaded files in your current working directory')
Filters:
-
- Subject Matter Expert
- Posts: 22
- Joined: Wed Feb 16, 2022 4:38 pm America/New_York
- Has thanked: 1 time
Re: Downloading specific time data from MERRA-2
Hello,
In order to subset on an hourly basis, there are a few parameters that need to be included in the subset request, as well as insuring that certain parameters like start and end times are formatted correctly. In addition to the 'start' and 'end' parameters, the 'diurnalFrom' and 'diurnalTo' parameters must be passed, which contain the HH:MM of the hours to be subsetted. Finally, the 'diurnalAggregation' parameter must be set to 'none'. Please use these parameters in your request and try again:
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -109.608682
maxlon = -108.983682
minlat = 32.8008966
maxlat = 33.3008966
begTime = '1996-01-17T00:00:00Z'
endTime = '1996-01-17T23:00:00Z'
begHour = '00:30'
endHour = '00:30'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime,
'end' : endTime,
'diurnalFrom' : begHour,
'diurnalTo' : endHour,
"diurnalAggregation": "none",
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
These very specific parameters are not included in the existing tutorial that you referenced for sub-daily requests; our apologies for not including those. We will add those to the existing documentation. If you wish to use any other parameters, please consult the API reference: https://disc.gsfc.nasa.gov/service/subset
In order to subset on an hourly basis, there are a few parameters that need to be included in the subset request, as well as insuring that certain parameters like start and end times are formatted correctly. In addition to the 'start' and 'end' parameters, the 'diurnalFrom' and 'diurnalTo' parameters must be passed, which contain the HH:MM of the hours to be subsetted. Finally, the 'diurnalAggregation' parameter must be set to 'none'. Please use these parameters in your request and try again:
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -109.608682
maxlon = -108.983682
minlat = 32.8008966
maxlat = 33.3008966
begTime = '1996-01-17T00:00:00Z'
endTime = '1996-01-17T23:00:00Z'
begHour = '00:30'
endHour = '00:30'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime,
'end' : endTime,
'diurnalFrom' : begHour,
'diurnalTo' : endHour,
"diurnalAggregation": "none",
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
These very specific parameters are not included in the existing tutorial that you referenced for sub-daily requests; our apologies for not including those. We will add those to the existing documentation. If you wish to use any other parameters, please consult the API reference: https://disc.gsfc.nasa.gov/service/subset
-
- Posts: 8
- Joined: Tue Jul 16, 2024 1:45 pm America/New_York
- Been thanked: 1 time
Re: Downloading specific time data from MERRA-2
Thank you, it worked perfectly!
One last question please:
Since the MERRA-2 information is based on the UTC, if I want the information for 12:30 PM Arizona time, the begHour and endHour should be 19:30, is that a correct statement? i.e. the begHour and endHour should always be the UTC time conversion of the Arizona time (UTC = Arizona time + 7 hours)
GES DISC - cbattisto wrote:
> Hello,
>
> In order to subset on an hourly basis, there are a few parameters that need
> to be included in the subset request, as well as insuring that certain
> parameters like start and end times are formatted correctly. In addition to
> the 'start' and 'end' parameters, the 'diurnalFrom' and 'diurnalTo'
> parameters must be passed, which contain the HH:MM of the hours to be
> subsetted. Finally, the 'diurnalAggregation' parameter must be set to
> 'none'. Please use these parameters in your request and try again:
>
> product = 'M2T1NXFLX_V5.12.4'
> varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
> minlon = -109.608682
> maxlon = -108.983682
> minlat = 32.8008966
> maxlat = 33.3008966
> begTime = '1996-01-17T00:00:00Z'
> endTime = '1996-01-17T23:00:00Z'
> begHour = '00:30'
> endHour = '00:30'
>
> interp = 'remapbil'
> destGrid = 'cfsr0.5a'
>
>
> # STEP 5
> # Construct JSON WSP request for API method: subset
> subset_request = {
> 'methodname': 'subset',
> 'type': 'jsonwsp/request',
> 'version': '1.0',
> 'args': {
> 'role' : 'subset',
> 'start' : begTime,
> 'end' : endTime,
> 'diurnalFrom' : begHour,
> 'diurnalTo' : endHour,
> "diurnalAggregation": "none",
> 'box' : [minlon, minlat, maxlon, maxlat],
> 'crop' : True,
> 'mapping': interp,
> 'grid' : destGrid,
> 'data': [{'datasetId': product,
> 'variable' : varNames[0]
> },
> {'datasetId': product,
> 'variable' : varNames[1]
> },
> {'datasetId': product,
> 'variable' : varNames[2]
> },
> {'datasetId': product,
> 'variable' : varNames[3]
> },
> {'datasetId': product,
> 'variable' : varNames[4]
> },
> {'datasetId': product,
> 'variable' : varNames[5]
> }]
> }
> }
>
> These very specific parameters are not included in the existing tutorial
> that you referenced for sub-daily requests; our apologies for not including
> those. We will add those to the existing documentation. If you wish to use
> any other parameters, please consult the API reference:
> https://disc.gsfc.nasa.gov/service/subset
One last question please:
Since the MERRA-2 information is based on the UTC, if I want the information for 12:30 PM Arizona time, the begHour and endHour should be 19:30, is that a correct statement? i.e. the begHour and endHour should always be the UTC time conversion of the Arizona time (UTC = Arizona time + 7 hours)
GES DISC - cbattisto wrote:
> Hello,
>
> In order to subset on an hourly basis, there are a few parameters that need
> to be included in the subset request, as well as insuring that certain
> parameters like start and end times are formatted correctly. In addition to
> the 'start' and 'end' parameters, the 'diurnalFrom' and 'diurnalTo'
> parameters must be passed, which contain the HH:MM of the hours to be
> subsetted. Finally, the 'diurnalAggregation' parameter must be set to
> 'none'. Please use these parameters in your request and try again:
>
> product = 'M2T1NXFLX_V5.12.4'
> varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
> minlon = -109.608682
> maxlon = -108.983682
> minlat = 32.8008966
> maxlat = 33.3008966
> begTime = '1996-01-17T00:00:00Z'
> endTime = '1996-01-17T23:00:00Z'
> begHour = '00:30'
> endHour = '00:30'
>
> interp = 'remapbil'
> destGrid = 'cfsr0.5a'
>
>
> # STEP 5
> # Construct JSON WSP request for API method: subset
> subset_request = {
> 'methodname': 'subset',
> 'type': 'jsonwsp/request',
> 'version': '1.0',
> 'args': {
> 'role' : 'subset',
> 'start' : begTime,
> 'end' : endTime,
> 'diurnalFrom' : begHour,
> 'diurnalTo' : endHour,
> "diurnalAggregation": "none",
> 'box' : [minlon, minlat, maxlon, maxlat],
> 'crop' : True,
> 'mapping': interp,
> 'grid' : destGrid,
> 'data': [{'datasetId': product,
> 'variable' : varNames[0]
> },
> {'datasetId': product,
> 'variable' : varNames[1]
> },
> {'datasetId': product,
> 'variable' : varNames[2]
> },
> {'datasetId': product,
> 'variable' : varNames[3]
> },
> {'datasetId': product,
> 'variable' : varNames[4]
> },
> {'datasetId': product,
> 'variable' : varNames[5]
> }]
> }
> }
>
> These very specific parameters are not included in the existing tutorial
> that you referenced for sub-daily requests; our apologies for not including
> those. We will add those to the existing documentation. If you wish to use
> any other parameters, please consult the API reference:
> https://disc.gsfc.nasa.gov/service/subset
-
- Subject Matter Expert
- Posts: 22
- Joined: Wed Feb 16, 2022 4:38 pm America/New_York
- Has thanked: 1 time
Re: Downloading specific time data from MERRA-2
Great, I'm glad to hear!
Yes, you will need to convert to UTC by adding 7, in your use case.
You can also use the Python "pytz" and "datetime" libraries, which will automatically perform this conversion. Here's an example, for getting the UTC hour of 12:30PM local time on 1996-01-17:
from datetime import datetime
import pytz
# Define Arizona timezone (MST year-round)
arizona_tz = pytz.timezone('America/Phoenix')
# Define local time
local_time = datetime(1996, 1, 17, 12, 30) # Jan 17, 1996 at 12:30 PM Arizona time
# Localize and convert to UTC
local_time = arizona_tz.localize(local_time)
utc_time = local_time.astimezone(pytz.utc)
# Print UTC time in HH:MM format
begHour = utc_time.strftime('%H:%M')
print("UTC Time:", begHour)
Yes, you will need to convert to UTC by adding 7, in your use case.
You can also use the Python "pytz" and "datetime" libraries, which will automatically perform this conversion. Here's an example, for getting the UTC hour of 12:30PM local time on 1996-01-17:
from datetime import datetime
import pytz
# Define Arizona timezone (MST year-round)
arizona_tz = pytz.timezone('America/Phoenix')
# Define local time
local_time = datetime(1996, 1, 17, 12, 30) # Jan 17, 1996 at 12:30 PM Arizona time
# Localize and convert to UTC
local_time = arizona_tz.localize(local_time)
utc_time = local_time.astimezone(pytz.utc)
# Print UTC time in HH:MM format
begHour = utc_time.strftime('%H:%M')
print("UTC Time:", begHour)
-
- Posts: 8
- Joined: Tue Jul 16, 2024 1:45 pm America/New_York
- Been thanked: 1 time
Re: Downloading specific time data from MERRA-2
Great! Thank you so much for the information and quickly responding to this post!
Regards
Regards
-
- Posts: 8
- Joined: Tue Jul 16, 2024 1:45 pm America/New_York
- Been thanked: 1 time
Re: Downloading specific time data from MERRA-2
I have one more issue, please.
Could you please have a look at this? I ran the below code (the city I am interested in is Picacho, ARIZONA)
# STEP 1
import sys
import json
import urllib3
import certifi
import requests
from time import sleep
from http.cookiejar import CookieJar
import urllib.request
from urllib.parse import urlencode
import getpass
# STEP 2
# Create a urllib PoolManager instance to make requests.
http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED',ca_certs=certifi.where())
# Set the URL for the GES DISC subset service endpoint
url = 'https://disc.gsfc.nasa.gov/service/subset/jsonwsp'
# STEP 3
# This method POSTs formatted JSON WSP requests to the GES DISC endpoint URL
# It is created for convenience since this task will be repeated more than once
def get_http_data(request):
hdrs = {'Content-Type': 'application/json',
'Accept' : 'application/json'}
data = json.dumps(request)
r = http.request('POST', url, body=data, headers=hdrs)
response = json.loads(r.data)
# Check for errors
if response['type'] == 'jsonwsp/fault' :
print('API Error: faulty %s request' % response['methodname'])
sys.exit(1)
return response
# STEP 4
# Define the parameters for the data subset
# SPEED = surface wind speed
# SPEEDMAX = surface wind speed
# TLML = surface air temperature
# QLML = surface specific humidity
# QSH = effective surface specific humidity
# HLML = surface layer height
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -111.8078971
maxlon = -111.1828971
minlat = 32.466176
maxlat = 32.966176
begTime = '1997-08-02T00:00:00Z'
endTime = '1997-08-02T23:00:00Z'
begHour = '23:30'
endHour = '23:30'
# Disabling diurnalAggregation to see if we receive hourly value per day
#diurnalAggregation = '1'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime,
'end' : endTime,
'diurnalFrom' : begHour,
'diurnalTo' : endHour,
"diurnalAggregation": "none",
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
# STEP 6
# Submit the subset request to the GES DISC Server
response = get_http_data(subset_request)
# Report the JobID and initial status
myJobId = response['result']['jobId']
print('Job ID: '+myJobId)
print('Job status: '+response['result']['Status'])
# STEP 7
# Construct JSON WSP request for API method: GetStatus
status_request = {
'methodname': 'GetStatus',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {'jobId': myJobId}
}
# Check on the job status after a brief nap
while response['result']['Status'] in ['Accepted', 'Running']:
sleep(5)
response = get_http_data(status_request)
status = response['result']['Status']
percent = response['result']['PercentCompleted']
print ('Job status: %s (%d%c complete)' % (status,percent,'%'))
if response['result']['Status'] == 'Succeeded' :
print ('Job Finished: %s' % response['result']['message'])
else :
print('Job Failed: %s' % response['fault']['code'])
sys.exit(1)
# STEP 8 (Plan A - preferred)
# Construct JSON WSP request for API method: GetResult
batchsize = 20
results_request = {
'methodname': 'GetResult',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {
'jobId': myJobId,
'count': batchsize,
'startIndex': 0
}
}
# Retrieve the results in JSON in multiple batches
# Initialize variables, then submit the first GetResults request
# Add the results from this batch to the list and increment the count
results = []
count = 0
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Increment the startIndex and keep asking for more results until we have them all
total = response['result']['totalResults']
while count < total :
results_request['args']['startIndex'] += batchsize
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Check on the bookkeeping
print('Retrieved %d out of %d expected items' % (len(results), total))
# Sort the results into documents and URLs
docs = []
urls = []
for item in results :
try:
if item['start'] and item['end'] : urls.append(item)
except:
docs.append(item)
# Print out the documentation links, but do not download them
# print('\nDocumentation:')
# for item in docs : print(item['label']+': '+item['link'])
# STEP 10
# Use the requests library to submit the HTTP_Services URLs and write out the results.
print('\nHTTP_services output:')
for item in urls :
URL = item['link']
result = requests.get(URL)
try:
result.raise_for_status()
outfn = item['label']
f = open(outfn,'wb')
f.write(result.content)
f.close()
print(outfn, "is downloaded")
except:
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
print('Help for downloading data is at https://disc.gsfc.nasa.gov/data-access')
print('Downloading is done and find the downloaded files in your current working directory')
And I got the below error message:
python3 Subsetting_MERRA-2_Data.py
Job ID: 67c5d11a080acf99d30be39c
Job status: Accepted
Job status: Succeeded (100% complete)
Job Finished: Complete (M2T1NXFLX_5.12.4)
Retrieved 2 out of 2 expected items
HTTP_services output:
Traceback (most recent call last):
File "/mnt/c/DevNet/NASA/Subsetting_MERRA-2_Data.py", line 190, in <module>
result.raise_for_status()
File "/mnt/c/DevNet/NASA/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: MERRA2_200.tavg1_2d_flx_Nx.19970802.SUB.nc not available for download for url: https://goldsmr4.gesdisc.eosdis.nasa.gov/daac-bin/OTF/HTTP_services.cgi?FILENAME=%2Fdata%2FMERRA2%2FM2T1NXFLX.5.12.4%2F1997%2F08%2FMERRA2_200.tavg1_2d_flx_Nx.19970802.nc4&VERSION=1.02&TIME=1997-08-02T23%3A30%3A00%2F1997-08-02T23%3A00%3A00&FORMAT=bmM0Lw&BBOX=32.466176%2C-111.8078971%2C32.966176%2C-111.1828971&LABEL=MERRA2_200.tavg1_2d_flx_Nx.19970802.SUB.nc&VARIABLES=SPEED%2CSPEEDMAX%2CTLML%2CQLML%2CQSH%2CHLML&DATASET_VERSION=5.12.4&SHORTNAME=M2T1NXFLX&SERVICE=L34RS_MERRA2&FLAGS=remapbil%2Ccfsr0.5a
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/c/DevNet/NASA/Subsetting_MERRA-2_Data.py", line 197, in <module>
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
^^^^^^^^^^^^^
AttributeError: 'Response' object has no attribute 'status'
Could you please tell me why I am getting this error?
Thank you, very much.
Soheil
Could you please have a look at this? I ran the below code (the city I am interested in is Picacho, ARIZONA)
# STEP 1
import sys
import json
import urllib3
import certifi
import requests
from time import sleep
from http.cookiejar import CookieJar
import urllib.request
from urllib.parse import urlencode
import getpass
# STEP 2
# Create a urllib PoolManager instance to make requests.
http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED',ca_certs=certifi.where())
# Set the URL for the GES DISC subset service endpoint
url = 'https://disc.gsfc.nasa.gov/service/subset/jsonwsp'
# STEP 3
# This method POSTs formatted JSON WSP requests to the GES DISC endpoint URL
# It is created for convenience since this task will be repeated more than once
def get_http_data(request):
hdrs = {'Content-Type': 'application/json',
'Accept' : 'application/json'}
data = json.dumps(request)
r = http.request('POST', url, body=data, headers=hdrs)
response = json.loads(r.data)
# Check for errors
if response['type'] == 'jsonwsp/fault' :
print('API Error: faulty %s request' % response['methodname'])
sys.exit(1)
return response
# STEP 4
# Define the parameters for the data subset
# SPEED = surface wind speed
# SPEEDMAX = surface wind speed
# TLML = surface air temperature
# QLML = surface specific humidity
# QSH = effective surface specific humidity
# HLML = surface layer height
product = 'M2T1NXFLX_V5.12.4'
varNames =['SPEED', 'SPEEDMAX', 'TLML', 'QLML', 'QSH', 'HLML']
minlon = -111.8078971
maxlon = -111.1828971
minlat = 32.466176
maxlat = 32.966176
begTime = '1997-08-02T00:00:00Z'
endTime = '1997-08-02T23:00:00Z'
begHour = '23:30'
endHour = '23:30'
# Disabling diurnalAggregation to see if we receive hourly value per day
#diurnalAggregation = '1'
interp = 'remapbil'
destGrid = 'cfsr0.5a'
# STEP 5
# Construct JSON WSP request for API method: subset
subset_request = {
'methodname': 'subset',
'type': 'jsonwsp/request',
'version': '1.0',
'args': {
'role' : 'subset',
'start' : begTime,
'end' : endTime,
'diurnalFrom' : begHour,
'diurnalTo' : endHour,
"diurnalAggregation": "none",
'box' : [minlon, minlat, maxlon, maxlat],
'crop' : True,
'mapping': interp,
'grid' : destGrid,
'data': [{'datasetId': product,
'variable' : varNames[0]
},
{'datasetId': product,
'variable' : varNames[1]
},
{'datasetId': product,
'variable' : varNames[2]
},
{'datasetId': product,
'variable' : varNames[3]
},
{'datasetId': product,
'variable' : varNames[4]
},
{'datasetId': product,
'variable' : varNames[5]
}]
}
}
# STEP 6
# Submit the subset request to the GES DISC Server
response = get_http_data(subset_request)
# Report the JobID and initial status
myJobId = response['result']['jobId']
print('Job ID: '+myJobId)
print('Job status: '+response['result']['Status'])
# STEP 7
# Construct JSON WSP request for API method: GetStatus
status_request = {
'methodname': 'GetStatus',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {'jobId': myJobId}
}
# Check on the job status after a brief nap
while response['result']['Status'] in ['Accepted', 'Running']:
sleep(5)
response = get_http_data(status_request)
status = response['result']['Status']
percent = response['result']['PercentCompleted']
print ('Job status: %s (%d%c complete)' % (status,percent,'%'))
if response['result']['Status'] == 'Succeeded' :
print ('Job Finished: %s' % response['result']['message'])
else :
print('Job Failed: %s' % response['fault']['code'])
sys.exit(1)
# STEP 8 (Plan A - preferred)
# Construct JSON WSP request for API method: GetResult
batchsize = 20
results_request = {
'methodname': 'GetResult',
'version': '1.0',
'type': 'jsonwsp/request',
'args': {
'jobId': myJobId,
'count': batchsize,
'startIndex': 0
}
}
# Retrieve the results in JSON in multiple batches
# Initialize variables, then submit the first GetResults request
# Add the results from this batch to the list and increment the count
results = []
count = 0
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Increment the startIndex and keep asking for more results until we have them all
total = response['result']['totalResults']
while count < total :
results_request['args']['startIndex'] += batchsize
response = get_http_data(results_request)
count = count + response['result']['itemsPerPage']
results.extend(response['result']['items'])
# Check on the bookkeeping
print('Retrieved %d out of %d expected items' % (len(results), total))
# Sort the results into documents and URLs
docs = []
urls = []
for item in results :
try:
if item['start'] and item['end'] : urls.append(item)
except:
docs.append(item)
# Print out the documentation links, but do not download them
# print('\nDocumentation:')
# for item in docs : print(item['label']+': '+item['link'])
# STEP 10
# Use the requests library to submit the HTTP_Services URLs and write out the results.
print('\nHTTP_services output:')
for item in urls :
URL = item['link']
result = requests.get(URL)
try:
result.raise_for_status()
outfn = item['label']
f = open(outfn,'wb')
f.write(result.content)
f.close()
print(outfn, "is downloaded")
except:
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
print('Help for downloading data is at https://disc.gsfc.nasa.gov/data-access')
print('Downloading is done and find the downloaded files in your current working directory')
And I got the below error message:
python3 Subsetting_MERRA-2_Data.py
Job ID: 67c5d11a080acf99d30be39c
Job status: Accepted
Job status: Succeeded (100% complete)
Job Finished: Complete (M2T1NXFLX_5.12.4)
Retrieved 2 out of 2 expected items
HTTP_services output:
Traceback (most recent call last):
File "/mnt/c/DevNet/NASA/Subsetting_MERRA-2_Data.py", line 190, in <module>
result.raise_for_status()
File "/mnt/c/DevNet/NASA/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: MERRA2_200.tavg1_2d_flx_Nx.19970802.SUB.nc not available for download for url: https://goldsmr4.gesdisc.eosdis.nasa.gov/daac-bin/OTF/HTTP_services.cgi?FILENAME=%2Fdata%2FMERRA2%2FM2T1NXFLX.5.12.4%2F1997%2F08%2FMERRA2_200.tavg1_2d_flx_Nx.19970802.nc4&VERSION=1.02&TIME=1997-08-02T23%3A30%3A00%2F1997-08-02T23%3A00%3A00&FORMAT=bmM0Lw&BBOX=32.466176%2C-111.8078971%2C32.966176%2C-111.1828971&LABEL=MERRA2_200.tavg1_2d_flx_Nx.19970802.SUB.nc&VARIABLES=SPEED%2CSPEEDMAX%2CTLML%2CQLML%2CQSH%2CHLML&DATASET_VERSION=5.12.4&SHORTNAME=M2T1NXFLX&SERVICE=L34RS_MERRA2&FLAGS=remapbil%2Ccfsr0.5a
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/c/DevNet/NASA/Subsetting_MERRA-2_Data.py", line 197, in <module>
print('Error! Status code is %d for this URL:\n%s' % (result.status.code,URL))
^^^^^^^^^^^^^
AttributeError: 'Response' object has no attribute 'status'
Could you please tell me why I am getting this error?
Thank you, very much.
Soheil
-
- Subject Matter Expert
- Posts: 22
- Joined: Wed Feb 16, 2022 4:38 pm America/New_York
- Has thanked: 1 time
Re: Downloading specific time data from MERRA-2
Hello,
We are looking into this error. I am also receiving the same 404 error when doing the previous subset example. Fortunately, it does work when using the subsetter tool on the website, if you wish to use that method for now.
Chris
We are looking into this error. I am also receiving the same 404 error when doing the previous subset example. Fortunately, it does work when using the subsetter tool on the website, if you wish to use that method for now.
Chris
-
- Subject Matter Expert
- Posts: 22
- Joined: Wed Feb 16, 2022 4:38 pm America/New_York
- Has thanked: 1 time
Re: Downloading specific time data from MERRA-2
Hello,
For your time parameters, please use the following:
begTime = '1997-08-02T23:30:00Z'
endTime = '1997-08-02T23:30:00Z'
begHour = '23:30'
endHour = '23:30'
Originally, the endTime was shorter than the endHour, causing the error.
Chris
For your time parameters, please use the following:
begTime = '1997-08-02T23:30:00Z'
endTime = '1997-08-02T23:30:00Z'
begHour = '23:30'
endHour = '23:30'
Originally, the endTime was shorter than the endHour, causing the error.
Chris
-
- Posts: 8
- Joined: Tue Jul 16, 2024 1:45 pm America/New_York
- Been thanked: 1 time
Re: Downloading specific time data from MERRA-2
Thank you, Chris!
I have 657 instances, and this is the only issue that happened with the ones that had 23:30 as their time.
Rerunning the script again now and the results are successful.
Thanks
I have 657 instances, and this is the only issue that happened with the ones that had 23:30 as their time.
Rerunning the script again now and the results are successful.
Thanks