Harmony and DayMet
Posted: Mon Aug 04, 2025 3:07 pm America/New_York
I develop a python-based workflow tool that needs to download DayMet data. My users typically want to download all variables, on a relatively small subdomain (e.g. 10x10 -- 200x200 pixels), but for a long time (~1 - 4 decades of daily data).
So far I have tried the following APIs to do so:
1. ORNL DAAC's THREDDS works great, but I have been told that is going away
2. pydap works for short time periods, but "times out" when processing more than a few months of data
3. harmony-py seems like the "right" tool for this job?
Below I have included the script I am using for harmony-py. I have tried three different end-dates:
1. 1 month of data -- everything works as expected, though I get the message "job finished with errors" the downloaded files look correct
2. 12 months of data -- the job "completes", again with the message "job finished with errors", but only 2 variables ('prcp' and 'dayl') are downloaded when I asked for 'all' variables (DayMet has 6 or 7).
3. > 1 year of data -- the job errors with the message:
ProcessingFailedException: 75 percent maximum errors exceeded. See the errors fields for more details
I have also tried downloading just one variable at a time, but despite the capability listing 'variables' as a valid subset method, the script (uncomment the variables line in the Request) errors with the message:
Exception: ('Bad Request', 'Error: Coverages were not found for the provided variables: srad')
I am unable to access the "error fields" -- request.error_messages() is always empty. But I suspect part of the problem may be that DayMet has 3 regions ('na' = North America, 'pr' = Puerto Rico, and 'hi' = Hawaii). I suspect that harmony is trying to use all regions to match my bounding box (which is in North America), so a lot of the files are failing because the bounds are not in those files?
Any help on, (1) is this the right tool for the job and (2) getting harmony to work for longer time periods would be appreciated.
Thanks,
Ethan
SCRIPT BELOW:
# define the slice that we want, in lat/lon
lon = (-83.47845, -83.421658)
lat = (35.027341, 35.073819)
# variables
# NOTE -- there is no way to subset by variable or region? The
# 'variables' option to Request fails at request.is_valid() with the
# error:
#
# Exception: ('Bad Request', 'Error: Coverages were not found for the provided variables: srad')
#
var = 'srad'
# NOTE: there is no way to provide a region, so the job tries Puerto
# Rico ('pr') and Hawaii ('hi') as well as North America ('na'). The
# lat/lon is only valid for 'na', so the others fail.
region = 'na'
# time
start = '2010-01-01'
# single month
# NOTE: this works as expected
#end = '2010-02-01'
# single-year
# NOTE: This completes, but takes a long time (~30 minutes) and it
# only successfully downloaded prcp and dayl -- it failed to do the
# other variables.
#end = '2010-12-31'
# multiple-year
# NOTE: this fails with the error:
#
# ProcessingFailedException: 75 percent maximum errors exceeded. See the errors fields for more details
#
end = '2011-12-31'
import datetime as dt
import harmony
import xarray as xr
harmony_client = harmony.Client()
collection = harmony.Collection(id="C2532426483-ORNL_CLOUD")
bbox = harmony.BBox(lon[0], lat[0], lon[1], lat[1])
fileformat = 'application/x-netcdf4'
#fileformat = 'application/netcdf'
request = harmony.Request(
collection=collection,
spatial=bbox,
temporal={
'start': dt.datetime.strptime(start, '%Y-%m-%d'),
'stop': dt.datetime.strptime(end, '%Y-%m-%d'),
},
#variables = [var,],
format = fileformat,
ignore_errors = True,
skip_preview=True,
)
assert request.is_valid()
print('REQUEST:')
print(harmony_client.request_as_url(request))
job_id = harmony_client.submit(request)
harmony_client.wait_for_processing(job_id, show_progress=True)
print('')
print('ERRORS:')
print(request.error_messages())
print('')
print('FILES:')
for i in harmony_client.download_all(job_id):
print(i)
So far I have tried the following APIs to do so:
1. ORNL DAAC's THREDDS works great, but I have been told that is going away
2. pydap works for short time periods, but "times out" when processing more than a few months of data
3. harmony-py seems like the "right" tool for this job?
Below I have included the script I am using for harmony-py. I have tried three different end-dates:
1. 1 month of data -- everything works as expected, though I get the message "job finished with errors" the downloaded files look correct
2. 12 months of data -- the job "completes", again with the message "job finished with errors", but only 2 variables ('prcp' and 'dayl') are downloaded when I asked for 'all' variables (DayMet has 6 or 7).
3. > 1 year of data -- the job errors with the message:
ProcessingFailedException: 75 percent maximum errors exceeded. See the errors fields for more details
I have also tried downloading just one variable at a time, but despite the capability listing 'variables' as a valid subset method, the script (uncomment the variables line in the Request) errors with the message:
Exception: ('Bad Request', 'Error: Coverages were not found for the provided variables: srad')
I am unable to access the "error fields" -- request.error_messages() is always empty. But I suspect part of the problem may be that DayMet has 3 regions ('na' = North America, 'pr' = Puerto Rico, and 'hi' = Hawaii). I suspect that harmony is trying to use all regions to match my bounding box (which is in North America), so a lot of the files are failing because the bounds are not in those files?
Any help on, (1) is this the right tool for the job and (2) getting harmony to work for longer time periods would be appreciated.
Thanks,
Ethan
SCRIPT BELOW:
# define the slice that we want, in lat/lon
lon = (-83.47845, -83.421658)
lat = (35.027341, 35.073819)
# variables
# NOTE -- there is no way to subset by variable or region? The
# 'variables' option to Request fails at request.is_valid() with the
# error:
#
# Exception: ('Bad Request', 'Error: Coverages were not found for the provided variables: srad')
#
var = 'srad'
# NOTE: there is no way to provide a region, so the job tries Puerto
# Rico ('pr') and Hawaii ('hi') as well as North America ('na'). The
# lat/lon is only valid for 'na', so the others fail.
region = 'na'
# time
start = '2010-01-01'
# single month
# NOTE: this works as expected
#end = '2010-02-01'
# single-year
# NOTE: This completes, but takes a long time (~30 minutes) and it
# only successfully downloaded prcp and dayl -- it failed to do the
# other variables.
#end = '2010-12-31'
# multiple-year
# NOTE: this fails with the error:
#
# ProcessingFailedException: 75 percent maximum errors exceeded. See the errors fields for more details
#
end = '2011-12-31'
import datetime as dt
import harmony
import xarray as xr
harmony_client = harmony.Client()
collection = harmony.Collection(id="C2532426483-ORNL_CLOUD")
bbox = harmony.BBox(lon[0], lat[0], lon[1], lat[1])
fileformat = 'application/x-netcdf4'
#fileformat = 'application/netcdf'
request = harmony.Request(
collection=collection,
spatial=bbox,
temporal={
'start': dt.datetime.strptime(start, '%Y-%m-%d'),
'stop': dt.datetime.strptime(end, '%Y-%m-%d'),
},
#variables = [var,],
format = fileformat,
ignore_errors = True,
skip_preview=True,
)
assert request.is_valid()
print('REQUEST:')
print(harmony_client.request_as_url(request))
job_id = harmony_client.submit(request)
harmony_client.wait_for_processing(job_id, show_progress=True)
print('')
print('ERRORS:')
print(request.error_messages())
print('')
print('FILES:')
for i in harmony_client.download_all(job_id):
print(i)