Harmony API prohibitively slow for GEDI L2A data

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
naiara
Posts: 2
Joined: Mon Jun 02, 2025 3:31 pm America/New_York
Answers: 0

Harmony API prohibitively slow for GEDI L2A data

by naiara » Thu Oct 16, 2025 7:15 pm America/New_York

Hi, I've requested a spatial subsetting of GEDI L2A data using the Harmony API. I am using a single rectangular area of 25x13 Km to generate 44 H5 files. This process can take between 10 minutes and 1 hour. I don't think I'm waiting in queue because the progress bar starts immediately after I submit the request. Why is this so slow? Is the subsetter routine open source? One possibility is to run it in my own machine in AWS so as to have more control of the computational resources.
Here's the main part of my code:
request = Request(
collection = Collection(id=concept_id),
shape = my_site,
temporal = temporal_range
)
task = harmony_client.submit(request)

Filters:

dauty
Posts: 1
Joined: Thu Feb 27, 2020 11:19 am America/New_York
Answers: 0

Re: Harmony API prohibitively slow for GEDI L2A data

by dauty » Fri Nov 14, 2025 8:02 am America/New_York

It may be this is much faster now with several recent and significant Harmony updates - though I suspect it is even now closer to the 10 min time, and hopefully only rarely hitting any Harmony backlog. Hitting a Harmony backlog cannot be ruled out, but should be much less common now.

At 10 min (~15 sec ea. file), I’m not sure I would characterize it as too slow - on par with our previous on-premise deployment. At 1 hour, >1m20s each, it is a bit more concerning.

The subsetter source is here: https://github.com/nasa/harmony-trajectory-subsetter. One thing to note is that the entire file has to be downloaded from the archive to a working machine prior to subsetting. For your case, the significantly subsetted region means that a significant percentage of the time (perhaps ~20%) is likely that whole-file downloading. We have not seen specific requirements to optimize this for the cloud, but it is something we have considered.

Post Reply