Page 1 of 1

Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Tue Apr 11, 2023 7:08 pm America/New_York
by mitchbon
Hello,

I have a workflow that uses HLS (HLSL30.v2.0, HLSS30.v2.0). I access HLS through 'https://cmr.earthdata.nasa.gov/stac/LPCLOUD' using pystac_client and stackstac.

It works, but I have noticed it is slow to process these STACs/COGs into memory (e.g., with compute()) when they need to be accessed locally, such as for plotting. Accessing a temporally long stack can take from 3 minutes (for 1 year) to 30 minutes (for 10 years).

By comparison, I can use the same workflow with S2 L2A from 'https://earth-search.aws.element84.com/v0' and retrieve the same outputs an order of magnitude faster (30 seconds for 1 year to 2 minutes for the full temporal depth).

This difference in access time holds true even for a basic workflow (e.g., https://stackstac.readthedocs.io/en/v0.2.0/basic.html).

Are there faster ways to access HLS that I should make use of?

Thanks,
Mitchell B.

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Wed Apr 12, 2023 4:46 pm America/New_York
by LP DAAC - dgolon
Hi @mitchbon Thanks for reporting this, we'll have our science team take a look.

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Thu May 04, 2023 9:41 am America/New_York
by LP DAAC - afriesz
@mitchbon ,

Would you be able to share your script? If you have a GitHub repository for your work, please post the link. Otherwise, you can attach the script or notebook here in the forum, or you can send it to LP DAAC User Services (LPDAAC@usgs.gov)

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Thu May 04, 2023 10:13 am America/New_York
by mitchbon
Hello,

I cannot share the full notebook for my workflow, but I have attached a simple example based on the basic stackstac tutorial that shows the difference in speed between accessing S2 L2A through AWS E84 and HLS S30 from LPCLOUD.

I am not sure how to attach the notebook here on the forum since the file type is not accepted, so I have emailed it to LP DAAC User Services.

That being said, I have seen some moderate speed up in accessing HLS over the last week or two... not sure if any changes have been made! It is still slower than S2 L2A, but not by as much as before.

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Tue May 30, 2023 9:41 am America/New_York
by LP DAAC - jwilson
@mitchbon
We have forwarded your email to the LP DAAC Subject Matter Expert.

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Thu Jun 08, 2023 10:27 am America/New_York
by LP DAAC - afriesz
@mitchbon ,

I was able to run the notebook you shared. As you mentioned, it works but it’s relatively slow. Unfortunately, we don’t know why there’s such a difference in speed, but we will look it and see if something on our side we can do to improve the performance. In the meantime, there are alternatives to using stackstac. I’ve pulled together an example that uses Dask here: https://github.com/nasa/HLS-Data-Resources/blob/main/python/how-tos/Data_Access__Create_HLS_Timeseries_Dask.ipynb.
The example assumes that you have gone through the process of getting a list of HLS URLs and have assign them as a variable (list). I know it doesn’t have the smooth coupling between STAC search results and commands to read in the data, but it is a working alternative.

Re: Slow Access to HLS (HLSL30.v2.0, HLSS30.v2.0)

Posted: Thu Jun 08, 2023 12:13 pm America/New_York
by mitchbon
@LP DAAC - afriesz

Thanks for the response and notebook! We access thousands of HLS images at a time, so I am not sure about getting a list like that, but maybe there are other ways we can leverage dask on our end.

Looking forward to hearing if you find anything on your end in regards to the speed difference too.