Page 1 of 2

Unable to access HLSS30.020 on aws s3

Posted: Mon Mar 06, 2023 9:07 am America/New_York
by psarka_hydrosat
I'm unable to access hls data on the `lp-prod-protected` bucket anymore. It was working a month ago, but it no longer works today. I'm getting s3 credentials fine, I tried creating a new account as well, same error. Here is a code snipped for reproduction:

```
import boto3
import requests

user = CENSORED
password = CENSORED

url = 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'
url = requests.get(url, allow_redirects=False).headers['Location']
creds = requests.get(url, auth=(user, password)).json()

session = boto3.Session(
aws_access_key_id=creds['accessKeyId'],
aws_secret_access_key=creds['secretAccessKey'],
aws_session_token=creds['sessionToken'],
region_name='us-west-2'
)

client = session.client('s3')
bucket = "lp-prod-protected"
key = "HLSS30.020/HLS.S30.T56QPM.2023001T002959.v2.0/HLS.S30.T56QPM.2023001T002959.v2.0.B03.tif"

client.download_file(Bucket=bucket, Key=key, Filename='temp.tif')
# ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

client.list_objects_v2(Bucket=bucket)
#ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
```

Re: Unable to access HLSS30.020 on aws s3

Posted: Tue Mar 07, 2023 9:24 am America/New_York
by LP DAAC - dgolon
Hi @psarka_hydrosat Thank you for reporting this issue. Our science team will take a look at what you are seeing. We'll report back when we have more information.

Re: Unable to access HLSS30.020 on aws s3

Posted: Wed Mar 08, 2023 2:08 pm America/New_York
by oscillation
I encountered the same 'Forbidden' when I attempted to use my earthdata AWS session info to download just a few minutes ago. This was my first attempt, so can't comment on whether anything has changed recently

Re: Unable to access HLSS30.020 on aws s3

Posted: Thu Mar 09, 2023 10:27 am America/New_York
by LP DAAC - dgolon
Hi @oscillation Thank you for reporting that you are seeing the same error. I have passed our comments along to our science team members that are looking into this.

Re: Unable to access HLSS30.020 on aws s3

Posted: Thu Mar 09, 2023 3:37 pm America/New_York
by LP DAAC - afriesz
I'm still looking in to why the code you posted does not work. Seems like it should but I'm getting an access denied error as well. In the meantime, the below code may be able to help. It uses Python's s3fs instead and shows how to list the contents within a bucket and how to download a file.

```
import s3fs
import requests
user = xxxx
password = xxxx
url = 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'
url = requests.get(url, allow_redirects=False).headers['Location']
creds = requests.get(url, auth=(user, password)).json()
bucket = "lp-prod-protected"
s3_fs = s3fs.S3FileSystem(
key=creds["accessKeyId"],
secret=creds["secretAccessKey"],
token=creds["sessionToken"],
)
# list collections in lp-prod-protected
s3_fs.listdir(f'{bucket}')
# list files in HLS.S30.T60VUP.2023066T234621.v2.0
s3_fs.listdir(f'{bucket}/HLSS30.020/HLS.S30.T60VUP.2023066T234621.v2.0')
# Download file
s3_fs.download('lp-prod-protected/HLSS30.020/HLS.S30.T60VUP.2023066T234621.v2.0/HLS.S30.T60VUP.2023066T234621.v2.0.VZA.tif', 'VZA.tif')
```

Re: Unable to access HLSS30.020 on aws s3

Posted: Thu Mar 09, 2023 3:42 pm America/New_York
by LP DAAC - afriesz
Something to note, when interacting with S3 buckets or trying to access data directly in S3 is that you must be running your code/making requests from AWS region us-west-2.

Re: Unable to access HLSS30.020 on aws s3

Posted: Fri Mar 10, 2023 8:43 am America/New_York
by psarka_hydrosat
Thank you @afriesz!

I did indeed forget that one needs to be in that region, this resolves half of the issue. Another half is missing `Prefix=''` argument in the list function, now everything works.

Re: Unable to access HLSS30.020 on aws s3

Posted: Wed May 10, 2023 3:30 am America/New_York
by indrek.sunter
Even while specifying the prefix and region, I'm getting
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

I tried with boto3 and a code similar to what psarka_hydrosat had, as well as with the code which LP DAAC - afriesz supplied as an example. Both failed with the same error on listing objects. On an attempt to download an object, I'm getting a 403
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

I tried running the script from within an AWS VPC, as well as from over a VPN with AWS.
This did not make a difference.

For login and credentials I also tried the guide at
https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME

The credentials seemed reasonable.
I also tried getting the credentials from the https://search.earthdata.nasa.gov/search/granules and copying the credentials into the scripts.
This resulted in the same errors.

Might there be some sort of subscriptions or account settings which I might be missing?

Re: Unable to access HLSS30.020 on aws s3

Posted: Thu May 25, 2023 9:52 am America/New_York
by LP DAAC - dgolon
hi @indrek.sunter Thank you for writing in. We are still looking into this. We will report back once we have more information.

Re: Unable to access HLSS30.020 on aws s3

Posted: Thu Jun 01, 2023 7:44 am America/New_York
by lesimppa
Hi everyone!

Thanks so much for investigating the error already. I've been trying to access s3 directly as the HTTP API is a bit slow in our distributed environment. I'm running into a similar error (403 Forbidden) with this minimal code:

import boto3

S3_LPDAAC_CREDENTIALS_ENDPOINT = 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'

def get_s3_lpdaac_credentials():
return requests.get(S3_LPDAAC_CREDENTIALS_ENDPOINT).json()


s3_lpdaac_credentials = get_s3_lpdaac_credentials()

session = boto3.Session(aws_access_key_id=s3_lpdaac_credentials['accessKeyId'],
aws_secret_access_key=s3_lpdaac_credentials['secretAccessKey'],
aws_session_token=s3_lpdaac_credentials['sessionToken'],
region_name='us-west-2')

client = session.client('s3', )
bucket = "lp-prod-protected"
key = "HLSS30.020/HLS.S30.T56QPM.2023001T002959.v2.0/HLS.S30.T56QPM.2023001T002959.v2.0.B03.tif"

client.download_file(Bucket=bucket, Key=key, Filename='temp.tif')

The credentials look healthy and they (the .netrc file) work when curling the HTTP API. I've followed the guide here: https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/05_Data_Access_Direct_S3.html. I'm running it from my local machine, but I believe that's not the issue.

Thanks for investigating,
Simon