S3 PODAAC Access Denied
-
- Posts: 1
- Joined: Fri Jun 23, 2023 11:07 am America/New_York
S3 PODAAC Access Denied
Trying to access the GRACE Product found here: https://podaac.jpl.nasa.gov/dataset/TELLUS_GRAC_L3_CSR_RL06_LND_v04
Following along with the S3 Documentation *(https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME), I have built out a `get_credentials` method that passes in credentials, works through the redirects and ultimately yields the AWS credentials from the credentials endpoint (https://archive.podaac.earthdata.nasa.gov/s3credentials)
After grabbing the credentials, I setup a Boto3 s3 client with the following code:
creds = get_credentias()
client = boto3.client(
's3',
aws_access_key_id=creds["accessKeyId"],
aws_secret_access_key=creds["secretAccessKey"],
aws_session_token=creds["sessionToken"],
region_name="us-west-2"
)
Session is established, but when trying to list the object in S3 I get this error: `ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied`
Command: client.list_objects_v2(Bucket='podaac-ops-cumulus-protected', Prefix="TELLUS_GRAC_L3_CSR_RL06_LND_v04/")
Had a few coworkers try the same approach/configuration with the same result, so it does not appear to be an issue with my specific account. I am able to download the tiles from the Earthsearch UI, but need systematic access for lambda function calls.
Following along with the S3 Documentation *(https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME), I have built out a `get_credentials` method that passes in credentials, works through the redirects and ultimately yields the AWS credentials from the credentials endpoint (https://archive.podaac.earthdata.nasa.gov/s3credentials)
After grabbing the credentials, I setup a Boto3 s3 client with the following code:
creds = get_credentias()
client = boto3.client(
's3',
aws_access_key_id=creds["accessKeyId"],
aws_secret_access_key=creds["secretAccessKey"],
aws_session_token=creds["sessionToken"],
region_name="us-west-2"
)
Session is established, but when trying to list the object in S3 I get this error: `ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied`
Command: client.list_objects_v2(Bucket='podaac-ops-cumulus-protected', Prefix="TELLUS_GRAC_L3_CSR_RL06_LND_v04/")
Had a few coworkers try the same approach/configuration with the same result, so it does not appear to be an issue with my specific account. I am able to download the tiles from the Earthsearch UI, but need systematic access for lambda function calls.
Tags:
-
- Subject Matter Expert
- Posts: 34
- Joined: Tue May 11, 2021 12:58 pm America/New_York
Re: S3 PODAAC Access Denied
Hi,
PO.DAAC has provided some helps to access the collections within Earthdata cloud (AWS). You could find out the useful information from PO.DAAC Cookbook at https://podaac.github.io/tutorials/. Please take a look at the "How to Access Data Directly in Cloud (netCDF)" section (https://podaac.github.io/tutorials/external/Direct_S3_Access_NetCDF.html) and see if you could get helps from there.
Thanks,
PO.DAAC has provided some helps to access the collections within Earthdata cloud (AWS). You could find out the useful information from PO.DAAC Cookbook at https://podaac.github.io/tutorials/. Please take a look at the "How to Access Data Directly in Cloud (netCDF)" section (https://podaac.github.io/tutorials/external/Direct_S3_Access_NetCDF.html) and see if you could get helps from there.
Thanks,
-
- Posts: 1
- Joined: Fri Aug 25, 2023 5:34 am America/New_York
Re: S3 PODAAC Access Denied
I'm having the exact same issue unfortunately. Did you find any solution?
When I follow the instructions here (https://podaac.github.io/tutorials/external/Direct_S3_Access_NetCDF.html) I can get the credentials successfully (from the 'https://archive.podaac.earthdata.nasa.gov/s3credentials' endpoint), however in the next part when using s3fs, the returned list `ssh_files` is empty (no error though...).
When I use boto3 directly, like the OP, I get a Access Denied error.
When I follow the instructions here (https://podaac.github.io/tutorials/external/Direct_S3_Access_NetCDF.html) I can get the credentials successfully (from the 'https://archive.podaac.earthdata.nasa.gov/s3credentials' endpoint), however in the next part when using s3fs, the returned list `ssh_files` is empty (no error though...).
When I use boto3 directly, like the OP, I get a Access Denied error.
Last edited by broodj3ham on Fri Aug 25, 2023 6:09 am America/New_York, edited 1 time in total.
-
- Subject Matter Expert
- Posts: 16
- Joined: Tue Mar 14, 2023 1:41 pm America/New_York
Re: S3 PODAAC Access Denied
Thanks again, broodj3ham and trevorskaggs. I can't offer any explanation for why access using boto3 is not working from your EC2 instances running in us-west-2.
We are looking into the issue. In the meantime, can you please try using this exact code snippet? This assumes you have the .netrc file set up appropriately on the same host:
Jack
We are looking into the issue. In the meantime, can you please try using this exact code snippet? This assumes you have the .netrc file set up appropriately on the same host:
Thanks again to you both for bringing the issue to our attention.import os
import s3fs
import requests
import xarray as xr
def begin_s3_direct_access(url: str="https://archive.podaac.earthdata.nasa.gov/s3credentials"):
response = requests.get(url).json()
return s3fs.S3FileSystem(key=response['accessKeyId'],
secret=response['secretAccessKey'],
token=response['sessionToken'],
client_kwargs={'region_name':'us-west-2'})
fs = begin_s3_direct_access()
short_name = "TELLUS_GRAC_L3_CSR_RL06_LND_v04"
files = sorted(fs.glob(os.path.join("podaac-ops-cumulus-protected/", short_name, "*.nc")))
Jack
Re: S3 PODAAC Access Denied
I have exactly the same issue. I can use the s3fs interface to look at the podaac-ops-cumulus-protected bucket, however, I cannot use the AWS CLI interface. However, I also have access to other (non-public, swot) podaac buckets which do not have this issue. This seems like a configuration issue...
-
- Posts: 1
- Joined: Wed Jan 31, 2024 5:59 pm America/New_York
Re: S3 PODAAC Access Denied
I'm also getting the same error, trying to access the OSTIA-UKMO-L4-GLOB-v2.0 prefix. I'm manually logging in, copying the resulting aws credentials json string into a script, parsing it and attempting to access the bucket/prefix that way.
Re: S3 PODAAC Access Denied
Had anyone found a solution to this? I copied the code that Jack posted and I have not had any luck getting it to work yet.
When I use Python 3.12, I am getting the "An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied" from the "files = sorted(fs.glob(os.path.join("podaac-ops-cumulus-protected/", short_name, "*.nc")))" line.
When I use Python 3.6, the files variable come back empty.
I believe I have my Earthdata account all setup in my .netrc as it is working with the "podaac-data-downloader". If I move the .netrc I get a "json.decoder.JSONDecodeError" when I run "response = requests.get(url).json()" so I believe my .netrc is being read in the Python.
When I use Python 3.12, I am getting the "An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied" from the "files = sorted(fs.glob(os.path.join("podaac-ops-cumulus-protected/", short_name, "*.nc")))" line.
When I use Python 3.6, the files variable come back empty.
I believe I have my Earthdata account all setup in my .netrc as it is working with the "podaac-data-downloader". If I move the .netrc I get a "json.decoder.JSONDecodeError" when I run "response = requests.get(url).json()" so I believe my .netrc is being read in the Python.
Re: S3 PODAAC Access Denied
Hello, I'm running into the same issue as well. I started from the "Documentation" link (https://archive.podaac.earthdata.nasa.gov/s3credentialsREADME) under the Direct S3-Access option for my data product of interest (https://podaac.jpl.nasa.gov/dataset/MITgcm_LLC4320_Pre-SWOT_JPL_L4_ACC_SMST_v1.0), and wrote the following script:
```
import argparse
import base64
import json
import os
from getpass import getpass
import boto3
import requests
from botocore.exceptions import NoCredentialsError
# Default bucket and region
DEFAULT_BUCKET = "podaac-ops-cumulus-public"
DEFAULT_OBJ_PREFIX = "MITgcm_LLC4320_Pre-SWOT_JPL_L4_ACC_SMST_v1.0"
DEFAULT_REGION = "us-west-2"
S3_CREDENTIALS_ENDPOINT = "https://archive.podaac.earthdata.nasa.gov/s3credentials"
def main(bucket_name, object_prefix, local_download_path):
creds = retrieve_credentials()
# Create an S3 client with the temporary credentials and default region
s3_client = boto3.client(
"s3",
region_name=DEFAULT_REGION,
aws_access_key_id=creds["accessKeyId"],
aws_secret_access_key=creds["secretAccessKey"],
aws_session_token=creds["sessionToken"],
)
try:
# List objects within a specified bucket and prefix
response = s3_client.list_objects(Bucket=bucket_name, Prefix=object_prefix)
if "Contents" in response:
for item in response["Contents"]:
key = item["Key"]
download_file(
s3_client,
bucket_name,
key,
f"{local_download_path}/{key.split('/')[-1]}",
)
else:
print("No objects found.")
except Exception as e:
print(f"Error listing objects in bucket {bucket_name}: {e}")
def retrieve_credentials():
"""Authenticate with EarthData and return a set of temporary S3 credentials."""
username = os.environ.get("EARTHDATA_USERNAME", input("Enter EDL username: "))
password = os.environ.get("EARTHDATA_PASSWORD", getpass("Enter EDL password: "))
edl_login_url = get_edl_login_url()
s3_cred_url = authenticate_with_edl(
encode_edl_creds(username, password), edl_login_url
)
return get_temporary_aws_credentials(s3_cred_url)
def get_edl_login_url():
"""Make initial GET request to S3 credentials service and get EDL login form URL."""
response = requests.get(S3_CREDENTIALS_ENDPOINT, allow_redirects=False)
response.raise_for_status()
return response.headers["location"]
def encode_edl_creds(username, password):
"""Encode EDL credentials for authorization."""
return base64.b64encode(f"{username}:{password}".encode("ascii")).decode("ascii")
def authenticate_with_edl(encoded_credentials, edl_login_url):
"""Make POST request to EDL login form URL and return the S3 credentials URL."""
response = requests.post(
edl_login_url,
data={"credentials": encoded_credentials},
headers={"Origin": S3_CREDENTIALS_ENDPOINT},
allow_redirects=False,
)
response.raise_for_status()
return response.headers["location"]
def get_temporary_aws_credentials(s3_cred_url):
"""Make authenticated GET request and return temporary AWS credentials."""
print(f"Obtaining S3 access token from {s3_cred_url}")
response = requests.get(s3_cred_url, allow_redirects=False)
final_response = requests.get(
S3_CREDENTIALS_ENDPOINT,
cookies={"accessToken": response.cookies["accessToken"]},
)
final_response.raise_for_status()
return json.loads(final_response.content)
def download_file(s3_client, bucket, key, download_path):
"""Download a file from S3."""
try:
s3_client.download_file(bucket, key, download_path)
print(f"File {key} downloaded to {download_path}")
except NoCredentialsError:
print("Credentials are not available or invalid")
except Exception as e:
print(f"Failed to download {key}: {e}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Download files from AWS S3 with temporary credentials"
)
parser.add_argument(
"--bucket",
default=DEFAULT_BUCKET,
help="S3 bucket name, defaults to a specific PODAAC dataset",
)
parser.add_argument(
"--prefix", help="Object prefix for files to download", default=DEFAULT_OBJ_PREFIX
)
parser.add_argument(
"--download-path", required=True, help="Local path to download files"
)
args = parser.parse_args()
main(args.bucket, args.prefix, args.download_path)
```
The following is a sample of my Conda environment:
```
python 3.10.12 hd12c33a_0_cpython conda-forge
boto3 1.29.1 py310h06a4308_0 anaconda
aws-c-auth 0.6.26 h987a71b_2 conda-forge
aws-c-cal 0.5.21 h48707d8_2 conda-forge
aws-c-common 0.8.14 h0b41bf4_0 conda-forge
aws-c-compression 0.2.16 h03acc5a_5 conda-forge
aws-c-event-stream 0.2.20 h00877a2_4 conda-forge
aws-c-http 0.7.6 hf342b9f_0 conda-forge
aws-c-io 0.13.19 h5b20300_3 conda-forge
aws-c-mqtt 0.8.6 hc4349f7_12 conda-forge
aws-c-s3 0.2.7 h909e904_1 conda-forge
aws-c-sdkutils 0.1.9 h03acc5a_0 conda-forge
aws-checksums 0.1.14 h03acc5a_5 conda-forge
aws-crt-cpp 0.19.8 hf7fbfca_12 conda-forge
aws-sdk-cpp 1.10.57 h17c43bd_8 conda-forge
requests 2.31.0 py310h06a4308_0 anaconda
requests-oauthlib 1.3.0 py_0 anaconda
s3transfer 0.7.0 py310h06a4308_0 anaconda
```
```
import argparse
import base64
import json
import os
from getpass import getpass
import boto3
import requests
from botocore.exceptions import NoCredentialsError
# Default bucket and region
DEFAULT_BUCKET = "podaac-ops-cumulus-public"
DEFAULT_OBJ_PREFIX = "MITgcm_LLC4320_Pre-SWOT_JPL_L4_ACC_SMST_v1.0"
DEFAULT_REGION = "us-west-2"
S3_CREDENTIALS_ENDPOINT = "https://archive.podaac.earthdata.nasa.gov/s3credentials"
def main(bucket_name, object_prefix, local_download_path):
creds = retrieve_credentials()
# Create an S3 client with the temporary credentials and default region
s3_client = boto3.client(
"s3",
region_name=DEFAULT_REGION,
aws_access_key_id=creds["accessKeyId"],
aws_secret_access_key=creds["secretAccessKey"],
aws_session_token=creds["sessionToken"],
)
try:
# List objects within a specified bucket and prefix
response = s3_client.list_objects(Bucket=bucket_name, Prefix=object_prefix)
if "Contents" in response:
for item in response["Contents"]:
key = item["Key"]
download_file(
s3_client,
bucket_name,
key,
f"{local_download_path}/{key.split('/')[-1]}",
)
else:
print("No objects found.")
except Exception as e:
print(f"Error listing objects in bucket {bucket_name}: {e}")
def retrieve_credentials():
"""Authenticate with EarthData and return a set of temporary S3 credentials."""
username = os.environ.get("EARTHDATA_USERNAME", input("Enter EDL username: "))
password = os.environ.get("EARTHDATA_PASSWORD", getpass("Enter EDL password: "))
edl_login_url = get_edl_login_url()
s3_cred_url = authenticate_with_edl(
encode_edl_creds(username, password), edl_login_url
)
return get_temporary_aws_credentials(s3_cred_url)
def get_edl_login_url():
"""Make initial GET request to S3 credentials service and get EDL login form URL."""
response = requests.get(S3_CREDENTIALS_ENDPOINT, allow_redirects=False)
response.raise_for_status()
return response.headers["location"]
def encode_edl_creds(username, password):
"""Encode EDL credentials for authorization."""
return base64.b64encode(f"{username}:{password}".encode("ascii")).decode("ascii")
def authenticate_with_edl(encoded_credentials, edl_login_url):
"""Make POST request to EDL login form URL and return the S3 credentials URL."""
response = requests.post(
edl_login_url,
data={"credentials": encoded_credentials},
headers={"Origin": S3_CREDENTIALS_ENDPOINT},
allow_redirects=False,
)
response.raise_for_status()
return response.headers["location"]
def get_temporary_aws_credentials(s3_cred_url):
"""Make authenticated GET request and return temporary AWS credentials."""
print(f"Obtaining S3 access token from {s3_cred_url}")
response = requests.get(s3_cred_url, allow_redirects=False)
final_response = requests.get(
S3_CREDENTIALS_ENDPOINT,
cookies={"accessToken": response.cookies["accessToken"]},
)
final_response.raise_for_status()
return json.loads(final_response.content)
def download_file(s3_client, bucket, key, download_path):
"""Download a file from S3."""
try:
s3_client.download_file(bucket, key, download_path)
print(f"File {key} downloaded to {download_path}")
except NoCredentialsError:
print("Credentials are not available or invalid")
except Exception as e:
print(f"Failed to download {key}: {e}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Download files from AWS S3 with temporary credentials"
)
parser.add_argument(
"--bucket",
default=DEFAULT_BUCKET,
help="S3 bucket name, defaults to a specific PODAAC dataset",
)
parser.add_argument(
"--prefix", help="Object prefix for files to download", default=DEFAULT_OBJ_PREFIX
)
parser.add_argument(
"--download-path", required=True, help="Local path to download files"
)
args = parser.parse_args()
main(args.bucket, args.prefix, args.download_path)
```
The following is a sample of my Conda environment:
```
python 3.10.12 hd12c33a_0_cpython conda-forge
boto3 1.29.1 py310h06a4308_0 anaconda
aws-c-auth 0.6.26 h987a71b_2 conda-forge
aws-c-cal 0.5.21 h48707d8_2 conda-forge
aws-c-common 0.8.14 h0b41bf4_0 conda-forge
aws-c-compression 0.2.16 h03acc5a_5 conda-forge
aws-c-event-stream 0.2.20 h00877a2_4 conda-forge
aws-c-http 0.7.6 hf342b9f_0 conda-forge
aws-c-io 0.13.19 h5b20300_3 conda-forge
aws-c-mqtt 0.8.6 hc4349f7_12 conda-forge
aws-c-s3 0.2.7 h909e904_1 conda-forge
aws-c-sdkutils 0.1.9 h03acc5a_0 conda-forge
aws-checksums 0.1.14 h03acc5a_5 conda-forge
aws-crt-cpp 0.19.8 hf7fbfca_12 conda-forge
aws-sdk-cpp 1.10.57 h17c43bd_8 conda-forge
requests 2.31.0 py310h06a4308_0 anaconda
requests-oauthlib 1.3.0 py_0 anaconda
s3transfer 0.7.0 py310h06a4308_0 anaconda
```
Re: S3 PODAAC Access Denied
I wasn't able to get the direct S3 access working so I ended up looking at https://github.com/podaac/data-subscriber/tree/main/subscriber to make a custom implementation of the PO.DAAC Data Subscriber to solve my issue. Maybe you could do something similar?