Download ASDC Files with Wget

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
njester
Posts: 20
Joined: Sat Mar 06, 2021 9:03 am America/New_York
Answers: 0
Been thanked: 3 times

Download ASDC Files with Wget

by njester » Mon May 24, 2021 11:44 am America/New_York

This guide shows how to data from https://asdc.larc.nasa.gov/data/ using bash on Linux or MacOs, or Cygwin on Windows.
  1. Identify the file URL or top level URL for the file(s) you want to download. To find a url, head here and find a URL for the data you want to download.
  2. You'll need to get a Earthdata Login token. To generate a new token or get an existing token, use the following steps:
    1. Head to the Earthdata Login.
    2. Select the “Generate Token” option in the top blue bar.
    3. If you do not have a token, select the green “Generate Token” button.
    4. Copy the token using the "Copy to Clipboard" button.
  3. Download your data using the script below.

    Code: Select all

    URL=<enter the top level URL here>
    TOKEN=<paste your token here>
    wget --header "Authorization: Bearer $TOKEN" --recursive --no-parent --reject "index.html*" --execute robots=off $URL
    
Wget has lots of options, that let you customize how it looks for and downloads files. If you need something that the script above doesn't do, check the official documentation:
https://www.gnu.org/software/wget/
Last edited by njester on Fri Jun 17, 2022 1:50 pm America/New_York, edited 5 times in total.

Tags:

lefsky
Posts: 1
Joined: Fri Aug 13, 2021 2:00 pm America/New_York
Answers: 0

Re: Download Multiple ASDC Files with Wget

by lefsky » Fri Aug 13, 2021 2:35 pm America/New_York

I am trying to adapt this script to allow me to download all of the file locations within a particular directory, specifically

https://asdc.larc.nasa.gov/data/DSCOVR/EPIC/L2_CLOUD_03

using the --spider option but it only downloads the index.html.* files rather than rejecting them. Can you assist?

ASDC - cheyenne.e.land
Subject Matter Expert
Subject Matter Expert
Posts: 129
Joined: Mon Mar 22, 2021 3:55 pm America/New_York
Answers: 1
Has thanked: 1 time
Been thanked: 8 times

Re: Download Multiple ASDC Files with Wget

by ASDC - cheyenne.e.land » Mon Aug 16, 2021 11:22 am America/New_York

Hello,

Thank you for your question. A Subject Matter Expert has been notified and will answer your question shortly.

njester
Posts: 20
Joined: Sat Mar 06, 2021 9:03 am America/New_York
Answers: 0
Been thanked: 3 times

Re: Download Multiple ASDC Files with Wget

by njester » Thu Aug 19, 2021 4:35 pm America/New_York

lefsky wrote: Fri Aug 13, 2021 2:35 pm America/New_York I am trying to adapt this script to allow me to download all of the file locations within a particular directory, specifically

https://asdc.larc.nasa.gov/data/DSCOVR/EPIC/L2_CLOUD_03

using the --spider option but it only downloads the index.html.* files rather than rejecting them. Can you assist?
The script provided should download all the files within a directory. Did you use the first code from the first code block to create .netrc and .urs_cookies file? If not, or if the credentials you entered were wrong, it will download an html page saying that it login failed. I don't think you'll need spider for this application unless I've misunderstood your question.
-Nathan

roux
Posts: 4
Joined: Tue Nov 02, 2021 7:20 am America/New_York
Answers: 0

Re: Download Multiple ASDC Files with Wget

by roux » Tue Nov 02, 2021 7:24 am America/New_York

Note that there is a gotcha in this recipe. This line

"""
echo "machine urs.earthdata.nasa.gov login $USERNAME password $PASSWORD" > .netrc
"""

will overwrite whatever you had in your ~/.netrc.

Backup often and stay safe :)

njester
Posts: 20
Joined: Sat Mar 06, 2021 9:03 am America/New_York
Answers: 0
Been thanked: 3 times

Re: Download Multiple ASDC Files with Wget

by njester » Tue Nov 02, 2021 12:12 pm America/New_York

roux wrote: Tue Nov 02, 2021 7:24 am America/New_York Note that there is a gotcha in this recipe. This line

"""
echo "machine urs.earthdata.nasa.gov login $USERNAME password $PASSWORD" > .netrc
"""

will overwrite whatever you had in your ~/.netrc.

Backup often and stay safe :)
That's a great point! I'll update the code example!

mingquan
Posts: 2
Joined: Fri Jan 14, 2022 2:52 pm America/New_York
Answers: 0

Re: Download Multiple ASDC Files with Wget

by mingquan » Fri Jan 14, 2022 6:20 pm America/New_York

njester wrote:
> [quote=lefsky post_id=8918 time=1628879738 user_id=3801]
> I am trying to adapt this script to allow me to download all of the file
> locations within a particular directory, specifically
>
> https://asdc.larc.nasa.gov/data/DSCOVR/EPIC/L2_CLOUD_03
>
> using the --spider option but it only downloads the index.html.* files
> rather than rejecting them. Can you assist?
> [/quote]
>
> The script provided should download all the files within a directory. Did
> you use the first code from the first code block to create .netrc and
> .urs_cookies file? If not, or if the credentials you entered were wrong,
> it will download an html page saying that it login failed. I don't think
> you'll need spider for this application unless I've misunderstood your
> question.
> -Nathan

The first code works well. However the issue is in the second code. How to set up my URL in the second code? I have server, redwood.ess.uci.edu. After I define: URL = redwood.ess.uci.edu, it seems like my server has no response at all when I run wget, the last line in the second code.

njester
Posts: 20
Joined: Sat Mar 06, 2021 9:03 am America/New_York
Answers: 0
Been thanked: 3 times

Re: Download Multiple ASDC Files with Wget

by njester » Thu Jan 20, 2022 9:40 am America/New_York

I think I may have used poor wording in the example, thank you for pointing this out. I can see where "URL=<your url here>" is could be confusing, because it's not your personal URL, but instead it's the URL of the files you want to download. I'll update the wording later today. Thank you!

Here's an example of how to download code:

Lets say you wanted to download the contents of

Code: Select all

https://asdc.larc.nasa.gov/data/ACEPOL/AircraftRemoteSensing_AirHARP_Data_1/
You would change

Code: Select all

URL=<your url here>
to

Code: Select all

URL=https://asdc.larc.nasa.gov/data/ACEPOL/AircraftRemoteSensing_AirHARP_Data_1/

joeyhotz
Posts: 4
Joined: Tue Aug 30, 2022 11:58 am America/New_York
Answers: 0

Re: Download ASDC Files with Wget

by joeyhotz » Wed Aug 31, 2022 2:48 pm America/New_York

To: njester post_id=8960 time=1629405345 user_id=3030
Nathan wrote in a previous post: The script provided should download all the files within a directory. Did you use the first code from the first code block to create .netrc and .urs_cookies file? If not, or if the credentials you entered were wrong, it will download an html page saying that it login failed. I don't think you'll need spider for this application unless I've misunderstood your question.


Has the first code block which creates the .netrc and .urs_cookies files been deleted from the forum post above?

I am having the issue which you mentioned where I am downloading an HTML page which says that my login failed, but I am unsure how to set up these two files to circumvent this issue.

njester
Posts: 20
Joined: Sat Mar 06, 2021 9:03 am America/New_York
Answers: 0
Been thanked: 3 times

Re: Download ASDC Files with Wget

by njester » Thu Sep 01, 2022 9:25 am America/New_York

@joeyhotz

Yes, I had to update the old guide. Earthdata Login updated and now requires a token instead of username/password. Have you tried the new, token based script listed above?

Post Reply