wget download of order manifest not working

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
imaginaryfish
Posts: 10
Joined: Fri Jun 04, 2021 4:12 pm America/New_York
Answers: 0

wget download of order manifest not working

by imaginaryfish » Fri Sep 16, 2022 6:17 pm America/New_York

Hello,
I ordered a small data request to test download a processing pipeline but the download using wget is not returning what is expected.

My code:
wget' --load-cookies cookie_path --save-cookies cookie_path --keep-session-cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302&p=/data1/d090a3c740273d67/requested_files

The resulting message is:
--2022-09-16 16:40:20-- https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302
Resolving oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)... 2001:4d0:2418:128::84, 169.154.128.84
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /ob/getfile/requested_files_1.tar?h=ocdist302 [following]
--2022-09-16 16:40:20-- https://oceandata.sci.gsfc.nasa.gov/ob/getfile/requested_files_1.tar?h=ocdist302
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://urs.earthdata.nasa.gov/oauth/authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict [following]
--2022-09-16 16:40:21-- https://urs.earthdata.nasa.gov/oauth/authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict
Resolving urs.earthdata.nasa.gov (urs.earthdata.nasa.gov)... 2001:4d0:241a:4081::89, 198.118.243.33
Connecting to urs.earthdata.nasa.gov (urs.earthdata.nasa.gov)|2001:4d0:241a:4081::89|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https:%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict’

0K .......... .. 24.4M=0s

2022-09-16 16:40:21 (24.4 MB/s) - ‘authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https:%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict’ saved [12389]

--2022-09-16 16:40:21-- https://cdn.earthdata.nasa.gov/eui/1.1.3/stylesheets/application.css
Resolving cdn.earthdata.nasa.gov (cdn.earthdata.nasa.gov)... 2001:4d0:241a:4081::87, 198.118.243.36
Connecting to cdn.earthdata.nasa.gov (cdn.earthdata.nasa.gov)|2001:4d0:241a:4081::87|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 77025 (75K) [text/css]
Saving to: ‘application.css’

0K .......... .......... .......... .......... .......... 66% 757K 0s
50K .......... .......... ..... 100% 47.3M=0.07s

2022-09-16 16:40:21 (1.10 MB/s) - ‘application.css’ saved [77025/77025]

--2022-09-16 16:40:21-- (link=)oceandata.sci.gsfc.nasa.gov/assets/application-432b3917d4a41042c0fd963eba859548ef2993f5ed7a0dca4bdb446fdf807556.css
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:21 ERROR 404: Not Found.

--2022-09-16 16:40:21-- https://netdna.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css
Resolving netdna.bootstrapcdn.com (netdna.bootstrapcdn.com)... 2606:4700::6812:bcf, 2606:4700::6812:acf, 104.18.10.207, ...
Connecting to netdna.bootstrapcdn.com (netdna.bootstrapcdn.com)|2606:4700::6812:bcf|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/css]
Saving to: ‘font-awesome.min.css’

0K .......... .......... ... 5.18M=0.004s

2022-09-16 16:40:21 (5.18 MB/s) - ‘font-awesome.min.css’ saved [23739]

--2022-09-16 16:40:21-- https://fonts.googleapis.com/css?family=Source+Sans+Pro:300,700
Resolving fonts.googleapis.com (fonts.googleapis.com)... 2607:f8b0:4009:80a::200a, 142.250.190.106
Connecting to fonts.googleapis.com (fonts.googleapis.com)|2607:f8b0:4009:80a::200a|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/css]
Saving to: ‘css?family=Source+Sans+Pro:300,700’

0K 134M=0s

2022-09-16 16:40:22 (134 MB/s) - ‘css?family=Source+Sans+Pro:300,700’ saved [420]

--2022-09-16 16:40:22-- https://www.googletagmanager.com/ns.html?id=GTM-WNP7MLF
Resolving www.googletagmanager.com (www.googletagmanager.com)... 2607:f8b0:4009:804::2008, 172.217.2.40
Connecting to www.googletagmanager.com (www.googletagmanager.com)|2607:f8b0:4009:804::2008|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘ns.html?id=GTM-WNP7MLF’

0K 54.8M=0s

2022-09-16 16:40:22 (54.8 MB/s) - ‘ns.html?id=GTM-WNP7MLF’ saved [460]

--2022-09-16 16:40:22-- https://oceandata.sci.gsfc.nasa.gov/
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20703 (20K) [text/html]
Saving to: ‘index.html’

0K .......... .......... 100% 472K=0.04s

2022-09-16 16:40:22 (472 KB/s) - ‘index.html’ saved [20703/20703]

--2022-09-16 16:40:22-- https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 302 Found
Location: /ob/getfile/requested_files_1.tar?h=ocdist302 [following]
--2022-09-16 16:40:22-- https://oceandata.sci.gsfc.nasa.gov/ob/getfile/requested_files_1.tar?h=ocdist302
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://urs.earthdata.nasa.gov/oauth/authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict [following]
--2022-09-16 16:40:22-- https://urs.earthdata.nasa.gov/oauth/authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict
Connecting to urs.earthdata.nasa.gov (urs.earthdata.nasa.gov)|2001:4d0:241a:4081::89|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https:%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict.1’

0K .......... .. 19.5M=0.001s

2022-09-16 16:40:22 (19.5 MB/s) - ‘authorize?client_id=Z0u-MdLNypXBjiDREZ3roA&response_type=code&redirect_uri=https:%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict.1’ saved [12389]

--2022-09-16 16:40:22-- (link=)oceandata.sci.gsfc.nasa.gov/assets/hamburger-68c8505066427f3e3f6ee40b24cfd3c9f7c0fe93ee298b9046564637262115fa.png
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:23 ERROR 404: Not Found.

--2022-09-16 16:40:23-- (link=)oceandata.sci.gsfc.nasa.gov/users/new?client_id=Z0u-MdLNypXBjiDREZ3roA&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&response_type=code
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:23 ERROR 404: Not Found.

--2022-09-16 16:40:23-- (link=)oceandata.sci.gsfc.nasa.gov/retrieve_info
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:23 ERROR 404: Not Found.

--2022-09-16 16:40:23-- (link=)oceandata.sci.gsfc.nasa.gov/reset_passwords/new
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:23 ERROR 404: Not Found.

--2022-09-16 16:40:23-- (link=)oceandata.sci.gsfc.nasa.gov/users/new?client_id=Z0u-MdLNypXBjiDREZ3roA&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&response_type=code
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:40:23 ERROR 404: Not Found.

--2022-09-16 16:40:23-- https://nodis3.gsfc.nasa.gov/displayDir.cfm?t=NPD&c=2810&s=1E
Resolving nodis3.gsfc.nasa.gov (nodis3.gsfc.nasa.gov)... 2001:4d0:2310:153::24, 129.164.181.207
Connecting to nodis3.gsfc.nasa.gov (nodis3.gsfc.nasa.gov)|2001:4d0:2310:153::24|:443... failed: Operation timed out.
Connecting to nodis3.gsfc.nasa.gov (nodis3.gsfc.nasa.gov)|129.164.181.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25854 (25K) [text/html]
Saving to: ‘displayDir.cfm?t=NPD&c=2810&s=1E’

0K .......... .......... ..... 100% 291K=0.09s

2022-09-16 16:41:39 (291 KB/s) - ‘displayDir.cfm?t=NPD&c=2810&s=1E’ saved [25854/25854]

--2022-09-16 16:41:39-- https://oceandata.sci.gsfc.nasa.gov/
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20703 (20K) [text/html]
Saving to: ‘index.html.1’

0K .......... .......... 100% 478K=0.04s

2022-09-16 16:41:39 (478 KB/s) - ‘index.html.1’ saved [20703/20703]

--2022-09-16 16:41:39-- https://oceandata.sci.gsfc.nasa.gov/users/new
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:41:39 ERROR 404: Not Found.

--2022-09-16 16:41:39-- (link=)oceandata.sci.gsfc.nasa.gov/documentation
Reusing existing connection to [oceandata.sci.gsfc.nasa.gov]:443.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:41:39 ERROR 404: Not Found.

URL transformed to HTTPS due to an HSTS policy
--2022-09-16 16:41:39-- https://www.nasa.gov/
Resolving www.nasa.gov (www.nasa.gov)... 2600:9000:212f:e200:12:80e9:d700:93a1, 2600:9000:212f:2600:12:80e9:d700:93a1, 2600:9000:212f:1e00:12:80e9:d700:93a1, ...
Connecting to www.nasa.gov (www.nasa.gov)|2600:9000:212f:e200:12:80e9:d700:93a1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11314 (11K) [text/html]
Saving to: ‘index.html.2’

0K .......... . 100% 55.6M=0s

2022-09-16 16:41:39 (55.6 MB/s) - ‘index.html.2’ saved [11314/11314]

--2022-09-16 16:41:39-- (link=)oceandata.sci.gsfc.nasa.gov/assets/application-15d4faf28e91715dccccf34f3e808b7a348298fb623e29d09281b30cc1f87492.js
Connecting to oceandata.sci.gsfc.nasa.gov (oceandata.sci.gsfc.nasa.gov)|2001:4d0:2418:128::84|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-09-16 16:41:40 ERROR 404: Not Found.

--2022-09-16 16:41:40-- https://cdn.earthdata.nasa.gov/tophat2/tophat2.js
Connecting to cdn.earthdata.nasa.gov (cdn.earthdata.nasa.gov)|2001:4d0:241a:4081::87|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15591 (15K) [application/javascript]
Saving to: ‘tophat2.js’

0K .......... ..... 100% 24.7M=0.001s

2022-09-16 16:41:40 (24.7 MB/s) - ‘tophat2.js’ saved [15591/15591]

--2022-09-16 16:41:40-- https://fbm.earthdata.nasa.gov/for/URS4/feedback.js
Resolving fbm.earthdata.nasa.gov (fbm.earthdata.nasa.gov)... 2001:4d0:241a:4081::91, 198.118.243.39
Connecting to fbm.earthdata.nasa.gov (fbm.earthdata.nasa.gov)|2001:4d0:241a:4081::91|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/javascript]
Saving to: ‘feedback.js’

0K ....... 23.2M=0s

2022-09-16 16:41:40 (23.2 MB/s) - ‘feedback.js’ saved [7351]

--2022-09-16 16:41:40-- https://dap.digitalgov.gov/Universal-Federated-Analytics-Min.js?agency=NASA&subagency=GSFC&dclink=true
Resolving dap.digitalgov.gov (dap.digitalgov.gov)... 2600:9000:204d:3000:5:83ea:ba80:93a1, 2600:9000:204d:f400:5:83ea:ba80:93a1, 2600:9000:204d:e400:5:83ea:ba80:93a1, ...
Connecting to dap.digitalgov.gov (dap.digitalgov.gov)|2600:9000:204d:3000:5:83ea:ba80:93a1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18764 (18K) [application/javascript]
Saving to: ‘Universal-Federated-Analytics-Min.js?agency=NASA&subagency=GSFC&dclink=true’

0K .......... ........ 100% 60.7M=0s

2022-09-16 16:41:40 (60.7 MB/s) - ‘Universal-Federated-Analytics-Min.js?agency=NASA&subagency=GSFC&dclink=true’ saved [18764/18764]

FINISHED --2022-09-16 16:41:40--
Total wall clock time: 1m 20s
Downloaded: 13 files, 241K in 0.2s (982 KB/s)

I can download the data by just pasting in the url, but that is not the point. I am using a MacBook OS BigSur ver. 11.6.8 (20G730). I can download individual files fine but the the order is not working.

Tags:

OB General Science - guoqingw
Subject Matter Expert
Subject Matter Expert
Posts: 78
Joined: Fri Jun 03, 2022 10:54 am America/New_York
Answers: 0
Location: NASA GSFC
Been thanked: 1 time
Contact:

Re: wget download of order manifest not working

by OB General Science - guoqingw » Tue Sep 20, 2022 2:32 pm America/New_York

You may need to configure your username and password for authentication using a .netrc file following the instructions on the following page:
https://oceancolor.gsfc.nasa.gov/data/download_methods/#netrc

imaginaryfish
Posts: 10
Joined: Fri Jun 04, 2021 4:12 pm America/New_York
Answers: 0

Re: wget download of order manifest not working

by imaginaryfish » Tue Sep 20, 2022 3:57 pm America/New_York

Thank you for the instructions. I regenerated the .netrc and .urs_cookies files in my home directory. I am able to download other files from the direct data access with no problem; however, when I try to download the manifest with the following command:

wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302&p=/data1/d090a3c740273d67/requested_files

the response is:

"HTTP request sent, awaiting response... 409 Conflict
2022-09-20 14:46:54 ERROR 409: Conflict.

No URLs found in https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302."

My data order manager says this order is valid until 9/23/2022 and I can still paste the url into a browser and download it directly (https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302&p=/data1/d090a3c740273d67/requested_files). I am confused why this is happening.
Last edited by imaginaryfish on Tue Sep 20, 2022 3:59 pm America/New_York, edited 1 time in total.

imaginaryfish
Posts: 10
Joined: Fri Jun 04, 2021 4:12 pm America/New_York
Answers: 0

Re: wget download of order manifest not working

by imaginaryfish » Tue Sep 20, 2022 3:58 pm America/New_York

Sorry, correction, the data order is valid until 9/23/2022

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1464
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 5 times

Re: wget download of order manifest not working

by OB.DAAC - SeanBailey » Wed Sep 21, 2022 10:35 am America/New_York

Your wget command is incorrect. You do not need (or want) the "-i" option here.

Try this instead:

wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/requested_files_1.tar?h=ocdist302&p=/data1/d090a3c740273d67/requested_files

But, since you downloaded it via a browser, you don't need the wget call. You have the file. It's a tar file that contains the following files:
❯ tar -tf requested_files_1.tar
requested_files/AQUA_MODIS.20030104T182500.L2.OC.x.nc
requested_files/AQUA_MODIS.20030101T193001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030103T192001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030108T193501.L2.OC.x.nc
requested_files/AQUA_MODIS.20030101T175001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030101T175501.L2.OC.x.nc
requested_files/AQUA_MODIS.20030107T185500.L2.OC.x.nc
requested_files/AQUA_MODIS.20030102T183500.L2.OC.x.nc
requested_files/AQUA_MODIS.20030106T181001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030105T190500.L2.OC.x.nc
requested_files/AQUA_MODIS.20030108T180001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030108T194001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030106T181501.L2.OC.x.nc
requested_files/AQUA_MODIS.20030103T174001.L2.OC.x.nc
requested_files/AQUA_MODIS.20030105T191001.L2.OC.x.nc

Sean

imaginaryfish
Posts: 10
Joined: Fri Jun 04, 2021 4:12 pm America/New_York
Answers: 0

Re: wget download of order manifest not working

by imaginaryfish » Wed Sep 21, 2022 6:23 pm America/New_York

Thank you for the suggestion Sean.

I am working on a pipeline to automate the download and processing a large chunk of data, so this is a test run on a small subset.

I did get this download to work. I did need the "-i" option in wget. The problem was I was supplying wget the url from the manifest file instead of the manifest file itself. It's the simple things that derail larger plans.

Thanks again to guoqingw also.

OB General Science - guoqingw
Subject Matter Expert
Subject Matter Expert
Posts: 78
Joined: Fri Jun 03, 2022 10:54 am America/New_York
Answers: 0
Location: NASA GSFC
Been thanked: 1 time
Contact:

Re: wget download of order manifest not working

by OB General Science - guoqingw » Thu Sep 22, 2022 9:51 am America/New_York

I think the problem could be from the bulk downloading you are trying to do. When using wget to download more than one file, you can put all the urls (e.g. https://oceandata.sci.gsfc.nasa.gov/ob/getfile/T2017004001500.L1A_LAC.bz2) into a txt file (such as required_files.txt), then use the following command line to download:
wget --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --no-check-certificate --content-disposition -i required_files.txt

The information on this page should be helpful to you (under the tab of "Retrieving Orders"):
https://oceancolor.gsfc.nasa.gov/data/download_methods/

Please let us know whether this helps.

Post Reply