I'm having the same problems as above - wget only works --auth-no-challenge=on, but since this method sends plaintext password, it's not great. LPDAAC downloads (eg, https://e4ftl01.cr.usgs.gov/VIIRS/VNP13A1.001/2016.08.28/VNP13A1.A2016241.h24v06.001.2018162005736.h5) also require authentication, but work with just "wget --user --password". Is it possible to configure this site in a similar way?
--auth-no-challenge=onin not ideal, but it is going over an HTTPS connection, so it isn't as bad as it could be :eek:
You probably should use the .netrc/urs_cookie approach described on https://oceancolor.gsfc.nasa.gov/data/download_methods/ instead of the command line username/password approach.
BTW, I ran your example file through wget with the --verbose option set, and it seems that the login fails (yes, I did pass it proper credentials :wink:), but the download proceeds anyway - which suggests to me that they're not verifying the URS response. This would explain why it *works* for them but not us (we verify).
Thank you, but the problem is I actually can't use neither wget nor curl, as my binaries do not have access to the Internet. Instead, we have an internal system that proxies HTTP requests, and I don't think I'd be able to use plaintext auth there. I was hoping to use this system for HEAD requests to get file sizes - any chance you could turn off auth for HEAD, maybe? (The actual downloads go through another system, but it's too cumbersome to use for HEAD.)
The .netrc approach also might not be trivial with the internal system.
What auth failure do you see with the LP DAAC file? What about https://n5eil01u.ecs.nsidc.org/MOST/MOD10A1.006/2000.02.24/MOD10A1.A2000055.h34v10.006.2016061160522.hdf?
(They definitely require the correct credentials.)
Yes, upon closer inspection, it does indeed seem to require a valid login.
It also spits out a 401 amid a flurry of 302s, so that is odd...I've asked folks to dig deeper to see if we can get wget to be happy without the
Perhaps there is something in the way we're making the authentication request to URS...
You do not need to a HEAD request to get a filesize. In fact, bad form to do so ( in my opinion :grin: )
The file_search API does not require authentication and can be used to retrieve information about a file.
If you are less specific in the search parameters, you'll get a JSON output with the information for all the files that match your search.
<!DOCTYPE html><html lang="en-US"><head><meta http-equiv="content-type" content="text/html;charset=utf-8"><meta name="ROBOTS" content="NOARCHIVE"><title>ERROR @ OceanColor Biology Processing Group (OBPG)</title></head><body link=#323232 vlink=#323232 alink=#323232 style="background-color:#ffffff; color:#323232; font-size:175%"><br><hr color=#323232><center><h1><b>.:. ERROR .:.</b></h1><h2>OceanColor Biology Processing Group (OBPG)</h2><blockquote>Sorry, an error has occurred. Use the back button to return to the previous page or go to the <a href="https://oceancolor.gsfc.nasa.gov">Ocean Color Home Page</a>.</blockquote><br><hr color= #323232></body></html>
Do you happen to have some IP blocks, maybe?
BTW, is the information about file size.date exported to NASA CMR?
Would your internal system be seen as crawl-????.googlebot.com? If so, then yes, you're being denied access to our search API.
Yes, the information returned by the API for filesize, etc. should match the corresponding information we provide to CMR.
--auth-no-challenge=on, arguing that there is no real security advantage, that the extra request has low cost/benefit, and that curl already defaults to wget's
--auth-no-challenge=onbehaviour. I expect such a change is more likely to appear in wget2.