I am wondering if there are any general guides for windows users and working with the large amount of individual files that come with a data request.
NCO offers has great solutions, but some of the functionality is buggy or missing on windows. The Python bindings often have similar issues. Panoply can open the individual files but offers no solution to connecting datasets. The guide for using wget was basic but clear--I would love to see something like that for dealing with the files after.
We have discussed several suggestions that may work for you. If you can provide specifics on the data you are attempting to use, we can tailor our response better to match your needs. Here are some specifications that would be useful to know: Level 2 or Level 3? Temporal resolution? File format? For the latter, we would assume netCDF, but that is not the only format we provide.
I found a solution by installing cygwin to combine files with NCO commands and then individually exported .csv files from variables I needed using panoply. I don't think this is the most efficient method...
I am using GLDAS L4 data from Noah and the netCDF .nc4 files.
If others are facing this problem I would be happy to share my workaround here if appropriate!
Regarding general advice, we saw that you can use Cygwin.
Cdo has a lot of functionality: https://code.mpimet.mpg.de/projects/cdo/. Some of our staff members find it is more useful than nco.
Gdal has some good tools as well. https://gdal.org/programs/
Hello. My fellow staff members would like to know what you meant by "connecting datasets" in your original question; what is your goal? They may be able to offer more specific advice.
By connecting datasets I mean combining the individual files that wget returned with time as the record dimension. Before I used earthdata I had never really worked extensively with the windows command prompt or large amounts of data and found the task a little intimidating--problems such as having a space in my windows username or limits of the windows NCO binary kept coming up and sending me back to Stack Overflow or Source Forge forums. Maybe its best if I just post the step by step and we can point out where there is an easier way. Full disclosure, most of this was new to me.
wget: install to C:\ and have a folder with the data there as well. This was related to having a space in my windows username between my first and last name and the install automatically choosing a folder with that username in the path.
install cygwin: install cygwin and add the cygwin64\bin to Path in the environment variables menu. I then moved all NCO .exe files I downloaded in a windows tarball from https://eternallybored.org/misc/wget/ into cygwin64\bin.
combine .nc4 files: I combined the 500 or so .nc4 files I had using the record dimension of 'time', but when I opened panoply and looked to use functions or plot data I received the error that NaN inputs were present. I decided to combine all files with the command 'ncecat *.nc4 outputfile.nc4'. The combined file with no record dimension specified didn't return the NaN error.
Creating my desired data frame: I ultimately wanted a handful of variables for one location to combine with a .csv files with agricultural data. I used panoply to export the variables one by one to .csv, opened them with excel, and pasted them into my main spreadsheet. For variables like soil moisture that exported thousands of inputs more than the amount of records through the file>export csv option, I created a line plot with record as the horizontal access and copy and pasted the data from the array tab. In excel, I converted the record to the correct dates (record 1= January 1980, record 2 = February 1980 etc.).
I hope this gives a clearer idea of a novice getting lost after the files finished downloading!