Aqua and SST NC dataset questions

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
kierandonnellan
Posts: 3
Joined: Tue Apr 06, 2021 4:50 am America/New_York
Answers: 0

Aqua and SST NC dataset questions

by kierandonnellan » Tue Apr 06, 2021 5:25 am America/New_York

Hello. I am working with the aqua mapped NC files in order to measure the levels of chlor_a in the ocean. I am also working with SST NC files. I have made some observations that I require feedback on please:


1 - LAT AND LON VARIATION. I compared the annual aqua chlor_a NC files from 2019 and 2020, and they both have the same increment variations between lat and lon index positions (where the difference between adjacent positions is not the same). While most of these increments can be truncated to .041666, there are outliers. Is this expected? Here is a printout from my console:

Number of different GPS increments in Chlor 2019 : 14
[0.041656494, 0.04166412353515625, 0.04166603088378906, 0.041666507720947266, 0.041666626930236816, 0.041666656732559204, 0.0416666641831398, 0.0416666679084301, 0.0416666716337204, 0.04166668653488159, 0.04166674613952637, 0.04166698455810547, 0.041667938232421875, 0.0416717529296875]

Number of different GPS increments in Chlor 2020 : 14
[0.041656494, 0.04166412353515625, 0.04166603088378906, 0.041666507720947266, 0.041666626930236816, 0.041666656732559204, 0.0416666641831398, 0.0416666679084301, 0.0416666716337204, 0.04166668653488159, 0.04166674613952637, 0.04166698455810547, 0.041667938232421875, 0.0416717529296875]

The increments from both data sets are equal.


2 - SST SPREAD. In the annual SST file for 2020, I noticed that the areas of inland water such as lakes and rivers seem larger than they are in reality. Has any 'spread' occurred during the creation of these files that would make values leak out of the real world boundaries of these bodies of water?


3 - SPIKE CHLOR ANOMALY. In the monthly aqua NC files for 2020, I noticed a spike in the chlor_a value for a particular point during the month of February. The spike was approx 98.8. However, the mode for the remainder of the year was only 3.3 approx. I investigated further by analyzing the daily data for this point for February. There was only 1 valid data point - the 16th of February. To verify, I ran the same check for all points adjoining the original point, and they had the same singular high value on the 16th. Has data corruption occurred at this point? The point is just off the coast of Brisbane, Australia. I also noticed that the annual value for this point was listed as 98.8, which is not the average over the course of the year. What type of algorithm is being using to determine which values to show on the annual data sets? Is the scale of 0 to 100 for chlor_a the true chlor value in MG or has some other factor been applied to it?


4 - I am processing many data files as part of my study. Does NASA have a public facing REST API, or similar, that allows me to read directly from the Aqua chlor and SST databases in a more flexible way, rather than having to download and process NC data?

Tags:

gnwiii
Posts: 713
Joined: Fri Jan 29, 2021 5:51 pm America/New_York
Answers: 2
Has thanked: 1 time

Re: Aqua and SST NC dataset questions

by gnwiii » Tue Apr 06, 2021 8:56 am America/New_York

1 - LAT AND LON VARIATION.

For the cylindrical equidistant projection, the exact positions are spaced by 180/N, where N is the number of latitudes. The actual values are computed using floating point math, and stored in single-precision. For N=4320,
180/N = 1/24 which gives 0.04166... where "..." is infinitely repeated 6's.
The differences you see are actually computed as a difference of two floating point numbers: a-b, where a=(j+1)*(1/24) and b=j*(1/24). For single precision, you can only expect agreement to +/-1 in the 5th decimal place:
```
> options(digits=4)
> (delta <- c(0.041656494, 0.04166412353515625, 0.04166603088378906, 0.041666507720947266, 0.041666626930236816, 0.041666656732559204, 0.0416666641831398, 0.0416666679084301, 0.0416666716337204, 0.04166668653488159, 0.04166674613952637, 0.04166698455810547, 0.041667938232421875, 0.0416717529296875))
[1] 0.04166 0.04166 0.04167 0.04167 0.04167 0.04167 0.04167
[8] 0.04167 0.04167 0.04167 0.04167 0.04167 0.04167 0.04167
```
2 - SST SPREAD I'll leave that to someone who works with inland lakes

3 - SPIKE CHLOR ANOMALY

Such anomalies are not unusual and anomalous values are often associated with cloud edges. Level-2 processing is not perfect, and has problems near clouds. For some computations we either set values outside [0.01,28] to missing or to the nearest endpoint (this range is the limit for in situ observations from ships in the N. Atlantic). When the spike could influence analysis results, we look at the level-2 images for the location in question.
For some regional studies we find it better to collect the relevant level-2 files. We inspect them for anomalies and possible exclusion from the binning and mapping process. Some anomalies are real -- a reviewer once questioned a spike in a satellite chlor_a time-series for an area near Hawaii, but careful examination of level-2 images showed a patch of high chlor_a that drifted by just when the HOTS station was occupied. This confirmed that the patch was indeed due to phytoplankton. Such patches are common in remote sensing images, but very rarely from a ship taking measurements.

kierandonnellan
Posts: 3
Joined: Tue Apr 06, 2021 4:50 am America/New_York
Answers: 0

Re: Aqua and SST NC dataset questions

by kierandonnellan » Tue Apr 06, 2021 9:35 am America/New_York

Thanks for the quick response gnwiii! Point 1 is clear to me now.

Regarding Point 3, why is the range from .01 to 28? The max acceptable value that I can see in the chlor_a section of the NC files is 100.0. Is this scale of 0.01 to 100.0, as indicated in the files, the true chlor value in mg per cubic metre? Or do I need to apply a log to get the true value (I noticed that the color legend key in the online visualization tool only goes to 20 and that an OCI algo is mentioned).

Perhaps if I explain what I'm currently trying to do, it will be easier to understand it in context. I want to find all positions that have a chlor_a value greater than 1 mg/m3, so I need to know if I should be working with the value of 1 on the scale of 0.01 to 100.0, or the value of 1 on some other scale that I get by applying some algo to the original scale?

gnwiii
Posts: 713
Joined: Fri Jan 29, 2021 5:51 pm America/New_York
Answers: 2
Has thanked: 1 time

Re: Aqua and SST NC dataset questions

by gnwiii » Tue Apr 06, 2021 10:43 am America/New_York

The range [0.01, 28] is appropriate for the North Atlantic open ocean, where in situ measurements have not exceeded 28 (up to early 2000's).

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1464
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 4 times

Re: Aqua and SST NC dataset questions

by OB.DAAC - SeanBailey » Tue Apr 06, 2021 1:35 pm America/New_York

The range George noted is what he uses in his research. The data as distributed fall in the valid range of 0.01 - 100mg/m3. The data are NOT log scaled, but the color scale applied is (and is limited to 20mg/m3 as globally very little is above that). OCI is the name of the algorithm used to derive chlorophyll concentration for the archived products we distribute.

The "spread" you see is because the L3 files you're looking at are at 4.6km resolution. MOST rivers are nowhere near that wide, but as the source L2 data is nominally 1km resolution, we can derived chlorophyll for some of the larger inland water bodies. When binned (and then mapped) that 1km resolution is spread across the full 4.6km resolution pixel.

The binning of the data uses a simple mean, but as the world is a cloudy place, it is possible for a single high value from a single day to propagate through to the longer time periods without any additional data to change the "mean". Looks like your 98.8mg/m3 off Brisbane is such a case. No, it is not likely to be a valid yearly mean :D

We do have an OPeNDAP server for the L3 products you've looked at...you may be able to use that to avoid manually retrieving and reading the netCDF files.

Regards,
Sean

kierandonnellan
Posts: 3
Joined: Tue Apr 06, 2021 4:50 am America/New_York
Answers: 0

Re: Aqua and SST NC dataset questions

by kierandonnellan » Wed Apr 07, 2021 2:31 pm America/New_York

Thanks guys! @Sean - is it normal for people to perform the kind of scientific analysis I mentioned on the L2 or L3 data? So far I have only been working with L3. If you recommend L2, where can I find this, and are there any code snippets or tutorials somewhere that explain how to extract useful data using Python?

OB.DAAC - SeanBailey
User Services
User Services
Posts: 1464
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 4 times

Re: Aqua and SST NC dataset questions

by OB.DAAC - SeanBailey » Tue Apr 13, 2021 3:00 pm America/New_York

It all depends on your needs. L2 is (typically) higher resolution, but not mapped and not as "quality" masked (questionable outputs may have a flag set, but are not masked). The "not mapped" part makes using L2 data more difficult, not that difficult, but definitely not as easy as the gridded data for the L3 maps. The L2 files are netCDF4 format, and python (with the netCDF module) can read them, but I don't know of any pre-canned python tools for it.

Sean

Post Reply