Parallel OCSSW l2gen code with OpenMP

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
fengqiao
Posts: 2
Joined: Thu Aug 12, 2021 4:56 am America/New_York
Answers: 0

Parallel OCSSW l2gen code with OpenMP

by fengqiao » Sat Feb 19, 2022 2:43 am America/New_York

Hello,
I want parallel ocssw l2gen code with OpenMP. I build the third libs and ocssw source code following the instruction from https://seadas.gsfc.nasa.gov/build_ocss ... g-the-code.

I add OpemMP instrction code before the first for loop in $OCSSWROOT/ocssw_src/src/l2gen/main_l2gen.c file.

```c
#pragma omp parallel for firstprivate(l1rec, l2rec) shared(l1file, ofile) num_threads(2)
for (iscan = sscan; iscan <= escan; iscan += dscan) {}
```
But, I get the error: Segmentation fault (core dumped).

Debug in VSCode, it report a HDF error.

Begin MSl12 processing at 2022050153251000

Allocated 5628269 bytes in L1 record.
Allocated 5628269 bytes in L1 record.
Allocated 5628269 bytes in L1 record.
Loading land mask file from /home/ocssw/ocssw/share/common/landmask_GMT15ARC.nc
Loading bathymetry mask file from /home/ocssw/ocssw/share/common/watermask.dat
Loading DEM info from /home/ocssw/ocssw/share/common/ETOPO1_ocssw.nc
Loading ice mask file from /home/ocssw/ocssw/share/common/ice_climatology.hdf
Loaded monthly NSIDC ice climatology HDF file.
Loading elevation file from /home/ocssw/ocssw/share/common/ETOPO1_ocssw.nc
-E- /home/ocssw/ocssw/ocssw_src/oel_util/libnetcdfutils/nc_gridutils.c:463: NetCDF: HDF error

I check HDF5 lib in $OCSSWROOT/opt/src/hdf5/BuildIt.py, found it must set `OCSSW_MPI=1` to enable parallel HDF read.
So, I set `OCSSW_MPI=1` in `~/.bashrc` file.

But, the same error (Segmentation fault) occur again.

And, Serial l2gen become slower. It takes 6 seconds to process a line date, now it takes 212 seconds.

Now, I have some question:

What is the function of `OCSSW_MPI=1`?
Why l2gen become slower when `OCSSW_MPI=1`?
Is it possible to parallelize l2gen using OpenMP?

Hope to get the answer.

Thanks.

feng qiao
Attachments
屏幕截图 2022-02-18 215107.png
屏幕截图 2022-02-18 215107.png (199.26 KiB) Not viewed yet

Tags:

gnwiii
Posts: 713
Joined: Fri Jan 29, 2021 5:51 pm America/New_York
Answers: 2
Has thanked: 1 time

Re: Parallel OCSSW l2gen code with OpenMP

by gnwiii » Tue Feb 22, 2022 6:10 pm America/New_York

fengqiao wrote: Sat Feb 19, 2022 2:43 am America/New_York Hello,
I want parallel ocssw l2gen code with OpenMP.
Many OCSSW users want to do the same l2gen processing on many files. For that use case, running multiple l2gen processes (e.g., using GNU parallel) is simple and efficient. I have done exactly that many times. Note that mass storage I/O is often a bottleneck, so you may need to experiment to determine the number of parallel jobs that provides best thruput. https://github.com/bcdev/calvalus2 came out of major study for ESA about a decade ago. Many of the ideas are still relevant today.

OB SeaDAS - dshea
Subject Matter Expert
Subject Matter Expert
Posts: 258
Joined: Thu Mar 05, 2009 10:25 am America/New_York
Answers: 0
Been thanked: 2 times

Re: Parallel OCSSW l2gen code with OpenMP

by OB SeaDAS - dshea » Thu Feb 24, 2022 2:39 pm America/New_York

OCSSW_MPI=1 is only used to enable a few of the libraries in ocssw/opt/src to be compiled with openmpi support. We have some radiative transfer code that uses MPI on a parallel processing computer we have. After the OCSSW_MPI environment variable is set to 1, make sure you rebuild ocssw/opt/src

Every time I think about parallelizing l2gen, I think of all the memory conflicts that have to be sorted out. In the run you posted here you will note that l2gen has a cache of 3 L1 records to do filtering. That will make it difficult to process 2 different lines in the same memory space on different threads.

I am not surprised that enabling parallel processing in HDF5 slows things down. There is some extra book keeping needed for multiple threads, but it is probably l2gen using a cached pointer that is the real problem. l2gen also reads a lot of HDF4 files. Not sure if that API works inside of MPI. When there is a memory overrun in l2gen, the program more often than not crashes in HDF5. Somehow, HDF5 is really sensitive to writing on its memory.

As George mentioned we have many l2gen processes running on a single computer processing different input files, so making l2gen parallel would probably slow down the throughput of the system as a whole.

I have not spent much time thinking about this, but let us know what you find out.

don

fengqiao
Posts: 2
Joined: Thu Aug 12, 2021 4:56 am America/New_York
Answers: 0

Re: Parallel OCSSW l2gen code with OpenMP

by fengqiao » Fri Mar 04, 2022 2:30 am America/New_York

Thanks, gnwiii. Actually what i want to do is give parallelism on the l2gen code. So, another user does not need to have knowledge of parallelism, and can directly use parallel l2gen program to process data.
As dshea said, to achieve parallelism, all memory conflicts must be resolved.
There are two ways to parallelize the l2gen code, 1) pixel-level parallelism (each pixel is processed separately); 2) intra-pixel parallelism.
1) pixel-level parallelism: When different threads process different image row data, a thread reads the auxiliary data first, reads the auxiliary data to the private memory of the thread, and locks the HDF file at the same time, so that other threads cannot access it and an error occurs.
2)intra-pixel parallelism: The default atmospheric correction algorithm of seadas is a while loop, there are data dependencies between loops, and it is difficult to be parallelized.

Post Reply