MPI run errors

Bug reports, work arounds and fixes

Moderators: arango, robertson

Post Reply
Message
Author
jande023
Posts: 29
Joined: Tue Oct 16, 2012 8:55 pm
Location: Old Dominion Universiy

MPI run errors

#1 Unread post by jande023 »

Unable to successfully run ROMS in MPI mode. In fact, I haven't been able to successfully compile oceanM. I can run in serial and Open MP. Would appreciate any guidance that could be provided.

My makefile and jobscript (job_jra.sh) are attached. When submitting (qsub job_jra.sh) job fails to submit and returns several files: ICS-MPI.o12911, ICS-MPI.e12911, ICS-MPI.pe12911, ICS-MPI.po12911, ICS-MPI.po12911.impi.mpd.hosts, and ICS-MPI.po12911.log.mynmpdhosts. These files appears to be unrevealing.

When I submit the job interactively (qrsh) job just hangs in the que indefinitely.

The job scheduler says that I am trying to use the parallel environment "nope" that environment does not have and slots for the queue. You should be using one of the mpi parallel environments. Here is the message from the job control scheduler.

cannot run in PE "nope" because it only offers 0 slots
This is put into place so that you cannot run a job though the job scheduler without a PE(parallel Environment).

The script I used has been used by to run other non-ROMS code in the parallel environment.

Thanks
Attachments
makefile.txt
text version of makefile
(94.02 KiB) Downloaded 366 times
job_jra.sh
job script
(998 Bytes) Downloaded 390 times

jwn4548
Posts: 8
Joined: Mon Jun 10, 2013 7:23 pm
Location: Rochester Institute of Technology

Re: MPI run errors

#2 Unread post by jwn4548 »

You said that you haven't been able to compile oceanM yet, so what are you running with mpiexec command in your job file? Notice that you called for ./oceanM, which you claim is not built.

What error message are you getting when trying to build oceanM?

jande023
Posts: 29
Joined: Tue Oct 16, 2012 8:55 pm
Location: Old Dominion Universiy

Re: MPI run errors

#3 Unread post by jande023 »

My script was intended to compile oceanM and then run the same. When I try to build interactively, the following happens:

[jande023@nikola trunk]$ qrsh -pe ics-mpi-44 16
Last login: Tue Aug 13 16:20:49 2013 from nikola.hpc.local
[jande023@nikola-28-13 ~]$ module load intel/ics/2012.0/impi
[jande023@nikola-28-13 ~]$ cd ROMS/trunk
[jande023@nikola-28-13 trunk]$ make
makefile:233: INCLUDING FILE Build/make_macros.mk WHICH CONTAINS APPLICATION-DEPENDENT MAKE DEFINITIONS
./ROMS/Bin/sfmakedepend --cpp --fext=f90 --file=- --objdir=Build -DROMS_HEADER="york_estuary.h" -I ROMS/Include -I ROMS/Nonlinear -I ROMS/Nonlinear/Biology -I ROMS/Nonlinear/Sediment -I ROMS/Utility -I ROMS/Drivers -I ROMS/Functionals -I Apps/YorkEstuary2 -I Master -I Compilers --silent --moddir Build ROMS/Nonlinear/bbl.F ROMS/Nonlinear/bc_2d.F ROMS/Nonlinear/bc_3d.F ROMS/Nonlinear/bc_bry2d.F ROMS/Nonlinear/bc_bry3d.F ROMS/Nonlinear/bulk_flux.F ROMS/Nonlinear/bvf_mix.F ROMS/Nonlinear/conv_2d.F ROMS/Nonlinear/conv_3d.F ROMS/Nonlinear/conv_bry2d.F ROMS/Nonlinear/conv_bry3d.F ROMS/Nonlinear/diag.F ROMS/Nonlinear/exchange_2d.F ROMS/Nonlinear/exchange_3d.F ROMS/Nonlinear/forcing.F ROMS/Nonlinear/frc_adjust.F ROMS/Nonlinear/get_data.F ROMS/Nonlinear/get_idata.F ROMS/Nonlinear/gls_corstep.F ROMS/Nonlinear/gls_prestep.F ROMS/Nonlinear/hmixing.F ROMS/Nonlinear/ini_fields.F ROMS/Nonlinear/initial.F ROMS/Nonlinear/interp_floats.F ROMS/Nonlinear/lmd_bkpp.F ROMS/Nonlinear/lmd_skpp.F ROMS/Nonlinear/lmd_swfrac.F ROMS/Nonlinear/lmd_vmix.F ROMS/Nonlinear/main2d.F ROMS/Nonlinear/main3d.F ROMS/Nonlinear/mpdata_adiff.F ROMS/Nonlinear/my25_corstep.F ROMS/Nonlinear/my25_prestep.F ROMS/Nonlinear/obc_adjust.F ROMS/Nonlinear/obc_volcons.F ROMS/Nonlinear/omega.F ROMS/Nonlinear/output.F ROMS/Nonlinear/pre_step3d.F ROMS/Nonlinear/prsgrd.F ROMS/Nonlinear/radiation_stress.F ROMS/Nonlinear/rho_eos.F ROMS/Nonlinear/rhs3d.F ROMS/Nonlinear/set_avg.F ROMS/Nonlinear/set_data.F ROMS/Nonlinear/set_depth.F ROMS/Nonlinear/set_massflux.F ROMS/Nonlinear/set_tides.F ROMS/Nonlinear/set_vbc.F ROMS/Nonlinear/set_zeta.F ROMS/Nonlinear/step2d.F ROMS/Nonlinear/step3d_t.F ROMS/Nonlinear/step3d_uv.F ROMS/Nonlinear/step_floats.F ROMS/Nonlinear/t3dbc_im.F ROMS/Nonlinear/t3dmix.F ROMS/Nonlinear/tkebc_im.F ROMS/Nonlinear/u2dbc_im.F ROMS/Nonlinear/u3dbc_im.F ROMS/Nonlinear/uv3dmix.F ROMS/Nonlinear/v2dbc_im.F ROMS/Nonlinear/v3dbc_im.F ROMS/Nonlinear/vwalk_floats.F ROMS/Nonlinear/wetdry.F ROMS/Nonlinear/wvelocity.F ROMS/Nonlinear/zetabc.F ROMS/Nonlinear/Biology/biology.F ROMS/Nonlinear/Biology/biology_floats.F ROMS/Nonlinear/Sediment/sed_bed.F ROMS/Nonlinear/Sediment/sed_bedload.F ROMS/Nonlinear/Sediment/sed_fluxes.F ROMS/Nonlinear/Sediment/sediment.F ROMS/Nonlinear/Sediment/sed_settling.F ROMS/Nonlinear/Sediment/sed_surface.F ROMS/Functionals/analytical.F ROMS/Utility/abort.F ROMS/Utility/array_modes.F ROMS/Utility/back_cost.F ROMS/Utility/cgradient.F ROMS/Utility/checkadj.F ROMS/Utility/checkdefs.F ROMS/Utility/checkerror.F ROMS/Utility/checkvars.F ROMS/Utility/close_io.F ROMS/Utility/congrad.F ROMS/Utility/convolve.F ROMS/Utility/cost_grad.F ROMS/Utility/def_avg.F ROMS/Utility/def_diags.F ROMS/Utility/def_dim.F ROMS/Utility/def_error.F ROMS/Utility/def_floats.F ROMS/Utility/def_gst.F ROMS/Utility/def_hessian.F ROMS/Utility/def_his.F ROMS/Utility/def_impulse.F ROMS/Utility/def_info.F ROMS/Utility/def_ini.F ROMS/Utility/def_lanczos.F ROMS/Utility/def_mod.F ROMS/Utility/def_norm.F ROMS/Utility/def_rst.F ROMS/Utility/def_station.F ROMS/Utility/def_tides.F ROMS/Utility/def_var.F ROMS/Utility/distribute.F ROMS/Utility/dotproduct.F ROMS/Utility/erf.F ROMS/Utility/extract_obs.F ROMS/Utility/extract_sta.F ROMS/Utility/frc_weak.F ROMS/Utility/gasdev.F ROMS/Utility/get_2dfld.F ROMS/Utility/get_2dfldr.F ROMS/Utility/get_3dfld.F ROMS/Utility/get_3dfldr.F ROMS/Utility/get_bounds.F ROMS/Utility/get_cycle.F ROMS/Utility/get_date.F ROMS/Utility/get_grid.F ROMS/Utility/get_gst.F ROMS/Utility/get_ngfld.F ROMS/Utility/get_ngfldr.F ROMS/Utility/get_state.F ROMS/Utility/get_varcoords.F ROMS/Utility/grid_coords.F ROMS/Utility/ini_adjust.F ROMS/Utility/ini_hmixcoef.F ROMS/Utility/ini_lanczos.F ROMS/Utility/inp_par.F ROMS/Utility/inquire.F ROMS/Utility/interpolate.F ROMS/Utility/lbc.F ROMS/Utility/lubksb.F ROMS/Utility/ludcmp.F ROMS/Utility/metrics.F ROMS/Utility/mp_exchange.F ROMS/Utility/mp_routines.F ROMS/Utility/nf_fread2d_bry.F ROMS/Utility/nf_fread2d.F ROMS/Utility/nf_fread3d_bry.F ROMS/Utility/nf_fread3d.F ROMS/Utility/nf_fread4d.F ROMS/Utility/nf_fwrite2d_bry.F ROMS/Utility/nf_fwrite2d.F ROMS/Utility/nf_fwrite3d_bry.F ROMS/Utility/nf_fwrite3d.F ROMS/Utility/nf_fwrite4d.F ROMS/Utility/normalization.F ROMS/Utility/nrutil.F ROMS/Utility/obs_cost.F ROMS/Utility/obs_depth.F ROMS/Utility/obs_initial.F ROMS/Utility/obs_read.F ROMS/Utility/obs_write.F ROMS/Utility/packing.F ROMS/Utility/posterior.F ROMS/Utility/posterior_var.F ROMS/Utility/ran1.F ROMS/Utility/random_ic.F ROMS/Utility/ran_state.F ROMS/Utility/read_asspar.F ROMS/Utility/read_biopar.F ROMS/Utility/read_couplepar.F ROMS/Utility/read_fltbiopar.F ROMS/Utility/read_fltpar.F ROMS/Utility/read_phypar.F ROMS/Utility/read_sedpar.F ROMS/Utility/read_stapar.F ROMS/Utility/regrid.F ROMS/Utility/rep_matrix.F ROMS/Utility/set_2dfld.F ROMS/Utility/set_2dfldr.F ROMS/Utility/set_3dfld.F ROMS/Utility/set_3dfldr.F ROMS/Utility/set_diags.F ROMS/Utility/set_masks.F ROMS/Utility/set_ngfld.F ROMS/Utility/set_ngfldr.F ROMS/Utility/set_scoord.F ROMS/Utility/set_weights.F ROMS/Utility/shapiro.F ROMS/Utility/sqlq.F ROMS/Utility/state_addition.F ROMS/Utility/state_copy.F ROMS/Utility/state_dotprod.F ROMS/Utility/state_initialize.F ROMS/Utility/state_product.F ROMS/Utility/state_scale.F ROMS/Utility/stats_modobs.F ROMS/Utility/stiffness.F ROMS/Utility/strings.F ROMS/Utility/sum_grad.F ROMS/Utility/timers.F ROMS/Utility/uv_rotate.F ROMS/Utility/vorticity.F ROMS/Utility/white_noise.F ROMS/Utility/wpoints.F ROMS/Utility/wrt_avg.F ROMS/Utility/wrt_diags.F ROMS/Utility/wrt_error.F ROMS/Utility/wrt_floats.F ROMS/Utility/wrt_gst.F ROMS/Utility/wrt_hessian.F ROMS/Utility/wrt_his.F ROMS/Utility/wrt_impulse.F ROMS/Utility/wrt_info.F ROMS/Utility/wrt_ini.F ROMS/Utility/wrt_rst.F ROMS/Utility/wrt_station.F ROMS/Utility/wrt_tides.F ROMS/Utility/zeta_balance.F ROMS/Modules/mod_arrays.F ROMS/Modules/mod_average.F ROMS/Modules/mod_bbl.F ROMS/Modules/mod_behavior.F ROMS/Modules/mod_biology.F ROMS/Modules/mod_boundary.F ROMS/Modules/mod_clima.F ROMS/Modules/mod_coupler.F ROMS/Modules/mod_coupling.F ROMS/Modules/mod_diags.F ROMS/Modules/mod_eclight.F ROMS/Modules/mod_eoscoef.F ROMS/Modules/mod_floats.F ROMS/Modules/mod_forces.F ROMS/Modules/mod_fourdvar.F ROMS/Modules/mod_grid.F ROMS/Modules/mod_iounits.F ROMS/Modules/mod_kinds.F ROMS/Modules/mod_mixing.F ROMS/Modules/mod_ncparam.F ROMS/Modules/mod_nesting.F ROMS/Modules/mod_netcdf.F ROMS/Modules/mod_ocean.F ROMS/Modules/mod_parallel.F ROMS/Modules/mod_param.F ROMS/Modules/mod_scalars.F ROMS/Modules/mod_sedbed.F ROMS/Modules/mod_sediment.F ROMS/Modules/mod_sources.F ROMS/Modules/mod_stepping.F ROMS/Modules/mod_storage.F ROMS/Modules/mod_strings.F ROMS/Modules/mod_tides.F Master/esmf_roms.F Master/master.F Master/ocean_control.F Master/ocean_coupler.F Master/propagator.F Master/roms_export.F Master/roms_import.F > Build/MakeDepend
cp -p /user/home/jande023/make_macros.mk Build
makefile:233: INCLUDING FILE Build/make_macros.mk WHICH CONTAINS APPLICATION-DEPENDENT MAKE DEFINITIONS
/user/opt/xeon/intel/ics/2012.0/impi/4.0.3.008/intel64/bin/mpiifort -heap-arrays -fp-model precise -ip -O2 -Vaxlib Build/esmf_roms.o Build/master.o Build/ocean_control.o Build/ocean_coupler.o Build/propagator.o Build/roms_export.o Build/roms_import.o -o oceanM Build/libNLM.a Build/libNLM_bio.a Build/libNLM_sed.a Build/libANA.a Build/libUTIL.a Build/libMODS.a -L/user/home/jande023/netcdf363_ifort/lib -lnetcdf -lfmpi-pgi -lmpi-pgi

Error: A license for FComp is not available (-5,412).

Make sure that a license file is being used that contains a license
for the requested feature. If your license requires a license server,
make sure that the server is using the right license file (usually,
this would be the same license file that is being used by this
application), and make sure that you have not changed the license
file since starting the server.

License file(s) used were (in this order):
1. Trusted Storage
** 2. /user/opt/xeon/intel/ics/2012.0/composer_xe_2011_sp1.6.233/licenses/intel.lic
** 3. /user/opt/xeon/intel/ics/2012.0.032/composer_xe_2011_sp1.6.233/Licenses
** 4. /user/home/jande023/intel/licenses
** 5. /opt/intel/licenses
** 6. /Users/Shared/Library/Application Support/Intel/Licenses
** 7. /user/opt/xeon/intel/ics/2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/*.lic

Please visit http://software.intel.com/sites/support/ if you require technical assistance.

ifort: error #10052: could not checkout FLEXlm license
make: *** [oceanM] Error 1
[jande023@nikola-28-13 trunk]$

Have attached a the makefile I used in the above build attempt.

thanks
Attachments
makefile.txt
Makefile used in the interactive build
(19.78 KiB) Downloaded 343 times

User avatar
arango
Site Admin
Posts: 1368
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: MPI run errors

#4 Unread post by arango »

Read the log file. The error is very clear :!: You are having problem with ifort compiler license... This is not a ROMS related error!

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: MPI run errors

#5 Unread post by kate »

Perhaps the question is what compiler did you use for the serial and OpenMP runs? Why isn't it trying to use that now for the MPI?

jande023
Posts: 29
Joined: Tue Oct 16, 2012 8:55 pm
Location: Old Dominion Universiy

Re: MPI run errors

#6 Unread post by jande023 »

Thank you Kate, clearly worth investigating. While I am certain I loaded the same compiler module, I may very well have altered my makefile between the runs.

User avatar
arango
Site Admin
Posts: 1368
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: MPI run errors

#7 Unread post by arango »

Use the build script :!: There is not a reason to edit the distributed makefile. It is quite complex and we can easily make a mistake.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: MPI run errors

#8 Unread post by kate »

With MPI, you might have chosen USE_MPIF90. What do you get with "which mpif90"? I can get:

Code: Select all

/usr/local/pkg/openmpi/openmpi-1.4.3.pgi-13.4/bin/mpif90
or

Code: Select all

/usr/local/pkg/openmpi/openmpi-1.4.3.gnu-4.7.3/bin/mpif90
depending on my modules.

jande023
Posts: 29
Joined: Tue Oct 16, 2012 8:55 pm
Location: Old Dominion Universiy

Re: MPI run errors

#9 Unread post by jande023 »

Thank you both.

I have kept my changes to the distributed makefile to a minimum. Specifically, I have changed the paths of my source code (HEADER, ANALYTICAL, ROOT_DIR, etc.). Again, these all check based on successful serial and Open MP runs. Beyond that, I have only changed:

USE_MPI ?= on (blank and commented out)
USE_MPIF90 ?= on (blank and commented out)
and
FORT ?= mpif90, ifort, mpiifort
and all permutations of the above.

with MPI on, I get the license error with ifort and mpiifort
and with mpif90, I get:

[jande023@nikola trunk]$ make
makefile:237: INCLUDING FILE /user/home/jande023/make_macros.mk WHICH CONTAINS APPLICATION-DEPENDENT MAKE DEFINITIONS
makefile:368: /user/home/jande023/ROMS/trunk/Compilers/Linux-mpif90.mk: No such file or directory
cp -f /netcdf.mod Build
cp: cannot stat `/netcdf.mod': No such file or directory
make: *** No rule to make target `/user/home/jande023/ROMS/trunk/Compilers/Linux-mpif90.mk'. Stop.

Obviously, there is no Linux-mpif90.mk in my Compiler directory. Is my compiler source code incomplete?

Note: when I load module intel/ics/2010.0/impi (the compiler I have used for successful serial and Open MP runs) and type "which mpif90" it returns:

user/opt/xeon/intel/ics/2012.0/impi/4.0.3.008/intel64/bin/mpif90

Appreciate your support and patience.

User avatar
kate
Posts: 4091
Joined: Wed Jul 02, 2003 5:29 pm
Location: CFOS/UAF, USA

Re: MPI run errors

#10 Unread post by kate »

As you can see, many people use the mpif90 wrapper around whatever compiler they have. That's why the setup is to call the compiler by it's unique name (like ifort), then invoke the USE_MPIF90 flag so that the Linux-ifort.mk file knows to be using mpif90 as the name of the compiler. If it should instead be called mpiifort, then turn off USE_MPIF90 and make sure Linux-ifort.mk does the right thing.

Now, if your ifort license is valid for calling "ifort" but not for calling "mpiifort", well, I can't answer that.

jande023
Posts: 29
Joined: Tue Oct 16, 2012 8:55 pm
Location: Old Dominion Universiy

Re: MPI run errors

#11 Unread post by jande023 »

Success! It turns out that I was trying to compile my MPI executable from an "inappropriate" node from within the cluster. I was subsequently able to compile, and then submit my job to the cluster with a successful run. Thank you again. John

Post Reply