Skip to content

Add nonlinear solver based on domain decomposition method#3764

Merged
bska merged 4 commits into
OPM:masterfrom
atgeirr:aspin
Jun 28, 2023
Merged

Add nonlinear solver based on domain decomposition method#3764
bska merged 4 commits into
OPM:masterfrom
atgeirr:aspin

Conversation

@atgeirr

@atgeirr atgeirr commented Jan 10, 2022

Copy link
Copy Markdown
Member

Making draft PR to provide a place to discuss aspects of the code. It still contains a lot of hacks, and when the time comes to get the functionality into the master branch it will be refactored to reduce the size: many methods have been added to support local solves that largely duplicate existing methods. I also think that there will be a series of smaller PRs instead of one large PR. There should be no change in performance or results for the existing fully implicit Newton(-like) method from the addition of new methods by themselves, but there could be some small improvements here and there. Any such will also be done in separate PRs.

@atgeirr

atgeirr commented Jan 10, 2022

Copy link
Copy Markdown
Member Author

Note: needs OPM/opm-grid#537 and OPM/opm-models#682 to compile and run correctly.

Comment thread opm/simulators/linalg/extractMatrix.hpp Outdated
bool matrixEqual(const Matrix& m1, const Matrix& m2)
{
// Compare size and nonzeroes.
if (m1.N() != m1.N()) return false;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow - that was hard to read 👓

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to mention the fact that unless m1.N() is NaN, this condition will never be true. I suspect one of the m1s should be an m2.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the commit message says: untested. With a test you would have discovered the m1 != m1 issue which @bska pointed out (I did not see it). This is new code - not deeply entranched in the existing patterns - no reason to not make a test.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And furthermore: Why not make a separate PR of these matrix utilities right away?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch! I might have made a test, and a PR, but this part is code thrown together in a debugging process, and not really intended for the end product. Still embarrassing to see...

@OPMUSER

OPMUSER commented Nov 2, 2022

Copy link
Copy Markdown
Contributor

Can this be closed?

@bska

bska commented Nov 2, 2022

Copy link
Copy Markdown
Member

Can this be closed?

I don't think so. We should rather pick this up after the 2022.10 release, because the feature is well worth having.

@atgeirr atgeirr changed the title [WIP] Nonlinear domain decomposition methods Add nonlinear solver based on domain decomposition method Jun 16, 2023
@atgeirr atgeirr marked this pull request as ready for review June 16, 2023 14:26
@atgeirr

atgeirr commented Jun 16, 2023

Copy link
Copy Markdown
Member Author

Marking this as ready for review.

@bska

bska commented Jun 16, 2023

Copy link
Copy Markdown
Member

jenkins build this please

@bska bska left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of new warnings, mostly concerning unused parameters or unused variables, in this version of the patch. There are also a few regression failures which must be analysed.

While I think the overall patch is desirable, I think we need to do a bit more work before we enable this in the main simulator.

@atgeirr

atgeirr commented Jun 20, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@atgeirr

atgeirr commented Jun 21, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@bska

bska commented Jun 21, 2023

Copy link
Copy Markdown
Member

Just a bit of a "heads up" notice: In local testing it seems that this PR breaks LinearSolver=cprw. We can still get most of the effect of CPRW by supplying an appropriate .json parameter description for the linear solver parameter, but the neat/convenient shorthand no longer works.

As an example, if I run NORNE_ATW2013 from OPM-Tests with the parameters below I get an unfriendly diagnostic message (OpmLog::error()) and a simulator crash:

Report step  0/247 at day 0/3312, date = 06-Nov-1997
Restart file written for report step   0/247, date = 06-Nov-1997 00:00:00
Using Newton nonlinear solver.

Starting time step 0, stepsize 1 days, at day 0/8, date = 06-Nov-1997

Error: [/home/bska/work/opm/src/opm/opm-simulators/opm/simulators/linalg/PreconditionerFactory_impl.hpp:485] Preconditioner type cprw is not registered in the factory. Available types are: GS ILU0 ILUn Jac ParOverILU0 SOR SSOR amg cpr cprt famg kamg 

Program threw an exception: [/home/bska/work/opm/src/opm/opm-simulators/opm/simulators/linalg/PreconditionerFactory_impl.hpp:485] Preconditioner type cprw is not registered in the factory. Available types are: GS ILU0 ILUn Jac ParOverILU0 SOR SSOR amg cpr cprt famg kamg 

[bska-ws:1138127] *** Process received signal ***
[bska-ws:1138127] Signal: Segmentation fault (11)
[bska-ws:1138127] Signal code: Address not mapped (1)
[bska-ws:1138127] Failing at address: 0x48
[bska-ws:1138127] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fbada6fb420]
[bska-ws:1138127] [ 1] /home/bska/work/opm/build/opm.dev/normal/opt/mpi/dune-2.7.1/inst/bin/flow(+0x1e0aa76)[0x563abcdf7a76]
[bska-ws:1138127] [ 2] /home/bska/work/opm/build/opm.dev/normal/opt/mpi/dune-2.7.1/inst/bin/flow(+0x1e0ae1b)[0x563abcdf7e1b]
[bska-ws:1138127] [ 3] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7fbada230de4]
[bska-ws:1138127] [ 4] /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7fbada6ef609]
[bska-ws:1138127] [ 5] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7fbad9f1d133]
[bska-ws:1138127] *** End of error message ***
./run.sh: line 19: 1138127 Segmentation fault      (core dumped) OMP_NUM_THREADS=1 "${flow}" --ecl-deck-file-name="${cse}" --output-dir="${out}" --output-extra-convergence-info=iteration,step --relaxed-max-pv-fraction=0.0 --ecl-enable-drift-compensation=false --min-strict-cnv-iter=10 --threads-per-process=1 ${prm} "${@}"

If we want to enable this then we should probably ensure that we don't break useful options in this way.


OutputExtraConvergenceInfo="iteration,step" # default: "none"
RelaxedMaxPvFraction="0.0" # default: "0.03"
EclEnableDriftCompensation="false" # default: "1"
MinStrictCnvIter="10" # default: "0"
ThreadsPerProcess="1" # default: "-1"
ParameterFile="cprw.settings.real.param" # default: ""
CprReuseInterval="200" # default: "30"
CprReuseSetup="4" # default: "4"
EclNewtonRelaxedVolumeFraction="1e-10" # default: "0.03"
EnableStorageCache="true" # default: "1"
EnableWellOperabilityCheck="false" # default: "1"
LinearSolver="cprw" # default: "ilu0"
LinearSolverReduction="1.0e-3" # default: "0.01"
MatrixAddWellContributions="true" # default: "0"
MaxNewtonIterationsWithInnerWellIterations="0" # default: "8"
NewtonMinIterations="-1" # default: "1"
SolverMaxTimeStepInDays="30" # default: "365"
TimeStepControlTargetNewtonIterations="8" # default: "8"
ToleranceCnv="1e-2" # default: "0.01"
ToleranceMb="1e-6" # default: "1e-06"
ToleranceWells="1e-4" # default: "0.0001"

@atgeirr

atgeirr commented Jun 23, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@atgeirr

atgeirr commented Jun 23, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@atgeirr

atgeirr commented Jun 23, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@bska

bska commented Jun 23, 2023

Copy link
Copy Markdown
Member

Just a bit of a "heads up" notice: In local testing it seems that this PR breaks LinearSolver=cprw. We can still get most of the effect of CPRW by supplying an appropriate .json parameter description for the linear solver parameter, but the neat/convenient shorthand no longer works.

Correction: It's the combination of LinearSolver=cprw (command line parameter setting --linear-solver=cprw) and MatrixAddWellContributions=true (command line parameter setting --matrix-add-well-contributions=true) which is broken. On the other hand that's broken in master as well so isn't really a change in this PR.

If it does not make sense to have that combination then we need to intercept it earlier so that users don't get the simulator crash.

@atgeirr

atgeirr commented Jun 23, 2023

Copy link
Copy Markdown
Member Author

If it does not make sense to have that combination then we need to intercept it earlier so that users don't get the simulator crash

Indeed it does not make sense, so yes, we need to check for that.

Also do not change getting intensive quantities in convergence
checks.
@atgeirr

atgeirr commented Jun 26, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@bska bska left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think this looks good now for the most part. I have just a couple of small comments concerning disabled statements and then I'm prepared to merge this.

Incidentally, while testing this on my local system, I think I may have stumbled upon a problem in GCC 9.x's -O3 optimizer. Comparing the make test output to the master branch at -O3 produces many test failures, but everything passes at -O2. Since GCC 9.x is nearing its end-of-life I'm not too concerned about this however.

#include <ebos/eclproblem.hh>
#include <opm/models/utils/start.hh>

#include <opm/simulators/timestepping/AdaptiveTimeSteppingEbos.hpp>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: We also include this header a few lines down. It doesn't really matter since we have include guards, but maybe we could remove one of the include statements?

Comment on lines +32 to +41
#include <opm/simulators/flow/NonlinearSolverEbos.hpp>
#include <opm/simulators/flow/BlackoilModelParametersEbos.hpp>
#include <opm/simulators/wells/BlackoilWellModel.hpp>
#include <opm/simulators/aquifers/BlackoilAquiferModel.hpp>
#include <opm/simulators/wells/WellConnectionAuxiliaryModule.hpp>
#include <opm/simulators/flow/partitionCells.hpp>
#include <opm/simulators/flow/SubDomain.hpp>
#include <opm/simulators/flow/countGlobalCells.hpp>
#include <opm/simulators/utils/DeferredLoggingErrorHelpers.hpp>
#include <opm/simulators/linalg/extractMatrix.hpp>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this ordering? We're mixing simulators/flow, simulators/wells, simulators/aquifers, simulators/utils, and simulators/linalg which at least superficially draws the eye.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was due to a rebasing error. I have removed all duplicates and avoid adding new headers that are not needed, and inserted all new ones in sorted order.

std::vector<int> domain_order(domains_.size());
if (param_.local_solve_approach_ == "gauss-seidel") {
// TODO: enable flexibility and choice in choosing domain ordering approach.
if (true) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this ever be anything other than true? If not, we can probably save a level of nesting.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep the extra code with the TODO above as a reminder to look into this, if that is ok.

for (int ii : {0, 1}) {
if (std::isnan(res[ii])) {
report.setReservoirFailed({types[ii], CR::Severity::NotANumber, compIdx});
report.setReservoirFailed({types[ii], CR::Severity::NotANumber, compIdx});//, cell[ii], res[ii]});

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we ever need cell[ii] and/or res[ii]? If not I think we should remove those disabled quantities.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@atgeirr

atgeirr commented Jun 26, 2023

Copy link
Copy Markdown
Member Author

jenkins build this please

@bska

bska commented Jun 26, 2023

Copy link
Copy Markdown
Member

Incidentally, while testing this on my local system, I think I may have stumbled upon a problem in GCC 9.x's -O3 optimizer. Comparing the make test output to the master branch at -O3 produces many test failures, but everything passes at -O2. Since GCC 9.x is nearing its end-of-life I'm not too concerned about this however.

On second thought it's probably -march=native that's causing the instability. I ran a new test just now, removing -march=native, but re-adding -O3 and that goes through without issue and produces the same regression results in the master branch as with this PR. Using -mtune=native appears to be fine.

@bska bska left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this now and tested it locally. With the caveat that this seems to trigger an instability when compiled with GCC 9.x and -march=native and -O3 I'm sufficiently convinced that it is not the PR itself that causes the problem. I will run a final build check just to be sure and then merge once the build check is green.

@bska

bska commented Jun 28, 2023

Copy link
Copy Markdown
Member

jenkins build this please

@bska

bska commented Jun 28, 2023

Copy link
Copy Markdown
Member

I'm sufficiently convinced that it is not the PR itself that causes the problem. I will run a final build check just to be sure and then merge once the build check is green.

Build check is green. I'll merge this into the master branch now.

@bska bska merged commit 9de5350 into OPM:master Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants