-
Notifications
You must be signed in to change notification settings - Fork 121
8354404: [lworld] compiler/startup/StartupOutput.java crashes during AdapterHandlerLibrary::initialize() due to too little CodeCacheSize #1509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back mchevalier! A progress list of the required criteria for merging this PR into |
@marc-chevalier This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been no new commits pushed to the As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@TobiHartmann) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@TobiHartmann was wondering why it was so different, so I've dug in it. There are a lot more adapters in Valhalla than in mainline right now (700 vs 300), which causes a multiple hundred kilo additional usage (between 1 and 2 per adapter). It seems that adapters for a same signature are not cached (at least in debug). But this part of the code is widely different between Valhalla and mainline, and that's because JDK-8350209 changed a lot of things there recently, and hasn't yet been merge in Valhalla it seems. I've compared with mainline before this change, and it behaves similarly as Valhalla: a lot more adapters (very close in cardinal and set as Valhalla), same kind of weird crashes for lack of adapter space even for relatively high memory settings. Valhalla still seems to need a bit more memory, but it also needs more adapters, quite some being related to flat values in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spending time on looking into this thoroughly. The fix looks good to me for now. As we discussed offline, let's file a follow-up RFE to check again once JDK-8350209 is integrated.
@marc-chevalier this pull request can not be integrated into git checkout JDK-8354404.StartupOutput-fails
git fetch https://git.openjdk.org/valhalla.git lworld
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge lworld"
git push |
/integrate Thanks @TobiHartmann. To be continued... |
@marc-chevalier |
/sponsor |
Going to push as commit 5e55eda. |
@TobiHartmann @marc-chevalier Pushed as commit 5e55eda. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Experimentally, replacing 800 with 2000 is needed to have this test pass on all platform. Somehow, while 1300 seems ok on my Linux (while having a fair share of "CodeCache is full. Compiler has been disabled." messages), Windows boxes seems to struggle even with 1800. Not sure why: isn't the size of the things in the code cache (that is, the code), depends mostly on the encoding of instructions, that is dependent on architecture, not OS? Anyway, it happens reliably, so something had to be done!
Let's compare (with debug builds so I can use
CodeCacheMinimumUseSpace
, that is 1200k by default on debug builds), for various values ofReservedCodeCacheSize
(in k), when running-version
. For mainline:[4, 28]
coming from
report_should_not_call(char const*, int)
initialize_stubs(BlobId, int, int, char const*, char const*, char const*)
StubRoutines::initialize_continuation_stubs()
continuation_stubs_init()
init_globals()
with
3000
being20000
forReservedCodeCacheSize
between 5 and 24, and being500
forReservedCodeCacheSize
= 4 (the minimum).[29, 32]
fatal error: Initial size of CodeCache is too small
https://github.com/openjdk/jdk/blob/b787ad6f690df5c82a1efc5ccac658a9238ff201/src/hotspot/share/code/codeBlob.cpp#L579
coming from
RuntimeStub::new_runtime_stub(char const*, CodeBuffer*, short, int, OopMapSet*, bool, bool)
SharedRuntime::generate_jfr_return_lease()
SharedRuntime::generate_jfr_stubs()
init_globals()
[33, 1112]
...
report_should_not_call(char const*, int)
StubQueue::StubQueue(StubInterface*, int, Mutex*, char const*)
TemplateInterpreter::initialize_stub()
interpreter_init_stub()
init_globals()
[1113, 1116] SIGSEGV
CodeSection::emit_int8(unsigned char)
AbstractAssembler::emit_int8(int)
Assembler::push(Register)
MacroAssembler::enter()
RegisterSaver::save_live_registers(MacroAssembler*, int, int*, bool)
SharedRuntime::generate_resolve_blob(StubId, unsigned char*)
SharedRuntime::generate_stubs()
init_globals()
[1117, 1128]
fatal error: Initial size of CodeCache is too small
https://github.com/openjdk/jdk/blob/b787ad6f690df5c82a1efc5ccac658a9238ff201/src/hotspot/share/code/codeBlob.cpp#L596
coming from
SingletonBlob::operator new(unsigned long, unsigned int, bool)
DeoptimizationBlob::create(CodeBuffer*, OopMapSet*, int, int, int, int)
SharedRuntime::generate_deopt_blob()
SharedRuntime::generate_stubs()
init_globals()
[1129, 1144]
assert(_buffer != nullptr) failed: should be initialized
https://github.com/openjdk/jdk/blob/b787ad6f690df5c82a1efc5ccac658a9238ff201/src/hotspot/share/runtime/sharedRuntime.cpp#L2526
coming from
AdapterHandlerLibrary::generate_adapter_code(AdapterBlob*&, AdapterHandlerEntry*, int, BasicType*, bool)
AdapterHandlerLibrary::create_adapter(AdapterBlob*&, int, BasicType*, bool)
AdapterHandlerLibrary::initialize()
SharedRuntime::init_adapter_library()
init_globals()
[1145, 1148]
assert(no_arg_blob != nullptr && obj_arg_blob != nullptr && int_arg_blob != nullptr && obj_int_arg_blob != nullptr && obj_obj_arg_blob != nullptr) failed: Initial adapters must be properly created
https://github.com/openjdk/jdk/blob/b787ad6f690df5c82a1efc5ccac658a9238ff201/src/hotspot/share/runtime/sharedRuntime.cpp#L2605-L2609
coming from
AdapterHandlerLibrary::initialize()
SharedRuntime::init_adapter_library()
init_globals()
(just after the previous)
[1149, ~1700]
and eventually works, but occasionally crashes with
from
report_should_not_call(char const*, int)
initialize_stubs(BlobId, int, int, char const*, char const*, char const*)
StubRoutines::initialize_compiler_stubs()
compiler_stubs_init(bool)
C2Compiler::init_c2_runtime()
C2Compiler::initialize()
especially around 1180k, still happening at 1200k, which is the default minimum.
Then a various number of
C1 initialization failed
(and sometimesC2 initialization failed
) but it overall works.And now, for Valhalla:
[4, 20] finding the first behavior
CodeCache: no room for StubRoutines
[21, 24] a new kind of SIGSEGV!
CodeSection::emit_int8(unsigned char)
AbstractAssembler::emit_int8(int)
Assembler::prefix(Assembler::Prefix)
Assembler::prefixq_and_encode(int, bool)
Assembler::subq(Register, int)
MacroAssembler::subptr(Register, int)
StubGenerator::generate_return_value_stub(unsigned char*, char const*, bool)
StubGenerator::generate_initial_stubs()
StubGenerator::StubGenerator(CodeBuffer*, StubGenBlobId)
StubGenerator_generate(CodeBuffer*, StubGenBlobId)
initialize_stubs(StubGenBlobId, int, int, char const*, char const*, char const*)
StubRoutines::initialize_initial_stubs()
initial_stubs_init()
init_globals()
[25, 28] first behavior again
CodeCache: no room for StubRoutines
[29, 32]
fatal error: Initial size of CodeCache is too small
valhalla/src/hotspot/share/code/codeBlob.cpp
Line 579 in b787ad6
like the [29, 32] range of mainline
[33, 1164]
Native memory allocation (malloc) failed to allocate 1158080 bytes. Error detail: CodeCache: no room for Interpreter
like the [33, 1112] range of mainline[1165, 1172]
fatal error: Initial size of CodeCache is too small
valhalla/src/hotspot/share/code/codeBlob.cpp
Line 579 in b787ad6
RuntimeStub::new_runtime_stub(char const*, CodeBuffer*, short, int, OopMapSet*, bool, bool)
SharedRuntime::generate_resolve_blob(SharedStubId, unsigned char*)
SharedRuntime::generate_stubs()
init_globals()
[1173, 1220] SIGSEGV
CodeBlob::name() const
CodeBuffer::CodeBuffer(CodeBlob*)
AdapterHandlerLibrary::create_adapter(AdapterBlob*&, CompiledEntrySignature&, bool)
AdapterHandlerLibrary::initialize()
SharedRuntime::generate_stubs()
init_globals()
actually analogous to the range [1129, 1144] (
assert(_buffer != nullptr) failed: should be initialized
) of mainline. But Valhalla is missing this assert yet, the_buffer
isnullptr
, andCodeBlob::name() const
is called on this null pointer that is provided to the ctor ofCodeBuffer
[1221, 1224]
assert(no_arg_blob != nullptr && obj_arg_blob != nullptr && int_arg_blob != nullptr && obj_int_arg_blob != nullptr && obj_obj_arg_blob != nullptr) failed: Initial adapters must be properly created
valhalla/src/hotspot/share/runtime/sharedRuntime.cpp
Lines 2605 to 2609 in b787ad6
same as the range [1145, 1148] of mainline
[1225, 1228] SIGSEGV
CodeSection::emit_int8(unsigned char)
AbstractAssembler::emit_int8(int)
Assembler::push(Register)
SharedRuntime::generate_handler_blob(SharedStubId, unsigned char*)
SharedRuntime::generate_stubs()
init_globals()
[1229, 1232]
fatal error: Initial size of CodeCache is too small
valhalla/src/hotspot/share/code/codeBlob.cpp
Line 596 in b787ad6
same as the [1117, 1128] range of mainline
[1233, 1236]
exit code = 1
[1237, 1300]
exit code = 1
[1301, 1304]
exit code = 1
[1305, 1308]
exit code = 1
[1309, 1792]
and eventually works
[1793, 1993]
exit code = 1
with possibly additional lines among:
(yes, an empty line)
[1994, ...] mostly works, but sometimes the same error as above, witnessed until 3780k (and especially between 3640 and 3780). Interestingly, at 3700k, we also see some fine runs, with, or without compiler shutdown. Starting 3800k, I never saw a failure, and starting 3900k, no compiler shutdown.
So, how to pick
CodeCacheMinimumUseSpace
, that is the smaller value ofReservedCodeCacheSize
we allow? On mainline with the default value (1200k on debug, 400k on product), we don't see any crash anymore. On Valhalla, at 2000k, it works always as far as I can see, but around 3700, it crashes in most cases (maybe two thirds or three quarters of the runs). If we are aiming for a value that would guarantee no crashes, we should pick at least 3800k, that is quite a huge increase. If we consider the weird crashes around 3700k as acceptable (and maybe fixable), 2000k is probably fine. Since that's the debug value, that is three times scaled up from the product onehttps://github.com/openjdk/jdk/blob/b787ad6f690df5c82a1efc5ccac658a9238ff201/src/hotspot/share/code/codeCache.cpp#L208
that would mean taking
CodeCacheMinimumUseSpace
as 1267k (in the first case, maybe rounded to 1300k), or 667k (in the second case, maybe rounded to 700k).That's for the minimum. As for the defaults: they still somewhat makes sense, in the meaning that we can run quite some things without filling the code cache. It's not clear to me what expectations we have and how we can decide.
I didn't change those as part of this PR, as the JBS issue is not primarily about that, and it looks like a more... subtle problem. But I can if a consensus emerges. Let the discussion begin!
Thanks,
Marc, the text wall builder (textmason?)
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1509/head:pull/1509
$ git checkout pull/1509
Update a local copy of the PR:
$ git checkout pull/1509
$ git pull https://git.openjdk.org/valhalla.git pull/1509/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 1509
View PR using the GUI difftool:
$ git pr show -t 1509
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1509.diff
Using Webrev
Link to Webrev Comment