内容概述:Java Safepoint/Handshake 设计、实现原理剖析与求证 - 摘自我的开源书《面向技术宅的 JVM 内幕》 中 Safepoint 一节。本文是摘录整合文章。建议用电脑或平板阅读本书原文,排版、图片、结构、整体阅读体验更友好。
大纲:
- Necessary evil
- Safepoint 术语
- Safepoint 流程概述
- JavaThread - State
- GC oop trace
- Safepoint 协作流程详述
- Safepoint 问题排查
- 参考
Necessary evil
在流行降本增笑
的年代,什么东西都要说 light-weight 。应用程序框架要 light-weight,VM 要换成 light-weight 的 container,连女朋友也要找 light-weight 的。
那么,底层的编程语言 Runtime,更应该 light-weight 了。像 C++ 有自己的原则:
What is the zero-overhead principle? - isocpp.org
The zero-overhead principle is a guiding principle for the design of C++. It states that: What you don’t use, you don’t pay for (in time or space) and further: What you do use, you couldn’t hand code any better.
In other words, no feature should be added to C++ which would make any existing code (not using the new feature) larger or slower, nor should any feature be added for which the compiler would generate code that is not as good as a programmer would create without using the feature.
Zero-overhead principle - cppreference.com
The zero-overhead principle is a C++ design principle that states:
- You don’t pay for what you don’t use.
- What you do use is just as efficient as what you could reasonably write by hand.
…
The only two features in the language that do not follow the zero-overhead principle are runtime type identification and exceptions, and are why most compilers include a switch to turn them off.
还有一些就这个原则的讨论和实验:
Zero 固然好,但代价是 Repeat Yourself 去做一些琐碎之事,如 memory 管理。于是很多语言出现了 GC。在 Rust 还未出来之前,大家都爱上有 GC 的语言,并认为 Stop The World ,或更小范围的 Stop some countries 是可以接受的 Necessary evil(必要之恶)
。要知道就算当年如日中天的 Golang ,也要 GC 。更何况步入中年的 Java,以及在 G1 与 ZGC 之前 “臭名昭著” 的 Java GC/Stop The World 。于是,本文就要研究一下这个 Java Stop The World 的 Necessary evil
。
Necessary evil(必要之恶)
这个词,Wikipedia 上这样解释:
A necessary evil is an evil that someone believes must be done or accepted because it is necessary to achieve a better outcome—especially because possible alternative courses of action or inaction are expected to be worse. It is the “lesser evil” in the lesser of two evils principle, which maintains that given two bad choices, the one that is least bad is the better choice.
History
The Oxford Dictionary of Word Origins asserts that “[t]he idea of a necessary evil goes back to Greek”, describing the first necessary evil as marriage, and further stating that, “The first example in English, from 1547, refers to a woman”. Thomas Fuller, in his 1642 work, The Holy State and the Profane State, made another of the earliest recorded uses of the phrase when he described the court jester as something that “…some count a necessary evil in a Court”. In Common Sense, Thomas Paine described government as at best a “necessary evil”.
必要之恶 是指某人认为必须做或接受的恶,因为这是必要实现更好结果的必要条件 — 尤其是因为其他可能的行动或不作为方案预计会更糟。它是 两害相权取其轻原则 中的“两害相权取其轻”,该原则认为,在两个糟糕的选择中,最不糟糕的选择是更好的选择。
历史
牛津词源词典 断言“必要之恶的概念可以追溯到希腊语”,将第一个必要之恶描述为 婚姻,并进一步指出,“英语中的第一个例子来自 1547 年,指的是女性”。托马斯·富勒 在他 1642 年的作品 神圣国家与世俗国家 中,再次使用了该短语,当时他将 宫廷小丑 描述为“……有些人认为是宫廷中的必要之恶”。在《常识》一书中,托马斯·潘恩将 GOV 描述为“必要之恶”。
Safepoint 作为 Java 最让 end-user 讨厌,但又最让 JVM 实现者爱恨交织,重度依赖的机制。成为每个要研究 Java/JVM 的人都必须研究的机制。
图: Safepoint 是 JVM 众多模块的依赖和协调机制 (来自: HotSpot JVM Deep Dive - Safepoint)
Safepoint 术语
先看看 Safepoint 相关知识的术语:
Safepoint
/ Safepointing
/ Stopping-the-world
来自: HotSpot JVM Deep Dive - Safepoint
Thread-local GC root
:= An oop , i.e. a pointer into the Java heap, local to aJavaThread
. The denoted Java object is a root of a reachability tree.
Mutable thread state
:= A JavaThread state in which the thread can mutate the Java heap or its thread-local GC roots. Aka an unsafe state.
A Safepoint (noun)
is a global JVM state
- Intuition: At this point (state), the Java world is stopped. It is therefore safe, as in exclusive access, to inspect and process by the JVM.
- Technical: No JavaThread is executing inside or can transition into a thread state classified as mutable
- Technical: Thread-local GC roots for all JavaThreads are accessible (published) to the JVM
Safepointing (verb)
or Stopping-the-world is a JVM process or mechanism to reach a Safepoint
- Intuition, older notion: “The process of halting or stopping all executing Java threads”
- Technical: The JVM cooperates with Java Threads using a technique called Cooperative Suspension
Cooperative Suspension
is a poll-based technique
JavaThreads check or poll thread-local state at designated locations
On suspension, the JVM blocks JavaThreads from transitioning into thread states classified as mutable
On suspension, the JVM triggers JavaThreads to transition from a mutable into an immutable thread state. As a consequence, thread-local GC roots are published.
For example:
- mov r10, qword ptr [r15+130h] // get thread-local poll page address
- test dword ptr [r10], eax // try to read the poll page
Traditionally, bringing the system to a Safepoint has been a necessary evil for runtimes that provide some form of automatic memory management**
- A pervasive JVM/Runtime mechanism. Consequently, a lot of machinery in the JVM.
- But JVM developments, especially in the GC area, move ever closer to obviating the need for the global JVM safepoint state.
-
Thread-local GC root
:=JavaThread
本地的一个指向 heap 的 oop 。作为 GC 对象可达性分析树的树根 -
Mutable thread state
:= 指一类型的 JavaThread 的状态,在该状态下,线程可以改变 Java help 或其Thread-local GC root
。又称unsafe state
。 -
Safepoint (名词)
是指一种 JVM 全局状态- 直觉上:此时(状态),Java 世界已停止。因此,JVM 检查和处理是安全的,就像独占访问一样。
- 技术上:没有 JavaThread 在内部执行或可以转换为归类为
Mutable thread state
的线程 - 技术上:所有 JavaThread 的
Thread-local GC root
都可以访问(发布)到 JVM
-
Safepointing(动词)
或 Stopping-the-world 是 JVM 达到安全点的过程或机制-
直觉上,较旧的概念:暂停或停止所有正在执行的 Java 线程的过程
-
技术:JVM 使用一种称为
Cooperative Suspension 协作挂起
的技术与 Java 线程协作
-
-
传统上,将系统置于
Safepoint
对于提供某种形式的自动内存管理的运行时来说是一种必要之恶- 但是 JVM 的发展,特别是在 GC 领域,已经越来越接近于消除对全局 JVM 安全点状态的需求。
Safepoint 流程概述
以上文字内容不太直观,来个图:
图: Stop The World 的步骤。Source: Async-profiler - manual by use cases
-
Global safepoint request
1.1 有一个线程向一个叫
VM Thread
提出了进入 safepoint 的请求,请求中带上safepoint operation
参数,参数其实是 STOP THE WORLD(STW) 后要执行的 Callback 操作 。可能是触发 GC。也可能是其它原因。1.2
VM Thread
线程在收到 safepoint request 后,修改一个 JVM 全局的safepoint flag
为 true(这个 flag 可以是操作系统的内存页权限标识) 。1.3 然后这个
VM Thread
就开始等待其它应用线程(App thread) 到达(进入) safepoint 。1.4 其它应用线程(App thread)其实会高频检查这个 safepoint flag ,当发现为 true 时,就到达(进入) safepoint 状态。
-
Global safepoint
当
VM Thread
发现所有 App thread 都到达 safepoint (真实的 STW 的开始) 。就开始执行safepoint operation
。GC 操作
是safepoint operation
其中一种可能类型。 -
End of safepoint operation
safepoint operation
执行完毕,VM Thread
结束 STW 。
JavaThread - State
详见本书的 JavaThread Polling 与 Reach Safepoint - JavaThread - State 一节。以下为摘录:
Safepoint 机制的实现依赖于 JavaThread 。
src/hotspot/share/runtime/javaThread.hpp
|
|
src/hotspot/share/utilities/globalDefinitions.hpp
|
|
其中 class JavaThread
的 JavaThreadState _thread_state
字段记录了线程的状态。
图: JavaThread 状态机。Source: Java Thread state machine
来自: HotSpot JVM Deep Dive - Safepoint
This is the state machine for the java thread and we can further classify it into the following categories:
mutable thread state
it’s a state in which the thread can mute it the java heap or its thread local gc routesimmutable thread states
is a state where the threat can do none of these thingstransition states
which act like bridges between the mutable and the immutable states a transition state has a safe point check or a poll instruction together with appropriate fencing
这是 Java 线程的状态机,我们可以进一步将其分为以下类别:
mutable thread state 可变线程状态
线程可以修改 Java 堆或其线程本地 GC 数据immutable thread states 不可变线程状态
不能修改 oop 的状态transition states 过渡状态
充当mutable thread state
和immutable thread states
之间的桥梁,过渡状态具有 safe point check 或 轮询指令 以及适当的隔离
来自: HotSpot JVM Deep Dive - Safepoint
Let’s for example take a look at this situation:
we have a new thread comes into being it starts running in theVM state
.
Let’s say this thread now wants to execute some java code. In order to do that it will need to traverse a transition into thejava state
and as that the transition as we said contains asave point check
. Some notable transitions here is that thejava code(java state)
can transition toVM state
and toNative state
without performing save point checks instead the save point check is performed when the thread returns tostate java
.Another important takeaway here is that code executing in
state native
is considered safe this means that during a safe point java threads can actually continue running native code and this also means that counter to the intuitive notion that a safe point involves blocking or halting all java threads it only means that they do not executein a sense a sensitivemutable state
关于 transition states
的作用 ,让我们看一下这种情况:
我们有一个新的线程出现,一开始在 VM state
中运行。
假设这个线程现在要执行一些 Java 代码。为了做到这一点,它将需要间接跳转到 java state
,这个跳转包含 safepoint check。 值得注意的是,Java 代码(Java state
) 可以直接跳转到 VM state
和 native state
,无需 执行 safepoint check,但在线程返回到 Java state
时执行,需要 safepoint check 。
另一个要注意的是,在native state
下执行的代码被认为是安全的,这意味着在安全点期间,java 线程实际上可以继续运行 native code ,这也意味着,与安全点会阻塞或停止所有 java 线程的直观想法相反,安全点只意味着不执行敏感的 mutable state
操作。
GC oop trace
src/hotspot/share/runtime/javaThread.hpp
|
|
src/hotspot/share/runtime/javaFrameAnchor.hpp
|
|
来自: HotSpot JVM Deep Dive - Safepoint
Global jvm state the second clause was that thread local gc routes for all java threads are accessible or published to the jvm. All current garbage collectors are tracing collectors which means they follow or trace the reachability trees starting out from what is called a root set. That is a set of immediately available oops.
Proper subset of the route set is the set of routes that is local to and reachable from java threads.
Let’s take a look at what some of these thread local gc routes are.
oop Handles
Local jni handles
A
JavaThread
has a field calledJNIHandleBlock* _active_handles
. Alocal jni handle
provides indirect access to anoop
for jni code running instate native
. But allocating/deallocating and even dereferencing a jni handle involve first performing avm state transition
which will perform a safe point check.Local jni handles
are auto managed so when the code returns from a jni method that is it transitions fromstate native
doing a safe point check intostate java
thelocal jni handles
allocated by that method are deallocated.HandleArea (missed in OpenJDK21)
The
JavaThread
also has a field calledhandle area
and handle area and its companion the handle provides pretty much the same indirection functionality as alocal jni handle
but these are targeted for code running in thevm state
. The important difference is that these handles are NOT auto managed but instead must be manually managed by the openjdk programmer.Handle marks
are used to describe ahandle scope
. And thehandle mark
destructor will deallocate the allocated handles for that particular scope and the scopes can also be nested.Last Java Frame
The thread also has an embedded struct called the
JavaFrameAnchor _anchor
field. It consists of three pointers:
_last_Java_sp
for last java stack pointer
_last_Java_pc
for last java program counter
_last_Java_fp
(missed in OpenJDK21, because of virtual thread ?) for last java frame pointer. the last java frame is the entry point for external stack walking. It is set if a thread has at least one java activation record or frame on its stack and it’s currently not instate java
. So the_last_Java_fp
is set instate java
before the thread transitions out. And conversely it is cleared upon thread reentry.The anchor struct here requires only that the
last java stack pointer
is set as the other fields are either not relevant for that context or they can be derived by the stack walking code.Java frames on the stack may contain
ordinary narrow oops
orderived oops
. So if you compared to the handles we discussed previously these are naked oops that is they do not have a handling direction they are direct pointers.
- An
ordinary oop
is a regular oop,- a
narrow oop
is a compressed version of an oop it’s a 32-bit size oop.- And the
derived oop
is a pointer into an object not pointing directly to its header.So for example we can think of an a pointer that points out an element in an array and a
derived oop
is always associated with a base for a specific code position in java for a specific code position like aprogram counter
which stack slots and registers contain oops relative to thatpc
is described by a piece of metadata generated by the compilers something called anoop map
.For a specific code position (pc), which stack slots and registers contain oops is described by a piece of metadata generated by the compilers, called an
OopMap
. To pinpoint an oop in a frame, theOopMap
describes a location using a relative address, either from theframe stackpointer (sp)
or as an index into aRegisterMap
. Not all code positions haveOopMaps
; mainly call sites and safepoint poll page instructions. For stackwalks, the return address of each frame is associated with anOopMap
.JavaThread CPU Context
A thread executing Java code also has a CPU context. Per the calling convention and performance reasons, oops are ideally placed in registers. Hotspot widely employs something called
Stubs
orStubRoutines
, which are special platform-specific assembly helper routines. An important feature of mostStubs
is to save the CPU context when a thread leaves, or suspends its Java execution, and restoring it when the thread re-enters, resuming execution. ARegister Map
is used to resolve a location described by anOopMap
to be in a register.
Safepoint 协作流程详述
Safepoint 协作流程可以划分为以下几步:
- 应用线程 Polling Safepoint
- 监听 Safepoint Request
- 接收 Safepoint Request
- Arm Safepoint - 标记所有线程
- 等待应用线程到达 Safepoint
- 应用线程陷入 Safepoint
- Global safepoint - The World Stopped
- Safepoint operation 结束
- Disarming Safepoint
应用线程 Polling Safepoint
详见本书的 JavaThread Polling 与 Reach Safepoint - Polling 一节。以下为摘录:
基础知识
JIT 生成代码的寄存器分类
固定寄存器
- $r12 - 存放 Java Heap base
- $r15 - 存放 thread local 的 JavaThread 指针
非固定(通用)寄存器在 Frame 间保存
- $rbp - 由
callee-saved
- 其它通用寄存器 - 由
caller-saved
Polling
Java 线程会高频检查 safepoint flag(safepoint check/polling) ,当发现为 true(arm) 时,就到达(进入) safepoint 状态。
JVM 初始化
JVM 在启动时,就已经初始化了两个 Memory Page ,用于 safepoint 。一个 bad_page 不可读,如在它上执行 test
x86指令,线程会因收到信号而挂起并跳转到信号处理器代码 。一个 good_page 可读,可正常执行 test
x86指令:
Stack:
libjvm.so!SafepointMechanism::default_initialize() (/jdk/src/hotspot/share/runtime/safepointMechanism.cpp:68)
libjvm.so!SafepointMechanism::pd_initialize() (/jdk/src/hotspot/share/runtime/safepointMechanism.hpp:56)
libjvm.so!SafepointMechanism::initialize() (/jdk/src/hotspot/share/runtime/safepointMechanism.cpp:171)
libjvm.so!Threads::create_vm(JavaVMInitArgs * args, bool * canTryAgain) (/jdk/src/hotspot/share/runtime/threads.cpp:492)
libjvm.so!JNI_CreateJavaVM_inner(JavaVM ** vm, void ** penv, void * args) (/jdk/src/hotspot/share/prims/jni.cpp:3577)
libjvm.so!JNI_CreateJavaVM(JavaVM ** vm, void ** penv, void * args) (/jdk/src/hotspot/share/prims/jni.cpp:3668)
libjli.so!InitializeJVM(JavaVM ** pvm, JNIEnv ** penv, InvocationFunctions * ifn) (/jdk/src/java.base/share/native/libjli/java.c:1506)
libjli.so!JavaMain(void * _args) (/jdk/src/java.base/share/native/libjli/java.c:415)
libjli.so!ThreadJavaMain(void * args) (/jdk/src/java.base/unix/native/libjli/java_md.c:650)
libc.so.6!start_thread(void * arg) (pthread_create.c:442)
libc.so.6!clone3() (clone3.S:81)
src/hotspot/share/runtime/safepointMechanism.cpp
|
|
(do_polling)=
真正 Polling
先看看相关的数据结构:
src/hotspot/share/runtime/javaThread.hpp
|
|
src/hotspot/share/runtime/safepointMechanism.hpp
|
|
从上面代码,可以猜到 SafepointMechanism._polling_page
是个 Global var。对应着 Global Safepoint。 而 JavaThread._poll_data._polling_page
是 Thread Local 的,对应着 Thread-Local Handshakes 。
自从 OpenJDK10 的 JEP 312: Thread-Local Handshakes - 2017年 后,就有了非 JVM Global 的 Safepoint - Thread Safepoint 。而 JVM Global 的 Safepoint 好像也修改为基于 Thread-Local Handshakes
去实现,即对每一条 JavaThread 执行 Thread-Local Handshakes
。
在 OpenJDK10 时,可以通过 -XX:-ThreadLocalHandshakes
去禁用 ThreadLocalHandshakes ,但以下几个过程后就不可以禁用了:
- Deprecated in JDK13
- Obsoleted in JDK14
- Expired in JDK15
原因当然是 OpenJDK 已经强依赖于这个特性了: Obsolete ThreadLocalHandshakes - bugs.openjdk.org .
JIT 编译后的 Polling
可以用下图说明 polling_page 的切换:
图: polling_page 的切换. Source: The Inner Workings of Safepoints 2023 - mostlynerdless.de
图: JavaThread 与 R15 寄存器. Source: Robbin Ehn: Handshaking HotSpot - Youtube Java Channel - 2020
上图意为,读取本线程对应的 JavaThread._poll_data(SafepointMechanism::ThreadData).polling_page 指向的地址。其中 R15 寄存器一般会指向本线程对应的 JavaThread。
|
|
Source: Robbin Ehn: Handshaking HotSpot - Youtube Java Channel - 2020
上面显示需要两条机器指令,才能完成 polling。如果你看过 OpenJDK11 之前的资料,之前应该就一条机器指令就够了:
test DWORD PTR [rip+0xa2b0966],eax
主要原因是 OpenJDK11 默认启用 JEP 312: Thread-Local Handshakes 的设计,要求每条 Thread 有自己的 polling_page 指针,所以需要多一条机器命令来多一层寻址。
####### JIT Polling 实验
下面,用实验观察的方法 fact check 一下。直接采用本书的 Stack Memory Anatomy - 堆栈内存剖析 - Java Options 一节中的示例环境、程序、输出。尝试在 java ... -XX:+PrintAssembly ... -XX:LogFile=./round3/mylogfile.log
输出的 JIT 汇编 mylogfile.log 文件中,找出 Polling 指令。
- 启动 GDB Debugger,见 Stack Memory Anatomy - 堆栈内存剖析 - 启动 debugger 一节
- Inspect
JavaThread
object layout。详见 GDB JVM FAQ - Inspect Object Layout 一节
(gdb) ptype /xo 'Thread'
/* offset | size */ type = class Thread : public ThreadShadow {
private:
static class Thread *_thr_current;
/* 0x0020 | 0x0008 */ uint64_t _nmethod_disarmed_guard_value;
... public:
/* 0x048c | 0x0004 */ volatile enum JavaThreadState _thread_state;
private:
/* 0x0490 | 0x0010 */ struct SafepointMechanism::ThreadData {
/* 0x0490 | 0x0008 */ volatile uintptr_t _polling_word;
/* 0x0498 | 0x0008 */ volatile uintptr_t _polling_page;
/* total size (bytes): 16 */
} _poll_data;
/* 0x04a0 | 0x0008 */ class ThreadSafepointState *_safepoint_state;
/* 0x04a8 | 0x0008 */ address _saved_exception_pc;
- 找到 Poll 指令
可见,_polling_page 的 offset 为 0x0498,即 0x498 。于是,在 mylogfile.log 中找 0x498 。发现几百个,抽其中一个:
[Entry Point]
# {method} {0x00007ffbf4249ab8} 'enqueue' '(Ljava/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionNode;)V' in 'java/util/concurrent/locks/AbstractQueuedSynchronizer'
# this: rsi:rsi = 'java/util/concurrent/locks/AbstractQueuedSynchronizer'
# parm0: rdx:rdx = 'java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionNode'
# [sp+0x40] (sp of caller)
...
0x00007fffed738747: call 0x00007fffed1170a0 ; ImmutableOopMap {[0]=Oop [16]=Derived_oop_[0] [8]=Oop [24]=Oop }
;*invokevirtual setPrevRelaxed {reexecute=0 rethrow=0 return_oop=0}
; - java.util.concurrent.locks.AbstractQueuedSynchronizer::enqueue@31 (line 614)
; {optimized virtual_call}
0x00007fffed73874c: nop DWORD PTR [rax+rax*1+0x23c] ; {other}
...
0x00007fffed738786: mov BYTE PTR [r9+r10*1],0x0 ;*ifeq {reexecute=0 rethrow=0 return_oop=0}
; - java.util.concurrent.locks.AbstractQueuedSynchronizer::enqueue@40 (line 615)
;; B8: # out( B3 B9 ) <- in( B7 B6 ) Freq: 8.17349
0x00007fffed73878b: mov r10,QWORD PTR [r15+0x498] ; ImmutableOopMap {rdx=Oop [0]=Oop r8=Derived_oop_[0] [24]=Oop }
;*ifeq {reexecute=1 rethrow=0 return_oop=0}
; - (reexecute) java.util.concurrent.locks.AbstractQueuedSynchronizer::enqueue@40 (line 615)
0x00007fffed738792: test DWORD PTR [r10],eax ; {poll}
0x00007fffed738795: test r11d,r11d
0x00007fffed738798: je 0x00007fffed738722
其中
0x00007fffed73878b: mov r10,QWORD PTR [r15+0x498]
0x00007fffed738792: test DWORD PTR [r10],eax
即为 Safepoint polling。 上文已经介绍过, r15 寄存器保存 Thread Local 的 JavaThread 对象指针。还可以看到一些 OopMap 的身影。
Java 源码:
src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedSynchronizer.java
606: final void enqueue(ConditionNode node) {
607: if (node != null) {
608: boolean unpark = false;
609: for (Node t;;) {
610: if ((t = tail) == null && (t = tryInitializeHead()) == null) {
611: unpark = true;
612: break;
613: }
614: node.setPrevRelaxed(t); // <<<< Safe point polled after call
615: if (casTail(t, node)) {
616: t.next = node;
617: if (t.status < 0)
618: unpark = true;
619: break;
620: }
621: }
622: if (unpark)
623: LockSupport.unpark(node.waiter);
624: }
625: }
- 进一步探索
这时,再看看 JavaThread 的属性。要知道 JavaThread 的属性,首先要知道 JavaThread 的地址。这时用 jhsdb:
hsdb> threads
513811 main
State: BLOCKED
Stack in use by Java: 0x00007ffff5285740 .. 0x00007ffff5285830
Base of Stack: 0x00007ffff5287000
Last_Java_SP: 0x00007ffff5285740
Last_Java_FP: null
Last_Java_PC: 0x00007fffed72c9af
Thread id: 513811
hsdb> threadcontext 513811
Thread "main" id=513811 Address=0x00007ffff002b3c0
可见,main JavaThread 的地址为 Address=0x00007ffff002b3c0 。hsdb inspect 一下:
hsdb> inspect 0x00007ffff002b3c0
Type is JavaThread (size of 1904)
oop ThreadShadow::_pending_exception: null
char* ThreadShadow::_exception_file: char @ null
int ThreadShadow::_exception_line: 0
ThreadLocalAllocBuffer Thread::_tlab: ThreadLocalAllocBuffer @ 0x00007ffff002b580
jlong Thread::_allocated_bytes: 0
ResourceArea* Thread::_resource_area: ResourceArea @ 0x00007ffff0018ba0
LockStack JavaThread::_lock_stack: LockStack @ 0x00007ffff002bae8
OopHandle JavaThread::_threadObj: OopHandle @ 0x00007ffff002b770
OopHandle JavaThread::_vthread: OopHandle @ 0x00007ffff002b778
OopHandle JavaThread::_jvmti_vthread: OopHandle @ 0x00007ffff002b780
OopHandle JavaThread::_scopedValueCache: OopHandle @ 0x00007ffff002b788
JavaFrameAnchor JavaThread::_anchor: JavaFrameAnchor @ 0x00007ffff002b798
oop JavaThread::_vm_result: null
Metadata* JavaThread::_vm_result_2: Metadata @ null
ObjectMonitor* JavaThread::_current_pending_monitor: ObjectMonitor @ null
bool JavaThread::_current_pending_monitor_is_from_java: 1
ObjectMonitor* JavaThread::_current_waiting_monitor: ObjectMonitor @ null
uint32_t JavaThread::_suspend_flags: 0
oop JavaThread::_exception_oop: null
address JavaThread::_exception_pc: address @ 0x00007ffff002b918
int JavaThread::_is_method_handle_return: 0
address JavaThread::_saved_exception_pc: address @ 0x00007ffff002b868
JavaThreadState JavaThread::_thread_state: 10
OSThread* JavaThread::_osthread: OSThread @ 0x00007ffff002d9a0
address JavaThread::_stack_base: address @ 0x00007ffff002b718
size_t JavaThread::_stack_size: 1048576
vframeArray* JavaThread::_vframe_array_head: vframeArray @ null
vframeArray* JavaThread::_vframe_array_last: vframeArray @ 0x00007ffff031da90
JNIHandleBlock* JavaThread::_active_handles: JNIHandleBlock @ 0x00007ffff01686f0
JavaThread::TerminatedTypes JavaThread::_terminated: 57002
还是 gdb 的信息会比 hsdb 多:
$1 = (class JavaThread *) 0x7ffff002b3c0
(gdb) p *((JavaThread*)0x00007ffff002b3c0)
$2 = {<Thread> = {<ThreadShadow> = {<CHeapObj<(MEMFLAGS)2>> = {<No data fields>}, _vptr.ThreadShadow = 0x7ffff7b4c4c8 <vtable for JavaThread+16>, _pending_exception = 0x0, _exception_file = 0x0, _exception_line = 0}, _nmethod_disarmed_guard_value = 1, ...
(gdb) p /x ((JavaThread*)0x00007ffff002b3c0)->_poll_data._polling_page
$4 = 0x7ffff7fa1000
然后,在 这前 pmap 的输出文件 pmap.txt 中找到:
Address Perm Offset Device Inode Size Rss Pss Referenced Anonymous LazyFree ShmemPmdMapped FilePmdMapped Shared_Hugetlb Private_Hugetlb Swap SwapPss Locked THPeligible Mapping
7ffff7fa1000 ---p 00000000 00:00 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0
7ffff7fa2000 r--p 00000000 00:00 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0
可见,在 core dump 时,thread local 的 _polling_page 指向了 bad page(没有 r
Perm) 。即是 arming 状态。
再好奇一下 SafepointMechanism 的 static page 指针。
p/x SafepointMechanism::_poll_page_armed_value
$5 = 0x7ffff7fa1000 (Perm:---p)
p/x SafepointMechanism::_poll_page_disarmed_value
$6 = 0x7ffff7fa2000 (Perm:r--p)
p/x SafepointMechanism::_polling_page
$7 = 0x7ffff7fa1000
####### 更多 JIT Polling 实现方式
以上是 JIT Polling 方式的一个重要实现方式,JIT 对于不同类型的场景,可能会使用不同的方式。如: //TBD .
Non-JIT Polling
//TBD
监听 Safepoint Request
见本书的 VM Operations 一节。以下为摘录:
VMThread 线程作为协调者(coordinator) ,循环监听 safepoint request
队列中的 VM_Operation
请求,并执行队列中的操作。
src/hotspot/share/runtime/vmOperation.hpp
|
|
src/hotspot/share/runtime/vmThread.cpp
|
|
VM Operations 与 Safepoints 的关系
VMThread 会一直等待 VM Operation 出现在 VMOperationQueue
中,然后执行这些VM Operation。通常,这些操作需要虚拟机到达 Safepoint 后才能执行,因此会转给 VMThread。简单地说,当虚拟机处于 Safepoint 时,虚拟机内的所有线程都会被阻塞(blocked),并且在 Safepoint 期间,在执行 native code 的任何线程都无法返回到虚拟机。这意味着在执行 VM operation 时,不会有线程修改 Java 堆,而且所有线程都处于这样一种状态:它们的 Java stack 是不变的,可以被 GC 线程等检查。
大家最熟悉的 VM operation 是 GC,或者更具体地说是许多 GC 算法中常见的 “Stop The World ”阶段的 GC。但除了 GC 以外,还有许多其他基于 Safepoint 的 VM operation ,例如:有偏见的锁定撤销(biased locking revocation)、thread stack dumps、thread suspension 或 thread stopping(即 java.lang.Thread.stop() 方法)以及通过 JVMTI 请求的许多观察/修改操作。
许多 VM operation 是同步的,即请求者会阻塞直到操作完成,但也有一些是异步或并发的,即请求者可以与 VMThread 并行(当然,前提是没有启动 Safepoint )。
Safepoint 是通过一种基于轮询的合作机制发起的。简单来说,每隔一段时间就会有一个线程查询 “我是否应该在 Safepoint 阻塞 ?高效地完成这个查询并不简单。在线程状态转换过程中,就是经常查询的地方。一旦发起 Safepoint 请求,VMThread 必须等待所有线程都处于 Safepoint 安全状态后,才能继续执行 VM operation 。在 Safepoint 期间,Threads_lock
用于阻塞任何正在运行的线程,VMThread 最终会在 VM Operation 执行完毕后释放 Threads_lock
。
接收 Safepoint Request
可能是分配内存失败触发 GC,也可能是其它原因,Java 线程向 VM Thread
提出了进入 safepoint 的请求(VM_Operation
),请求中带上 safepoint operation
参数,参数其实是 STOP THE WORLD(STW) 后要执行的 Callback 操作 。
|
|
(arming_safepoint)=
Arm Safepoint - 标记所有线程
VM Thread
线程在收到 safepoint request 后,修改一个 JVM 全局的 safepoint flag
为 true(这个 flag 可以是操作系统的内存页权限标识) 。
Arm Safepoint 术语中这个 arm 可以直译成 “武装/装备” ,但我翻译成设置标志
。
src/hotspot/share/runtime/safepoint.cpp
|
|
可见,vm thread
逐一 arm
所有的应用线程 。
自从 OpenJDK10 的 JEP 312: Thread-Local Handshakes - 2017年 后,就有了非 JVM Global 的 Safepoint - Thread Safepoint 。而 JVM Global 的 Safepoint 好像也修改为基于 Thread-Local Handshakes
去实现,即对每一条 JavaThread 执行 Thread-Local Handshakes
。
src/hotspot/share/runtime/safepointMechanism.inline.hpp
|
|
可以用下图说明 polling_page 的切换:
图: polling_page 的切换. Source: The Inner Workings of Safepoints 2023 - mostlynerdless.de
等待应用线程到达 Safepoint
然后这个 VM Thread
就开始等待其它应用线程(App thread) 到达(进入) safepoint 。
src/hotspot/share/runtime/safepoint.cpp
|
|
应用线程陷入 Safepoint
Java 线程会高频检查 safepoint flag(safepoint check/polling) ,当发现为 true(arm) 时,就到达(进入) safepoint 状态。
详见本书的 JavaThread Polling 与 Reach Safepoint - Reach and handle 一节。以下为摘录:
在 VMThread arm safepoint (详见本书的 Safepoint - Arm Safepoint - 标记所有线程) 后。polling 的应用线程最终会感知到 safepoint 的聚集要求(arming)。
-
对于 绿色
immutable thread state
状态的 JavaThread:vm thread
通过arm
Java 线程的 polling page,这实际上在 arm safepoint 期间阻止了线程从所有绿色immutable thread state
中唤醒/返回后,转换到任何红色 unsafemutable thread state
。见 src/hotspot/share/utilities/globalDefinitions.hpp:1 2 3
// Each state has an associated xxxx_trans state, which is an intermediate state used when a thread is in // a transition from one state to another. These extra states makes it possible for the safepoint code to // handle certain thread_states without having to suspend the thread - making the safepoint code faster.
图: 当 JavaThread 被arm
polling page 后的状态机变化 。Source: HotSpot JVM Deep Dive - Safepoint
-
对于 红色
mutable thread state
状态的 JavaThread:vm thread
通过arm
Java 线程的 polling page, 触发 Java 线程从mutable thread state
转换为immutable thread state
状态。并且作为此转换的结果,线程本地 GC 树被同步到 JavaThread 对象。对于
VM state
的线程,这意味着需要等待线程自行完成转换。VM state
中只有少数几个地方显式执行安全点检查。例如,在争夺VM mutex 互斥锁
或VM monitor
时。此设计的前提是 Java 线程应尽可能少地处于VM state
。但对于在state java
下运行的线程,情况有所不同。
下图举一个例子,尝试说明在几种线程状态和操作系统调度环境下,线程到达 Safepoint (GetStackTrace 需要 Stop The World) 的情况。
图: 几种状态和系统调度环境下,线程到达 Safepoint 的情况. Source: Safepoints: Meaning, Side Effects and Overheads - psy-lob-saw.blogspot.com
- 绿色箭头:java state thread and running on CPU
- 黄色箭头:java state thread and off CPU (因 CPU 资源不足等原因)
- 红色箭头:JNI state thread
从 VMThread arm safepoint 到 应用线程 Reach Safepoint 的延迟,叫 Time To Safe Point(TTSP)
:
每个线程在进行 safepoint check 时如发现 safepoint arming 都会进入安全点。但到达 safepoint check 前需要执行机器指令的数量不是固定的。上图中,我们可以看到:
-
J1 直接命中安全点轮询并被暂停。J2 和 J3 正在争夺可用的 CPU 时间。J3 抢占了一些 CPU 时间,将 J2 推入运行队列,但 J2 并未进入安全点。J3 到达安全点并暂停,从而腾出内核,让 J2 取得足够的进展,进入安全点轮询。
-
J4 和 J5 在执行 JNI 代码(
JNI state
)时属于Immutable thread state
,它们不受 Safepoint 挂起影响。请注意,J5 在 Stop The World 执行到一半时试图离开 JNI,并在恢复执行 Java 代码前被暂停。重要的是,我们观察到不同线程到达安全点的时间各不相同,有些线程暂停的时间比其他线程长,Java 线程花很长时间到达安全点可能会耽误其他线程。
OpenJDK9 前,用 -XX:+PrintGCApplicationStoppedTime
可以打印出 TTSP 。OpenJDK9 后,由于采用了 Unified Logging for GC logging
的设计,配置修改成:
-Xlog:safepoint
。
Signal Handle
JVM 启动初始化时,安装了JVM 自用的 Signal Handler :
Stack :
libjvm.so!PosixSignals::install_sigaction_signal_handler(sigaction * sigAct, sigaction * oldSigAct, int sig, sa_sigaction_t handler) (/jdk/src/hotspot/os/posix/signals_posix.cpp:900)
libjvm.so!set_signal_handler(int sig) (/jdk/src/hotspot/os/posix/signals_posix.cpp:1271)
libjvm.so!install_signal_handlers() (/jdk/src/hotspot/os/posix/signals_posix.cpp:1313)
libjvm.so!PosixSignals::init() (/jdk/src/hotspot/os/posix/signals_posix.cpp:1855) // <<<<----
libjvm.so!os::init_2() (/jdk/src/hotspot/os/linux/os_linux.cpp:4613)
libjvm.so!Threads::create_vm(JavaVMInitArgs * args, bool * canTryAgain) (/jdk/src/hotspot/share/runtime/threads.cpp:482)
libjvm.so!JNI_CreateJavaVM_inner(JavaVM ** vm, void ** penv, void * args) (/jdk/src/hotspot/share/prims/jni.cpp:3577)
libjvm.so!JNI_CreateJavaVM(JavaVM ** vm, void ** penv, void * args) (/jdk/src/hotspot/share/prims/jni.cpp:3668)
libjli.so!InitializeJVM(JavaVM ** pvm, JNIEnv ** penv, InvocationFunctions * ifn) (/jdk/src/java.base/share/native/libjli/java.c:1506)
libjli.so!JavaMain(void * _args) (/jdk/src/java.base/share/native/libjli/java.c:415)
libjli.so!ThreadJavaMain(void * args) (/jdk/src/java.base/unix/native/libjli/java_md.c:650)
libc.so.6!start_thread(void * arg) (pthread_create.c:442)
libc.so.6!clone3() (clone3.S:81)
src/hotspot/os/posix/signals_posix.cpp
|
|
Signal Handler 实现:
src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp
|
|
SafepointBlob Stub 机器码获取 - poll_stub
src/hotspot/share/runtime/sharedRuntime.cpp
|
|
JVM 启动初始化时,SafepointBlob Stub 机器码生成:
src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
|
|
Global safepoint - The World Stopped
当 VM Thread
发现所有 App thread 都到达 safepoint (真实的 STW 的开始) 。就开始执行 safepoint operation
。GC 操作
是 safepoint operation
其中一种可能类型。
源码 RuntimeService::record_safepoint_synchronized()
Safepoint operation 结束
safepoint operation
执行完毕, VM Thread
结束 STW 。
源码 SafepointSynchronize::end()
Disarming Safepoint
src/hotspot/share/runtime/safepointMechanism.inline.hpp
|
|
Safepoint 问题排查
Safepoint 是 JVM 性能问题的热点爆发地。我之前写有一些文章去排查相关问题:
参考
- HotSpot JVM Deep Dive - Safepoint - Youtube Java Channel
- Async-profiler - manual by use cases - krzysztofslusarski.github.io
- Safepoints: Meaning, Side Effects and Overheads - psy-lob-saw.blogspot.com
- Where is my safepoint? - psy-lob-saw.blogspot.com
- The Inner Workings of Safepoints 2023 - mostlynerdless.de
- Robbin Ehn: Handshaking HotSpot - Youtube Java Channel - 2020
- Robbin Ehn: HotSpot Handshaking - Jfokus 2020
- JVM Anatomy Quark #22: Safepoint Polls