要想理解透 synchronized
,还要从Java对象头说起。要想能直观的观察到内存布局还要借助一些工具。
常规观看Java类编译后的class文件的字节码较为复杂,需要将java类编译成class文件,再使用javap -verbose ***.class
命令才能查看它的字节码。
Idea这么强大,怎么会没有插件,插件的名字是jclasslib Bytecode viewer
,至于怎么安装插件,大家自行百度。
这里简单介绍它的使用方式,也很easy,见下图:
字节码显示区中,已将常量池、接口、变量等进行了分类,并且有信息提示、信息关联,字节码指令(点击对应指令还可跳转Oracle官网虚拟机指令API文档)。使用起来非常方便,大家慢慢体会。
JOL是Java Object Layout的缩写,相信不用翻译大家,也已知道它的作用。JOL就是OpenJdk提供的一款小工具,传送门。使用方式如下:
引入JOL的maven依赖
<!-- https://mvnrepository.com/artifact/org.openjdk.jol/jol-core -->
<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<version>0.9</version>
</dependency>
编写程序调用即可
private static void main(String[] args){
Object obj = new Object();
String layout = ClassLayout.parseInstance(obj).toPrintable();
System.out.println(layout);
}
打印结果如下:
java.lang.Object object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 28 0f b3 1a (00101000 00001111 10110011 00011010) (447942440)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
打印结果是一道高级的面试题哦:Object obj = new Object()初始化出的obj对象,在内存中占用多少字节?大家还可尝试声明一个类,分别加上boolean、Boolean、int、Integer、数组、引用对象等成员变量,打印出的结果便可观看出该类型在Java中到底占多少字节。
更深入的用法等大家自行去深究。
这部分内容还是单独拎出来做一个介绍,因为synchronized
锁会用到这部分知识。
在HotSpot的虚拟机中,Java对象在内存中的存储布局总体分为3块区域:对象头(object header)、实例数据(instance data)、和对齐填充(Padding)。
下图是普通对象实例与数组对象实例的数据结构,其中数组长度为数组对象时才会有的对象头。
我们就通过1.2章中介绍的JOL工具进行查看
private static void main(String[] args){
// User中加入成员变量,观察它的内存布局,此时会看到实例数据部分的内容
User obj1 = new User();
String layout1 = ClassLayout.parseInstance(obj1).toPrintable();
System.out.println(layout1);
// User数组对象,观察它的内存布局,此时会看到数组数据部分的内容
// 数组数据所占字节数 = 数组长度 * 4;下例中长度为:5 * 4 = 20字节
User[] obj2 = new User[5];
String layout2 = ClassLayout.parseInstance(obj2).toPrintable();
System.out.println(layout2);
}
上述程序的执行结果就不占用文章内容贴出了,动手复制过去自己看下结果,并把玩一下。
对象头中的MarkWord用于存储对象本身的运行时数据,记录了对象的哈希码、锁和GC标记等相关信息。当使用synchronized
关键字加锁时,围绕同步锁的一系列过程均和Mark Word有关。这也是为何会介绍内存存储布局的原因所在。
在jdk的源码openjdk中的个人下载路径\openjdk\hotspot\src\share\vm\oops
路径下有markOop.hpp
的C++头文件,里面有这样一段注释:
// Bit-format of an object header (most significant first, big endian layout below):
//
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)
//
// unused:25 hash:31 -->| cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && normal object)
// JavaThread*:54 epoch:2 cms_free:1 age:4 biased_lock:1 lock:2 (COOPs && biased object)
// narrowOop:32 unused:24 cms_free:1 unused:4 promo_bits:3 ----->| (COOPs && CMS promoted object)
// unused:21 size:35 -->| cms_free:1 unused:7 ------------------>| (COOPs && CMS free block)
MarkWord在32位的JVM中是32bit,在64位中是64bit。但是对于锁状态的存储内容都是一致的。我们拿相对简洁的32位JVM中的存储举例,MarkWord中的具体组成,如下图:
其中2bit的锁标志位
表示锁的状态,1bit的偏向锁标志位
表示是否偏向。
锁标志位
为01,偏向锁标志位
为0锁标志位
依然为01,偏向锁标志位
会被置为1,此时锁进入偏向模式。同时,使用CAS操作将此获取锁对象的线程ID设置到锁对象的Mark Word中,持有偏向锁,下次再可直接进入。偏向锁标志位
重新被置为0,准备升级轻量级锁。首先将在当前线程的帧栈中开辟一块锁记录空间(Lock Record),用于存储锁对象当前的Mark Word拷贝。然后,使用CAS操作尝试把锁对象的Mark Word更新为指向帧栈中Lock Record的指针,CAS操作成功,则代表获取到锁,同时将锁标志位
设置为00,进入轻量级锁模式。若CAS操作失败,则进入下述操作。锁标志位
设置为10。在此状态下,所有等待锁的线程都必须进入阻塞状态。(打个广告:对于线程的状态,推荐大家看下我的另外一篇文章:脱掉Java线程状态的衣服)针对上述的步骤不了解没关系,看完后面的介绍,回过头来再反复品一品。
这里会引申出“指针压缩”的概念,以及可能会看到的两个JVM的参数-XX:+UseCompressedClassPointers
和-XX:+UseCompressedOops
,这里做一个简介,并用实验的方式解释清楚它们的含义。
**指针压缩:**JVM最初是32位的,随着64位系统的兴起,JVM也迎来了从32位到64位的转换,32位的JVM对比64位的内存容量比较有限。但是使用64位虚拟机的同时,带来一个问题,64位下的JVM中的对象指针占用内存会比32位的多1.5倍,这是我们不希望看到的。于是在JDK1.6时,引入了指针压缩。
**-XX:+UseCompressedClassPointers参数:**启用类指针(类元数据的指针)压缩。
**-XX:+UseCompressedOops参数:**启用普通对象指针压缩。Oops缩写于:ordinary object pointers
-XX:+UseCompressedClassPointers
和-XX:+UseCompressedOops
在Jdk1.8中默认开启,可用java -XX:+PrintCommandLineFlags -version
此条命令进行检测:
+UseCompressedClassPointers
和+UseCompressedOops
参数中的+
号代表开启参数,-
号代表关闭参数。下面例子中会使用-
号来关闭参数。通过在Idea中编辑jvm参数,来用实践去检验这两个参数的开和关对内存布局的影响。
我们使用四组不同Vm options来跑下面的小Demo:
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+PrintCommandLineFlags
-XX:-UseCompressedClassPointers -XX:+UseCompressedOops -XX:+PrintCommandLineFlags
-XX:+UseCompressedClassPointers -XX:-UseCompressedOops -XX:+PrintCommandLineFlags
-XX:-UseCompressedClassPointers -XX:-UseCompressedOops -XX:+PrintCommandLineFlags
public class HelloJOL {
private boolean flag1 = true;
private Boolean flag2 = true;
private int x = 0;
private Integer y = 0;
private String str = "";
private int[] arrInt = new int[10];
private String[] arrStr;
public static void main(String[] args) {
HelloJOL o = new HelloJOL();
String layout = ClassLayout.parseInstance(o).toPrintable();
System.out.println(layout);
}
}
通过仔细观察打印结果,会得出如下结论:
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops
对象头的大小为12字节,其中8字节的markword + 4字节的class pointer
-XX:-UseCompressedClassPointers -XX:+UseCompressedOops
仅关闭类指针压缩
对象头的大小为16字节,其中8字节的markword + 8字节的class pointer
说明:64位机器中UseCompressedClassPointers会将class pointer类指针从8字节压缩至4字节
-XX:+UseCompressedClassPointers -XX:-UseCompressedOops
仅关闭普通对象指针
通过-XX:+PrintCommandLineFlags打印的jvm参数会发现,这时UseCompressedClassPointers会被系统默认关闭(虽然你没有设置);
对象头的大小为16字节。因为类指针压缩被级联关闭;
boolean、int等基础类型的属性的大小不变,依然为1、4字节,但是Boolean、Integer、String、数组等类型的属性,占用大小由4字节变成了8字节。
-XX:-UseCompressedClassPointers -XX:-UseCompressedOops
同时关闭类指针压缩和普通对象指针压缩,效果同同实验3
此篇文章基于对synchronized
关键字和用法有初步的理解,不再进行基础知识的科普。不了解的先去学习一下:传送门
首先,我们都知道synchronized
关键字既可以修饰方法(静态和非静态),也可以修饰代码块。
非静态方法:针对当前实例加锁
静态方法:作用于当前类加锁
修饰代码块:指定加锁对象,既可针对类加锁,也可针对实例对象加锁。
public static synchronized void methodA() {
// 修饰静态方法,执行前必须先获取当前类的锁
}
public synchronized void methodB() {
// 修饰非静态方法,执行前必须先获取当前实例对象的锁
}
Object lock = new Object();
public void methodC() {
synchronized (lock) {
// 同步块,执行前必须先获取lock实例对象的锁
}
}
public void methodD() {
synchronized (Object.class) {
// 同步块,执行前必须先获取Object类锁
}
}
对此段代码进行编译,查看字节码文件。jvm对于synchronized
关键字既可以修饰方法和修饰代码块的实现是不同的:
ACC_SYNCHRONIZED
标记。用来告诉JVM这是一个同步方法,在进入该方法之前需要获取相应的锁。mointerenter
和mointerexit
指令,由monitorenter
指令进入,然后monitorexit
释放锁重点
public static void main(String[] args) {
Object lock = new Object();
System.out.println("加锁前**********************");
String layout0 = ClassLayout.parseInstance(lock).toPrintable();
System.out.println(layout0);
System.out.println("***********加锁时***********");
synchronized (lock) {
// -XX:BiasedLockingStartupDelay=0 偏向锁延时
String layout1 = ClassLayout.parseInstance(lock).toPrintable();
System.out.println(layout1);
}
System.out.println("*********************释放锁后*");
String layout2 = ClassLayout.parseInstance(lock).toPrintable();
System.out.println(layout2);
}
通过我们在1.2章中介绍的JOL工具查看一下,加锁前、加锁时、释放锁后对象头,都有什么样的变化,jdk版本不同,看的结果会不大相同。但是肯定会看到对象的markword发生了一定的变化。
在上面,我们已经介绍过synchronized
修饰代码块时,会产生mointerenter
和mointerexit
指令。那么,jvm是如何通过这两个指令来搞定加锁的呢?下面我们一步步跟踪openjdk源码中,如何实现的mointerenter
和mointerexit
。
我使用的是openjdk8,附百du云盘下载链接:https://pan.baidu.com/s/1ZFQLurrriyUzyS78_SwcXw 密码:aeqm
openjdk根路径/hotspot/src/share/vm/interpreter
路径下的interpreterRuntime.cpp
文件中对mointerenter
和mointerexit
的定义:
// 解释器的同步代码被分解出来,以便方法调用和同步快可以共享使用
// The interpreter's synchronization code is factored out so that it can
// be shared by method invocation and synchronized blocks.
//%note synchronization_3
//%note monitor_1 monitorenter同步锁加锁方法
IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
if (PrintBiasedLockingStatistics) { // 打印偏向锁的统计
Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
}
Handle h_obj(thread, elem->obj());
assert(Universe::heap()->is_in_reserved_or_null(h_obj()),
"must be NULL or an object");
if (UseBiasedLocking) { // 如果开启了偏向模式
// Retry fast entry if bias is revoked to avoid unnecessary inflation
// 请快速重试进入,如果偏向锁被取消以避免不必要的膨胀
ObjectSynchronizer::fast_enter(h_obj, elem->lock(), true, CHECK);
} else {
// 没开启偏向模式的,则调用slow_enter方法进入轻/重量级锁
ObjectSynchronizer::slow_enter(h_obj, elem->lock(), CHECK);
}
assert(Universe::heap()->is_in_reserved_or_null(elem->obj()),
"must be NULL or an object");
#ifdef ASSERT
thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END
//%note monitor_1 monitorexit同步锁的释放锁方法
IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorexit(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
Handle h_obj(thread, elem->obj());
assert(Universe::heap()->is_in_reserved_or_null(h_obj()),
"must be NULL or an object");
if (elem == NULL || h_obj()->is_unlocked()) {
THROW(vmSymbols::java_lang_IllegalMonitorStateException());
}
ObjectSynchronizer::slow_exit(h_obj(), elem->lock(), thread);
// Free entry. This must be done here, since a pending exception might be installed on
// exit. If it is not cleared, the exception handling code will try to unlock the monitor again.
elem->set_obj(NULL);
#ifdef ASSERT
thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END
openjdk根路径/hotspot/src/share/vm/runtime/synchronizer.cpp
路径下的synchronized.cpp
文件中对fast_enter
和slow_enter
的定义,仔细阅读并结合本文2.3章中于锁膨胀过程的介绍,会对加锁、锁膨胀、释放锁的过程有更清晰的认识。本文2.3章内容一定要反复看,反复品!!!
// -----------------------------------------------------------------------------
// Monitor快速Enter/Exit的方法,解释器和编译器使用了一些汇编语言在其中。如果一下的函数被更改,请确保更新他们。实现方式对竟态条件及其敏感,务必小心。
// Fast Monitor Enter/Exit
// This the fast monitor enter. The interpreter and compiler use
// some assembly copies of this code. Make sure update those code
// if the following function is changed. The implementation is
// extremely sensitive to race condition. Be careful.
void ObjectSynchronizer::fast_enter(Handle obj, BasicLock* lock, bool attempt_rebias, TRAPS) {
if (UseBiasedLocking) {// 又判断了一遍是否使用偏向模式
if (!SafepointSynchronize::is_at_safepoint()) {// 确保当前不在安全点
// 偏向锁加锁:revoke_and_rebias
BiasedLocking::Condition cond = BiasedLocking::revoke_and_rebias(obj, attempt_rebias, THREAD);
if (cond == BiasedLocking::BIAS_REVOKED_AND_REBIASED) {
return;
}
} else {
assert(!attempt_rebias, "can not rebias toward VM thread");
BiasedLocking::revoke_at_safepoint(obj);
}
assert(!obj->mark()->has_bias_pattern(), "biases should be revoked by now");
}
// 快速加锁未成功时,采用慢加锁的方式
slow_enter (obj, lock, THREAD) ;
}
void ObjectSynchronizer::fast_exit(oop object, BasicLock* lock, TRAPS) {
// 从下面这个断言遍可得知:偏向锁不会进入快锁解锁方法。
assert(!object->mark()->has_bias_pattern(), "should not see bias pattern here");
// displaced header是升级轻量级锁过程中,用于存储锁对象MarkWord的拷贝,官方为这份拷贝加了一个Displaced前缀。可参考:《深入理解Java虚拟机》第三版482页的介绍。
// 如果displaced header是空,先前的加锁便是重量级锁
// if displaced header is null, the previous enter is recursive enter, no-op
markOop dhw = lock->displaced_header();
markOop mark ;
if (dhw == NULL) {
// Recursive stack-lock. 递归堆栈锁
// Diagnostics -- Could be: stack-locked, inflating, inflated. 断定应该是:堆栈锁、膨胀中、已膨胀(重量级锁)
mark = object->mark() ;
assert (!mark->is_neutral(), "invariant") ;
if (mark->has_locker() && mark != markOopDesc::INFLATING()) {
assert(THREAD->is_lock_owned((address)mark->locker()), "invariant") ;
}
if (mark->has_monitor()) {
ObjectMonitor * m = mark->monitor() ;
assert(((oop)(m->object()))->mark() == mark, "invariant") ;
assert(m->is_entered(THREAD), "invariant") ;
}
return ;
}
mark = object->mark() ; // 锁对象头的MarkWord
// 此处为轻量级锁的释放过程,使用CAS方式解锁(下述方法中的cmpxchg_ptr即CAS操作)。
// 如果对象被当前线程堆栈锁定,请尝试将displaced header和锁对象中的MarkWord替换回来。
// If the object is stack-locked by the current thread, try to
// swing the displaced header from the box back to the mark.
if (mark == (markOop) lock) {
assert (dhw->is_neutral(), "invariant") ;
if ((markOop) Atomic::cmpxchg_ptr (dhw, object->mark_addr(), mark) == mark) {
TEVENT (fast_exit: release stacklock) ;
return;
}
}
ObjectSynchronizer::inflate(THREAD, object)->exit (true, THREAD) ;
}
// -----------------------------------------------------------------------------
// Interpreter/Compiler Slow Case
// 解释器/编译器慢加锁的case。常规操作,此时不需使用fast_enter的方式,因为一定是在解释器/编译器已经失败过了。
// This routine is used to handle interpreter/compiler slow case
// We don't need to use fast path here, because it must have been
// failed in the interpreter/compiler code.
void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) {
markOop mark = obj->mark();
assert(!mark->has_bias_pattern(), "should not see bias pattern here");
if (mark->is_neutral()) {
// 预期成功的CAS -- 替换标记的ST必须是可见的 <= CAS执行的ST。优先使用轻量级锁(又叫:自旋锁)
// Anticipate successful CAS -- the ST of the displaced mark must
// be visible <= the ST performed by the CAS.
lock->set_displaced_header(mark);
if (mark == (markOop) Atomic::cmpxchg_ptr(lock, obj()->mark_addr(), mark)) {
TEVENT (slow_enter: release stacklock) ;
return ;
}
// Fall through to inflate() ... 上面没成功,只能向下执行inflate()锁膨胀方法了
} else
if (mark->has_locker() && THREAD->is_lock_owned((address)mark->locker())) { //当前线程已持有锁
assert(lock != mark->locker(), "must not re-lock the same lock");
assert(lock != (BasicLock*)obj->mark(), "don't relock with same BasicLock");
lock->set_displaced_header(NULL);
return;
}
#if 0
// The following optimization isn't particularly useful.
if (mark->has_monitor() && mark->monitor()->is_entered(THREAD)) {
lock->set_displaced_header (NULL) ;
return ;
}
#endif
// 对象头将再也不会被移到这个锁锁,所以是什么值并不重要,除非必须是非零的,以避免看起来像是重入锁,而且也不能看起来是锁定的。
// 重量级锁的mrakword中除了锁标记位为10外,另外30位是:指向重量级锁的指针
// The object header will never be displaced to this lock,
// so it does not matter what the value is, except that it
// must be non-zero to avoid looking like a re-entrant lock,
// and must not look locked either.
lock->set_displaced_header(markOopDesc::unused_mark());
ObjectSynchronizer::inflate(THREAD, obj())->enter(THREAD);
}
// This routine is used to handle interpreter/compiler slow case
// We don't need to use fast path here, because it must have
// failed in the interpreter/compiler code. Simply use the heavy
// weight monitor should be ok, unless someone find otherwise.
void ObjectSynchronizer::slow_exit(oop object, BasicLock* lock, TRAPS) {
fast_exit (object, lock, THREAD) ;
}
同样是synchronized.cpp
文件中的方法,两部分代码没挨着,又比较长,分开放了。
// Note that we could encounter some performance loss through false-sharing as
// multiple locks occupy the same $ line. Padding might be appropriate.
// 注意:当多个锁并发使用同一 $=行时,错误的共享方式可能会导致一些性能损失。填充可能是合适的。
ObjectMonitor * ATTR ObjectSynchronizer::inflate (Thread * Self, oop object) {
// Inflate mutates the heap ...
// Relaxing assertion for bug 6320749.
assert (Universe::verify_in_progress() ||
!SafepointSynchronize::is_at_safepoint(), "invariant") ;
for (;;) {
const markOop mark = object->mark() ;
assert (!mark->has_bias_pattern(), "invariant") ;
// The mark can be in one of the following states:
// * Inflated - just return 仅仅返回
// * Stack-locked - coerce it to inflated 轻量级锁,需强迫它膨胀
// * INFLATING - busy wait for conversion to complete 膨胀中,需自旋等待转换完成
// * Neutral中立的 - aggressively inflate the object. 积极地使object发生膨胀
// * BIASED - Illegal. We should never see this 进入此方法必定不是偏向锁状态,直接忽略即可
// CASE: inflated
if (mark->has_monitor()) {
ObjectMonitor * inf = mark->monitor() ;
assert (inf->header()->is_neutral(), "invariant");
assert (inf->object() == object, "invariant") ;
assert (ObjectSynchronizer::verify_objmon_isinpool(inf), "monitor is invalid");
return inf ;
}
// CASE: inflation in progress - inflating over a stack-lock. 锁膨胀正在进行中,膨胀的堆栈锁(轻量级锁)
// Some other thread is converting from stack-locked to inflated. 其他线程正在从堆栈锁(轻量级锁)定转换为膨胀。
// Only that thread can complete inflation -- other threads must wait. 只有那个线程才能完成膨胀——其他线程必须等待。
// The INFLATING value is transient. INFLATING状态是暂时的
// Currently, we spin/yield/park and poll the markword, waiting for inflation to finish. 并发地,我们 spin/yield/park和poll的markword,等待inflation结束。
// We could always eliminate polling by parking the thread on some auxiliary list. 我们总是可以通过将线程停在某个辅助列表上来消除轮询。
if (mark == markOopDesc::INFLATING()) {
TEVENT (Inflate: spin while INFLATING) ;
ReadStableMark(object) ;
continue ;
}
// CASE: stack-locked 此时锁为:轻量级锁,需强迫它膨胀为重量级锁
// Could be stack-locked either by this thread or by some other thread. 可能被此线程或其他线程堆栈锁定
//
// Note that we allocate the objectmonitor speculatively, _before_ attempting
// to install INFLATING into the mark word. We originally installed INFLATING,
// allocated the objectmonitor, and then finally STed the address of the
// objectmonitor into the mark. This was correct, but artificially lengthened
// the interval in which INFLATED appeared in the mark, thus increasing
// the odds of inflation contention.
// 我们大胆地分配objectmonitor,在此之前尝试将INFLATING状态先设置到mark word。
// 我们先设置了INFLATING状态标记,然后分配了objectmonitor,最后将objectmonitor的地址设置到mark word中。
// 这是正确的,但人为地延长了INFLATED出现在mark上的时间间隔,从而增加了锁膨胀的可能性。
// 老外反复说了一堆重复的话,意思无非就是:markword设置状态INFLATING(结合上段对INFLATING处理的代码思考) -> 分配锁 -> markword设置状态INFLATED(膨胀重量级锁成功)
//
// We now use per-thread private objectmonitor free lists.
// These list are reprovisioned from the global free list outside the
// critical INFLATING...ST interval. A thread can transfer
// multiple objectmonitors en-mass from the global free list to its local free list.
// This reduces coherency traffic and lock contention on the global free list.
// Using such local free lists, it doesn't matter if the omAlloc() call appears
// before or after the CAS(INFLATING) operation.
// See the comments in omAlloc().
if (mark->has_locker()) {
ObjectMonitor * m = omAlloc (Self) ;
// Optimistically prepare the objectmonitor - anticipate successful CAS
// We do this before the CAS in order to minimize the length of time
// in which INFLATING appears in the mark.
m->Recycle();
m->_Responsible = NULL ;
m->OwnerIsThread = 0 ;
m->_recursions = 0 ;
m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ; // Consider: maintain by type/class
markOop cmp = (markOop) Atomic::cmpxchg_ptr (markOopDesc::INFLATING(), object->mark_addr(), mark) ;
if (cmp != mark) {
omRelease (Self, m, true) ;
continue ; // Interference -- just retry
}
// We've successfully installed INFLATING (0) into the mark-word.
// This is the only case where 0 will appear in a mark-work.
// Only the singular thread that successfully swings the mark-word
// to 0 can perform (or more precisely, complete) inflation.
//
// Why do we CAS a 0 into the mark-word instead of just CASing the
// mark-word from the stack-locked value directly to the new inflated state?
// Consider what happens when a thread unlocks a stack-locked object.
// It attempts to use CAS to swing the displaced header value from the
// on-stack basiclock back into the object header. Recall also that the
// header value (hashcode, etc) can reside in (a) the object header, or
// (b) a displaced header associated with the stack-lock, or (c) a displaced
// header in an objectMonitor. The inflate() routine must copy the header
// value from the basiclock on the owner's stack to the objectMonitor, all
// the while preserving the hashCode stability invariants. If the owner
// decides to release the lock while the value is 0, the unlock will fail
// and control will eventually pass from slow_exit() to inflate. The owner
// will then spin, waiting for the 0 value to disappear. Put another way,
// the 0 causes the owner to stall if the owner happens to try to
// drop the lock (restoring the header from the basiclock to the object)
// while inflation is in-progress. This protocol avoids races that might
// would otherwise permit hashCode values to change or "flicker" for an object.
// Critically, while object->mark is 0 mark->displaced_mark_helper() is stable.
// 0 serves as a "BUSY" inflate-in-progress indicator.
// fetch the displaced mark from the owner's stack.
// The owner can't die or unwind past the lock while our INFLATING
// object is in the mark. Furthermore the owner can't complete
// an unlock on the object, either.
markOop dmw = mark->displaced_mark_helper() ;
assert (dmw->is_neutral(), "invariant") ;
// Setup monitor fields to proper values -- prepare the monitor
m->set_header(dmw) ;
// Optimization: if the mark->locker stack address is associated
// with this thread we could simply set m->_owner = Self and
// m->OwnerIsThread = 1. Note that a thread can inflate an object
// that it has stack-locked -- as might happen in wait() -- directly
// with CAS. That is, we can avoid the xchg-NULL .... ST idiom.
m->set_owner(mark->locker());
m->set_object(object);
// TODO-FIXME: assert BasicLock->dhw != 0.
// Must preserve store ordering. The monitor state must
// be stable at the time of publishing the monitor address.
guarantee (object->mark() == markOopDesc::INFLATING(), "invariant") ;
object->release_set_mark(markOopDesc::encode(m));
// Hopefully the performance counters are allocated on distinct cache lines
// to avoid false sharing on MP systems ...
if (ObjectMonitor::_sync_Inflations != NULL) ObjectMonitor::_sync_Inflations->inc() ;
TEVENT(Inflate: overwrite stacklock) ;
if (TraceMonitorInflation) {
if (object->is_instance()) {
ResourceMark rm;
tty->print_cr("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s",
(void *) object, (intptr_t) object->mark(),
object->klass()->external_name());
}
}
return m ;
}
// CASE: neutral
// TODO-FIXME: for entry we currently inflate and then try to CAS _owner.
// If we know we're inflating for entry it's better to inflate by swinging a
// pre-locked objectMonitor pointer into the object header. A successful
// CAS inflates the object *and* confers ownership to the inflating thread.
// In the current implementation we use a 2-step mechanism where we CAS()
// to inflate and then CAS() again to try to swing _owner from NULL to Self.
// An inflateTry() method that we could call from fast_enter() and slow_enter()
// would be useful.
assert (mark->is_neutral(), "invariant");
ObjectMonitor * m = omAlloc (Self) ;
// prepare m for installation - set monitor to initial state
m->Recycle();
m->set_header(mark);
m->set_owner(NULL);
m->set_object(object);
m->OwnerIsThread = 1 ;
m->_recursions = 0 ;
m->_Responsible = NULL ;
m->_SpinDuration = ObjectMonitor::Knob_SpinLimit ; // consider: keep metastats by type/class
if (Atomic::cmpxchg_ptr (markOopDesc::encode(m), object->mark_addr(), mark) != mark) {
m->set_object (NULL) ;
m->set_owner (NULL) ;
m->OwnerIsThread = 0 ;
m->Recycle() ;
omRelease (Self, m, true) ;
m = NULL ;
continue ;
// interference - the markword changed - just retry.
// The state-transitions are one-way, so there's no chance of
// live-lock -- "Inflated" is an absorbing state.
}
// Hopefully the performance counters are allocated on distinct
// cache lines to avoid false sharing on MP systems ...
if (ObjectMonitor::_sync_Inflations != NULL) ObjectMonitor::_sync_Inflations->inc() ;
TEVENT(Inflate: overwrite neutral) ;
if (TraceMonitorInflation) {
if (object->is_instance()) {
ResourceMark rm;
tty->print_cr("Inflating object " INTPTR_FORMAT " , mark " INTPTR_FORMAT " , type %s",
(void *) object, (intptr_t) object->mark(),
object->klass()->external_name());
}
}
return m ;
}
}
重要的事情又来了,又到了反复品本文2.3章内容的时刻!!!
锁升级过程,可以总结为:无锁 -> 偏向锁 -> 轻量级锁 (自旋锁,自适应自旋)-> 重量级锁。且只可正向膨胀升级,不存在降级。
对象初始化后,处于无锁状态
当存在一个线程A来获取锁,锁对象第一次被获取使用时,进入偏向锁模式,且可重入。当满足一些苛刻的条件时,如果存在另外一个线程B来获取锁时,偏向锁可被B线程CAS获取到,并替换markword中的线程ID相关信息。
若竞争偏向锁失败,则会升级为轻量级锁(又叫自旋锁、堆栈锁),在升级过程中也采用CAS操作。若首次CAS获取或竞争轻量级锁失败,则会采用spin自旋的方式,自旋N次,重复尝试。自旋也又固定的次数,逐渐优化为更为智能的自适应自旋重试。
若经过自旋,依然无法获取到锁,表明锁竞争较为激烈,CAS自旋较为消耗CPU资源,直接膨胀升级为重量级锁。
超有用的总结:重量级锁,会直接向操作系统申请资源,将等待线程挂起,进入锁池队列阻塞等待,等待操作系统的调度。其余的偏向锁和轻量级锁,本质上并未交由操作系统调度,依然处于用户态,依然消耗CPU资源,只是采用CAS无锁竞争的方式获取锁。CAS又是Java通过Unsafe
类中compareAndSwap方法,jni调用jvm中的C++方法,最终通过下述汇编指令锁住cpu中的北桥信号(非锁住总线,锁住总线就什么都干不了了)实现。
lock cmpxchg 指令
引用《深入理解Java虚拟机》第三版对锁消除的一段介绍:
锁消除是指虚拟机即时编译器在运行时,对一些代码要求同步,但是对被检测到不可能存在共享数据竞争的锁进行消除。锁消除的主要判定依据来源于逃逸分析的数据支持,如果判断到一段代码中,在堆上的所有数据都不会逃逸出去被其他线程调用,那就可以把它们当作栈上数据对待,认为它们是线程私有的,同步加锁自然无须再进行。
比如下面一段代码:
public static String concatString(String str1, String str2, String str3) {
StringBuffer sb = new StringBuffer();
sb.append(str1).append(str2).append(str3);
return sb.toString();
}
大家都熟知StringBuffer是一个线程安全的字符串拼接类,它的每个方法都加了synchronized
关键字,每个方法都需要获取锁才能执行,锁对象就是StringBuffer的实例化对象。上述代码中,锁对象就是sb实例对象,经过虚拟机的逃逸分析后会发现sb对象的作用域仅仅被局限在concatString方法内部,根本不会被外部方法使用或调用。因此,其他线程完全没有机会访问到它,也不会产生资源竞争的同步问题。在解释执行时,这里仍然会加锁,在经过服务端编译器的即时编译后(因为逃逸分析是属于即时编译器的优化技术),这段代码就会忽略所有的同步措施而直接执行。
Show Code:
public static String testLockCoarsenin(String str) {
StringBuffer sb = new StringBuffer();
for(int i = 0; i < 100; i++){
sb.append(str1);
}
return sb.toString();
}
比如上述代码,append方法需要获取锁,在未优化的情况下,循环调用100次,则需要获取锁和释放锁各100次,相当浪费资源。JVM 会检测到这样一连串的操作都对同一个对象加锁,将会把加锁同步的范围粗化到整个操作序列的外部(如循环体外部),使得一连串的操作只需要加一次锁即可。
拓展了解
目前主流的Java虚拟机,如我们最常使用的HotSpot虚拟机采用的是:解释器和编译器并存的架构。Java程序最初通过解释器进行解释执行的,当虚拟机发现某个方法或代码块运行很频繁,就会把这些代码认定为热点代码,并通过编译器即时将热点代码编译成本地机器码,并以各种手段尽可能地优化代码,以提高执行效率。
上述这种解释器和编译器并存的架构使解释器和编译器优势互补:当程序需要迅速启动或执行时,解释器首先介入,省去编译时间;当程序启动后,编译器逐渐发挥作用,把更多的代码编译成本地代码,提高执行效率。
继续Show Code:
public class Demo {
static volatile int i = 0;
static volatile int j = 0;
public static void n() {
i++;
}
public static synchronized void m() {
j++;
}
public static void main(String[] args) throws InterruptedException {
for (int j = 0; j < 100_0000; j++) {
m();
n();
}
System.out.println(i);
System.out.println(j);
}
}
执行main方法时,加上以下JVM参数(打开诊断模式,打印汇编代码):-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly 打印汇编代码;或使用-server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+PrintAssembly -XX:+LogCompilation -XX:LogFile=TestSynchronizedAssembly.log
以log的形式输出到文件,使用jitwatch等工具查看汇编代码。
会看到m和n方法的C1 Compile Level 1 (C1编译器优化)和C2 Compile Level 1 (C2编译器优化)内容。里面都会有lock comxchg .....指令
,也就是我们重复执行100万次的m和n方法成为热点代码,经过了两级编译器的优化编译,将较为耗时的synchronized
加锁和释放锁操作,优化成了在此处更为合理的底层cas操作,并使用lock指令修饰的同步措施。
注:并非所有的synchronized
经过被编译优化为lock comxchg ...指令
,不同代码有不同的优化方式,千万、千万不要认为synchronized
的底层实现是lock comxchg ...指令
。这里只是拿上述代码进行的举例。
如果大家的**-XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly** 指令无法正常使用,是因为缺少hsdis的配置,请自行百度或参考《深入理解Java虚拟机》第三版的第11.2.4章。hsdis和强大的jitwatch的下载和安装参考文章:https://www.xuebuyuan.com/3192700.html,以及强大
如果大家对编译器工作内容和原理感兴趣,请自行百度或或参考《深入理解Java虚拟机》第三版的第10章和第11章。
我对上述的底层原理也停留在“纸老虎”阶段,如有理解或表述误差,还请斧正或探讨。
手机扫一扫
移动阅读更方便
你可能感兴趣的文章