ThreadLocal源码分析解密

什么是ThreadLocal

我们来看看作者Doug Lea是怎么说的,下面是jdk7.x里面ThreadLocal注释

This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g.a user ID or Transaction ID).
each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist)

也就是说这个类给线程提供了一个本地变量，这个变量是该线程自己拥有的。在该线程存活和ThreadLocal实例能访问的时候,保存了对这个变量副本的引用.当线程消失的时候，所有的本地实例都会被GC。并且建议我们ThreadLocal最好是 private static 修饰的成员

和Thread的关系

假设我们要设计一个和线程绑定的变量，我们会怎么做呢？很常见的一个思路就是把Thread和变量放在一个Map中(当然key可以是id或者其他能代表唯一Thread的东西)。但是jdk是怎么做的呢？当然不是这么做的，但是为什么要用其他的做法呢？这里我们先不思考这个问题,我们先来看看ThreadLocal#set方法做了些什么

/**
    * Sets the current thread's copy of this thread-local variable
    * to the specified value.  Most subclasses will have no need to
    * override this method, relying solely on the {@link #initialValue}
    * method to set the values of thread-locals.
    *
    * @param value the value to be stored in the current thread's copy of this thread-local.
    */
   public void set(T value) {
       Thread t = Thread.currentThread();
       //通过当前线程得到一个ThreadLocalMap
       ThreadLocalMap map = getMap(t);
       //map存在,则把value放入该ThreadLocalMap中
       if (map != null)
           map.set(this, value);
       else
           createMap(t, value);
   }

然后,

/**
    * Get the map associated with a ThreadLocal. Overridden in
    * InheritableThreadLocal.
    *
    * @param  t the current thread
    * @return the map
    */
   ThreadLocalMap getMap(Thread t) {
       //返回Thread的一个成员变量
       return t.threadLocals;
   }

原来是把ThreadLocalMap和Thread绑定起来了，Thread类中有一个ThreadLocalMap为null的变量,那我们现在回到ThreadLocalMap来看，在我们Thread返回的引用来看，如果map为null的情况下，调用了createMap方法.这就为我们的Thread创建了一个能保存在本地线程的map.下面是Thread里面的字段


/* ThreadLocal values pertaining to this thread. This map is maintained by the ThreadLocal class. */
//ThreadLocal帮助Thread赋值了该字段
ThreadLocal.ThreadLocalMap threadLocals = null;

ThreadLocalMap

那么当我们第一次使用ThreadLocal的时候,我们通过getMAP得到的ThreadLocalMap必然是null，我们来看看createMap方法

/**
 * Create the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param t the current thread
 * @param firstValue value for the initial entry of the map
 * @param map the map to store.
 */
void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

    /**
     * Construct a new map initially containing(firstKey,firstValue).
     * ThreadLocalMaps are constructed lazily, so we only create
     * one when we have at least one entry to put in it.
     */
    ThreadLocalMap(ThreadLocal firstKey, Object firstValue) {
        table = new Entry[INITIAL_CAPACITY];
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
        table[i] = new Entry(firstKey, firstValue);
        size = 1;
        setThreshold(INITIAL_CAPACITY);
    }

在CreatMap中会直接new 一个ThreadLocalMap,里面传入的是当前ThreadLocal#this.然后创建一个大小为INITIAL_CAPACITY的Entry。关于这个INITIAL_CAPACITY为什么是2的N次方,这在HashMap里面也是有体现的,这里INITIAL_CAPACITY为16那么16-1=15在二进制中就是1111.当他和TheadLocal的INITIAL_CAPACITY相与的时候，得到的数绝对是<=INITIAL_CAPACITY.这和threadLocalHashCode%INITIAL_CAPACITY的效果是一样的，但是效率比前者好处很多倍。ok，这里不再赘述，此时我们已经得到一个下标位置，我们直接new了一个Entry(ThreadLocal,Object)，放入该table数组当中，这个时候把table的size置为1，阈值职位INITIAL_CAPACITY的2/3（达到最大长度的2/3的时候会扩容）.代码就不贴了。

这里小结一下，现在我们应该能够理清ThreadLocal和Thread的关系了，大致是这样的:Thread里面有一个类似MAP的东西，但是初始化的时候为null，当我们使用ThreadLocal的时候，ThreadLocal会帮助当前线程初始化这个MAP，并且把我们需要和线程绑定的值放入改Map中。map的key为当前ThreadLocal。那么这样和我们才开始的想法有什么不一样呢，才开始我们的想法是在ThreadLocal当中维护一个mao，key为Thread表示，value为值。和这样的方式有什么差别呢，为什么要这样做？话说在jdk1.3之前就是用这种方式做的，但是之后就改成了现在的这种做法。这样做法的优点之一是，value放在了线程当中，随着线程的生命周期生存，线程死亡，value回收。之二是性能提高了，想想一下在有很多请求的应用中，如果按照之前的做法，HashMap该多大？，性能应该会比较低，而换成后者这种方法，map的大小变得比较小，和Threadlocal的数量相同（有多少个ThreadLocal，线程当中的map实际存储的就有多少个）。

可能存在的问题

上文看似我们已经渐渐的明白了ThreadLocal的本质，实际上Threadlocal可能会存在一些些问题
关于Entry,这里说一下,jdk中Entry的key值其实是弱引用的,这代表他将会很快被GC掉

/**
        * The entries in this hash map extend WeakReference, using
        * its main ref field as the key (which is always a
        * ThreadLocal object).  Note that null keys (i.e. entry.get()
        * == null) mean that the key is no longer referenced, so the
        * entry can be expunged from table.  Such entries are referred to
        * as "stale entries" in the code that follows.
        */
       static class Entry extends WeakReference<ThreadLocal> {
           /** The value associated with this ThreadLocal. */
           Object value;

           Entry(ThreadLocal k, Object v) {
               super(k);
               value = v;
           }
       }

如下图(摘自网络),ThreadLocalMap使用ThreadLocal的弱引用作为key，如果一个ThreadLocal没有外部强引用引用他，那么系统gc的时候，这个ThreadLocal势必会被回收，这样一来，ThreadLocalMap中就会出现key为null的Entry，就没有办法访问这些key为null的Entry的value，如果当前线程再迟迟不结束的话，这些key为null的Entry的value就会一直存在一条强引用链,这就会造成很多人都认为的内存泄露，其实我认为是不会发生的。继续看

按照道理来说，如果我们使用的线程池方式，当一个线程使用完的时候，线程并没有死亡，而是回归线程池继续使用，这个时候和该线程的bind其实并有没什么意义呢，但是呢？value并不会被回收，这也算导致了内存泄露，还有一种情况就是上述所说的，当弱引用被回收吊，null无法访问value，也导致了相同的问题。那么？这是真的吗？先卖一个关子，我们先来看看一些其他的东西

ThreadLocal小片段

ThreadLocal之间是如何区分的呢？给每个ThreadLocal一个标识符？这确实是一种思路，jdk里面是这样做的。

//每个对象都有一个HashCode来标示自己的唯一性
 private final int threadLocalHashCode = nextHashCode();

    /**
     * The next hash code to be given out. Updated atomically. Starts at
     * zero.
     */
     //原子类保证线程安全，保证每个对象的hashcode唯一，并且是静态的
    private static AtomicInteger nextHashCode =
        new AtomicInteger();

    /**
     * The difference between successively generated hash codes - turns
     * implicit sequential thread-local IDs into near-optimally spread
     * multiplicative hash values for power-of-two-sized tables.
     为什么是这个数，暂时没探究
     */
    private static final int HASH_INCREMENT = 0x61c88647;

    /**
     * Returns the next hash code.
     返回原始值，加上上面那个数
     */
    private static int nextHashCode() {
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }

比如我第一个ThreadLocal的hashCode就是0,那么我在定义一个他的hashCode就是0的基础上加上HASH_INCREMENT。这样在map中他们的hahscode不一样，但是这个时候虽然hashcode不一样，但是计算出来的下标i可能是一样的，这就造成了hash冲突，在ThreadLocal里面用的解决Hash冲突是用的线性探查法(Linear Probing)来解决的，当i下标有值的时候则找到i+1处，然后依次往下推。看看set、get

/**
    * Returns the value in the current thread's copy of this
    * thread-local variable.  If the variable has no value for the
    * current thread, it is first initialized to the value returned
    * by an invocation of the {@link #initialValue} method.
    *
    * @return the current thread's value of this thread-local
    */
   public T get() {
       Thread t = Thread.currentThread();
       ThreadLocalMap map = getMap(t);
       //map不为null.直接在map中取
       if (map != null) {
           ThreadLocalMap.Entry e = map.getEntry(this);
           if (e != null)
               return (T)e.value;
       }
       //map为null，需要从初始化的地方取值
       return setInitialValue();
   }

   /**
    * Variant of set() to establish initialValue. Used instead
    * of set() in case user has overridden the set() method.
    *
    * @return the initial value
    */
   private T setInitialValue() {
       //初始化值的方法,大部分情况我们会重写这个方法
       T value = initialValue();
       Thread t = Thread.currentThread();
       ThreadLocalMap map = getMap(t);
       if (map != null)
       //放入map
           map.set(this, value);
       else
       //新建map
           createMap(t, value);
       return value;
   }

   /**
    * Sets the current thread's copy of this thread-local variable
    * to the specified value.  Most subclasses will have no need to
    * override this method, relying solely on the {@link #initialValue}
    * method to set the values of thread-locals.
    *
    * @param value the value to be stored in the current thread's copy of
    *        this thread-local.
    */
   public void set(T value) {
       //和上面大致相同
       Thread t = Thread.currentThread();
       ThreadLocalMap map = getMap(t);
       if (map != null)
           map.set(this, value);
       else
           createMap(t, value);
   }

粗虐上来看，这是一个非常简单的对map的add、get、init操作,但是我们来看看ThreadLocalMap#set方法的一些细节

 private void set(ThreadLocal key, Object value) {

            // We don't use a fast path as with get() because it is at
            // least as common to use set() to create new entries as
            // it is to replace existing ones, in which case, a fast
            // path would fail more often than not.
            //得到应该与运算之后应该得到的下标
            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
            /**得到entry,如果e不为null,调用父类Reference的get方法得到 ThreadLocal对象，虽然下标相同。但是很可能不是同一个ThreadLocal对象，
*如果是同一个对象，k==key。就替换Entry里面的value值，该下标的对象k为null。就放入改位置，如果有其他的，就往下一个i+1位置上找            
            */
            for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
                ThreadLocal k = e.get();

                if (k == key) {
                    e.value = value;
                    return;
                }

                if (k == null) {
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }
            /**如果计算后的坐标获取到的entry为null，就new一个Entry对象并保存进去，然后调用cleanSomeSlots()对table进行清理，如果没有任何Entry被清理，并且表的size超过了阈值，就会调用rehash()方法。 
*cleanSomeSlots()会调用expungeStaleEntry清理陈旧过时的Entry。rehash则会调用expungeStaleEntries()方法清理所有的陈旧的Entry，然后在size大于阈值的3/4时调用resize()方法进行扩容。代码如下
*/
            tab[i] = new Entry(key, value);
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();
        }

在get中getEntry()方法通过计算出的下标从table中取出entry，如果取得的entry为null或它的key值不相等，就调用getEntryAfterMiss()方法，否则返回。
而在getEntryAfterMiss()是当通过key与table的长度取模得到的下标取得entry后，entry里没有该key时所调用的。这时，如果获取的entry为null，即没有保存，就直接返回null，否则进入循环不，计算下一个坐标并获取对应的entry，并且当key相等时（表明找到了之前保存的值）返回entry，或是entry为null时退出循环，并返回null。expungeStaleEntries方法会清楚所有key未null的Entry

  /**
 * Expunge all stale entries in the table.
 */
private void expungeStaleEntries() {
    Entry[] tab = table;
    int len = tab.length;
    for (int j = 0; j < len; j++) {
        Entry e = tab[j];
        if (e != null && e.get() == null)
            expungeStaleEntry(j);
    }
}

总结

到了最后,上面我们留下的问题大致也都得到了答案，在我们调用set或者get的时候，ThreadLocal会自动的清楚key为null的值，不会造成内存泄露。而当使用线程池的时候，我们应该在改线程使用完该ThreadLocal的时候自觉地调用remove方法清空Entry，这会是一个非常好的习惯。
被废弃了的ThreadLocal所绑定对象的引用，会在以下4情况被清理。

Thread结束时。
当Thread的ThreadLocalMap的threshold超过最大值时。rehash
向Thread的ThreadLocalMap中存放一个ThreadLocal，hash算法没有命中既有Entry,而需要新建一个Entry时。
手工通过ThreadLocal的remove()方法或set(null)。