深入浅出mybatis之缓存机制-白红宇

深入浅出mybatis之缓存机制

阅读量：4314 次

发布时间：2019-06-06

本文共 18488 字，大约阅读时间需要 61 分钟。

前言

提到缓存，我们都会不约而同地认识到这是提高系统性能的必要措施之一，特别是高命中率的缓存设置，将会大大提高系统的整体吞吐量。缓存的应用场景从小到在http会话中缓存登录信息，大到为数据库分担一部分查询压力的独立缓存组件（如Redis，Memcached等），应用都非常普遍。而MyBatis作为Java中非常流行的ORM组件，也不可免俗地使用了缓存机制。那么我们不禁要提出疑问：MyBatis是如何实现缓存的？如何在应用程序中合理地使用MyBatis缓存？如下内容基于MyBatis3.4.5版本进行说明。

准备工作

数据库表

create table `student` (  `id` BIGINT(20) NOT NULL AUTO_INCREMENT COMMENT '主键ID',  `name` VARCHAR(50) default '',  `age` INT NOT NULL DEFAULT 0 COMMENT '年龄',  `sex` TINYINT NOT NULL DEFAULT 0 COMMENT '性别,0:男,1:女',  `ctime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',  `mtime` timestamp  DEFAULT CURRENT_TIMESTAMP COMMENT '编辑时间',  PRIMARY KEY (`id`))ENGINE=InnoDB DEFAULT CHARSET=utf8;

接口映射器

public interface StudentMapper {    // 缓存的应用主要是查询的场景    @Select("select * from student where id = #{id}")    Student getStudentById(@Param("id") long id);}

XML映射器

MyBatis默认缓存设置

在MyBatis中，关于缓存设置的参数一共有2个：localCacheScope，cacheEnabled。

那么这两个参数分别在什么地方使用呢？不妨先走读一下MyBatis的相关源码。

首先，来看看参数cacheEnabled的应用。

org.apache.ibatis.session.defaults.DefaultSqlSessionFactory

// 在应用程序中通过sqlSessionFactory获取一个SqlSession对象执行CRUD操作SqlSession sqlSession = sqlSessionFactory.openSession(true);// 在DefaultSqlSessionFactory中获取SqlSession对象@Overridepublic SqlSession openSession(boolean autoCommit) {    return openSessionFromDataSource(configuration.getDefaultExecutorType(), null, autoCommit);}// 通过MyBatis配置参数构建SqlSession对象private SqlSession openSessionFromDataSource(ExecutorType execType, TransactionIsolationLevel level, boolean autoCommit) {    Transaction tx = null;    try {        final Environment environment = configuration.getEnvironment();        final TransactionFactory transactionFactory = getTransactionFactoryFromEnvironment(environment);        tx = transactionFactory.newTransaction(environment.getDataSource(), level, autoCommit);        // 根据配置的Executor类型装配具体的实现类        final Executor executor = configuration.newExecutor(tx, execType);        return new DefaultSqlSession(configuration, executor, autoCommit);    } catch (Exception e) {        closeTransaction(tx); // may have fetched a connection so lets call close()        throw ExceptionFactory.wrapException("Error opening session.  Cause: " + e, e);    } finally {        ErrorContext.instance().reset();    }}

org.apache.ibatis.session.Configuration

// 在Configuration中根据不同的defaultExecutorType参数值装配具体的Executor实现public Executor newExecutor(Transaction transaction, ExecutorType executorType) {    executorType = executorType == null ? defaultExecutorType : executorType;    executorType = executorType == null ? ExecutorType.SIMPLE : executorType;    // 根据不同的defaultExecutorType参数值装配具体的Executor实现    Executor executor;    if (ExecutorType.BATCH == executorType) {        // 当defaultExecutorType值为BATCH时，使用BatchExecutor        executor = new BatchExecutor(this, transaction);    } else if (ExecutorType.REUSE == executorType) {        // 当defaultExecutorType值为REUSE时，使用ReuseExecutor        executor = new ReuseExecutor(this, transaction);    } else {        // 默认情况下使用SimpleExecutor        executor = new SimpleExecutor(this, transaction);    }    // 如果设置cacheEnabled参数值为true,将使用CachingExecutor    if (cacheEnabled) {        executor = new CachingExecutor(executor);    }    executor = (Executor) interceptorChain.pluginAll(executor);    return executor;}

从上述源码中可以看到，MyBatis会根据配置参数defaultExecutorType的值使用不同的执行器：BatchExecutor，ReuseExecutor，SimpleExecutor。此外，当参数cacheEnabled值为true时，会使用一个特别的执行器：CachingExecutor。那么，不同的执行器有什么不同呢？他们有什么联系吗？下图为MyBatis中执行器的类图。

其次，再来跟踪一下参数localCacheScope的使用场景。

如下为MyBatis执行查询的时序图：

从查询时序图中可以看到，MyBatis中的查询操作会通过执行器来完成，因此我们需要跟踪一下相关执行器的源码。

org.apache.ibatis.executor.BaseExecutor

@SuppressWarnings("unchecked")@Overridepublic 
    
      List
     
       query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {    ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());    if (closed) {        throw new ExecutorException("Executor was closed.");    }    if (queryStack == 0 && ms.isFlushCacheRequired()) {        clearLocalCache();    }    List
      
        list;    try {        queryStack++;        // 先从缓存中获取数据        list = resultHandler == null ? (List
       
        ) localCache.getObject(key) : null;        if (list != null) {            handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);        } else {            // 未从缓存中获取到数据时直接从数据库中查询            list = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql);        }    } finally {        queryStack--;    }    if (queryStack == 0) {        for (BaseExecutor.DeferredLoad deferredLoad : deferredLoads) {            deferredLoad.load();        }        // issue #601        deferredLoads.clear();        // 如果参数localCacheScope值为STATEMENT，则每次查询之后都清空缓存        if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {            // issue #482            clearLocalCache();        }    }    return list;}// 直接从数据库查询private 
        
          List
         
           queryFromDatabase(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException { List
          
            list; localCache.putObject(key, EXECUTION_PLACEHOLDER); try { // 真正执行数据库查询操作的是BaseExecutor子类中实现的doQuery()方法 list = doQuery(ms, parameter, rowBounds, resultHandler, boundSql); } finally { localCache.removeObject(key); } localCache.putObject(key, list); if (ms.getStatementType() == StatementType.CALLABLE) { localOutputParameterCache.putObject(key, parameter); } return list;}// BaseExecutor中的doQuery()方法是一个抽放方法// 所以真正执行数据库查询的操作都是委托给了BaseExecutor子类：BatchExecutor，ReuseExecutor和SimpleExecutorprotected abstract 
           
             List
            
              doQuery(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, BoundSql boundSql) throws SQLException;

从BaseExecutor的query()方法中可以看到，在执行数据查询时会先尝试从本地名为“localCache”的缓存对象中获取数据，当从缓存中未查询到数据时才直接从数据库查询。另外，当参数localCacheScope值为STATEMENT时，每次查询之后都会清空BaseExecutor的本地缓存。

OK，到这里我们就可以对MyBatis中控制缓存的2个参数做一个浅显的总结：

（1）参数cacheEnabled控制MyBatis使用的执行器类型

（2）参数localCacheScope控制的是BaseExecutor内部的缓存策略

缓存实现原理分析

既然在MyBatis中是通过cacheEnabled和localCacheScope这2个参数来控制缓存的，那么如下的实现原理分析也基于这2个参数进行。

参数localCacheScope控制的缓存策略

在MyBatis的默认缓存设置中我们已经知道，参数cacheEnabled控制的是不同的执行器类型，而从MyBatis的执行器类图中又可以看出，当参数cacheEnabled为false时，MyBatis使用的执行器类型为BaseExecutor。并且，从上述对BaseExecutor中query()方法源码中也可以看到，在BaseExecutor中会使用一个名为“localCache”的缓存对象缓存查询数据。参数localCacheScope的有效值分别为SESSION，STATEMENT，当该参数值为STATEMENT时，每次查询操作都会清空BaseExecutor中的本地缓存。因此，在这里我们需要深入分析一下，BaseExecutor中的本地缓存实现机制是什么。

实际上，BaseExecutor中的本地缓存是一个org.apache.ibatis.cache.impl.PerpetualCache类型的实例，跟踪其源码可以发现，其内部仅仅是封装了一个HashMap对象，真正缓存的数据正是存放在这个HashMap实例中的。

public class PerpetualCache implements Cache {    private final String id;    private Map
    
      cache = new HashMap
     
      ();    public PerpetualCache(String id) {        this.id = id;    }}

而且，从BaseExecutor的query()方法中可以看到，每次从本地缓存对象中取数据时的Key是一个类型为org.apache.ibatis.cache.CacheKey的实例。另外，从HashMap的实现原理我们也清楚，HashMap内部认为两个对象的Key是否相同需要满足如下条件：

第一，两个Key的hashCode值必须相同，这是前提；

第二，两个Key引用的对象相同或者他们通过equals()方法比较时返回true。

也就说，如果要使得BaseExecutor内部的本地缓存生效，必须保证查询时传入的CacheKey对象满足HashMap内部判断Key相同的条件，否则无法命中缓存。那么，我们来继续跟踪一下这个CacheKey的实现。

org.apache.ibatis.cache.CacheKey

public class CacheKey implements Cloneable, Serializable {    private final int multiplier;    private int hashcode;    private long checksum;    private int count;    private transient List

在CacheKey内部存放了4个关键属性：hashcode，checksum，count和updateList，他们的值影响比较两个CacheKey对象时是否相同。也就说，为了使得BaseExecutor内部的本地缓存被命中，必须使得查询时传递的CacheKey对象中对应的属性值与存放缓存数据时设置的CacheKey中的属性相同。

那么，我们就需要跟踪一下CacheKey是如何被构造的。

org.apache.ibatis.executor.BaseExecutor

@Overridepublic 
    
      List
     
       query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException {    BoundSql boundSql = ms.getBoundSql(parameter);    // 在查询之前构造CacheKey缓存Key对象    CacheKey key = createCacheKey(ms, parameter, rowBounds, boundSql);    return query(ms, parameter, rowBounds, resultHandler, key, boundSql);}@Overridepublic CacheKey createCacheKey(MappedStatement ms, Object parameterObject, RowBounds rowBounds, BoundSql boundSql) {    if (closed) {        throw new ExecutorException("Executor was closed.");    }    CacheKey cacheKey = new CacheKey();    // 调用CacheKey实例的update()方法将一个对象放在其内部的updateList列表中    // 只要是查询同一条数据的相同SQL语句，可以保证如下参数的相同的：    // ms.getId()，rowBounds.getOffset()，rowBounds.getLimit()，boundSql.getSql()，boundSql.getParameterMappings()    // 也就是说，只要是在相同的SqlSession中查询同一条数据时都会命中BaseExecutor的本地缓存    cacheKey.update(ms.getId());    cacheKey.update(rowBounds.getOffset());    cacheKey.update(rowBounds.getLimit());    cacheKey.update(boundSql.getSql());    List
      
        parameterMappings = boundSql.getParameterMappings();    TypeHandlerRegistry typeHandlerRegistry = ms.getConfiguration().getTypeHandlerRegistry();    // mimic DefaultParameterHandler logic    for (ParameterMapping parameterMapping : parameterMappings) {        if (parameterMapping.getMode() != ParameterMode.OUT) {            Object value;            String propertyName = parameterMapping.getProperty();            if (boundSql.hasAdditionalParameter(propertyName)) {                value = boundSql.getAdditionalParameter(propertyName);            } else if (parameterObject == null) {                value = null;            } else if (typeHandlerRegistry.hasTypeHandler(parameterObject.getClass())) {                value = parameterObject;            } else {                MetaObject metaObject = configuration.newMetaObject(parameterObject);                value = metaObject.getValue(propertyName);            }            cacheKey.update(value);        }    }    if (configuration.getEnvironment() != null) {        // issue #176        cacheKey.update(configuration.getEnvironment().getId());    }    return cacheKey;}

从上述构造BaseExecutor本地缓存对象的CacheKey源码分析中可以得出这样的结论：在相同的SqlSession中查询同一条数据时都会命中BaseExecutor的本地缓存。也就是说通过参数localCacheScope控制的缓存策略只能在相同SqlSession内有效，因为BaseExecutor的本地缓存对象localCache是实例属性，在不同的执行器实例中都保存一个独立的本地缓存，而在不同的SqlSession中使用的是不同的执行器实例。这个关系可以通过下图描述：

那么，到底是不是这样的呢？我们需要进行验证。

// 在相同Session中查询同一条数据SqlSession sqlSession = sqlSessionFactory.openSession(true);Student student = sqlSession.getMapper(StudentMapper.class).getStudentById(1);System.out.println(student);student = sqlSession.getMapper(StudentMapper.class).getStudentById(1);System.out.println(student);sqlSession.close();

对应MyBatis输出日志如下：

method: queryDEBUG [main] - Opening JDBC ConnectionDEBUG [main] - Created connection 1131184547.DEBUG [main] - ==>  Preparing: select * from student where id = ? DEBUG [main] - ==> Parameters: 1(Long)DEBUG [main] - <==      Total: 1Student{id=1, name='张三', age=23, sex=0}method: queryStudent{id=1, name='张三', age=23, sex=0}DEBUG [main] - Closing JDBC Connection [com.mysql.cj.jdbc.ConnectionImpl@436c81a3]DEBUG [main] - Returned connection 1131184547 to pool.

显然，从输出日志中可以很确定地知道：在相同Session中查询同一条数据时，只有第一次会真正从数据库中查询，后续的查询都会直接从Session内的缓存中获取。而且，我们从上述相关源码中知道，只要SqlSession存在，该缓存是永远存在，不会失效。

// 在不同Session中查询同一条数据SqlSession sqlSession1 = sqlSessionFactory.openSession(true);Student student = sqlSession1.getMapper(StudentMapper.class).getStudentById(1);System.out.println(student);sqlSession1.close();SqlSession sqlSession2 = sqlSessionFactory.openSession(true);student = sqlSession2.getMapper(StudentMapper.class).getStudentById(1);System.out.println(student);sqlSession2.close();

而在不同的Session中查询同一条数据时，都分别从数据库直接查询，如下输出日志所示：

method: queryDEBUG [main] - Opening JDBC ConnectionDEBUG [main] - Created connection 1131184547.DEBUG [main] - ==>  Preparing: select * from student where id = ? DEBUG [main] - ==> Parameters: 1(Long)DEBUG [main] - <==      Total: 1Student{id=1, name='张三', age=23, sex=0}DEBUG [main] - Closing JDBC Connection [com.mysql.cj.jdbc.ConnectionImpl@436c81a3]DEBUG [main] - Returned connection 1131184547 to pool.method: queryDEBUG [main] - Opening JDBC ConnectionDEBUG [main] - Checked out connection 1131184547 from pool.DEBUG [main] - ==>  Preparing: select * from student where id = ? DEBUG [main] - ==> Parameters: 1(Long)DEBUG [main] - <==      Total: 1Student{id=1, name='张三', age=23, sex=0}DEBUG [main] - Closing JDBC Connection [com.mysql.cj.jdbc.ConnectionImpl@436c81a3]DEBUG [main] - Returned connection 1131184547 to pool.

参数cacheEnabled控制的缓存策略

在了解了参数localCacheScope控制的缓存策略之后，还需要继续研究参数cacheEnabled所控制的缓存策略。从上述源码分析中已经知道，当参数cacheEnabled值为true时，MyBatis将使用CachingExecutor执行器，下面通过源码解读一下CachingExecutor到底与其他Executor实现类有什么不同。

org.apache.ibatis.executor.CachingExecutor

// 被包装的执行器对象private final Executor delegate;// CachingExecutor内部的缓存管理器private final TransactionalCacheManager tcm = new TransactionalCacheManager();// 在构造函数中传入一个执行器对象public CachingExecutor(Executor delegate) {    this.delegate = delegate;    delegate.setExecutorWrapper(this);}// CachingExecutor实现的查询方法，在这里实现缓存@Overridepublic 
    
      List
     
       query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql)        throws SQLException {    // 从MappedStatement中获取一个缓存实例对象    Cache cache = ms.getCache();    if (cache != null) {        flushCacheIfRequired(ms);        if (ms.isUseCache() && resultHandler == null) {            ensureNoOutParams(ms, parameterObject, boundSql);            @SuppressWarnings("unchecked")            List
      
        list = (List
       
        ) tcm.getObject(cache, key);            if (list == null) {                list = delegate.
        
          query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);                tcm.putObject(cache, key, list); // issue #578 and #116            }            return list;        }    }    return delegate.
         
           query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);}

从CachingExecutor的源码走读中可以得到如下信息：

（1）CachingExecutor内部使用了一个独立的缓存管理组件TransactionalCacheManager，其实现如下：

public class TransactionalCacheManager {    // 内部是一个HashMap，所以TransactionalCacheManager本身不负责缓存数据    // 值得注意的是：该HashMap的Key和Value都是Cache类型，实际上真正缓存数据的是Value对应的TransactionalCache实例    private final Map
    
      transactionalCaches = new HashMap
     
      ();}

该缓存管理器的名称很有意思，从字面上看似乎与事务相关。实际上正是因为在事务关闭的时候会调用该缓存管理器的commit()方法，从而实现了通过参数cacheEnabled控制的缓存策略是全局的，这是一个非常巧妙的设计。

（2）在CachingExecutor的query()方法中可以看到，在执行数据查询时，先通过它的缓存管理器查询缓存，如果从缓存中没有取到数据时，将使用具体的执行器查询数据并缓存（在CachingExecutor中包装的具体执行器实际上就是BaseExecutor类型的实例，如：BatchExecutor，ReuseExecutor或SimpleExecutor）。

（3）CachingExecutor的缓存管理器在缓存数据时使用的Key是一个从MappedStatement中获取的Cache实例对象，实际上，在MappedStatement内部确实存在一个Cache类型实例的属性，继续解读相关源码之后才知道，原来这个Cache类型的对象需要在MyBatis的映射器中配置，并且该实例对象是全局的。实际上，这一点也从MyBatis的xml映射器配置文档中得到了证实。

再一次仔细阅读CachingExecutor的query()方法实现，将其缓存原理通过下图进行总结。

第一步，缓存数据时，将数据临时保存到TransactionalCacheManager中属性transactionalCaches的Value所引用的TransactionalCache实例内部的HashMap中。

第二步，事务关闭时将TransactionalCache实例内部HashMap中临时保存的数据全部转移到全局的Cache实例中。

第三步，从缓存中取数据时实际上直接从全局Cache实例中查询。

因为数据最终保存到了全局的Cache实例中，所以说参数cacheEnabled控制的缓存策略是全局的（属于应用上下文范围），在不同的Session中查询同一条数据时都会从这个全局缓存中查询，下面通过实例来进行验证。

打开全局缓存开关

定义全局缓存实例

// 在xml映射器中定义全局缓存

在xml映射器中配置全局缓存很简单，只需要在xml映射器中简单添加一个<cache />节点即可，这里为了演示全局缓存的效果，所以不用配置详细参数，使用默认值即可。

// 在接口映射器中定义全局缓存@CacheNamespacepublic interface StudentMapper {

在接口映射器中配置全局缓存通过注解@CacheNamespace实现，其效果与在xml映射器中通过节点<cache />配置是一样的。

验证全局缓存的作用

通过参数cacheEnabled控制的缓存是全局的，所以在多个Session中使用相同SQL语句查询同一条数据时，只在第一次查询时直接查询数据库，之后的查询都会从这个全局缓存中读取数据。如下以通过xml映射器查询为例：

// 在不同Session中查询同一条数据SqlSession sqlSession1 = sqlSessionFactory.openSession(true);Student student = sqlSession1.selectOne("org.chench.test.mybatis.mapper.getStudentById", 1);System.out.println(student);sqlSession1.close();SqlSession sqlSession2 = sqlSessionFactory.openSession(true);student = sqlSession2.selectOne("org.chench.test.mybatis.mapper.getStudentById", 1);System.out.println(student);sqlSession2.close();

查看MyBatis的输出日志：

method: queryDEBUG [main] - Cache Hit Ratio [org.chench.test.mybatis.mapper]: 0.0DEBUG [main] - Opening JDBC ConnectionDEBUG [main] - Created connection 1463355115.DEBUG [main] - ==>  Preparing: select * from test where id = ? DEBUG [main] - ==> Parameters: 1(Integer)DEBUG [main] - <==      Total: 1Student{id=1, name='1509690042107_haha_update_update', age=0, sex=0}DEBUG [main] - Closing JDBC Connection [com.mysql.cj.jdbc.ConnectionImpl@573906eb]DEBUG [main] - Returned connection 1463355115 to pool.method: queryDEBUG [main] - Cache Hit Ratio [org.chench.test.mybatis.mapper]: 0.5Student{id=1, name='1509690042107_haha_update_update', age=0, sex=0}

显然，从日志中很明显看到第一次查询时缓存命中率为0，第二次查询时缓存命中率为0.5，直接从缓存中取得了数据。

总结

MyBatis的缓存功能同时受到localCacheScope和cacheEnabled这2个运行时参数的控制。那么我们不禁要问：为什么需要使用2个参数进行控制而不是直接使用1个参数更加简洁？实际上，2个参数控制的缓存策略是不一样，localCacheScope参数控制的缓存是Session范围内的，称为一级缓存；而cacheEnabled参数控制的缓存是全局的，称为二级缓存，这对应于不同的应用需求。