PostgreSQL relcache在长连接应用中的内存霸占"坑"-阿里云开发者社区

PostgreSQL relcache在长连接应用中的内存霸占"坑"

2016-07-09 7252

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云原生数据库 PolarDB MySQL 版，Serverless 5000PCU 100GB

云数据库 Redis 版，社区版 2GB

简介： 除了常见的执行计划缓存、数据缓存，PostgreSQL为了提高生成执行计划的效率，还提供了catalog, relation等缓存机制。PostgreSQL 9.5支持的缓存如下 ll src/backend/utils/cache/ attoptcache.c catcache.c

背景

阿里巴巴内部的某业务在使用阿里云RDS PG时，业务线细心的DBA发现，一些长连接占据了大量的内存没有释放。后来找到了复现的方法。使用场景有些极端。

有阿里巴巴内部业务这样的老湿机陪伴的RDS PG，是很靠谱的。

PostgreSQL 缓存

除了常见的执行计划缓存、数据缓存，PostgreSQL为了提高生成执行计划的效率，还提供了catalog, relation等缓存机制。

PostgreSQL 9.5支持的缓存代码如下

ll src/backend/utils/cache/

attoptcache.c  catcache.c  evtcache.c  inval.c  lsyscache.c  plancache.c  relcache.c  relfilenodemap.c  relmapper.c  spccache.c  syscache.c  ts_cache.c  typcache.c

长连接的缓存问题

这些缓存中，某些缓存是不会主动释放的，因此可能导致长连接霸占大量的内存不释放。

通常，长连接的应用，一个连接可能给多个客户端会话使用过，访问到大量catalog的可能性非常大。所以此类的内存占用比是非常高的。

有什么影响呢？
如果长连接很多，而且每个都霸占大量的内存，你的内存很快会被大量的连接耗光，出现OOM是避免不了的。
而实际上，这些内存可能大部分都是relcache的（还有一些其他的），要用到内存时，这些relcache完全可以释放出来，腾出内存空间，而没有必要被持久霸占。

例子

在数据库中存在大量的表，PostgreSQL会缓存当前会话访问过的对象的元数据，如果某个会话从启动以来，对数据库中所有的对象都有过查询的动作，那么这个会话需要将所有的对象定义都缓存起来，会占用较大的内存，占用的内存大小与一共访问了多少站该对象有关。

复现方法(截取自stackoverflow某个问题)，创建大量的对象，访问大量的对象，从而造成会话的relcache等迅速增长。
创建大量的对象
functions :
-- MTDB_destroy

CREATE OR REPLACE FUNCTION public.mtdb_destroy(schemanameprefix character varying)
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
declare
   curs1 cursor(prefix varchar) is select schema_name from information_schema.schemata where schema_name like prefix || '%';
   schemaName varchar(100);
   count integer;
begin
   count := 0;
   open curs1(schemaNamePrefix);
   loop
      fetch curs1 into schemaName;
      if not found then exit; end if;           
      count := count + 1;
      execute 'drop schema ' || schemaName || ' cascade;';
   end loop;  
   close curs1;
   return count;
end $function$;

-- MTDB_Initialize

CREATE OR REPLACE FUNCTION public.mtdb_initialize(schemanameprefix character varying, numberofschemas integer, numberoftablesperschema integer, createviewforeachtable boolean)
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
declare   
   currentSchemaId integer;
   currentTableId integer;
   currentSchemaName varchar(100);
   currentTableName varchar(100);
   currentViewName varchar(100);
   count integer;
begin
   -- clear
   perform MTDB_Destroy(schemaNamePrefix);

   count := 0;
   currentSchemaId := 1;
   loop
      currentSchemaName := schemaNamePrefix || ltrim(currentSchemaId::varchar(10));
      execute 'create schema ' || currentSchemaName;

      currentTableId := 1;
      loop
         currentTableName := currentSchemaName || '.' || 'table' || ltrim(currentTableId::varchar(10));
         execute 'create table ' || currentTableName || ' (f1 integer, f2 integer, f3 varchar(100), f4 varchar(100), f5 varchar(100), f6 varchar(100), f7 boolean, f8 boolean, f9 integer, f10 integer)';
         if (createViewForEachTable = true) then
            currentViewName := currentSchemaName || '.' || 'view' || ltrim(currentTableId::varchar(10));
            execute 'create view ' || currentViewName || ' as ' ||
                     'select t1.* from ' || currentTableName || ' t1 ' ||
             ' inner join ' || currentTableName || ' t2 on (t1.f1 = t2.f1) ' ||
             ' inner join ' || currentTableName || ' t3 on (t2.f2 = t3.f2) ' ||
             ' inner join ' || currentTableName || ' t4 on (t3.f3 = t4.f3) ' ||
             ' inner join ' || currentTableName || ' t5 on (t4.f4 = t5.f4) ' ||
             ' inner join ' || currentTableName || ' t6 on (t5.f5 = t6.f5) ' ||
             ' inner join ' || currentTableName || ' t7 on (t6.f6 = t7.f6) ' ||
             ' inner join ' || currentTableName || ' t8 on (t7.f7 = t8.f7) ' ||
             ' inner join ' || currentTableName || ' t9 on (t8.f8 = t9.f8) ' ||
             ' inner join ' || currentTableName || ' t10 on (t9.f9 = t10.f9) ';                    
         end if;
         currentTableId := currentTableId + 1;
         count := count + 1;
         if (currentTableId > numberOfTablesPerSchema) then exit; end if;
      end loop;   

      currentSchemaId := currentSchemaId + 1;
      if (currentSchemaId > numberOfSchemas) then exit; end if;     
   end loop;
   return count;
END $function$;

在一个会话中访问所有的对象
-- MTDB_RunTests

CREATE OR REPLACE FUNCTION public.mtdb_runtests(schemanameprefix character varying, rounds integer)
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
declare
   curs1 cursor(prefix varchar) is select table_schema || '.' || table_name from information_schema.tables where table_schema like prefix || '%' and table_type = 'VIEW';
   currentViewName varchar(100);
   count integer;
begin
   count := 0;
   loop
      rounds := rounds - 1;
      if (rounds < 0) then exit; end if;

      open curs1(schemaNamePrefix);
      loop
         fetch curs1 into currentViewName;
         if not found then exit; end if;
         execute 'select * from ' || currentViewName;
         count := count + 1;
      end loop;
      close curs1;
   end loop;
   return count;  
end $function$;

test SQL:
prepare :
准备对象

postgres=# select MTDB_Initialize('tenant', 100, 1000, true);

访问对象
session 1 :

postgres=# select MTDB_RunTests('tenant', 1);
 mtdb_runtests 
---------------
        100000
(1 row)

访问对象
session 2 :

postgres=# select MTDB_RunTests('tenant', 1);
 mtdb_runtests 
---------------
        100000
(1 row)

观察内存的占用
memory view :

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+   COMMAND
 2536 digoal    20   0 20.829g 0.016t 1.786g S   0.0 25.7   3:08.20 postgres: postgres postgres [local] idle
 2453 digoal    20   0 6854896 187124 142780 S   0.0  0.3   0:00.68 postgres: postgres postgres [local] idle

smem

  PID User     Command                         Swap      USS      PSS     RSS 
 2536 digoal   postgres: postgres postgres        0 15022132 15535203 16894900 
 2453 digoal   postgres: postgres postgres        0 15022256 15535405 16895100

优化建议

.1. 应用层优化建议
对于长连接，建议空闲一段时间后，自动释放连接。
这样的话，即使因为某些原因一些连接访问了大量的对象，也不至于一直占用这些缓存不释放。
我们可以看到pgpool-II的设计，也考虑到了这一点，它会对空闲的server connection设置阈值，或者设置一个连接的使用生命周期，到了就释放重建。

.2. PostgreSQL内核优化建议
优化relcache的管理，为relcache等缓存提供LRU管理机制，限制总的大小，淘汰不经常访问的对象，同时建议提供SQL语法给用户，允许用户自主的释放cache。

阿里云RDS PG正在对内核进行优化，修正目前社区版本PG存在的这个问题。

参考

https://www.postgresql.org/message-id/flat/20160708012833.1419.89062%40wrigleys.postgresql.org#20160708012833.1419.89062@wrigleys.postgresql.org

Every PostgreSQL session holds system data in own cache. Usually this cache
is pretty small (for significant numbers of users). But can be pretty big
if your catalog is untypically big and you touch almost all objects from
catalog in session. A implementation of this cache is simple - there is not
delete or limits. There is not garabage collector (and issue related to
GC), what is great, but the long sessions on big catalog can be problem.
The solution is simple - close session over some time or over some number
of operations. Then all memory in caches will be released.

Regards 

Pavel

随时欢迎来杭交流PostgreSQL相关技术，记得来之前请与我联系哦。

PostgreSQL relcache在长连接应用中的内存霸占"坑"

背景

PostgreSQL 缓存

长连接的缓存问题

例子

优化建议

参考

数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像