

Stolon is composed of 3 main components

  • keeper: it manages a PostgreSQL instance converging to the clusterview computed by the leader sentinel.
  • sentinel: it discovers and monitors keepers and proxies and computes the optimal clusterview.
  • proxy: the client’s access point. It enforce connections to the right PostgreSQL master and forcibly closes connections to old masters.

For more details and requirements see Stolon Architecture and Requirements

Stolon architecture

上面是Stolon项目Readme中的说明,可以看到其本质与Redis Sentinel的方案比较类似,都是哨兵模式。



除此之外,为了让客户端能透明地访问Posgresql集群,还提供了proxy组件处理客户端请求,最请求导向集群的master节点,这一点比redis sentinel方案更好了,就不用客户端驱动专门做sentinel模式支持了。


官方文档中有写如何在kubernetes集群中部署Stolon集群,虽然也是用yaml文件分别3个组件,不过还是麻烦了些,幸好找到了对应的helm chart


git clone https://github.com/lwolf/stolon-chart
cd stolon-chart/stolon
helm install --namespace test --name stolon . --set store.backend=kubernetes --set persistence.enabled=true --set persistence.storageClassName=defaultScName


然后kubernetes集群内部的其它pod配置stolon-proxy的service FQDN地址就可以访问到它了,比如用上面的命令部署的stolon集群可以以下面的地址来访问它:




In a single node setup we can kill the current master keeper pod but usually the statefulset controller will recreate a new pod before the sentinel declares it as failed. To avoid the restart we’ll first remove the statefulset without removing the pod and then kill the master keeper pod. The persistent volume will be kept so we’ll be able to recreate the statefulset and the missing pods will be recreated with the previous data.

kubectl delete statefulset stolon-keeper --cascade=false
kubectl delete pod stolon-keeper-0

You can take a look at the leader sentinel log and will see that after some seconds it’ll declare the master keeper as not healthy and elect the other one as the new master:

no keeper info available db=cb96f42d keeper=keeper0
no keeper info available db=cb96f42d keeper=keeper0
master db is failed db=cb96f42d keeper=keeper0
trying to find a standby to replace failed master
electing db as the new master db=087ce88a keeper=keeper1

Now, inside the previous psql session you can redo the last select. The first time psql will report that the connection was closed and then it successfully reconnected:

postgres=# select * from test;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# select * from test;
 id | value
  1 | value1
(1 row)




  1. https://github.com/sorintlab/stolon/
  2. https://github.com/sorintlab/stolon/blob/master/examples/kubernetes/README.md
  3. https://github.com/lwolf/stolon-chart/tree/master/stolon