You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Apache Pegasus 2.2.0 is a feature release. The change-list is summarized here: https://github.com/apache/incubator-pegasus/issues/696.

Upgrading Notes

  • 2.2.0 can only be upgraded from 2.1.0. ReplicaServers of prior versions should upgrade to 2.1.0 firstly before upgrading to 2.2.0.

New Features

Modification of configs in runtime

In order to modify a configuration, we had to rolling-update all the servers to load the new config file. This is particularly cumbersome for us to operate the service.

Now, this feature allows us to dynamically update a config item using HTTP API without service downtime. Please check XiaoMi/rdsn#719XiaoMi/rdsn#704XiaoMi/rdsn#682 for more details.

Hot-key detection

The hotspot workload is dangerous as it continuously strikes the availability of one partition.

In this situation, we formerly needed to search the slow-log (abnormal reads) on servers and find the hot-key, which is usually the key that appears most frequently in the logs.

Now this job can be automated. When a hot partition is confirmed, we can send a "hot-key detection RPC" to the corresponding ReplicaServer.

This RPC triggers analysis of the incoming requests to the partition.

The analysis will end when it finds the hot-key or the process times out.

The related issue: apache/incubator-pegasus#495

Rate-limiting of reads

Pegasus has the throttling of writes but has no support of the throttling of reads. This version introduces QPS-based throttling of reads.

Optimizations and improvements

Support HDFS as a remote storage provider of Bulk-Load/Backup/Restore

For most of our use-cases of bulk-load, HDFS is the default choice of remote storage for the Spark-generated files.

In this version, we add support for HDFS as well as the rate-limiting of HDFS downloading/uploading.

HTTP APIs of the Bulk-Load procedure

Bulk Load in version 2.1.0 is relatively primitive. We make this feature full-feathered in this version with a list of APIs that can help us to automate the procedure.

(We in Xiaomi have developed a tool called BulkLoad Manager that simplifies the management of tasks. It utilizes the newly exposed APIs. We plan to open-source this project soon.)

Fixed Issues

  • From this version, we have support for various C++ compilers, including:

    • GCC 5.4.0 (ubuntu1604)
    • GCC 7.5.0 (ubuntu1804)
    • GCC 9.4.0 (ubuntu2004)
    • Clang9
    • Clang10

    The continuous testing is here: https://github.com/pegasus-kv/pegasus-docker

Known Issues




Apache Pegasus 2.2.0 是一个功能版本。所有的改动都被总结在: https://github.com/apache/incubator-pegasus/issues/696

升级提示

  • 2.2.0 只能从 2.1.0 版本升级。此前版本的 ReplicaServers 首先需要升级至 2.1.0  2.2.0.

新功能

动态配置修改

为了修改一个配置项, 过去我们需要重启所有集群内服务节点来加载新的配置文件. 这对我们的服务运维造成较大的麻烦.

而现在, 我们可以使用这个功能, 通过HTTP API来动态修改配置项, 从而不影响服务.  你可以查看 XiaoMi/rdsn#719XiaoMi/rdsn#704XiaoMi/rdsn#682 来了解更多细节.

热点Key检测

由于热点流量会持续影响一个分片的可用性, 它是一个非常危险的问题.

此前我们遇到这个问题时, 我们会从服务节点上搜索慢查询日志(异常读), 然后找到出现最频繁的key, 这个key一般就是热点key.

现在这个流程可以被自动化了. 当我们确认一个分片是热点分片时, 我们可以向对应的ReplicaServer发送"热点检测RPC". 

这个RPC会触发对该分片的请求热点分析. 一旦热点key被找到, 或者流程超时, 则分析结束.

相关 issue: apache/incubator-pegasus#495

使用文档: http://pegasus.apache.org/administration/hotspot-detection

读限流

Pegasus有写限流但没有读限流的支持. 在这个版本我们引入了基于QPS的读限流.

优化和改进

支持HDFS用于BulkLoad/Backup/Restore

在我们绝大多数的BulkLoad用户场景,HDFS都是Spark生成文件的默认存储。在这一版本中,我们提供了HDFS的支持,以及HDFS文件上传下载限流的支持。

Bulk Load流程的HTTP接口

2.1.0版本的Bulk Load是相对初级的。我们在这一版本中将该功能打造得更为完整。我们提供了一系列API,它们可以用于BulkLoad流程的自动化。

(在小米,我们开发了BulkLoad Manager运维工具,它简化了BulkLoad的任务管理。它使用了该版本引入的新API。我们计划近期将该工具开源。)


  • No labels