Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Contents

...

Apache ShardingSphere Support mainstream database metadata table query

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org
Githubhttps://github.com/apache/shardingsphere 

Background

ShardingSphere has designed its own metadata database to simulate metadata queries that support various databases.

More details:

https://github.com/apache/shardingsphere/issues/21268
https://github.com/apache/shardingsphere/issues/22052

Task

  • Support PostgreSQL And openGauss `\d tableName`
  • Support PostgreSQL And openGauss `\d+`
  • Support PostgreSQL And openGauss `\d+ tableName`
  • Support PostgreSQL And openGauss `l`
  • Support query for MySQL metadata `TABLES`
  • Support query for MySQL metadata `COLUMNS`
  • Support query for MySQL metadata `schemata`
  • Support query for MySQL metadata `ENGINES`
  • Support query for MySQL metadata `FILES`
  • Support query for MySQL metadata `VIEWS`

Notice, these issues can be a good example.

https://github.com/apache/shardingsphere/pull/22053
https://github.com/apache/shardingsphere/pull/22057/
https://github.com/apache/shardingsphere/pull/22166/
https://github.com/apache/shardingsphere/pull/22182

Relevant Skills

  •  Master JAVA language
  •  Have a basic understanding of Zookeeper
  •  Be familiar with MySQL/Postgres SQLs 


Mentor

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

Zhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.org

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Chuxin Chen, mail: tuichenchuxin (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

SkyWalking

[GSOC] [SkyWalking] AIOps Log clustering with Flink (Algorithm Optimization)

Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on algorithm optimiztion for the clustering technique.

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Yihao Chen, mail: yihaochen (at) apache.org
Project Devs, mail: dev (at) skywalking.apache.org

[GSOC] [SkyWalking] Python Agent Performance Enhancement Plan

Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This task is about enhancing Python agent performance, the tracking issue can be seen here -< https://github.com/apache/skywalking/issues/10408

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Yihao Chen, mail: yihaochen (at) apache.org
Project Devs, mail: dev (at) skywalking.apache.org

[GSOC] [SkyWalking] AIOps Log clustering with Flink (Flink Integration)

Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on Flink and its integration with SkyWalking OAP.

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Yihao Chen, mail: yihaochen (at) apache.org
Project Devs, mail: dev (at) skywalking.apache.org

Apache ShardingSphere Write a converter to generate DistSQL

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Page: https://shardingsphere.apache.org/
Github: https://github.com/apache/shardingsphere 

Background

Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

Task

The elementary task is that the storage node controller could manage the lifecycle of  a set of storage units, like PostgreSQL, in kubernetes. 

We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

  • [ ] Generate DistSQL according to the Golang struct `EncryptionRule`
  • [ ] Generate DistSQL according to the Golang struct `ShardingRule`
  • [ ] Generate DistSQL according to the Golang struct `ReadWriteSplittingRule`
  • [ ] Generate DistSQL according to the Golang struct `MaskRule`
  • [ ] Generate DistSQL according to the Golang struct `ShadowRule`

    Relevant Skills

1. Master Go language, Ginkgo test framework
2. Have a basic understanding of Apache ShardingSphere Concepts and DistSQL

Targets files

DistSQL Converter - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/distsql/converter.go, etc.

Example

A struct defined as below:

```golang
type EncryptRule struct{}
func (t EncryptRule) ToDistSQL() string {}
```
While invoking ToDistSQL() it will generate a DistSQL regarding a EncryptRule like:

```SQL
CREATE ENCRYPT RULE t_encrypt (....
```

References:

Mentor
Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache

[GSOC] [SkyWalking] Self-Observability of the query subsystem in BanyanDB

Background

SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

Objectives

  1. Support EXPLAIN[1] for both measure query and stream query
  2. Add self-observability including trace and metrics for query subsystem
  3. Support EXPLAIN in the client SDK & CLI and add query plan visualization in the UI

[1]: EXPLAIN in MySQL

Recommended Skills

  1. Familiar with Go
  2. Have a basic understanding of database query engine
  3. Have an experience of Apache SkyWalking or other APMs

Mentor

  • Mentor: Jiajing Lu, Apache SkyWalking PMC, lujiajing@apache.orgImage Removed
  • Mentor: Hongtao Gao, Apache SkyWalking PMC, Apache ShardingSphere PMC, hanahmily@apache.orgImage Removed
  • Mailing List: dev@skywalking.apache

    .org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jiajing LuLiyao Miao, mail: lujiajing miaoliyao (at) apache.org
    Project Devs, mail: dev (at) skywalkingshardingsphere.apache.org

    [GSOC] [SkyWalking] Unify query planner and executor in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Fully unify/merge the query planner and executor for Measure and TopN

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking

    Mentor

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    Apache ShardingSphere Introduce new CRD as StorageNode for better usability

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about new CRD Cluster and ComputeNode as belows:

    • #167
    • #166

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of a set of storage units, like PostgreSQL, in kubernetes.

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

    • [ ] Create a PostgreSQL cluster while a StorageNode with pg parameters is created
    • [ ] Update the PostgreSQL cluster while updated StorageNode
    • [ ] Delete the PostgreSQL cluster while deleted StorageNode. Notice this may need a deletion strategy.
    • [ ] Reconciling StorageNode according to the status of PostgreSQL cluster.
    • [ ] The status of StorageNode would be consumed by common storage units related DistSQLs

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic understanding of Apache ShardingSphere Concepts
    3. Be familiar with Kubernetes Operator, kubebuilder framework

    Targets files

    StorageNode Controller - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/controllers/storagenode_controller.go


    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    [GSOC][SkyWalking] Add Terraform provider for Apache SkyWalking

    Now the deployment methods for SkyWalking are limited, we only have Helm Chart for users to deploy in Kubernetes, other users that are not using Kubernetes have to do all the house keeping stuffs to set up SkyWalking on, for example, VM.

    This issue aims to add a Terraform provider, so that users can conveniently  spin up a cluster for demonstration or testing, we should evolve the provider and allow users to customize as their need and finally users can use this in their production environment.

    In this task, we will mainly focus on the support for AWS. In the Terraform provider, users need to provide their access key / secret key, and the provider does the rest stuffs: create VMs, create database/OpenSearch or RDS, download SkyWalking tars, configure the SkyWalking, and start the SkyWalking components (OAP/UI), create public IPs/domain name, etc.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhenxu KeLiyao Miao, mail: kezhenxu94 miaoliyao (at) apache.org
    Project Devs, mail: dev (at) skywalkingshardingsphere.apache.org

    ShenYu

    Apache ShardingSphere Introduce JVM chaos to ShardingSphere

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    Introduce ChaosEngineering for ShardingSphere #32
    And we also proposed a generic controller for ShardingSphereChaos as belows:

    [GSoC 2023] Introduce New CRD ShardingSphereChaos #272
    The ShardingSphereChaos controller is aiming at different chaos tests. This JVMChaos is an important one.

    Task

    Write several scripts to implement different JVMChaos for main features of ShardingSphere. The specific case list is as follows.

    • Add scripts injecting chaos to DataSharding
    • Add scripts injecting chaos to ReadWritingSplitting
    • Add scripts injecting chaos to DatabaseDiscovery
    • Add scripts injecting chaos to Encryption
    • Add scripts injecting chaos to Mask
    • Add scripts injecting chaos to Shadow
      Basically, these scripts will cause unexpected behaviour while executing the related. DistSQL.

    Relevant Skills

    • Master Go language, Ginkgo test framework
    • Have a deep understanding of Apache ShardingSphere concepts and practices.
    • JVM byte mechanisms like ByteMan, ByteBuddy.

    Targets files

    JVMChaos Scripts - https://github.com/apache/shardingsphere-on-cloud/chaos/jvmchaos/scripts/

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org
    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (

    Apache ShenYu Gsoc 2023 - Support for Kubernetes Service Discovery

    Background

    Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance. Currently, ShenYu has good usability and performance in microservice scenarios. However, ShenYu's support for Kubernetes is still relatively weak.

    Tasks

    1. Support the registration of microservices deployed in K8s Pod to shenyu-admin and use K8s as the register center.
    2. Discuss with mentors, and complete the requirements design and technical design of Shenyu K8s Register Center.
    3. Complete the initial version of Shenyu K8s Register Center.
    4. Complete the CI test of Shenyu K8s Register Center, verify the correctness of the code.
    5. Write the necessary documentation, deployment guides, and instructions for users to connect microservices running inside the K8s Pod to ShenYu

    Relevant Skills

    1. Know the use of Apache ShenYu, especially the register center
    2. Familiar with Java and Golang
    3. Familiar with Kubernetes and can use Java or Golang to develop

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yonglun Zhang, mail: zhangyonglun (at) apache.org
    Project Devs, mail: dev (at) shenyushardingsphere.apache.org

    Apache

    ShenYu Gsoc 2023 - Design and implement shenyu ingress-controller in k8s

    ShardingSphere Introduce New CRD ShardingSphereChaos

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    The ShardingSphereChaos controller is aiming at different chaos tests. 

    Task

    Propose a generic controller for ShardingSphereChaos, which reconcile CRD ShardingSphereChaos, prepare, execute and verify test.

    • [ ] Support common ShardingSphere features, prepare test rules and dataset
    • [ ] Generating chaos type according to the backend implementation
    • [ ] Verify testing result with DistSQL or other tools

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a deep understanding of Apache ShardingSphere concepts and practices.
    3. Kubernetes operator pattern, kube-builder 

    Targets files

    ShardingSphereChaos Controller -

    Background

    Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance. Currently, ShenYu has good usability and performance in microservice scenarios. However, ShenYu's support for Kubernetes is still relatively weak.

    Tasks

    1. Discuss with mentors, and complete the requirements design and technical design of shenyu-ingress-controller.
    2. Complete the initial version of shenyu-ingress-controller, implement the reconcile of k8s ingress api, and make ShenYu as the ingress gateway of k8s.
    3. Complete the ci test of shenyu-ingress-controller, verify the correctness of the code.

    Relevant Skills

    1. Know the use of Apache ShenYu
    2. Familiar with Java and Golang
    3. Familiar with Kubernetes and can use java or golang to develop Kubernetes Controller

    Description

    Issues : https://github.com/apache/shenyu/issues/4438
    website : https://shenyu.apache.org/shardingsphere-on-cloud/shardingsphere-operator/pkg/controllers/chaos_controller.go, etc.


    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yu XiaoLiyao Miao, mail: xiaoyu miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shenyushardingsphere.apache.org

    SkyWalking

    [GSOC] [SkyWalking] AIOps Log clustering with Flink (Algorithm Optimization)

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on algorithm optimiztion for the clustering technique.

    Difficulty: Major
    Project size: ~350 hour (large

    Apache ShenYu Gsoc 2023 - Design license scanning function

    Background

    At present, shenyu needs to manually check whether the license is correct one by one when releasing the version.

    Tasks

    1. Discuss with the tutor to complete the requirement design and technical design of the scanning license.
    2. Finished scanning the initial version of the license.
    3. Complete the corresponding test.

    Relevant Skills

    1. Familiar with Java.
    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    SiYing ZhengYihao Chen, mail: impactcn yihaochen (at) apache.org
    Project Devs, mail: dev (at) shenyuskywalking.apache.org

    [GSOC] [SkyWalking] Python Agent Performance Enhancement Plan

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This task is about enhancing Python agent performance, the tracking issue can be seen here -< https://github.com/apache/skywalking/issues/10408

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao Chen, mail: yihaochen (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] AIOps Log clustering with Flink (Flink Integration)

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on Flink and its integration with SkyWalking OAP.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao Chen, mail: yihaochen (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] Self-Observability of the query subsystem in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Support EXPLAIN[1] for both measure query and stream query
    2. Add self-observability including trace and metrics for query subsystem
    3. Support EXPLAIN in the client SDK & CLI and add query plan visualization in the UI

    [1]: EXPLAIN in MySQL

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking or other APMs

    Mentor

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] Unify query planner and executor in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Fully unify/merge the query planner and executor for Measure and TopN

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking

    Mentor

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC][SkyWalking] Add Terraform provider for Apache SkyWalking

    Now the deployment methods for SkyWalking are limited, we only have Helm Chart for users to deploy in Kubernetes, other users that are not using Kubernetes have to do all the house keeping stuffs to set up SkyWalking on, for example, VM.


    This issue aims to add a Terraform provider, so that users can conveniently  spin up a cluster for demonstration or testing, we should evolve the provider and allow users to customize as their need and finally users can use this in their production environment.


    In this task, we will mainly focus on the support for AWS. In the Terraform provider, users need to provide their access key / secret key, and the provider does the rest stuffs: create VMs, create database/OpenSearch or RDS, download SkyWalking tars, configure the SkyWalking, and start the SkyWalking components (OAP/UI), create public IPs/domain name, etc.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhenxu Ke, mail: kezhenxu94 (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    ShenYu

    Apache ShenYu Gsoc 2023 - Support for Kubernetes Service Discovery

    Background

    Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance. Currently, ShenYu has good usability and performance in microservice scenarios. However, ShenYu's support for Kubernetes is still relatively weak.

    Tasks

    1. Support the registration of microservices deployed in K8s Pod to shenyu-admin and use K8s as the register center.
    2. Discuss with mentors, and complete the requirements design and technical design of Shenyu K8s Register Center.
    3. Complete the initial version of Shenyu K8s Register Center.
    4. Complete the CI test of Shenyu K8s Register Center, verify the correctness of the code.
    5. Write the necessary documentation, deployment guides, and instructions for users to connect microservices running inside the K8s Pod to ShenYu

    Relevant Skills

    1. Know the use of Apache ShenYu, especially the register center
    2. Familiar with Java and Golang
    3. Familiar with Kubernetes and can use Java or Golang to develop

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yonglun Zhang, mail: zhangyonglun (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    Apache ShenYu Gsoc 2023 - Design and implement shenyu ingress-controller in k8s

    Background

    Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance. Currently, ShenYu has good usability and performance in microservice scenarios. However, ShenYu's support for Kubernetes is still relatively weak.

    Tasks

    1. Discuss with mentors, and complete the requirements design and technical design of shenyu-ingress-controller.
    2. Complete the initial version of shenyu-ingress-controller, implement the reconcile of k8s ingress api, and make ShenYu as the ingress gateway of k8s.
    3. Complete the ci test of shenyu-ingress-controller, verify the correctness of the code.

    Relevant Skills

    1. Know the use of Apache ShenYu
    2. Familiar with Java and Golang
    3. Familiar with Kubernetes and can use java or golang to develop Kubernetes Controller

    Description

    Issues : https://github.com/apache/shenyu/issues/4438
    website : https://shenyu.apache.org/

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yu Xiao, mail: xiaoyu (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    Apache ShenYu Gsoc 2023 - Design license scanning function

    Background

    At present, shenyu needs to manually check whether the license is correct one by one when releasing the version.

    Tasks

    1. Discuss with the tutor to complete the requirement design and technical design of the scanning license.
    2. Finished scanning the initial version of the license.
    3. Complete the corresponding test.

    Relevant Skills

    1. Familiar with Java.
    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    SiYing Zheng, mail: impactcn (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    Apache ShenYu Gsoc 2023 - Shenyu-Admin Internationalization

    Background

    Shenyu is a native API gateway for service proxy, protocol translation and API governance. It can manage and maintain the API through Shenyu-admin, and support internationalization in Chinese and English. Unfortunately, Shenyu-admin is only internationalized on the front end. The message prompt returned by the back-end interface is still in English. Therefore, we need to implement internationalization support for the back-end interface.This will lay a good foundation for shenyu to move towards more language support.

    Relevant skills

    • Related skills spring resources
    • Spring Internationalization
    • Front-end react framework

    API reference

                java.util.Locale;
                org.springframework.context.MessageSource;
                org.springframework.context.support.ResourceBundleMessageSource; 

    Interface effect example

                ## zh request example
                POST http://localhost:9095/plugin
                Content-Type: application/json
                Location: cn-zh
                X-Access-Token: xxx
                {
                "name": "test-create-plugin",
                "role": "test-create-plugin",
                "enabled": true,
                "sort": 

    Apache ShenYu Gsoc 2023 - Shenyu-Admin Internationalization

    Background

    Shenyu is a native API gateway for service proxy, protocol translation and API governance. It can manage and maintain the API through Shenyu-admin, and support internationalization in Chinese and English. Unfortunately, Shenyu-admin is only internationalized on the front end. The message prompt returned by the back-end interface is still in English. Therefore, we need to implement internationalization support for the back-end interface.This will lay a good foundation for shenyu to move towards more language support.

    Relevant skills

    • Related skills spring resources
    • Spring Internationalization
    • Front-end react framework

    API reference

                java.util.Locale;
                org.springframework.context.MessageSource;
                org.springframework.context.support.ResourceBundleMessageSource; 

    Interface effect example

                ## zh request example
                POST http://localhost:9095/plugin
                Content-Type: application/json
                Location: cn-zh
                X-Access-Token: xxx
                {
                "name": "test-create-plugin",
                "role": "test-create-plugin",
                "enabled": true,
                "sort": 100
                }
                Respone
                {
                "code": 600,
                "message": "未登录"
                }
                
                ### en request example
                POST http://localhost:9095/plugin
                Content-Type: application/json
                Location: en
                X-Access-Token: xxx
                {
                "name": "test-create-plugin",
                "role": "test-create-plugin",
                "enabled": true,
                "sort": 100
                }
                Respone
                {
                "code": 600,
                 "code": 600,
                "message": "token is error"
                } 

    Task List

  • The task discussed with the tutor how to achieve the internationalization of shenyu-admin background
  • Some prompt message translation
  • Get through the internationalization of front-end, obtain the client region information through http protocol, support the language of the corresponding region.
  • Leave the extension of other multi-language internationalization support interface, so as to facilitate the localization transformation of subsequent users.
     "message": "token is error"
                } 

    Task List

    • The task discussed with the tutor how to achieve the internationalization of shenyu-admin background
    • Some prompt message translation
    • Get through the internationalization of front-end, obtain the client region information through http protocol, support the language of the corresponding region.
    • Leave the extension of other multi-language internationalization support interface, so as to facilitate the localization transformation of subsequent users.
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Keguo Li, mail: likeguo (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    Apache ShenYu Gsoc 2023 - ShenYu End-To-End SpringCloud plugin test case

    Background:

    Shenyu is a native API gateway for service proxy, protocol translation and API governance. but Shenyu lack of End-To-End Tests.

    Relevant skills:

    1.Understand the architecture of ShenYu

    2.Understand SpringCloud micro-service and ShenYu SpringCloud proxy plugin.

    3.Understand ShenYu e2e framework and architecture.

    How to coding

    1.please refer to org.apache.shenyu.e2e.testcase.plugin.DividePluginCases

    How to test

    1.start shenyu admin in docker

    2.start shenyu boostrap in docker

    3.run test case org.apache.shenyu.e2e.testcase.plugin.PluginsTest#testDivide

    Task List

    1.develop e2e tests of the springcloud plug-in.

    2.write shenyu e2e springcloud plugin documentation in shenyu-website.

    3.refactor the existing plugin test cases.


    Links:

    website: https://shenyu.apache.org/

    issues: https://github.com/apache/shenyu/issues/4474


    Difficulty: Major
    Project size: ~350 ~175 hour (largemedium)
    Potential mentors:
    Keguo LiFengen He, mail: likeguo hefengen (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    Apache ShenYu Gsoc 2023 - ShenYu End-To-End SpringCloud plugin test case

    TrafficControl

    GSOC Varnish Cache support in Apache Traffic Control

    Background
    Apache Traffic Control is a Content Delivery Network (CDN) control plane for large scale content distribution.

    Traffic Control currently requires Apache Traffic Server as the underlying cache. Help us expand the scope by integrating with the very popular Varnish Cache.

    There are multiple aspects to this project:

    • Configuration Generation: Write software to build Varnish configuration files (VCL). This code will be implemented in our Traffic Ops and cache client side utilities, both written in Go.
    • Health Monitoring: Implement monitoring of the Varnish cache health and performance. This code will run both in the Traffic Monitor component and within Varnish. Traffic Monitor is written in Go and Varnish is written in C.
    • Testing: Adding automated tests for new code

    Skills:

    • Proficiency in Go is required
    • A basic knowledge of HTTP and caching is preferred, but not required for this project.
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Eric Friedrich, mail: friede (at) apache.org
    Project Devs, mail: dev (at) trafficcontrol.apache.org

    Add server indicator if a server is a cache

    Background:

    Shenyu is a native API gateway for service proxy, protocol translation and API governance. but Shenyu lack of End-To-End Tests.

    Relevant skills:

    1.Understand the architecture of ShenYu

    2.Understand SpringCloud micro-service and ShenYu SpringCloud proxy plugin.

    3.Understand ShenYu e2e framework and architecture.

    How to coding

    1.please refer to org.apache.shenyu.e2e.testcase.plugin.DividePluginCases

    How to test

    1.start shenyu admin in docker

    2.start shenyu boostrap in docker

    3.run test case org.apache.shenyu.e2e.testcase.plugin.PluginsTest#testDivide

    Task List

    1.develop e2e tests of the springcloud plug-in.

    2.write shenyu e2e springcloud plugin documentation in shenyu-website.

    3.refactor the existing plugin test cases.

    Links:

    website: https://shenyu.apache.org/

    issues: https://github.com/apache/shenyutrafficcontrol/issues/44747076

    Difficulty: Major Trivial
    Project size: ~175 hour (medium)
    Potential mentors:
    Fengen HeBrennan Fieck, mail: hefengen ocket8888 (at) apache.org
    Project Devs, mail: dev (at) shenyutrafficcontrol.apache.org

    TrafficControl

    GSOC Varnish Cache support in Apache Traffic Control

    Background
    Apache Traffic Control is a Content Delivery Network (CDN) control plane for large scale content distribution.

    Traffic Control currently requires Apache Traffic Server as the underlying cache. Help us expand the scope by integrating with the very popular Varnish Cache.

    There are multiple aspects to this project:

    • Configuration Generation: Write software to build Varnish configuration files (VCL). This code will be implemented in our Traffic Ops and cache client side utilities, both written in Go.
    • Health Monitoring: Implement monitoring of the Varnish cache health and performance. This code will run both in the Traffic Monitor component and within Varnish. Traffic Monitor is written in Go and Varnish is written in C.
    • Testing: Adding automated tests for new code

    Skills:

    • Proficiency in Go is required
    • A basic knowledge of HTTP and caching is preferred, but not required for this project.
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Eric Friedrich, mail: friede (at) apache.org
    Project Devs, mail: dev (at) trafficcontrol.apache.org

    Doris

    [GSoC][Doris]Page Cache Improvement

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org

    Github: https://github.com/apache/doris

    Background

    Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
    Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: 

    • Hot data will be phased out in large queries
    • The page cache configuration is immutable and does not support GC.

    Task

    • Phase One: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
    • Phase Two: Improve the cache strategy for Apache Doris based on the results from Phase One.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhijing Lu, mail: luzhijing

    Add server indicator if a server is a cache

    Difficulty: Trivial
    Project size: ~175 hour (medium)
    Potential mentors:
    Brennan Fieck, mail: ocket8888 (at) apache.org
    Project Devs, mail: dev (at) trafficcontroldoris.apache.org

    ...

    [GSoC][

    Beam] Build out Beam Machine Learning Use Cases

    Today, you can do all sorts of Machine Learning using Apache Beam (https://beam.apache.org/documentation/ml/overview/).
     
    Many of our users, however, have a hard time getting started with ML and understanding how Beam can be applied to their day to day work. The goal of this project is to build out a series of Beam pipelines as Jupyter Notebooks demonstrating real world ML use cases, from NLP to image recognition to using large language models. As you go, there may be bugs or friction points as well which will provide opportunities to contribute back to Beam's core ML libraries.

    Mentor for this will be Danny McCormick

    Doris] Supports BigQuery/Apache Kudu/Apache Cassandra/Apache Druid in Federated Queries

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Background

    Apache Doris supports acceleration of queries on external data sources to meet users' needs for federated queries and analysis.
    Currently, Apache Doris supports multiple external catalogs including those from Hive, Iceberg, Hudi, and JDBC. Developers can connect more data sources to Apache Doris based on a unified framework.

    Objective

    Task
    Phase One:

    • Get familiar with the Multi-Catalog structure of Apache Doris, including the metadata synchronization mechanism in FE and the data reading mechanism of BE.
    • Investigate how metadata should be acquired and how data access works regarding the picked data source(s); produce the corresponding design documentation.

    Phase Two:

    • Develop connections to the picked data source(s) and implement access to metadata and data.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    [GSoC][Beam] Advancing the Rust SDK on Beam

    Beam has an experimental, ongoing implementation for a Rust SDK.

    This project involves advancing that implementation and making sure it's compiant with Beam standards.

    Good resource materials:

    This project is large.
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo EstradaZhijing Lu, mail: pabloem luzhijing (at) apache.org
    Project Devs, mail: dev (at) beamdoris.apache.org

    [GSoC][

    Beam] Advancing the Beam-on-Ray runner

    Doris]Dictionary Encoding Acceleration

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org

    Github

    There is a community effort to build a Beam runner to run Beam pipelines on top of Ray: https://github.com/ray-project/ray_beam_runner/

    This involves pushing that project forward. It will require writing lots of Python code, and specifically going through the list of issues (https://github.com/ray-project/ray_beam_runner/issues) and solving as many of them as possible to make sure the runner is compliant.

    Good resource docs:

    This project is large.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    apache/doris

    Background

    In Apache Doris, dictionary encoding is performed during data writing and compaction. Dictionary encoding will be implemented on string data types by default. The dictionary size of a column for one segment is 1M at most. The dictionary encoding technology accelerates strings during queries, converting them into INT, for example.
     

    Task

    • Phase One: Get familiar with the implementation of Apache Doris dictionary encoding; learning how Apache Doris dictionary encoding accelerates queries.
    •  Phase Two: Evaluate the effectiveness of full dictionary encoding and figure out how to optimize memory in such a case.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

    [GSoC][Beam] An IntelliJ plugin to develop Apache Beam pipelines and the Apache Beam SDKs

    Beam library developers and Beam users would appreciate this : )

    This project involves prototyping a few different solutions, so it will be large.
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo EstradaZhijing Lu, mail: pabloem luzhijing (at) apache.org
    Project Devs, mail: dev (at) beamdoris.apache.org

    ...

    Beam

    [GSoC][Beam] Build out Beam Machine Learning Use Cases

    Today, you can do all sorts of Machine Learning using Apache Beam (https://beam.apache.org/documentation/ml/overview/).
     
    Many of our users, however, have a hard time getting started with ML and understanding how Beam can be applied to their day to day work. The goal of this project is to build out a series of Beam pipelines as Jupyter Notebooks demonstrating real world ML use cases, from NLP to image recognition to using large language models. As you go, there may be bugs or friction points as well which will provide opportunities to contribute back to Beam's core ML libraries.


    Mentor for this will be Danny McCormick

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    [GSoC][Beam] Advancing the Beam-on-Ray runner

    There is a community effort to build a Beam runner to run Beam pipelines on top of Ray: https://github.com/ray-project/ray_beam_runner/


    This involves pushing that project forward. It will require writing lots of Python code, and specifically going through the list of issues (

    Doris]Page Cache Improvement

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org

    Github: https://github.com/apache/doris

    Background

    Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored.
    Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: 

    • Hot data will be phased out in large queries
    • The page cache configuration is immutable and does not support GC.

    Task

    • Phase One: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required.
    • Phase Two: Improve the cache strategy for Apache Doris based on the results from Phase One.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

    ray-project/ray_beam_runner/issues) and solving as many of them as possible to make sure the runner is compliant.

    Good resource docs:

    This project is large.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    [GSoC][Beam] Advancing the Rust SDK on Beam

    Beam has an experimental, ongoing implementation for a Rust SDK.

    This project involves advancing that implementation and making sure it's compiant with Beam standards.

    Good resource materials:

    This project is large.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    [GSoC][Beam] An IntelliJ plugin to develop Apache Beam pipelines and the Apache Beam SDKs

    Beam library developers and Beam users would appreciate this : )


    This project involves prototyping a few different solutions, so it will be large.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Pablo Estrada, mail: pabloem (at) apache.org
    Project Devs, mail: dev (at) beam.apache.org

    Comdev GSOC

    [GSoC][Airflow] Automation for PMC

    This is a project to implement a tool for PMC task automation.


    This is a large project.


    Mentor will be aizhamal ,

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhijing Lu, mail: luzhijing (at) apache.org
    Project Devs, mail: dev (at) doris.apache.org

    [GSoC][Doris] Supports BigQuery/Apache Kudu/Apache Cassandra/Apache Druid in Federated Queries

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Background

    Apache Doris supports acceleration of queries on external data sources to meet users' needs for federated queries and analysis.
    Currently, Apache Doris supports multiple external catalogs including those from Hive, Iceberg, Hudi, and JDBC. Developers can connect more data sources to Apache Doris based on a unified framework.

    Objective

    Task
    Phase One:

    • Get familiar with the Multi-Catalog structure of Apache Doris, including the metadata synchronization mechanism in FE and the data reading mechanism of BE.
    • Investigate how metadata should be acquired and how data access works regarding the picked data source(s); produce the corresponding design documentation.

    Phase Two:

    • Develop connections to the picked data source(s) and implement access to metadata and data.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

  • Mentor: Mingyu Chen, Apache Doris PMC Member & Committer, morningman@apache.org Image Removed
  • Mentor: Calvin Kirs, Apache Geode PMC & Committer, Kirs@apache.orgImage Removed
  • Mailing List: dev@doris.apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhijing LuPablo Estrada, mail: luzhijing pabloem (at) apache.org
    Project Devs, mail: dev (at) doris.apache.org

    [GSoC][

    Doris]Dictionary Encoding Acceleration

    Teaclave (incubating)] Data Privacy Policy Definition and Function Verification

    Background

    The Apache Teaclave (incubating) is a cutting-edge solution for confidential computing, providing Function-as-a-Service (FaaS) capabilities that enable the decoupling of data and function providers. Despite its impressive functionality and security features, Teaclave currently lacks a mechanism for data providers to enforce policies on the data they upload. For example, data providers may wish to restrict access to certain columns of data for third-party function providers. Open Policy Agent (OPA) offers flexible control over service behavior and has been widely adopted by the cloud-native community. If Teaclave were to integrate OPA, data providers could apply policies to their data, enhancing Teaclave’s functionality. Another potential security loophole in Teaclave is the absence of a means to verify the expected behavior of a function. This gap leaves the system vulnerable to exploitation by malicious actors. Fortunately, most of Teaclave’s interfaces can be reused, with the exception of the function uploading phase, which may require an overhaul to address this issue. Overall, the integration of OPA and the addition of a function verification mechanism would make Teaclave an even more robust and secure solution for confidential computing.

    Benefits

    If this proposal moves on smoothly, new functionality will be added to the Teaclave project that enables the verification of the function behavior that it strictly conforms to a prescribed policy.

    Deliverables

    • Milestones: Basic policies (e.g., addition, subtraction) of the data can be verified by Teaclave; Complex policies can be verified.
    • Components: Verifier for the function code; Policy language adapters (adapt policy language to verifier); Policy language parser; Function source code converter (append policies to the functions).
    • Documentation: The internal working mechanism of the verification; How to write policies for the data.

    Timeline Estimation

    • 0.5 month: Policy language parser and/or policy language design (if Rego is not an ideal choice).
    • 1.5 − 2 months: Verification contracts rewriting on the function source code based on the policy parsed. • (∼ 1 month): The function can be properly verified formally (by, e.g., querying the Z3 SMT solver).

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Mingshen Sun, Apache Teaclave (incubating) PPMC, mssun@apache.orgImage Added

    Apache Doris
    Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
    Page: https://doris.apache.org

    Github: https://github.com/apache/doris

    Background

    In Apache Doris, dictionary encoding is performed during data writing and compaction. Dictionary encoding will be implemented on string data types by default. The dictionary size of a column for one segment is 1M at most. The dictionary encoding technology accelerates strings during queries, converting them into INT, for example.
     

    Task

    • Phase One: Get familiar with the implementation of Apache Doris dictionary encoding; learning how Apache Doris dictionary encoding accelerates queries.
    •  Phase Two: Evaluate the effectiveness of full dictionary encoding and figure out how to optimize memory in such a case.

    Learning Material

    Page: https://doris.apache.org
    Github: https://github.com/apache/doris

    Mentor

  • Mentor: Chen Zhang, Apache Doris Committer, zhangchen@apache.org Image Removed
  • Mentor: Zhijing Lu, Apache Doris Committer, luzhijing@apache.orgImage Removed  
  • Mailing List: dev@doris.apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhijing LuMingshen Sun, mail: luzhijing mssun (at) apache.org
    Project Devs, mail: dev (at) doris.apache.org

    CloudStack

    CloudStack GSoC 2023 - Autodetect IPs used inside the VM

    Github issue: https://github.com/apache/cloudstack/issues/7142


    Description:

    With regards to IP info reporting, Cloudstack relies entirely on it's DHCP data bases and so on. When this is not available (L2 networks etc) no IP information is shown for a given VM.

    I propose we introduce a mechanism for "IP autodetection" and try to discover the IPs used inside the machines by means of querying the hypervisors. For example with KVM/libvirt we can simply do something like this:

     
    {{root@fedora35 ~]# virsh domifaddr win2k22 --source agent
    Name MAC address Protocol Address
    -------------------------------------------------------------------------------
    Ethernet 52:54:00:7b:23:6a ipv4 192.168.0.68/24
    Loopback Pseudo-Interface 1 ipv6 ::1/128

    • - ipv4 127.0.0.1/8}}
      The above command queries the qemu-guest-agent inside the Windows VM. The VM needs to have the qemu-guest-agent installed and running as well as the virtio serial drivers (easily done in this case with virtio-win-guest-tools.exe ) as well as a guest-agent socket channel defined in libvirt.

    Once we have this information we could display it in the UI/API as "Autodetected VM IPs" or something like that.

    I imagine it's very similar for VMWare and XCP-ng.

    Thank you

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Nicolás Vázquez, mail: nvazquez (at) apache.org
    Project Devs, mail: dev (at) cloudstack.apache.org

    ...

    Dubbo GSoC 2023 - Refactor the http layer

    Background

    Dubbo currently supports the rest protocol based on http1, and the triple protocol based on http2, but currently the two protocols based on the http protocol are implemented independently, and at the same time, they cannot replace the underlying implementation, and their respective implementation costs are relatively high.

    Target

    In order to reduce maintenance costs, we hope to be able to abstract http. The underlying implementation of the target implementation of http has nothing to do with the protocol, and we hope that different protocols can reuse related implementations.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    ...

    Dubbo GSoC 2023 - Refactor Connection

    Background

    At present, the abstraction of connection by client in different protocols in Dubbo is not perfect. For example, there is a big discrepancy between the client abstraction of connection in dubbo and triple protocols. As a result, the enhancement of connection-related functions in the client is more complicated, and the implementation cannot be reused. At the same time, the client also needs to implement a lot of repetitive code when extending the protocol.

    Target

    Reduce the complexity of the client part when extending the protocol, and increase the reuse of connection-related modules.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - IDL management

    Background

    Dubbo currently supports protobuf as a serialization method. Protobuf relies on proto (Idl) for code generation, but currently lacks tools for managing Idl files. For example, for java users, proto files are used for each compilation. It is more troublesome, and everyone is used to using jar packages for dependencies.

    Target

    Implement an Idl management and control platform, support idl files to automatically generate dependency packages in various languages, and push them to relevant dependency warehouses

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    ...