Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Contents

...

Code Insights for Apache StreamPipes

Apache StreamPipes

Apache StreamPipes (incubating) is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. StreamPipes offers several modules including StreamPipes Connect to easily connect data from industrial IoT sources, the Pipeline Editor to quickly create processing pipelines and several visualization modules for live and historic data exploration. Under the hood, StreamPipes utilizes an event-driven microservice paradigm of standalone, so-called analytics microservices making the system easy to extend for individual needs.

Background

StreamPipes has grown significantly throughout recent years. We were able to introduce a lot of new features and attracted both users and contributors. Putting the cherry on the cake, we were graduated as an Apache top level project in December 2022. We will of course continue developing new features and never rest to make StreamPipes even more amazing. Although, since we are approaching with full stream towards our `1.0` release, we want to project also to get more mature. Therefore, we want to address one of our Achilles' heels: our test coverage.

Don't worry, this issue is not about implementing myriads of tests for our code base. As a first step, we would like to make the status quo transparent. That means we want to measure our code coverage consistently across the whole codebase (Backend, UI, Python library) and report the coverage to codecov. Furthermore, to benchmark ourselves and motivate us to provide tests with every contributing, we would like to lock the current test coverage as an lower threshold that we always want to achieve (meaning in case we drop CI builds fail etc). With time we then can increase the required coverage lever step to step.

More than monitoring our test coverage, we also want to invest in better and more clean code. Therefore, we would like to adopt sonarcloud for our repository.

Tasks

  • [ ] calculate test coverage for all main parts of the repo
  • [ ] send coverage to codeCov
  • [ ] determine coverage threshold and let CI fail if below
  • [ ] include sonarcloud in CI setup
  • [ ] include automatic coverage report in PR validation (see an example here ) -> optional
  • [ ] include automatic sonarcloud report in PR validation -> optional
  • [ ] what ever comes in your mind 💡 further ideas are always welcome


❗Important Note❗

Do not create any account in behalf of Apache StreamPipes in Sonarcloud or in CodeCov or using the name of Apache StreamPipes for any account creation. Your mentor will take care of it.


Relevant Skills

  • basic knowledge about GitHub worfklows

Learning Material


References

You can find our corresponding issue on GitHub here


Name and Contact Information

Name: Tim Bossenmaier

email:  bossenti[at]apache.org

community: dev[at]streampipes.apache.org

website: https://streampipes.apache.org/

Difficulty: Major
Project size: ~175 hour (medium)
Potential mentors:
Tim Bossenmaier, mail: bossenti (at) apache.org
Project Devs, mail: dev (at) streampipes.apache.org

ShardingSphere

SkyWalking

[GSOC] [SkyWalking] AIOps Log clustering with Flink (Algorithm Optimization)

Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on algorithm optimiztion for the clustering technique.

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Yihao Chen, mail: yihaochen (at) apache.org
Project Devs, mail: dev (at) skywalking.apache.org

[GSOC] [SkyWalking] Python Agent Performance Enhancement Plan

Apache ShardingSphere Add ShardingSphere Kafka source connector

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org
Githubhttps://github.com/apache/shardingsphere 

Background

The community just added CDC (change data capture) feature recently. Change feed will be published in created network connection after logging in, then it could be consumed.

Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.

Task

  1. Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
  2. Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
  3. Add unit test and E2E integration test.

Relevant Skills

1. Java language

2. Basic knowledge of CDC and Kafka

3. Maven

References

Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This task is about enhancing Python agent performance, the tracking issue can be seen here -<

https://github.com/apache/

shardingsphere

skywalking/issues/

22500
  • https://kafka.apache.org/documentation/#connect_development
  • https://github.com/apache/kafka/tree/trunk/connect/file/src
  • https://github.com/confluentinc/kafka-connect-jdbc
  • Mentor

    Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

    10408

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao Chen, mail: yihaochen (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] AIOps Log clustering with Flink (Flink Integration)

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on Flink and its integration with SkyWalking OAP.Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Hongsheng ZhongYihao Chen, mail: zhonghongsheng yihaochen (at) apache.org
    Project Devs, mail: dev (at) shardingsphereskywalking.apache.org

    Apache ShardingSphere Enhance SQLNodeConverterEngine to support more MySQL SQL statements

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    The ShardingSphere SQL federation engine provides support for complex SQL statements, and it can well support cross-database join queries, subqueries, aggregation queries and other statements. An important part of SQL federation engine is to convert the SQL statement parsed by ShardingSphere into SqlNode, so that Calcite can be used to implement SQL optimization and federated query.

    Task

    This issue is to solve the MySQL exception that occurs during SQLNodeConverterEngine conversion. The specific case list is as follows.

    • select_char
    • select_extract
    • select_from_dual
    • select_from_with_table
    • select_group_by_with_having_and_window
    • select_not_between_with_single_table
    • select_not_in_with_single_table
    • select_substring
    • select_trim
    • select_weight_string
    • select_where_with_bit_expr_with_ampersand
    • select_where_with_bit_expr_with_caret
    • select_where_with_bit_expr_with_div
    • select_where_with_bit_expr_with_minus_interval
    • select_where_with_bit_expr_with_mod
    • select_where_with_bit_expr_with_mod_sign
    • select_where_with_bit_expr_with_plus_interval
    • select_where_with_bit_expr_with_signed_left_shift
    • select_where_with_bit_expr_with_signed_right_shift
    • select_where_with_bit_expr_with_vertical_bar
    • select_where_with_boolean_primary_with_comparison_subquery
    • select_where_with_boolean_primary_with_is
    • select_where_with_boolean_primary_with_is_not
    • select_where_with_boolean_primary_with_null_safe
    • select_where_with_expr_with_and_sign
    • select_where_with_expr_with_is
    • select_where_with_expr_with_is_not
    • select_where_with_expr_with_not
    • select_where_with_expr_with_not_sign
    • select_where_with_expr_with_or_sign
    • select_where_with_expr_with_xor
    • select_where_with_predicate_with_in_subquery
    • select_where_with_predicate_with_regexp
    • select_where_with_predicate_with_sounds_like
    • select_where_with_simple_expr_with_collate
    • select_where_with_simple_expr_with_match
    • select_where_with_simple_expr_with_not
    • select_where_with_simple_expr_with_odbc_escape_syntax
    • select_where_with_simple_expr_with_row
    • select_where_with_simple_expr_with_tilde
    • select_where_with_simple_expr_with_variable
    • select_window_function
    • select_with_assignment_operator
    • select_with_assignment_operator_and_keyword
    • select_with_case_expression
    • select_with_collate_with_marker
    • select_with_date_format_function
    • select_with_exists_sub_query_with_project
    • select_with_function_name
    • select_with_json_value_return_type
    • select_with_match_against
    • select_with_regexp
    • select_with_schema_name_in_column_projection
    • select_with_schema_name_in_shorthand_projection
    • select_with_spatial_function
    • select_with_trim_expr
    • select_with_trim_expr_from_expr

    You need to compare the difference between actual and expected, and then correct the logic in SQLNodeConverterEngine so that actual can be consistent with expected.

    After you make changes, remember to add case to SUPPORTED_SQL_CASE_IDS to ensure it can be tested.

     
    Notice, these issues can be a good example.
    https://github.com/apache/shardingsphere/pull/14492

    Relevant Skills

     
    1. Master JAVA language

    2. Have a basic understanding of Antlr g4 file

    3. Be familiar with MySQL and Calcite SqlNode

    Targets files

     
    SQLNodeConverterEngineIT

    https://github.com/apache/shardingsphere/blob/master/test/it/optimizer/src/test/java/org/apache/shardingsphere/test/it/optimize/SQLNodeConverterEngineIT.java 

    Mentor

    Zhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.org

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Removed

    Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhengqiang Duan, mail: duanzhengqiang (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    [GSOC] [SkyWalking] Self-Observability of the query subsystem in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Support EXPLAIN[1] for both measure query and stream query
    2. Add self-observability including trace and metrics for query subsystem
    3. Support EXPLAIN in the client SDK & CLI and add query plan visualization in the UI

    [1]: EXPLAIN in MySQL

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking or other APMs

    Mentor

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] Unify query planner and executor in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Fully unify/merge the query planner and executor for Measure and TopN

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking

    Mentor

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC][SkyWalking] Add Terraform provider for Apache SkyWalking

    Now the deployment methods for SkyWalking are limited, we only have Helm Chart for users to deploy in Kubernetes, other users that are not using Kubernetes have to do all the house keeping stuffs to set up SkyWalking on, for example, VM.


    This issue aims to add a Terraform provider, so that users can conveniently  spin up a cluster for demonstration or testing, we should evolve the provider and allow users to customize as their need and finally users can use this in their production environment.


    In this task, we will mainly focus on the support for AWS. In the Terraform provider, users need to provide their access key / secret key, and the provider does the rest stuffs: create VMs, create database/OpenSearch or RDS, download SkyWalking tars, configure the SkyWalking, and start the SkyWalking components (OAP/UI), create public IPs/domain name, etc.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhenxu Ke, mail: kezhenxu94 (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    ShardingSphere

    Apache ShardingSphere Enhance SQLNodeConverterEngine to support more MySQL SQL statements

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    The ShardingSphere SQL federation engine provides support for complex SQL statements, and it can well support cross-database join queries, subqueries, aggregation queries and other statements. An important part of SQL federation engine is to convert the SQL statement parsed by ShardingSphere into SqlNode, so that Calcite can be used to implement SQL optimization and federated query.

    Task

    This issue is to solve the MySQL exception that occurs during SQLNodeConverterEngine conversion. The specific case list is as follows.

    • select_char
    • select_extract
    • select_from_dual
    • select_from_with_table
    • select_group_by_with_having_and_window
    • select_not_between_with_single_table
    • select_not_in_with_single_table
    • select_substring
    • select_trim
    • select_weight_string
    • select_where_with_bit_expr_with_ampersand
    • select_where_with_bit_expr_with_caret
    • select_where_with_bit_expr_with_div
    • select_where_with_bit_expr_with_minus_interval
    • select_where_with_bit_expr_with_mod
    • select_where_with_bit_expr_with_mod_sign
    • select_where_with_bit_expr_with_plus_interval
    • select_where_with_bit_expr_with_signed_left_shift
    • select_where_with_bit_expr_with_signed_right_shift
    • select_where_with_bit_expr_with_vertical_bar
    • select_where_with_boolean_primary_with_comparison_subquery
    • select_where_with_boolean_primary_with_is
    • select_where_with_boolean_primary_with_is_not
    • select_where_with_boolean_primary_with_null_safe
    • select_where_with_expr_with_and_sign
    • select_where_with_expr_with_is
    • select_where_with_expr_with_is_not
    • select_where_with_expr_with_not
    • select_where_with_expr_with_not_sign
    • select_where_with_expr_with_or_sign
    • select_where_with_expr_with_xor
    • select_where_with_predicate_with_in_subquery
    • select_where_with_predicate_with_regexp
    • select_where_with_predicate_with_sounds_like
    • select_where_with_simple_expr_with_collate
    • select_where_with_simple_expr_with_match
    • select_where_with_simple_expr_with_not
    • select_where_with_simple_expr_with_odbc_escape_syntax
    • select_where_with_simple_expr_with_row
    • select_where_with_simple_expr_with_tilde
    • select_where_with_simple_expr_with_variable
    • select_window_function
    • select_with_assignment_operator
    • select_with_assignment_operator_and_keyword
    • select_with_case_expression
    • select_with_collate_with_marker
    • select_with_date_format_function
    • select_with_exists_sub_query_with_project
    • select_with_function_name
    • select_with_json_value_return_type
    • select_with_match_against
    • select_with_regexp
    • select_with_schema_name_in_column_projection
    • select_with_schema_name_in_shorthand_projection
    • select_with_spatial_function
    • select_with_trim_expr
    • select_with_trim_expr_from_expr

    You need to compare the difference between actual and expected, and then correct the logic in SQLNodeConverterEngine so that actual can be consistent with expected.

    After you make changes, remember to add case to SUPPORTED_SQL_CASE_IDS to ensure it can be tested.

    Apache ShardingSphere Enhance ComputeNode reconciliation

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about new CRD Cluster and ComputeNode as belows:

    Currently we try to promote ComputeNode as major CRD to represent a special ShardingSphere Proxy deployment. And plan to use Cluster indicating a special ShardingSphere Proxy cluster.

    Task

    This issue is to enhance ComputeNode reconciliation availability. The specific case list is as follows.

  •  Add IT test case for Deployment spec volume
  •  Add IT test case for Deployment spec template init containers
  •  Add IT test case for Deployment spec template spec containers
  •  Add IT test case for Deployment spec volume mounts
  •  Add IT test case for Deployment spec container ports
  •  Add IT test case for Deployment spec container image tag
  •  Add IT test case for Service spec ports
  •  Add IT test case for ConfigMap data serverconfig
  •  Add IT test case for ConfigMap data logback

     
    Notice, these issues can be a good example.

    chore: add more Ginkgo tests for ComputeNode #203

    https://github.com/apache/shardingsphere/pull/14492

    Relevant Skills

     
    1. Master

    Go

    JAVA language

    , Ginkgo test framework

    2. Have a basic understanding of

    Apache ShardingSphere Concepts

    Antlr g4 file

    3. Be familiar with

    Kubernetes Operator, kubebuilder framework

    MySQL and Calcite SqlNode

    Targets files

    ComputeNode IT - 
    SQLNodeConverterEngineIT

    https://github.com/apache/shardingsphere-on-cloud/blob/mainmaster/test/it/optimizer/src/test/java/org/apache/shardingsphere-operator/pkgtest/reconcileit/computenode/compute_node_test.gooptimize/SQLNodeConverterEngineIT.java 

    Mentor

    Liyao MiaoZhengqiang Duan, Committer PMC of Apache ShardingSphere,  miaoliyao@apacheduanzhengqiang@apache.orgImage Removed

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.orgImage Removed

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Chuxin ChenZhengqiang Duan, mail: tuichenchuxin duanzhengqiang (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere

    Add the feature of switching logging framework

    Enhance ComputeNode reconciliation

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    ShardingSphere provides two adapters: ShardingSphere-JDBC and ShardingSphere-Proxy.

    Now, ShardingSphere uses logback for logging, but consider the following situations:

    • Users may need to switch the logging framework to meet special needs, such as log4j2 can provide better asynchronous performance;
    • When using the JDBC adapter, the user application may not use logback, which may cause some conflicts.

    Why doesn't the log facade suffice? Because ShardingSphere provides users with clustered logging configurations (such as changing the log level online), this requires dynamic construction of logger, which cannot be achieved with only the log facade.

    Task

    1. Design and implement logging SPI to support multiple logging frameworks (such as logback and log4j2)
    2. Allow users to choose which logging framework to use through the logging rule

    Relevant Skills

    1. Master JAVA language

    2. Basic knowledge of logback and log4j2

    3. Maven

    Mentor

    Longtao Jiang, Committer of Apache ShardingSphere, jianglongtao@apache.orgImage Removed

    Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.orgImage Removed

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Longtao Jiang, mail: jianglongtao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Support mainstream database metadata table query

    the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about new CRD Cluster and ComputeNode as belows:

    Currently we try to promote ComputeNode as major CRD to represent a special ShardingSphere Proxy deployment. And plan to use Cluster indicating a special ShardingSphere Proxy cluster.

    Task

    This issue is to enhance ComputeNode reconciliation availability. The specific case list is as follows.

    •  Add IT test case for Deployment spec volume
    •  Add IT test case for Deployment spec template init containers
    •  Add IT test case for Deployment spec template spec containers
    •  Add IT test case for Deployment spec volume mounts
    •  Add IT test case for Deployment spec container ports
    •  Add IT test case for Deployment spec container image tag
    •  Add IT test case for Service spec ports
    •  Add IT test case for ConfigMap data serverconfig
    •  Add IT test case for ConfigMap data logback
       
      Notice, these issues can be a good example.
    • chore: add more Ginkgo tests for ComputeNode #203

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic understanding of Apache ShardingSphere Concepts
    3. Be familiar with Kubernetes Operator, kubebuilder framework

    Targets files

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    ShardingSphere has designed its own metadata database to simulate metadata queries that support various databases.

    More details:

    ComputeNode IT - 

    https://github.com/apache/shardingsphere-on-cloud/issues/21268
    https://github.com/apache/shardingsphere/issues/22052blob/main/shardingsphere-operator/pkg/reconcile/computenode/compute_node_test.go

    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Chuxin Chen, mail: tuichenchuxin (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Add the feature of switching logging framework

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Github

    Task

    • Support PostgreSQL And openGauss `\d tableName`
    • Support PostgreSQL And openGauss `\d+`
    • Support PostgreSQL And openGauss `\d+ tableName`
    • Support PostgreSQL And openGauss `l`
    • Support query for MySQL metadata `TABLES`
    • Support query for MySQL metadata `COLUMNS`
    • Support query for MySQL metadata `schemata`
    • Support query for MySQL metadata `ENGINES`
    • Support query for MySQL metadata `FILES`
    • Support query for MySQL metadata `VIEWS`

    Notice, these issues can be a good example.

    https://github.com/apache/shardingsphere/pull/22053
    https://github.com/apache/shardingsphere/pull/22057/
    https://github.com/apache/shardingsphere/pull/22166/
    https://github.com/apache/shardingsphere/pull/22182

    Relevant Skills

    •  Master JAVA language
    •  Have a basic understanding of Zookeeper
    •  Be familiar with MySQL/Postgres SQLs 

    Mentor

     

    Background

    ShardingSphere provides two adapters: ShardingSphere-JDBC and ShardingSphere-Proxy.

    Now, ShardingSphere uses logback for logging, but consider the following situations:

    • Users may need to switch the logging framework to meet special needs, such as log4j2 can provide better asynchronous performance;
    • When using the JDBC adapter, the user application may not use logback, which may cause some conflicts.


    Why doesn't the log facade suffice? Because ShardingSphere provides users with clustered logging configurations (such as changing the log level online), this requires dynamic construction of logger, which cannot be achieved with only the log facade.

    Task

    1. Design and implement logging SPI to support multiple logging frameworks (such as logback and log4j2)
    2. Allow users to choose which logging framework to use through the logging rule

    Relevant Skills

    1. Master JAVA language

    2. Basic knowledge of logback and log4j2

    3. Maven

    Mentor

    Longtao JiangChuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache jianglongtao@apache.orgImage Added

    Zhengqiang DuanTrista Pan, PMC of Apache ShardingSphere, duanzhengqiang@apache panjuan@apache.orgImage Added

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Chuxin ChenLongtao Jiang, mail: tuichenchuxin jianglongtao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere

    Write a converter to generate DistSQL

    Support mainstream database metadata table query

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    ShardingSphere has designed its own metadata database to simulate metadata queries that support various databases.

    More details:

    Page: https://shardingspheregithub.apache.org/
    Github: com/apache/shardingsphere/issues/21268
    https://github.com/apache/shardingsphere 

    Background

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of  a set of storage units, like PostgreSQL, in kubernetes. 

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

    • [ ] Generate DistSQL according to the Golang struct `EncryptionRule`
    • [ ] Generate DistSQL according to the Golang struct `ShardingRule`
    • [ ] Generate DistSQL according to the Golang struct `ReadWriteSplittingRule`
    • [ ] Generate DistSQL according to the Golang struct `MaskRule`
    • [ ] Generate DistSQL according to the Golang struct `ShadowRule`

      Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic understanding of Apache ShardingSphere Concepts and DistSQL

    Targets files

    /issues/22052

    Task

    • Support PostgreSQL And openGauss `\d tableName`
    • Support PostgreSQL And openGauss `\d+`
    • Support PostgreSQL And openGauss `\d+ tableName`
    • Support PostgreSQL And openGauss `l`
    • Support query for MySQL metadata `TABLES`
    • Support query for MySQL metadata `COLUMNS`
    • Support query for MySQL metadata `schemata`
    • Support query for MySQL metadata `ENGINES`
    • Support query for MySQL metadata `FILES`
    • Support query for MySQL metadata `VIEWS`

    Notice, these issues can be a good example.

    DistSQL Converter - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/distsql/converter.go, etc.

    Example

    A struct defined as below:

    ```golang
    type EncryptRule struct{}
    func (t EncryptRule) ToDistSQL() string {}
    ```
    While invoking ToDistSQL() it will generate a DistSQL regarding a EncryptRule like:

    ```SQL
    CREATE ENCRYPT RULE t_encrypt (....
    ```

    References:

    pull/22053
    https://github.com/apache/shardingsphere/pull/22057/
    https://

    shardingsphere

    github.com/apache

    .org

    /

    document

    shardingsphere/

    current

    pull/

    en/user-manual/shardingsphere-proxy/distsql/syntax/rdl/rule-definition/encrypt/create-encrypt-rule/

    22166/
    https://github.com/apache/shardingsphere/pull/22182

    Relevant Skills

    •  Master JAVA language
    •  Have a basic understanding of Zookeeper
    •  Be familiar with MySQL/Postgres SQLs 


    Mentor

    Chuxin Chen

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apachetuichenchuxin@apache.orgChuxin Chen

    Zhengqiang Duan, Committer PMC of Apache ShardingSphere, tuichenchuxin@apacheduanzhengqiang@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao MiaoChuxin Chen, mail: miaoliyao tuichenchuxin (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere

    Introduce new CRD as StorageNode for better usability

    Write a converter to generate DistSQL

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page:   https://shardingsphere.apache.org/
    Github:   https://github.com/apache/shardingsphere 

    Background

    There is a proposal about new CRD Cluster and ComputeNode as belows:

    • #167
    • #166

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of a  a set of storage units, like PostgreSQL, in kubernetes. 

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

    • [ ] Create a PostgreSQL cluster while a StorageNode with pg parameters is createdGenerate DistSQL according to the Golang struct `EncryptionRule`
    • [ ] Generate DistSQL according to the Golang struct `ShardingRule`
    • [ ] Generate DistSQL according to the Golang struct `ReadWriteSplittingRule` Update the PostgreSQL cluster while updated StorageNode
    • [ ] Delete the PostgreSQL cluster while deleted StorageNode. Notice this may need a deletion strategy.Generate DistSQL according to the Golang struct `MaskRule`
    • [ ] Reconciling StorageNode Generate DistSQL according to the status of PostgreSQL cluster.
    • [ ] The status of StorageNode would be consumed by common storage units related DistSQLs
    • Golang struct `ShadowRule`

      Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic understanding of Apache ShardingSphere Concepts
    3. Be familiar with Kubernetes Operator, kubebuilder framework

    Targets files

    StorageNode Controller - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/controllers/storagenode_controller.go

    and DistSQL

    Targets files

    DistSQL Converter - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/distsql/converter.go, etc.

    Example

    A struct defined as below:

    ```golang
    type EncryptRule struct{}
    func (t EncryptRule) ToDistSQL() string {}
    ```
    While invoking ToDistSQL() it will generate a DistSQL regarding a EncryptRule like:

    ```SQL
    CREATE ENCRYPT RULE t_encrypt (....
    ```

    References:

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere,   miaoliyao@apache.orgImage Removed

    Chuxin Chen, Committer of Apache ShardingSphere,   tuichenchuxin@apache.orgImage Removed

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Introduce

    JVM chaos to ShardingSphere

    new CRD as StorageNode for better usability

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about the background of ChaosEngineering new CRD Cluster and ComputeNode as belows:

    Introduce ChaosEngineering for ShardingSphere #32
    And we also proposed a generic controller for ShardingSphereChaos as belows:

    [GSoC 2023] Introduce New CRD ShardingSphereChaos #272
    The ShardingSphereChaos controller is aiming at different chaos tests. This JVMChaos is an important one.

    Task

    Write several scripts to implement different JVMChaos for main features of ShardingSphere. The specific case list is as follows.

    • Add scripts injecting chaos to DataSharding
    • Add scripts injecting chaos to ReadWritingSplitting
    • Add scripts injecting chaos to DatabaseDiscovery
    • Add scripts injecting chaos to Encryption
    • Add scripts injecting chaos to Mask
    • Add scripts injecting chaos to Shadow
      Basically, these scripts will cause unexpected behaviour while executing the related. DistSQL.

    Relevant Skills

    • #167
    • #166

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of a set of storage units, like PostgreSQL, in kubernetes.

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

    • [ ] Create a PostgreSQL cluster while a StorageNode with pg parameters is created
    • [ ] Update the PostgreSQL cluster while updated StorageNode
    • [ ] Delete the PostgreSQL cluster while deleted StorageNode. Notice this may need a deletion strategy.
    • [ ] Reconciling StorageNode according to the status of PostgreSQL cluster.
    • [ ] The status of StorageNode would be consumed by common storage units related DistSQLs

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a

    deep

    basic understanding of Apache ShardingSphere

    concepts and practices.JVM byte mechanisms like ByteMan, ByteBuddy.

    Concepts
    3. Be familiar with Kubernetes Operator, kubebuilder framework

    Targets files

    JVMChaos Scripts StorageNode Controller - https://github.com/apache/shardingsphere-on-cloud/chaos/jvmchaos/scripts//blob/main/shardingsphere-operator/pkg/controllers/storagenode_controller.go


    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Introduce

    New CRD ShardingSphereChaos

    JVM chaos to ShardingSphere

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    Introduce ChaosEngineering for ShardingSphere

    · Issue #32 · apache/shardingsphere-on-cloud (github.com)

    #32
    And we also proposed a generic controller for ShardingSphereChaos as belows:

    [GSoC 2023] Introduce New CRD ShardingSphereChaos #272
    The ShardingSphereChaos controller is aiming at different chaos tests. This JVMChaos is an important one. 

    Task

    Propose a generic controller for ShardingSphereChaos, which reconcile CRD ShardingSphereChaos, prepare, execute and verify test.

    • [ ] Support common ShardingSphere features, prepare test rules and dataset
    • [ ] Generating chaos type according to the backend implementation
    • [ ] Verify testing result with DistSQL or other tools

    Relevant Skills

    1.

    Write several scripts to implement different JVMChaos for main features of ShardingSphere. The specific case list is as follows.

    • Add scripts injecting chaos to DataSharding
    • Add scripts injecting chaos to ReadWritingSplitting
    • Add scripts injecting chaos to DatabaseDiscovery
    • Add scripts injecting chaos to Encryption
    • Add scripts injecting chaos to Mask
    • Add scripts injecting chaos to Shadow
      Basically, these scripts will cause unexpected behaviour while executing the related. DistSQL.

    Relevant Skills

    • Master Go language, Ginkgo test framework
    2.
    • Have a deep understanding of Apache ShardingSphere concepts and practices.
    3. Kubernetes operator pattern, kube-builder 
    • JVM byte mechanisms like ByteMan, ByteBuddy.

    Targets files

    ShardingSphereChaos Controller JVMChaos Scripts - https://github.com/apache/shardingsphere-on-cloud/shardingsphere-operatorchaos/pkgjvmchaos/controllers/chaos_controller.go, etc.scripts/

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere,   miaoliyao@apache.orgImage Removed
    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere

    tuichenchuxin@apache.orgImage Removed
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    SkyWalking

    [GSOC] [SkyWalking] AIOps Log clustering with Flink (Algorithm Optimization)

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on algorithm optimiztion for the clustering technique.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao Chen, mail: yihaochen (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    Introduce New CRD ShardingSphereChaos

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    The ShardingSphereChaos controller is aiming at different chaos tests. 

    Task

    Propose a generic controller for ShardingSphereChaos, which reconcile CRD ShardingSphereChaos, prepare, execute and verify test.

    • [ ] Support common ShardingSphere features, prepare test rules and dataset
    • [ ] Generating chaos type according to the backend implementation
    • [ ] Verify testing result with DistSQL or other tools

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a deep understanding of Apache ShardingSphere concepts and practices.
    3. Kubernetes operator pattern, kube-builder 

    Targets files

    ShardingSphereChaos Controller -

    [GSOC] [SkyWalking] Python Agent Performance Enhancement Plan

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This task is about enhancing Python agent performance, the tracking issue can be seen here -< https://github.com/apache/skywalking/issues/10408

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao Chen, mail: yihaochen (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    shardingsphere-on-cloud/shardingsphere-operator/pkg/controllers/chaos_controller.go, etc.


    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    [GSOC] [SkyWalking] AIOps Log clustering with Flink (Flink Integration)

    Apache SkyWalking is an application performance monitor tool for distributed systems, especially designed for microservices, cloud native and container-based (Kubernetes) architectures. This year we will proceed on log clustering implementation with a revised architecture and this task will require student to focus on Flink and its integration with SkyWalking OAP.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yihao ChenLiyao Miao, mail: yihaochen miaoliyao (at) apache.org
    Project Devs, mail: dev (at) skywalkingshardingsphere.apache.org

    [GSOC] [SkyWalking] Self-Observability of the query subsystem in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Support EXPLAIN[1] for both measure query and stream query
    2. Add self-observability including trace and metrics for query subsystem
    3. Support EXPLAIN in the client SDK & CLI and add query plan visualization in the UI

    [1]: EXPLAIN in MySQL

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking or other APMs

    Mentor

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    [GSOC] [SkyWalking] Unify query planner and executor in BanyanDB

    Background

    SkyWalking BanyanDB is an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data.

    Objectives

    1. Fully unify/merge the query planner and executor for Measure and TopN

    Recommended Skills

    1. Familiar with Go
    2. Have a basic understanding of database query engine
    3. Have an experience of Apache SkyWalking

    Mentor

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jiajing Lu, mail: lujiajing (at) apache.org
    Project Devs, mail: dev (at) skywalking.apache.org

    Apache ShardingSphere Add ShardingSphere Kafka source connector

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org
    Githubhttps://github.com/apache/shardingsphere 

    Background

    The community just added CDC (change data capture) feature recently. Change feed will be published in created network connection after logging in, then it could be consumed.

    Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.

    Task

    1. Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
    2. Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
    3. Add unit test and E2E integration test.

    Relevant Skills

    1. Java language
    2. Basic knowledge of CDC and Kafka
    3. Maven

    References

    Local Test Steps

    1. Modify `conf/server.yaml`, uncomment `cdc-server-port: 33071` to enable CDC. (Refer to step 2)
    2. Configure proxy, refer to `Prerequisites` and `Procedure` in build to configure proxy (Newer version could be used too, current stable version is 5.3.1).
    3. Start proxy server, it'll start CDC server too.
    4. Download ShardingSphere source code from https://github.com/apache/shardingsphere , modify and run `org.apache.shardingsphere.data.pipeline.cdc.client.example.Bootstrap`. It'll print `records:` by default in `Bootstrap`.
    5. Execute some ISNERT/UPDATE/DELETE SQLs in proxy to generate change feed, and then check in `Bootstrap` console.

    Mentor

    Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

    Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org

    [GSOC][SkyWalking] Add Terraform provider for Apache SkyWalking

    Now the deployment methods for SkyWalking are limited, we only have Helm Chart for users to deploy in Kubernetes, other users that are not using Kubernetes have to do all the house keeping stuffs to set up SkyWalking on, for example, VM.

    This issue aims to add a Terraform provider, so that users can conveniently  spin up a cluster for demonstration or testing, we should evolve the provider and allow users to customize as their need and finally users can use this in their production environment.

    In this task, we will mainly focus on the support for AWS. In the Terraform provider, users need to provide their access key / secret key, and the provider does the rest stuffs: create VMs, create database/OpenSearch or RDS, download SkyWalking tars, configure the SkyWalking, and start the SkyWalking components (OAP/UI), create public IPs/domain name, etc.


    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Zhenxu KeHongsheng Zhong, mail: kezhenxu94 zhonghongsheng (at) apache.org
    Project Devs, mail: dev (at) skywalkingshardingsphere.apache.org

    ShenYu

    Apache ShenYu Gsoc 2023 - Support for Kubernetes Service Discovery

    Background

    Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance. Currently, ShenYu has good usability and performance in microservice scenarios. However, ShenYu's support for Kubernetes is still relatively weak.

    Tasks

    1. Support the registration of microservices deployed in K8s Pod to shenyu-admin and use K8s as the register center.
    2. Discuss with mentors, and complete the requirements design and technical design of Shenyu K8s Register Center.
    3. Complete the initial version of Shenyu K8s Register Center.
    4. Complete the CI test of Shenyu K8s Register Center, verify the correctness of the code.
    5. Write the necessary documentation, deployment guides, and instructions for users to connect microservices running inside the K8s Pod to ShenYu

    Relevant Skills

    1. Know the use of Apache ShenYu, especially the register center
    2. Familiar with Java and Golang
    3. Familiar with Kubernetes and can use Java or Golang to develop

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Yonglun Zhang, mail: zhangyonglun (at) apache.org
    Project Devs, mail: dev (at) shenyu.apache.org

    ...

    Nemo on Google Dataproc

    Issues for making it easy to install and use Nemo on Google Dataproc.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    John Yang, mail: johnyangk (at) apache.org
    Project Devs, mail: dev (at) nemo.apache.org

    Apache Dubbo

    Dubbo GSoC 2023 - Dubbo usage scanner

    As a development framework closely related to users, Dubbo provides many functional features (such as configuring timeouts, retries, etc.). We hope that a tool can be given to users to scan which features are used, which features are deprecated, which ones will be deprecated in the future, and so on. Based on this tool, we can provide users with a better migration solution.
    Suggestion: You can consider based on static code scanning or javaagent implementation.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail: dev (at) nemo.apache.org

    ...

    Dubbo GSoC 2023 -

    Integration suite on Kubernetes

    Remove jprotoc in compiler

    Dubbo supports the communication mode based on the gRPC protocol through Triple. For this reason, Dubbo has developed a compiling plug-in for proto files based on jprotoc. Due to the activeness of jprotoc, currently Dubbo compiler cannot run well on the latest protobuf version. Therefore, we need to consider implementing a new compiler with reference to gRPC

    As a development framework that is closely related to users, Dubbo may have a huge impact on users if any problems occur during the iteration process. Therefore, Dubbo needs a complete set of automated regression testing tools.
    At present, Dubbo already has a set of testing tools based on docker-compose, but this set of tools cannot test the compatibility in the kubernetes environment. At the same time, we also need a more reliable test case construction system to ensure that the test cases are sufficiently complete.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Dubbo

    usage scanner

    i18n log

    Dubbo is a development framework that is closely related to users, and many usages by users may cause exceptions handled by Dubbo. Usually, in this case, users can only judge through logs. We hope to provide an i18n localized log output tool to provide users with a more friendly log troubleshooting experience

    As a development framework closely related to users, Dubbo provides many functional features (such as configuring timeouts, retries, etc.). We hope that a tool can be given to users to scan which features are used, which features are deprecated, which ones will be deprecated in the future, and so on. Based on this tool, we can provide users with a better migration solution.
    Suggestion: You can consider based on static code scanning or javaagent implementation.

    Difficulty: Major
    Project size: ~350 ~175 hour (largemedium)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Remove jprotoc in compiler

    Refactor dubbo project to gradle

    As more and more projects start to develop based on Gradle and profit from Gradle, Dubbo also hopes to migrate to the Gradle project. This task requires you to transform the dubbo project[1] into a gradle project.


     [1] https://github.com/apache/dubbo

    Dubbo supports the communication mode based on the gRPC protocol through Triple. For this reason, Dubbo has developed a compiling plug-in for proto files based on jprotoc. Due to the activeness of jprotoc, currently Dubbo compiler cannot run well on the latest protobuf version. Therefore, we need to consider implementing a new compiler with reference to gRPC.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Dubbo i18n log

    Refactor Connection

    Background

    At present, the abstraction of connection by client in different protocols in Dubbo is not perfect. For example, there is a big discrepancy between the client abstraction of connection in dubbo and triple protocols. As a result, the enhancement of connection-related functions in the client is more complicated, and the implementation cannot be reused. At the same time, the client also needs to implement a lot of repetitive code when extending the protocol.

    Target

    Reduce the complexity of the client part when extending the protocol, and increase the reuse of connection-related modules

    Dubbo is a development framework that is closely related to users, and many usages by users may cause exceptions handled by Dubbo. Usually, in this case, users can only judge through logs. We hope to provide an i18n localized log output tool to provide users with a more friendly log troubleshooting experience.

    Difficulty: Major
    Project size: ~175 ~350 hour (mediumlarge)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Refactor dubbo project to gradle

    IDL management

    Background

    Dubbo currently supports protobuf as a serialization method. Protobuf relies on proto (Idl) for code generation, but currently lacks tools for managing Idl files. For example, for java users, proto files are used for each compilation. It is more troublesome, and everyone is used to using jar packages for dependencies.

    Target

    Implement an Idl management and control platform, support idl files to automatically generate dependency packages in various languages, and push them to relevant dependency warehouses

    As more and more projects start to develop based on Gradle and profit from Gradle, Dubbo also hopes to migrate to the Gradle project. This task requires you to transform the dubbo project[1] into a gradle project.

     [1] https://github.com/apache/dubbo

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Metrics on Dubbo Admin

    Service Deployer

    For a large number of monolithic applications, problems such as performance will be encountered during large-scale deployment. For interface-oriented programming languages, Dubbo provides the capability of RPC remote calls, and we can help applications decouple through interfaces. Therefore, we can provide a deployer to help users realize the decoupling and splitting of microservices during deployment, and quickly provide performance optimization capabilities

    Dubbo Admin is a console of Dubbo. Today, Dubbo's observability is becoming more and more powerful. We need to directly observe some indicators of Dubbo on Dubbo Admin, and even put forward suggestions for users to improve problems.

    Difficulty: Major
    Project size: ~175 ~350 hour (mediumlarge)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Refactor Connection

    API manager

    Since Dubbo runs on a distributed architecture, it naturally has the problem of difficult API interface definition management. It is often difficult for us to know which interface is running in the production environment. So we can provide an API-defined reporting platform, and even a management platform. This platform can automatically collect all APIs of the cluster, or can be directly defined by the user, and then unified distribution management is carried out through a mechanism similar to git and maven package management

    Background

    At present, the abstraction of connection by client in different protocols in Dubbo is not perfect. For example, there is a big discrepancy between the client abstraction of connection in dubbo and triple protocols. As a result, the enhancement of connection-related functions in the client is more complicated, and the implementation cannot be reused. At the same time, the client also needs to implement a lot of repetitive code when extending the protocol.

    Target

    Reduce the complexity of the client part when extending the protocol, and increase the reuse of connection-related modules.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    IDL management

    JSON compatibility check

    Background

    Dubbo currently supports protobuf as a serialization method. Protobuf relies on proto (Idl) for code generation, but currently lacks tools for managing Idl files. For example, for java users, proto files are used for each compilation. It is more troublesome, and everyone is used to using jar packages for dependencies.

    Target

    Implement an Idl management and control platform, support idl files to automatically generate dependency packages in various languages, and push them to relevant dependency warehousesa large number of Java language features through hessian under the Java SDK, such as generics, interfaces, etc. These capabilities will not be compatible when calling across systems. Therefore, Dubbo needs to provide the ability to detect the interface definition and determine whether the interface published by the user can be described by native json.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Service Deployer

    Automated Performance Testing Mechanism

    Dubbo currently provides a very simple performance testing tool. But for such a complex framework as Dubbo, the functional coverage is very low. We urgently need a testing tool that can test multiple complex scenarios. In addition, we also hope that this set of testing tools can be run automatically, so that we can track the current performance of Dubbo in time

    For a large number of monolithic applications, problems such as performance will be encountered during large-scale deployment. For interface-oriented programming languages, Dubbo provides the capability of RPC remote calls, and we can help applications decouple through interfaces. Therefore, we can provide a deployer to help users realize the decoupling and splitting of microservices during deployment, and quickly provide performance optimization capabilities.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    API managerSince

    Dubbo

    runs on a distributed architecture, it naturally has the problem of difficult API interface definition management. It is often difficult for us to know which interface is running in the production environment. So we can provide an API-defined reporting platform, and even a management platform. This platform can automatically collect all APIs of the cluster, or can be directly defined by the user, and then unified distribution management is carried out through a mechanism similar to git and maven package management.

    Client on WASM

    WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. For web client users, we can provide Dubbo's wasm client, so that front-end developers can simply initiate Dubbo requests in the browser, and realize Dubbo's full-link unification.

    This task needs to be implemented on a browser such as Chrome to initiate a request to the Dubbo backend

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - JSON compatibility check

    Dubbo currently supports a large number of Java language features through hessian under the Java SDK, such as generics, interfaces, etc. These capabilities will not be compatible when calling across systems. Therefore, Dubbo needs to provide the ability to detect the interface definition and determine whether the interface published by the user can be described by native json.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Automated Performance Testing Mechanism

    Pure Dubbo RPC API

    At present, Dubbo provides RPC capabilities and a large number of service governance capabilities. This has led to the fact that Dubbo cannot be used well if some of Dubbo's own components only need to use RPC capabilities or some users who need extreme lightweight.
    Goal: To provide a Dubbo RPC kernel, users can directly program for service calls and focus on RPC

    Dubbo currently provides a very simple performance testing tool. But for such a complex framework as Dubbo, the functional coverage is very low. We urgently need a testing tool that can test multiple complex scenarios. In addition, we also hope that this set of testing tools can be run automatically, so that we can track the current performance of Dubbo in time.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Dubbo

    Client

    SPI Extensions on WASM

    WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. For web client users, we can provide Dubbo's wasm client, so that front-end developers can simply initiate Dubbo requests in the browser, and realize Dubbo's full-link unification.This task needs to be implemented on a browser such as Chrome to initiate a request to the Dubbo backendMany capabilities of Dubbo support extensions, such as custom interceptors, routing, load balancing, etc. In order to allow the user's implementation to be used on Dubbo's multiple language SDKs, we can implement cross-platform operation based on wasm capabilities.


    The implementation of this topic needs to provide a set of mechanisms for Wasm on Dubbo, covering the implementation of Java and Go. Also supports at least Filter, Router and Loadbalance.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Go Web Protocol and Programming Support


    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Refactor the http layer

    Background

    Dubbo currently supports the rest protocol based on http1, and the triple protocol based on http2, but currently the two protocols based on the http protocol are implemented independently, and at the same time, they cannot replace the underlying implementation, and their respective implementation costs are relatively high.

    Target

    In order to reduce maintenance costs, we hope to be able to abstract http. The underlying implementation of the target implementation of http has nothing to do with the protocol, and we hope that different protocols can reuse related implementations

    Pure Dubbo RPC API

    At present, Dubbo provides RPC capabilities and a large number of service governance capabilities. This has led to the fact that Dubbo cannot be used well if some of Dubbo's own components only need to use RPC capabilities or some users who need extreme lightweight.
    Goal: To provide a Dubbo RPC kernel, users can directly program for service calls and focus on RPC.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Go Traffic Management

    HTTP/3 Rest Support

    HTTP/3 has been formalized as a standard in the last year. Dubbo, as a framework that supports publishing and invoking Web services, needs to support the HTTP/3 protocol.

    This task needs to expand the implementation of the current rest protocol to support publishing HTTP/3 services and calling HTTP/3 services.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:
    Dubbo GSoC 2023 - Go Security

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jun LiuAlbumen Kevin, mail: liujun albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Improve usability of Dubbo-go project

    Dubbo3 Python HTTP/2 RPC Protocol Implementation


    Including but not limited to programming patterns, configuration, apis, documentation and demos.

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Dubbo SPI Extensions on WASM

    Integration suite on Kubernetes

    As a development framework that is closely related to users, Dubbo may have a huge impact on users if any problems occur during the iteration process. Therefore, Dubbo needs a complete set of automated regression testing tools.
    At present, Dubbo already has a set of testing tools based on docker-compose, but this set of tools cannot test the compatibility in the kubernetes environment. At the same time, we also need a more reliable test case construction system to ensure that the test cases are sufficiently complete

    WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Many capabilities of Dubbo support extensions, such as custom interceptors, routing, load balancing, etc. In order to allow the user's implementation to be used on Dubbo's multiple language SDKs, we can implement cross-platform operation based on wasm capabilities.

    The implementation of this topic needs to provide a set of mechanisms for Wasm on Dubbo, covering the implementation of Java and Go. Also supports at least Filter, Router and Loadbalance.

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Albumen Kevin, mail: albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023

    - Admin Control Plane
    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:
    Dubbo GSoC 2023 - Dubbo3 Node.js HTTP/2 RPC Protocol Implementation

    - Improve usability of Dubbo-go project

    Including but not limited to programming patterns, configuration, apis, documentation and demos.

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Go HTTP1&2 RPC Protocol Support


    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Go

    Web Protocol and Programming Support

    Observability Improvement


    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Go

    Observability Improvement

    Traffic Management


    Difficulty: Major
    Project size: ~175 ~350 hour (mediumlarge)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Development of Dubbo Admin Dashboard UI Pages

    In charge of the maintenance of the development of the UI pages of the whole Dubbo Admin project.
    Project Devs, mail:

    Dubbo GSoC 2023 - Go Security


    Difficulty: Major
    Project size: ~175 ~350 hour (mediumlarge)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Rust Cluster Feature Implementation and Stability Improvement

    Metrics on Dubbo Admin

    Dubbo Admin is a console of Dubbo. Today, Dubbo's observability is becoming more and more powerful. We need to directly observe some indicators of Dubbo on Dubbo Admin, and even put forward suggestions for users to improve problems.

    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun LiuAlbumen Kevin, mail: liujun albumenj (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    Refactor the http layer

    Development of Dubbo Admin Dashboard UI Pages

    In charge of the maintenance of the development of the UI pages of the whole Dubbo Admin project

    Background

    Dubbo currently supports the rest protocol based on http1, and the triple protocol based on http2, but currently the two protocols based on the http protocol are implemented independently, and at the same time, they cannot replace the underlying implementation, and their respective implementation costs are relatively high.

    Target

    In order to reduce maintenance costs, we hope to be able to abstract http. The underlying implementation of the target implementation of http has nothing to do with the protocol, and we hope that different protocols can reuse related implementations.

    Difficulty: Major
    Project size: ~350 ~175 hour (largemedium)
    Potential mentors:
    Albumen KevinJun Liu, mail: albumenj liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 -

    HTTP/3 Rest Support

    Rust Cluster Feature Implementation and Stability Improvement.


    HTTP/3 has been formalized as a standard in the last year. Dubbo, as a framework that supports publishing and invoking Web services, needs to support the HTTP/3 protocol.

    This task needs to expand the implementation of the current rest protocol to support publishing HTTP/3 services and calling HTTP/3 services.

    Difficulty: Major
    Project size: ~350 ~175 hour (largemedium)
    Potential mentors:
    Albumen KevinJun Liu, mail: albumenj liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Dubbo3

    Python

    Node.js HTTP/2 RPC Protocol Implementation


    Difficulty: Major
    Project size: ~175 hour (medium)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    Dubbo GSoC 2023 - Admin Control Plane


    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Jun Liu, mail: liujun (at) apache.org
    Project Devs, mail:

    ...