Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Reverted from v. 49

...

Contents

...

Code Insights for Apache StreamPipes

Apache StreamPipes

Apache StreamPipes (incubating) is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams. StreamPipes offers several modules including StreamPipes Connect to easily connect data from industrial IoT sources, the Pipeline Editor to quickly create processing pipelines and several visualization modules for live and historic data exploration. Under the hood, StreamPipes utilizes an event-driven microservice paradigm of standalone, so-called analytics microservices making the system easy to extend for individual needs.

Background

StreamPipes has grown significantly throughout recent years. We were able to introduce a lot of new features and attracted both users and contributors. Putting the cherry on the cake, we were graduated as an Apache top level project in December 2022. We will of course continue developing new features and never rest to make StreamPipes even more amazing. Although, since we are approaching with full stream towards our `1.0` release, we want to project also to get more mature. Therefore, we want to address one of our Achilles' heels: our test coverage.

Don't worry, this issue is not about implementing myriads of tests for our code base. As a first step, we would like to make the status quo transparent. That means we want to measure our code coverage consistently across the whole codebase (Backend, UI, Python library) and report the coverage to codecov. Furthermore, to benchmark ourselves and motivate us to provide tests with every contributing, we would like to lock the current test coverage as an lower threshold that we always want to achieve (meaning in case we drop CI builds fail etc). With time we then can increase the required coverage lever step to step.

More than monitoring our test coverage, we also want to invest in better and more clean code. Therefore, we would like to adopt sonarcloud for our repository.

Tasks

  • [ ] calculate test coverage for all main parts of the repo
  • [ ] send coverage to codeCov
  • [ ] determine coverage threshold and let CI fail if below
  • [ ] include sonarcloud in CI setup
  • [ ] include automatic coverage report in PR validation (see an example here ) -> optional
  • [ ] include automatic sonarcloud report in PR validation -> optional
  • [ ] what ever comes in your mind 💡 further ideas are always welcome


❗Important Note❗

Do not create any account in behalf of Apache StreamPipes in Sonarcloud or in CodeCov or using the name of Apache StreamPipes for any account creation. Your mentor will take care of it.


Relevant Skills

  • basic knowledge about GitHub worfklows

Learning Material


References

You can find our corresponding issue on GitHub here


Name and Contact Information

Name: Tim Bossenmaier

email:  bossenti[at]apache.org

community: dev[at]streampipes.apache.org

website: https://streampipes.apache.org/

Difficulty: Major
Project size: ~175 hour (medium)
Potential mentors:
Tim Bossenmaier, mail: bossenti (at) apache.org
Project Devs, mail: dev (at) streampipes.apache.org

ShardingSphere

Apache ShardingSphere Enhance

SQLNodeConverterEngine to support more MySQL SQL statements

ComputeNode reconciliation

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org/
Githubhttps://github.com/apache/shardingsphere 

Background

shardingsphere 

Background

There is a proposal about new CRD Cluster and ComputeNode as belows:

Currently we try to promote ComputeNode as major CRD to represent a special ShardingSphere Proxy deployment. And plan to use Cluster indicating a special ShardingSphere Proxy clusterThe ShardingSphere SQL federation engine provides support for complex SQL statements, and it can well support cross-database join queries, subqueries, aggregation queries and other statements. An important part of SQL federation engine is to convert the SQL statement parsed by ShardingSphere into SqlNode, so that Calcite can be used to implement SQL optimization and federated query.

Task

This issue is to solve the MySQL exception that occurs during SQLNodeConverterEngine conversionenhance ComputeNode reconciliation availability. The specific case list is as follows.

  • select_char
  • select_extract
  • select_from_dual
  • select_from_with_table
  • select_group_by_with_having_and_window
  • select_not_between_with_single_table
  • select_not_in_with_single_table
  • select_substring
  • select_trim
  • select_weight_string
  • select_where_with_bit_expr_with_ampersand
  • select_where_with_bit_expr_with_caret
  • select_where_with_bit_expr_with_div
  • select_where_with_bit_expr_with_minus_interval
  • select_where_with_bit_expr_with_mod
  • select_where_with_bit_expr_with_mod_sign
  • select_where_with_bit_expr_with_plus_interval
  • select_where_with_bit_expr_with_signed_left_shift
  • select_where_with_bit_expr_with_signed_right_shift
  • select_where_with_bit_expr_with_vertical_bar
  • select_where_with_boolean_primary_with_comparison_subquery
  • select_where_with_boolean_primary_with_is
  • select_where_with_boolean_primary_with_is_not
  • select_where_with_boolean_primary_with_null_safe
  • select_where_with_expr_with_and_sign
  • select_where_with_expr_with_is
  • select_where_with_expr_with_is_not
  • select_where_with_expr_with_not
  • select_where_with_expr_with_not_sign
  • select_where_with_expr_with_or_sign
  • select_where_with_expr_with_xor
  • select_where_with_predicate_with_in_subquery
  • select_where_with_predicate_with_regexp
  • select_where_with_predicate_with_sounds_like
  • select_where_with_simple_expr_with_collate
  • select_where_with_simple_expr_with_match
  • select_where_with_simple_expr_with_not
  • select_where_with_simple_expr_with_odbc_escape_syntax
  • select_where_with_simple_expr_with_row
  • select_where_with_simple_expr_with_tilde
  • select_where_with_simple_expr_with_variable
  • select_window_function
  • select_with_assignment_operator
  • select_with_assignment_operator_and_keyword
  • select_with_case_expression
  • select_with_collate_with_marker
  • select_with_date_format_function
  • select_with_exists_sub_query_with_project
  • select_with_function_name
  • select_with_json_value_return_type
  • select_with_match_against
  • select_with_regexp
  • select_with_schema_name_in_column_projection
  • select_with_schema_name_in_shorthand_projection
  • select_with_spatial_function
  • select_with_trim_expr
  • select_with_trim_expr_from_expr

You need to compare the difference between actual and expected, and then correct the logic in SQLNodeConverterEngine so that actual can be consistent with expected.

  •  Add IT test case for Deployment spec volume
  •  Add IT test case for Deployment spec template init containers
  •  Add IT test case for Deployment spec template spec containers
  •  Add IT test case for Deployment spec volume mounts
  •  Add IT test case for Deployment spec container ports
  •  Add IT test case for Deployment spec container image tag
  •  Add IT test case for Service spec ports
  •  Add IT test case for ConfigMap data serverconfig
  •  Add IT test case for ConfigMap data logback
     
    Notice, these issues can be a good example.
  • chore: add more Ginkgo tests for ComputeNode #203

Relevant Skills

  1. Master Go language, Ginkgo test framework
  2. Have a basic understanding of Apache ShardingSphere Concepts
  3. Be familiar with Kubernetes Operator, kubebuilder framework

Targets files

ComputeNode IT - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/reconcile/computenode/compute_node_test.go

Mentor

Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Chuxin Chen, mail: tuichenchuxin (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere Add the feature of switching logging framework

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org
Githubhttps://github.com/apache/shardingsphere 

Background

ShardingSphere provides two adapters: ShardingSphere-JDBC and ShardingSphere-Proxy.

Now, ShardingSphere uses logback for logging, but consider the following situations:

  • Users may need to switch the logging framework to meet special needs, such as log4j2 can provide better asynchronous performance;
  • When using the JDBC adapter, the user application may not use logback, which may cause some conflicts.


Why doesn't the log facade suffice? Because ShardingSphere provides users with clustered logging configurations (such as changing the log level online), this requires dynamic construction of logger, which cannot be achieved with only the log facade.

Task

1. Design and implement logging SPI to support multiple logging frameworks (such as logback and log4j2)
2. Allow users to choose which logging framework to use through the logging rule

Relevant Skills

1. Master JAVA language

2. Basic knowledge of logback and log4j2

3. Maven

Mentor

Longtao Jiang, Committer of Apache ShardingSphere, jianglongtao@apache.orgImage Added

Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.orgImage Added

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Longtao Jiang, mail: jianglongtao (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere Support mainstream database metadata table query

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org
Github

After you make changes, remember to add case to SUPPORTED_SQL_CASE_IDS to ensure it can be tested.

 
Notice, these issues can be a good example.
https://github.com/apache/shardingsphere/pull/14492

Relevant Skills

 1. Master JAVA language

2. Have a basic understanding of Antlr g4 file

3. Be familiar with MySQL and Calcite SqlNode

Targets files

 
SQLNodeConverterEngineIT

Background

ShardingSphere has designed its own metadata database to simulate metadata queries that support various databases.

More details:

https://github.com/apache/shardingsphere/blob/master/test/it/optimizer/src/test/java/org/issues/21268
https://github.com/apache/shardingsphere/test/it/optimize/SQLNodeConverterEngineIT.java 

Mentor

Zhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.org

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Removed

Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.org

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Zhengqiang Duan, mail: duanzhengqiang (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

issues/22052

Task

  • Support PostgreSQL And openGauss `\d tableName`
  • Support PostgreSQL And openGauss `\d+`
  • Support PostgreSQL And openGauss `\d+ tableName`
  • Support PostgreSQL And openGauss `l`
  • Support query for MySQL metadata `TABLES`
  • Support query for MySQL metadata `COLUMNS`
  • Support query for MySQL metadata `schemata`
  • Support query for MySQL metadata `ENGINES`
  • Support query for MySQL metadata `FILES`
  • Support query for MySQL metadata `VIEWS`

Notice, these issues can be a good example.

https://github.com/apache/shardingsphere/pull/22053
https://github.com/apache/shardingsphere/pull/22057/
https://github.com/apache/shardingsphere/pull/22166/
https://github.com/apache/shardingsphere/pull/22182

Relevant Skills

  •  Master JAVA language
  •  Have a basic understanding of Zookeeper
  •  Be familiar with MySQL/Postgres SQLs 


Mentor

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

Zhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.org

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Chuxin Chen, mail: tuichenchuxin (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere Add ShardingSphere Kafka source connector

Apache ShardingSphere Enhance ComputeNode reconciliation

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org/
Githubhttps://github.com/apache/shardingsphere shardingsphere 

Background

There is a proposal about new CRD Cluster and ComputeNode as belows:

Currently we try to promote ComputeNode as major CRD to represent a special ShardingSphere Proxy deployment. And plan to use Cluster indicating a special ShardingSphere Proxy cluster.

Task

This issue is to enhance ComputeNode reconciliation availability. The specific case list is as follows.

  •  Add IT test case for Deployment spec volume
  •  Add IT test case for Deployment spec template init containers
  •  Add IT test case for Deployment spec template spec containers
  •  Add IT test case for Deployment spec volume mounts
  •  Add IT test case for Deployment spec container ports
  •  Add IT test case for Deployment spec container image tag
  •  Add IT test case for Service spec ports
  •  Add IT test case for ConfigMap data serverconfig
  •  Add IT test case for ConfigMap data logback
     
    Notice, these issues can be a good example.
  • chore: add more Ginkgo tests for ComputeNode #203

Relevant Skills

  1. Master Go language, Ginkgo test framework
  2. Have a basic understanding of Apache ShardingSphere Concepts
  3. Be familiar with Kubernetes Operator, kubebuilder framework

Targets files

ComputeNode IT - 

The community just added CDC (change data capture) feature recently. Change feed will be published in created network connection after logging in, then it could be consumed.

Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.

Task

  1. Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
  2. Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
  3. Add unit test and E2E integration test.

Relevant Skills

  1. Java language
  2. Basic knowledge of CDC and Kafka
  3. Maven

References

shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/reconcile/computenode/compute_node_test.go

Local Test Steps

  1. Modify `conf/server.yaml`, uncomment `cdc-server-port: 33071` to enable CDC. (Refer to step 2)
  2. Configure proxy, refer to `Prerequisites` and `Procedure` in build to configure proxy (Newer version could be used too, current stable version is 5.3.1).
  3. Start proxy server, it'll start CDC server too.
  4. Download ShardingSphere source code from https://github.com/apache/shardingsphere , modify and run `org.apache.shardingsphere.data.pipeline.cdc.client.example.Bootstrap`. It'll print `records:` by default in `Bootstrap`.
  5. Execute some ISNERT/UPDATE/DELETE SQLs in proxy to generate change feed, and then check in `Bootstrap` console.

Mentor

Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

Xinze Guo

Mentor

Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Removed

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apacheazexin@apache.orgImage Removed


Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Chuxin ChenHongsheng Zhong, mail: tuichenchuxin zhonghongsheng (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere

Add the feature of switching logging framework

Write a converter to generate DistSQL

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Page:   https://shardingsphere.apache.org/
Github:   https://github.com/apache/shardingsphere 

Background

ShardingSphere provides two adapters: ShardingSphere-JDBC and ShardingSphere-Proxy.

Now, ShardingSphere uses logback for logging, but consider the following situations:

  • Users may need to switch the logging framework to meet special needs, such as log4j2 can provide better asynchronous performance;
  • When using the JDBC adapter, the user application may not use logback, which may cause some conflicts.

Why doesn't the log facade suffice? Because ShardingSphere provides users with clustered logging configurations (such as changing the log level online), this requires dynamic construction of logger, which cannot be achieved with only the log facade.

Task

1. Design and implement logging SPI to support multiple logging frameworks (such as logback and log4j2)
2. Allow users to choose which logging framework to use through the logging rule

Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

Task

The elementary task is that the storage node controller could manage the lifecycle of  a set of storage units, like PostgreSQL, in kubernetes. 

We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

  • [ ] Generate DistSQL according to the Golang struct `EncryptionRule`
  • [ ] Generate DistSQL according to the Golang struct `ShardingRule`
  • [ ] Generate DistSQL according to the Golang struct `ReadWriteSplittingRule`
  • [ ] Generate DistSQL according to the Golang struct `MaskRule`
  • [ ] Generate DistSQL according to the Golang struct `ShadowRule`

    Relevant Skills

1. Master JAVA Go language, Ginkgo test framework
2. Basic knowledge of logback and log4j2

3. Maven

Mentor

Longtao Jiang, Committer of Apache ShardingSphere, jianglongtao@apache.orgImage Removed

Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.orgImage Removed

Have a basic understanding of Apache ShardingSphere Concepts and DistSQL

Targets files

DistSQL Converter - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/distsql/converter.go, etc.

Example

A struct defined as below:

```golang
type EncryptRule struct{}
func (t EncryptRule) ToDistSQL() string {}
```
While invoking ToDistSQL() it will generate a DistSQL regarding a EncryptRule like:

```SQL
CREATE ENCRYPT RULE t_encrypt (....
```

References:

Mentor
Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

Difficulty: Major
Project
Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Longtao JiangLiyao Miao, mail: jianglongtao miaoliyao (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere

Support mainstream database metadata table query

Introduce new CRD as StorageNode for better usability

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Pagehttps://shardingsphere.apache.org/
Githubhttps://github.com/apache/shardingsphere 

Background

ShardingSphere has designed its own metadata database to simulate metadata queries that support various databases.

More details:

https://github.com/apache/shardingsphere/issues/21268
https://github.com/apache/shardingsphere/issues/22052

Task

  • Support PostgreSQL And openGauss `\d tableName`
  • Support PostgreSQL And openGauss `\d+`
  • Support PostgreSQL And openGauss `\d+ tableName`
  • Support PostgreSQL And openGauss `l`
  • Support query for MySQL metadata `TABLES`
  • Support query for MySQL metadata `COLUMNS`
  • Support query for MySQL metadata `schemata`
  • Support query for MySQL metadata `ENGINES`
  • Support query for MySQL metadata `FILES`
  • Support query for MySQL metadata `VIEWS`

Notice, these issues can be a good example.

There is a proposal about new CRD Cluster and ComputeNode as belows:

  • #167
  • #166

Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

Task

The elementary task is that the storage node controller could manage the lifecycle of a set of storage units, like PostgreSQL, in kubernetes.

We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

  • [ ] Create a PostgreSQL cluster while a StorageNode with pg parameters is created
  • [ ] Update the PostgreSQL cluster while updated StorageNode
  • [ ] Delete the PostgreSQL cluster while deleted StorageNode. Notice this may need a deletion strategy.
  • [ ] Reconciling StorageNode according to the status of PostgreSQL cluster.
  • [ ] The status of StorageNode would be consumed by common storage units related DistSQLs

Relevant Skills

1. Master Go language, Ginkgo test framework
2. Have a basic understanding of Apache ShardingSphere Concepts
3. Be familiar with Kubernetes Operator, kubebuilder framework

Targets files

StorageNode Controller - https://github.com/apache/shardingsphere-on-cloud/pull/22053
https://github.com/apache/shardingsphere/pull/22057/
https://github.com/apache/shardingsphere/pull/22166/
https://github.com/apache/shardingsphere/pull/22182

Relevant Skills

  •  Master JAVA language
  •  Have a basic understanding of Zookeeper
  •  Be familiar with MySQL/Postgres SQLs 

Mentor

blob/main/shardingsphere-operator/pkg/controllers/storagenode_controller.go


Mentor

Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Added

Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgZhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.orgImage Added

Difficulty: Major
Project size: ~350 hour (large)
Potential mentors:
Chuxin ChenLiyao Miao, mail: tuichenchuxin miaoliyao (at) apache.org
Project Devs, mail: dev (at) shardingsphere.apache.org

Apache ShardingSphere

Add ShardingSphere Kafka source connector

Introduce JVM chaos to ShardingSphere

Apache ShardingSphere

Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

Page:   https://shardingsphere.apache.org/
Github:  https://github.com/apache/shardingsphere 

Background

There is a proposal about the background of ChaosEngineering as belows:

The community just added CDC (change data capture) feature recently. Change feed will be published in created network connection after logging in, then it could be consumed.

Since Kafka is popular distributed event streaming platform, it's useful to import change feed into Kafka for later processing.

Task

  1. Familiar with ShardingSphere CDC client usage, create publication and subscribe change feed.
  2. Familiar with Kafka connector development, develop source connector, integrate with ShardingSphere CDC. Persist change feed to Kafka topics properly.
  3. Add unit test and E2E integration test.

Relevant Skills

  1. Java language
  2. Basic knowledge of CDC and Kafka
  3. Maven

References

Introduce ChaosEngineering for ShardingSphere #32
And we also proposed a generic controller for ShardingSphereChaos as belows:

[GSoC 2023] Introduce New CRD ShardingSphereChaos #272
The ShardingSphereChaos controller is aiming at different chaos tests. This JVMChaos is an important one.

Task

Write several scripts to implement different JVMChaos for main features of ShardingSphere. The specific case list is as follows.

  • Add scripts injecting chaos to DataSharding
  • Add scripts injecting chaos to ReadWritingSplitting
  • Add scripts injecting chaos to DatabaseDiscovery
  • Add scripts injecting chaos to Encryption
  • Add scripts injecting chaos to Mask
  • Add scripts injecting chaos to Shadow
    Basically, these scripts will cause unexpected behaviour while executing the related. DistSQL.

Relevant Skills

  • Master Go language, Ginkgo test framework
  • Have a deep understanding of Apache ShardingSphere concepts and practices.
  • JVM byte mechanisms like ByteMan, ByteBuddy.

Targets files

JVMChaos Scripts - https://github.com/apache/shardingsphere-on-cloud/

issues/22500
  • https://kafka.apache.org/documentation/#connect_development
  • https://github.com/apache/kafka/tree/trunk/connect/file/src
  • https://github.com/confluentinc/kafka-connect-jdbc
  • Local Test Steps

    1. Modify `conf/server.yaml`, uncomment `cdc-server-port: 33071` to enable CDC. (Refer to step 2)
    2. Configure proxy, refer to `Prerequisites` and `Procedure` in build to configure proxy (Newer version could be used too, current stable version is 5.3.1).
    3. Start proxy server, it'll start CDC server too.
    4. Download ShardingSphere source code from https://github.com/apache/shardingsphere , modify and run `org.apache.shardingsphere.data.pipeline.cdc.client.example.Bootstrap`. It'll print `records:` by default in `Bootstrap`.
    5. Execute some ISNERT/UPDATE/DELETE SQLs in proxy to generate change feed, and then check in `Bootstrap` console.

    Mentor

    Hongsheng Zhong, PMC of Apache ShardingSphere, zhonghongsheng@apache.org

    Xinze Guo, Committer of Apache ShardingSphere, azexin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Hongsheng Zhong, mail: zhonghongsheng (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    chaos/jvmchaos/scripts/

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org
    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Introduce New CRD ShardingSphereChaos

    Apache ShardingSphere

    Apache ShardingSphere

    Apache ShardingSphere Write a converter to generate DistSQL

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of  a set of storage units, like PostgreSQL, in kubernetes. 

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

  • [ ] Generate DistSQL according to the Golang struct `EncryptionRule`
  • [ ] Generate DistSQL according to the Golang struct `ShardingRule`
  • [ ] Generate DistSQL according to the Golang struct `ReadWriteSplittingRule`
  • [ ] Generate DistSQL according to the Golang struct `MaskRule`
  • There is a proposal about the background of ChaosEngineering as belows:

    The ShardingSphereChaos controller is aiming at different chaos tests. 

    Task

    Propose a generic controller for ShardingSphereChaos, which reconcile CRD ShardingSphereChaos, prepare, execute and verify test.

    • [ ] Support common ShardingSphere features, prepare test rules and dataset
    • [ ] Generating chaos type according to the backend implementation
    • [ ] Verify testing result with DistSQL or other tools
    [ ] Generate DistSQL according to the Golang struct `ShadowRule`

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic deep understanding of Apache ShardingSphere Concepts and DistSQLconcepts and practices.
    3. Kubernetes operator pattern, kube-builder 

    Targets files

    DistSQL Converter ShardingSphereChaos Controller - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/distsqlcontrollers/converterchaos_controller.go, etc.

    Example

    A struct defined as below:

    ```golang
    type EncryptRule struct{}
    func (t EncryptRule) ToDistSQL() string {}
    ```
    While invoking ToDistSQL() it will generate a DistSQL regarding a EncryptRule like:

    ```SQL
    CREATE ENCRYPT RULE t_encrypt (....
    ```

    References:


    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage AddedMentor
    Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Added

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere

    Introduce new CRD as StorageNode for better usability

    Enhance SQLNodeConverterEngine to support more MySQL SQL statements

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Pagehttps://shardingsphere.apache.org/
    Githubhttps://github.com/apache/shardingsphere 

    Background

    There is a proposal about new CRD Cluster and ComputeNode as belows:

    • #167
    • #166

    Currently we try to promote StorageNode as major CRD to represent a set of storage units for ShardingSphere.

    Task

    The elementary task is that the storage node controller could manage the lifecycle of a set of storage units, like PostgreSQL, in kubernetes.

    We don't hope to create another wheel like pg-operator. So consider using a predefined parameter group to generate the target CRD.

    • [ ] Create a PostgreSQL cluster while a StorageNode with pg parameters is created
    • [ ] Update the PostgreSQL cluster while updated StorageNode
    • [ ] Delete the PostgreSQL cluster while deleted StorageNode. Notice this may need a deletion strategy.
    • [ ] Reconciling StorageNode according to the status of PostgreSQL cluster.
    • [ ] The status of StorageNode would be consumed by common storage units related DistSQLs

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a basic understanding of Apache ShardingSphere Concepts
    3. Be familiar with Kubernetes Operator, kubebuilder framework

    Targets files

    StorageNode Controller - https://github.com/apache/shardingsphere-on-cloud/blob/main/shardingsphere-operator/pkg/controllers/storagenode_controller.go

    Mentor

    Liyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Removed

    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.orgImage Removed

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    Apache ShardingSphere Introduce JVM chaos to ShardingSphere

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    Introduce ChaosEngineering for ShardingSphere #32
    And we also proposed a generic controller for ShardingSphereChaos as belows:

    [GSoC 2023] Introduce New CRD ShardingSphereChaos #272
    The ShardingSphereChaos controller is aiming at different chaos tests. This JVMChaos is an important one.

    Task

    Write several scripts to implement different JVMChaos for main features of ShardingSphere. The specific case list is as follows.

    • Add scripts injecting chaos to DataSharding
    • Add scripts injecting chaos to ReadWritingSplitting
    • Add scripts injecting chaos to DatabaseDiscovery
    • Add scripts injecting chaos to Encryption
    • Add scripts injecting chaos to Mask
    • Add scripts injecting chaos to Shadow
      Basically, these scripts will cause unexpected behaviour while executing the related. DistSQL.

    Relevant Skills

    • Master Go language, Ginkgo test framework
    • Have a deep understanding of Apache ShardingSphere concepts and practices.
    • JVM byte mechanisms like ByteMan, ByteBuddy.

    Targets files

    JVMChaos Scripts - https://github.com/apache/shardingsphere-on-cloud/chaos/jvmchaos/scripts/

    Mentor
    Liyao Miao, Committer of Apache ShardingSphere, miaoliyao@apache.org
    Chuxin Chen, Committer of Apache ShardingSphere, tuichenchuxin@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao Miao, mail: miaoliyao (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    The ShardingSphere SQL federation engine provides support for complex SQL statements, and it can well support cross-database join queries, subqueries, aggregation queries and other statements. An important part of SQL federation engine is to convert the SQL statement parsed by ShardingSphere into SqlNode, so that Calcite can be used to implement SQL optimization and federated query.

    Task

    This issue is to solve the MySQL exception that occurs during SQLNodeConverterEngine conversion. The specific case list is as follows.

    • select_char
    • select_extract
    • select_from_dual
    • select_from_with_table
    • select_group_by_with_having_and_window
    • select_not_between_with_single_table
    • select_not_in_with_single_table
    • select_substring
    • select_trim
    • select_weight_string
    • select_where_with_bit_expr_with_ampersand
    • select_where_with_bit_expr_with_caret
    • select_where_with_bit_expr_with_div
    • select_where_with_bit_expr_with_minus_interval
    • select_where_with_bit_expr_with_mod
    • select_where_with_bit_expr_with_mod_sign
    • select_where_with_bit_expr_with_plus_interval
    • select_where_with_bit_expr_with_signed_left_shift
    • select_where_with_bit_expr_with_signed_right_shift
    • select_where_with_bit_expr_with_vertical_bar
    • select_where_with_boolean_primary_with_comparison_subquery
    • select_where_with_boolean_primary_with_is
    • select_where_with_boolean_primary_with_is_not
    • select_where_with_boolean_primary_with_null_safe
    • select_where_with_expr_with_and_sign
    • select_where_with_expr_with_is
    • select_where_with_expr_with_is_not
    • select_where_with_expr_with_not
    • select_where_with_expr_with_not_sign
    • select_where_with_expr_with_or_sign
    • select_where_with_expr_with_xor
    • select_where_with_predicate_with_in_subquery
    • select_where_with_predicate_with_regexp
    • select_where_with_predicate_with_sounds_like
    • select_where_with_simple_expr_with_collate
    • select_where_with_simple_expr_with_match

    You need to compare the difference between actual and expected, and then correct the logic in SQLNodeConverterEngine so that actual can be consistent with expected.

    After you make changes, remember to add case to SUPPORTED_SQL_CASE_IDS to ensure it can be tested.

     
    Notice, these issues can be a good example.
    https://github.com/apache/shardingsphere/pull/14492

    Relevant Skills

     
    1. Master JAVA language

    2. Have a basic understanding of Antlr g4 file

    3. Be familiar with MySQL and Calcite SqlNode

    Targets files

     
    SQLNodeConverterEngineIT

    Apache ShardingSphere Introduce New CRD ShardingSphereChaos

    Apache ShardingSphere

    Apache ShardingSphere is positioned as a Database Plus, and aims at building a standard layer and ecosystem above heterogeneous databases. It focuses on how to reuse existing databases and their respective upper layer, rather than creating a new database. The goal is to minimize or eliminate the challenges caused by underlying databases fragmentation.

    Page: https://shardingsphere.apache.org/
    Github: https://github.com/apache/shardingsphere 

    Background

    There is a proposal about the background of ChaosEngineering as belows:

    The ShardingSphereChaos controller is aiming at different chaos tests. 

    Task

    Propose a generic controller for ShardingSphereChaos, which reconcile CRD ShardingSphereChaos, prepare, execute and verify test.

    • [ ] Support common ShardingSphere features, prepare test rules and dataset
    • [ ] Generating chaos type according to the backend implementation
    • [ ] Verify testing result with DistSQL or other tools

    Relevant Skills

    1. Master Go language, Ginkgo test framework
    2. Have a deep understanding of Apache ShardingSphere concepts and practices.
    3. Kubernetes operator pattern, kube-builder 

    Targets files

    ShardingSphereChaos Controller - https://github.com/apache/shardingsphere-on-cloud/shardingsphere-operator/pkg/controllers/chaos_controller.go, etc.

    Mentor

    /blob/master/test/it/optimizer/src/test/java/org/apache/shardingsphere/test/it/optimize/SQLNodeConverterEngineIT.java 

    Mentor

    Zhengqiang Duan, PMC of Apache ShardingSphere, duanzhengqiang@apache.orgLiyao Miao, Committer of Apache ShardingSphere,  miaoliyao@apache.orgImage Removed

    Chuxin Chen, Committer of Apache ShardingSphere,   tuichenchuxin@apache.org

    Trista Pan, PMC of Apache ShardingSphere, panjuan@apache.org

    Difficulty: Major
    Project size: ~350 hour (large)
    Potential mentors:
    Liyao MiaoZhengqiang Duan, mail: miaoliyao duanzhengqiang (at) apache.org
    Project Devs, mail: dev (at) shardingsphere.apache.org

    ...