Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Contents

Syncope

Debezium integration for live sync

Debezium provides a mean to transform changes from RDBMS, MongoDB and Cassandra into Kafka messages.

This tool can be leveraged to implement a listener-like approach to enable a "live-sync" mechanism from External Resources not requiring ConnId.

Difficulty: Major
Potential mentors:
Francesco Chicchiriccò, mail: ilgrosso (at) apache.org
Project Devs, mail:

Allow export for individual items in XML

Provide a functionality in the admin console (and from REST) that allows one to view the XML configuration of an individual item. For example, an user would be interested in seeing the XML representation of the new configuration parameter, or it might be a user object, group, provisioning rules, etc.

Difficulty: Minor
Potential mentors:
Andrea Patricelli, mail: andrea.patricelli (at) apache.org
Project Devs, mail:

...

Add Plugin Support for CQLSH

Currently the Cassandra drivers offer a plugin authenticator architecture for the support of different authentication methods. This has been leveraged to provide support for LDAP, Kerberos, and Sigv4 authentication. Unfortunately, cqlsh, the included CLI tool, does not offer such support. Switching to a new enhanced authentication scheme thus means being cut off from using cqlsh in normal operation.

We should have a means of using the same plugins and authentication providers as the Python Cassandra driver.

Here's a link to an initial draft of CEP.

Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

Prevent and fail-fast any attempts to incremental repair cdc/mv tables

Running incremental repairs on CDC or MV tables breaks them.

Attempting to run incremental repair on such should fail-fast and be prevented, with a clear error message

Add ability to ttl snapshots

It should be possible to add a TTL to snapshots, after which it automatically cleans itself up.

This will be useful together with the auto_snapshot option, where you want to keep an emergency snapshot in case of accidental drop or truncation but automatically remove it after a specified period when it's no longer useful. So in addition to allowing a user to specify a snapshot ttl on nodetool snapshot we should have a auto_snapshot_ttl option that allows a user to set a ttl for automatic snaphots on drop/truncate.

Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

Prevent and fail-fast any attempts to incremental repair cdc/mv tables

Running incremental repairs on CDC or MV tables breaks them.

Attempting to run incremental repair on such should fail-fast and be prevented, with a clear error message.

Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

Script to autogenerate cassandra.yaml

It would

Script to autogenerate cassandra.yaml

It would be useful to have a script that can ask the user a few questions and generate a recommended cassandra.yaml based on their answers. This will help solve issues like selecting num_tokens. It can also be integrated into OS specific packaging tools such as debconf[1]. Rather than just documenting on the website, it is best to provide a simple script to auto-generate configuration based on common use-cases.

[1] https://wiki.debian.org/debconf

Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

...

Add nodetool command to display or export the contents of a virtual table

Several virtual tables were recently added, but they're currently only accessible via cqlsh or programmatically. While this is valuable for many use cases, operators are accustomed with the convenience of querying system metrics with a simple nodetool command.

In addition to that, a relatively common request is to provide nodetool output in different formats (JSON, YAML and even XML) (CASSANDRA-5977, CASSANDRA-12035, CASSANDRA-12486, CASSANDRA-12698, CASSANDRA-12503). However this requires lots of manual labor as each nodetool subcommand needs to be adapted to support new output formats.

I propose adding a new nodetool command that will consistently print to the standard output the contents of a virtual table. By default the command will print the output in a human-readable tabular format similar to cqlsh, but a "--format" parameter can be specified to modify the output to some other format like JSON or YAML.

It should be possible to add a limit to the amount of rows displayed and filter to display only rows from a specific keyspace or table. The command should be flexible and provide simple hooks for registration and customization of new virtual tables.

I propose calling this command nodetool show <virtualtable> (naming bikeshedding welcome), for example:

nodetool show --list
            caches
            clients
            internode_inbound
            internode_outbound
            settings
            sstable_tasks
            system_properties
            thread_pools
            
            nodetool show clients --format yaml
            ...
            nodetool show internode_outboud --format json
            ...
            nodetool show sstabletasks --keyspace my_ks --table -my_table
            ...
            
Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

Beam

Add ability to ttl snapshots

It should be possible to add a TTL to snapshots, after which it automatically cleans itself up.

This will be useful together with the auto_snapshot option, where you want to keep an emergency snapshot in case of accidental drop or truncation but automatically remove it after a specified period when it's no longer useful. So in addition to allowing a user to specify a snapshot ttl on nodetool snapshot we should have a auto_snapshot_ttl option that allows a user to set a ttl for automatic snaphots on drop/truncate.

Difficulty: Normal
Potential mentors:
paulo, mail: paulo (at) apache.org
Project Devs, mail: dev (at) cassandra.apache.org

Beam

Profile and improve performance for the local FnApiRunner

The FnApiRunner is undergoing a series of changes to support streaming. These changes are altering its execution significantly, and may introduce inefficiencies.

This project has the following deliverables:

  • A report with results from profiling the execution of a pipeline, and finding hotspots, and inefficiencies
  • Code improvements to speed up the execution of the FnApiRunner
  • Improvements to the FnApiRunner manual to instruct others on how to do profiling.


Tools that you may need to use:

Profile and improve performance for the local FnApiRunner

The FnApiRunner is undergoing a series of changes to support streaming. These changes are altering its execution significantly, and may introduce inefficiencies.

This project has the following deliverables:

  • A report with results from profiling the execution of a pipeline, and finding hotspots, and inefficiencies
  • Code improvements to speed up the execution of the FnApiRunner
  • Improvements to the FnApiRunner manual to instruct others on how to do profiling.

Tools that you may need to use:

Contact  Pablo in dev@beam.apache.org to ask questions about this project.

Difficulty: P2
Potential mentors:
Pablo Estrada, mail: pabloem (at) apache.org
Project Devs, mail: dev (at) beam.apache.org

...

Apache APISIX: improve the website

Apache APISIX

Apache APISIX is a dynamic, real-time, high-performance API gateway, based on the Nginx library and etcd, and we have a standalone website to let more people know about the Apache APISIX. 

Background

The website of Apache APISIX is used for showing people what's Apache APISIX is, and it will include up to date docs to let developers searching guides more easily, and so on.

Task

In the website[1]  and its repo[2], we are going to refactor the homepage, improve those docs which include apisix's docs and some like release guide.

Relevant Skills
TypeScript

React.js

Mentor

Zhiyuan, PMC of Apache APISIX, juzhiyuan@apache.org


[1] https://apisix.apache.org/

[2]https://github.com/apache/apisix-website

Difficulty: Major
Potential mentors:
Zhiyuan, mail: juzhiyuan (at) apache.org
Project Devs, mail: dev (at) apisix.apache.org

Record short videos about Apache APISIX

: Enhanced verification for APISIX ingress controller

Apache APISIX

Apache APISIX is a dynamic, real-time, high-performance API gateway, based on the Nginx library and etcd, and we have a standalone website to let more people know about the Apache APISIX. 

APISIX provides rich traffic management features such as load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, and more.

You can use Apache APISIX to handle traditional north-south traffic, as well as east-west traffic between services. It can also be used as a k8s ingress controller.

Background

We can use APISIX as kubernetes ingress.Use CRD (Custom Resource Definition) on kubernetes to define APISIX objects, such as route/service/upstream/plugins.

We have done basic structural verification of CRD, but we still need more verification. For example, plug-in schema verification, dependency verification between APISIX objects, rule conflict verification, etc. All these verifications need to be completed before CRD is applied.

Task

1. Implement a validating admission webhook.
2. Support plugins schema verification.
3. Support object dependency verification.

Relevant Skills

1. Golang
2. Be familiar with Apache APISIX's admin API
3. Be familiar with kubernetes

Mentor

Wei Jin, PMC of Apache APISIX, kvn@apache.org

Background

Apache APISIX has the official website[1], we would like to record more videos about How Apache APISIX works, What Apache APISIX is, How to write plugins for Apache APISIX, etc, to help more users to know what & how Apache APISIX works

Task

  • Draft the video list outline (I will provide it);
  • To read & try to use Apache APISIX docs, we must clearly know what it is;
  • Record videos (30 videos IMO).

Relevant Skills

  • Read Docs;
  • will use FinalCut or PR;
  • Know about Apache APISIX;

Mentor

Zhiyuan, PMC of Apache APISIX, juzhiyuan@apache.org

[1] https://apisix.apache.org/

Difficulty: Major
Potential mentors:
Wei JinZhiyuan, mail: kvn juzhiyuan (at) apache.org
Project Devs, mail: dev (at) apisix.apache.org
Record short videos about

Apache APISIX: Enhanced verification for APISIX ingress controller

Apache APISIX

Apache APISIX is a dynamic, real-time, high-performance API gateway, based on the Nginx library and etcd, and we have a standalone website to let more people know about the Apache APISIX. 

Background

Apache APISIX has the official website[1], we would like to record more videos about How Apache APISIX works, What Apache APISIX is, How to write plugins for Apache APISIX, etc, to help more users to know what & how Apache APISIX works

Task

  • Draft the video list outline (I will provide it);
  • To read & try to use Apache APISIX docs, we must clearly know what it is;
  • Record videos (30 videos IMO).

Relevant Skills

  • Read Docs;
  • will use FinalCut or PR;
  • Know about Apache APISIX;

Mentor

Zhiyuan, PMC of Apache APISIX, juzhiyuan@apache.org

dynamic, real-time, high-performance API gateway, based on the Nginx library and etcd.

APISIX provides rich traffic management features such as load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, and more.

You can use Apache APISIX to handle traditional north-south traffic, as well as east-west traffic between services. It can also be used as a k8s ingress controller.

Background

We can use APISIX as kubernetes ingress.Use CRD (Custom Resource Definition) on kubernetes to define APISIX objects, such as route/service/upstream/plugins.

We have done basic structural verification of CRD, but we still need more verification. For example, plug-in schema verification, dependency verification between APISIX objects, rule conflict verification, etc. All these verifications need to be completed before CRD is applied.

Task

1. Implement a validating admission webhook.
2. Support plugins schema verification.
3. Support object dependency verification.

Relevant Skills

1. Golang
2. Be familiar with Apache APISIX's admin API
3. Be familiar with kubernetes

Mentor

Wei Jin, PMC of Apache APISIX, kvn@apache.org[1] https://apisix.apache.org/

Difficulty: Major
Potential mentors:
ZhiyuanWei Jin, mail: juzhiyuan kvn (at) apache.org
Project Devs, mail: dev (at) apisix.apache.org