Apache ShardingSphere Enterprise User Case - Energy Monster · ShardingSphere - Blog

Apache ShardingSphere Enterprise User Case - Energy Monster

Energy Monster’s application of ShardingSphere-JDBC

Energy Monster is a consumer tech company with the mission to energize everyday life. The company is the largest provider of mobile device charging services in Asia.

As the company’s business concurrency volume is getting larger, the amount of data generated (users, orders, activities, etc.) increases each day. The traditional relational database has proven to be inadequate in supporting millions or tens of millions of data volumes in a single database or table.

Performance has been unable to meet the benchmark requirements of business development. Under these circumstances, data sharding is an effective way to solve the problem.

Technology selection

Under the Database Plus concept, ShardingSphere is designed to build an ecosystem on top of heterogeneous databases. The goal is to provide globally scalable and enhanced computing capabilities while maximizing the original database computing capabilities.

The interaction between applications and databases becomes oriented towards the Database Plus standard, therefore minimizing the impact of database fragmentation on upper-layer services.

Within the ShardingSphere ecosystem, ShardingSphere-JDBC is positioned as a lightweight Java framework, providing additional services in Java’s JDBC layer.

It uses the client to directly connect to the database and provide services in the form of a jar package, without additional deployment and dependence. It can be understood as an enhanced version of the JDBC driver, which is fully compatible with JDBC and various ORM frameworks.

ShardingSphere-JDBC enables developers to focus only on the work outside the data layer by coordinating the data read and write under the data sharding, instead of using business code to manually select databases and tables.

Business case

UCS is Energy Monster’s user-centric service providing basic functionality for users on the Server side. In 2018, it was stripped from PHP Server and moved to the Java technology stack to implement microservitization.

It involves the design of new databases and tables and data cleaning and migration. The whole switchover process was expected to ensure the following functions:

Solutions to data cleansing and migration

img

Data sharding strategy

The database adopts a database shards design, divided into 16 databases. The default shard key is user_id and the default sharding strategy user_id is mod 16, such as ${user_id % 16} for the user table. For SQL that does not carry shard keys, broadcast routing is used.

img

user_id is used as the shard key because user_id can cover most business scenarios, and other fields possibly can be empty. In the local test, the query of shard key strategy (openId,mobile) took 50ms to 200ms.

Using the sharding algorithm

There are currently three sharding algorithms available.

Upgrading ShardingSphere-JDBC

ShardingSphere-JDBC is used in multiple business scenarios, such as order, inventory, and finance. By 2021, the R&D groups or teams were using different versions of ShardingSphere-JDBC, ranging from 1.X to 4.X, which is difficult to achieve unified maintenance in the later stage.

Additionally, there are some potential bugs and missing functions in the earlier version. Based on requirements for unified management and availability, we implemented a project to unify the ShardingSphere-JDBC’s versions used by the company and upgrade them to a 4.1.1 stable version in April 2021.

The following problems were encountered during the upgrade:

1. It takes a long time to start the service after the upgrade.

ShardingSphere-JDBC checks the metadata consistency of sub-tables when the service is started. The configuration item max.connections.size.per.quer (maximum number of connections that can be opened per query) is 1 by default. With a large number of tables, the loading process would be slow. You need to refer to the connection pool configuration to improve the loading speed.

2. There is no response when there is no shard key in the sub-table query.

img

Logical SQL query does not specify shard keys and it queries all the tables according to the whole database tables router in broadcasting routing.

The configuration items have 108 pieces of real tables in a database. According to the configuration of maxConnectionsizeperquery=50, ShardingSphere-JDBC uses the connection limit mode, divides the query requests into three groups, and merges the results with in-memory. As a result, 36 database connections are required for one query. But the maxActive configured by the druid thread pool is set to 20, resulting in a deadlock.

img

img

img

img

Solutions:

3. After upgrading from 1.X, an error message “Cannot update Sharding key” is displayed in SQL execution, and the actual shard key value is not updated.

To avoid data query failure caused by changing the shard key value, shard key detection is added to the SQL update in the 4.X version. The error can be rectified in the following ways:

4. A start failure is caused when using druid-spring-boot-starter, which is incompatible with Sharding-datasource.

The druid data connection pool starter will load and create a default data source. This will cause conflicts when ShardingSphere-JDBC creates data sources.

5. inline strategy reports an error in range query.

The inline strategy doesn’t support range query by default and the standard strategy is advised. Add the following configuration if the inline strategy is needed for the range query.

spring.shardingsphere.props.allow.range.query.with.inline.sharding: true

Note: Here all the inline strategy range queries will query each sub-table in broadcasting.

6. The “Cannot find owner from table” error is reported.

SQL (simplified):

select id from (select id from x) as a group by a.id

The 4.X version supports limited sub-queries. This problem is caused by the name of the intermediate table. Remove the table alias of select or group order or other fields.

https://github.com/apache/shardingsphere/issues/4810

7. The table’s primary key conflicts with the primary key generated by the SNOWFLAKE algorithm.

ShardingSphere provides flexible ways to configure distributed primary key generation strategies. In the sharding rule configuration module, you can configure the primary key generation strategy for each table.

By default, the snowflake algorithm is used to generate long integer data of 64bit. The snowflake generator needs to be configured with:

spring.shardingsphere.sharding.tables.x.key-generator.props.worker.id = ${dcc.node.id}

img

The company uses the apollo configuration center to deliver the node id of the service instance. The service uses multi-data sources. If you use the YAML file to load sharding configuration, the workId cannot be automatically loaded into sharding configuration items.

Solutions:

img

Use the custom generator type based on the built-in SnowflakeShardingKeyGenerator.

If the primary key is used as a shard key, configure max.vibration.offset based on the data sharding value to increase the vibration range.

img

8. The 3.X version reports an error when CASE WHEN statement is executed.

First, the 3.X and 4.X versions don’t support the case when statement.

The 3.X and 4.X versions have different logics when parsing the shard keys of case when’s update statement. The 4.X parserEngine.parse method will ignore the case when parsing parameters, resulting in inconsistency with the external parameter list and an error when 3.X executes the normal SQL.

The 3.X version works correctly because the first parameter of case when is intentionally set to the shard key when the SQL is written, and the case when statement comes first.

https://github.com/apache/shardingsphere/issues/13233

Solutions:

9. The logical table actualDataNodes is configured and no default value error is reported for the primary key.

img

img

The check.table.metadata.enabled=true is not configured for service, and the metadata consistency of sub-tables is not checked by default.

The first table of actualDataNodes configured by services does not exist, resulting in an empty GenerateKeyContenxt.

img

img

Solutions:

10. In version 3.0, there is a deadlock under the high concurrency of the full database and table router.

ShardingSphere-JDBC uses local transactions by default. In local transactions, the database connection is obtained asynchronously. Under high concurrency, it is possible that all database connections cannot be obtained, resulting in a deadlock.

img

img

img

img

Conclusion

As a ShardingSphere core user, Energy Monster’s upgrade process also reflects some problems that community users may encounter in the application of ShardingSphere.

Currently, Apache ShardingSphere’s stable version has been updated to 5.1.2 and has been optimized in terms of its functions, performance, testing, documentation, and examples.

You can refer to Apache ShardingSphere’s official website for more information. If you have any questions or suggestions, you are also welcome to give feedback on Github. The community will actively respond and discuss.

Project Links:

ShardingSphere Github

ShardingSphere Twitter

ShardingSphere Slack

Contributor Guide

GitHub Issues

Contributor Guide