Making ScyllaDB Simple with Charybdis

Published in

ZeroFlucs

5 min readOct 11, 2022

At ZeroFlucs we’ve been using open-source software through the business to help us grow and innovate, as we work to put simulation based pricing technology into the hands of every bookmaker and wagering service provider on earth. Now, we’re starting to give something back.

The first package we’re releasing into the wild today is charybdis our internal support package for working with ScyllaDB. In this post, we’ll go through what it is — why it exists, and how hopefully it’s something worth sharing.

Named for the other monster that lived across the way from Scylla, Charybdis is a package we built to make our own lives easier, and we think it might make yours easier too.

Give Me Code Now!

Alright, alright! It’s over at our new Github org, zeroflucs-given:

GitHub - zeroflucs-given/charybdis: The charybdis package provides helpers for low-code integration…

"For on one side lay Scylla and on the other divine Charybdis terribly sucked down the salt water of the sea. Verily…

github.com

Our Problem Domain — Selecting Scylla

For those who are already intimately familiar with Scylla, you can jump past this section — but for newcomers, buckle up!

We need a high scale database. Every time a ball is passed, every time a price changes — our platform needs to respond quickly, executing hundreds of thousands of simulations of virtual sports matches and races, to calculate pricing to offer to customers. Having used a range of database technologies before, there’s typically a point past which performance falls off a cliff or scaling becomes impractical.

Time is very much money, and the workloads associated with our line of work require us to handle very noisy source feeds of price changes (up to 100,000 messages/minute), multiplied by the number of customers and then each of those can in turn cause thousands of prices to be changed as a consequence.

The other challenge is that whilst we operate one homogenous platform globally, each of our customers could be “homed” in a different geographic region — but need data available at local latency in multiple locations.

After much searching and prototyping, we honed in on ScyllaDB — with it’s brutally simple data modelling paradigms (no relations, joins are all but forbidden etc), there came a high performance that we’ve been unable to match readily elsewhere.

Developing using ScyllaDB with Go

Our platform is written primarily in Go, the open source language created by Google. High performance, low footprint and capable of great and evil things.

The Go developer experience today is really an extension of the Cassandra developer experience (Cassandra being the database that ScyllaDB is mostly wire-compatible with), and that’s “not great”. Finding a collection of tools and scripts and packages that let us focus on writing code without wrangling with keyspace definitions, or exposing many of the advanced ScyllaDB features proved quite vexing.

Our initial code for services bore out this pain — lots of boilerplate and repetition, varying only in the types/tables being used. It offended the eyes, and it also created a lot of opportunities for our code to contain duplication, or accidental errors when duplicating code.

Enter Stage Left: Go Generics

With the advent of Go 1.18 came the generics support many Go developers have longed for. However the gocql and gocqlx packages we use have a strong compatibility focus, meaning there was no (and continues to be no) firm timeline for bringing generics into the fold.

For example, this is a cut down summary of how to write a record of a given type — from the Gocqlx examples:

p := Person{
	"John",
	"Smith",
	"foo@bar.com",
}q := session.Query(personTable.Insert()).BindStruct(p)
if err := q.ExecRelease(); err != nil {
	t.Fatal(err)
}

Reading, similarly:

var p Person
q := session.Query(personTable.Get()).BindStruct(p)
if err := q.GetRelease(&p); err != nil {
	t.Fatal(err)
}

Thus, one of our main design goals was to reduce most operations to one-liners. Our solution became to create a “manager” type that was generic / type-aliased to your record type per table, and can simplify read/write/select operations.

Reading that record back now is still a single line, but is really only one command, and not a chain of them.

record, err := manager.GetByPartitionKey(ctx, "test-user-1")

Custom Queries, TTLs and Options

Another sore point for us was if you wanted to use TTL’s, Lightweight Transactions or any of the other advanced features. The “single line” of code that was present before now requires you to build a custom query — there’s no half-measures, you’re now trying to essentially fight with the “qb” (Query Builder) package to build a query through a chain of calls.

Our solution was to use more go-idiomatic “WithOptionName()” parameters to mutate the query. For example, to insert a record with a TTL is very simple:

errUpsert := manager.Insert(ctx, &UserVisit{
   UserID:   "john-smith"
   FirstName: "John",
   Visits:    0,  
}, tables.WithTTL(time.Minute))

This same approach is used for any query variation such as:

Record Expiry (TTL)
IF NOT EXISTS / IF EXISTS (Insert/Update vs Upsert semantics)
IF x = Y (Conditional updates)
Varying consistency levels up/down from the default level.

Dynamic Schema Management

We didn’t want everyone to be constantly worrying about ScyllaDB schema management, so part of the package is a DDL generator. This can reflect over your structures (or a static configuration object describing your tables), and provide all of the metadata for the helpers to work.

To facilitate simple schema evolution, the package will automatically create and maintain your tables, if you require it — as seen below:

When creating your TableManager[T] instances, you can enable this functionality

manager, err := tables.NewTableManager[Record](ctx,
   tables.WithCluster(cluster),
   tables.WithLogger(log),
   tables.WithKeyspace("examples"),
   mapping.WithAutomaticTableSpecification[Record]("user_visits"),
   generator.WithSimpleKeyspaceManagement(log, cluster, 1), 
   generator.WithAutomaticTableManagement(log, cluster))

Those last 3 lines provide:

Creation of the relevant metadata by reflecting over a structure.
Creation of a keyspace (IF NOT EXISTS) with a basic replication model (you’d want to use a more advanced approach in the wild).
Simple automatic table creation and column extensions (ALTER TABLE …. ADD)

Thats It

Now you understand what Charybdis is and why we wrote it. The version on Github is a rework of an internal package we’d gradually refined over the last year, with some features still to be ported over — however it’s the start of two stories:

This package will take on a life of its own.
We’re also going to routinely give back to the Open Source community components that we can where it makes sense, to help others from the learnings we’ve had over the last year.

Last but not least, for anyone wondering — we’ve used Dall-E to summon up the logo for the project.