Planet Scale SQL For The New Generation Of Applications With YugabyteDB

Hello, and welcome to the Data Engineering Podcast, the show about modern data management. When you're ready to build your next pipeline or want to test out the project you hear about on the show, you'll need somewhere to deploy it, so check out our friends over at Linode.

With 200 gigabit private networking, scalable shared block storage, and a 40 gigabit public network, you've got everything you need to run a fast, reliable, and bulletproof data platform.

If you need global distribution, they've got that covered too with worldwide data centers, including new ones in Toronto and Mumbai.

And for your machine learning workloads, they just announced dedicated CPU instances, and they've got GPU instances as well.

Go to data engineering podcast.com/linode,

that's l I n o d e, today to get a $20 credit and launch a new server in under a minute. And don't forget to thank them for their continued support of this show.

And you listen to this show to learn and stay up to date with what's happening in databases,

streaming platforms, big data, and everything else you need to know about modern data management.

For even more opportunities to meet, listen, and learn from your peers, you don't want to miss out on this year's conference season.

We have partnered with organizations such as O'Reilly Media, Corinium Global Intelligence,

ODSC,

and Data Council.

Upcoming events include the software architecture conference, the Strata Data Conference, and PyCon US.

Go to data engineering podcast dotcom/conferences

to learn more about these and other events and take advantage of our partner discounts to save money when you register today.

Your host is Tobias Macy. And today, I'm interviewing Kartik Ranganathan about YougabyteDB,

the open source high performance distributed SQL database for global Internet scale apps. So, Kartik, can you start by introducing yourself? Absolutely.

Thanks for having me on here, Tobias, first off. Hi, folks. I'm Kartik. I'm 1 of the cofounders and the CTO of, of the company, Yugabyte, which is the company behind Yugabyte DB, the open source data distributed SQL database. I've been an engineer

forever, like, you know, right from like, for many, many years and, longer than I care to remember, I guess. And, like, right before starting Gigabyte, I was at Nutanix, where I was working on distributed storage for about 3 years.

Before that, I was at Facebook for about 6 years working on distributed

databases.

The first database I worked on at Facebook was Apache Cassandra. I mean, obviously, when I started working on it, it wasn't called Apache Cassandra, and it wasn't open source. So it was the very early days. Subsequently

started working on,

Apache HBase. So me and all both my other cofounders, we're all HBase committers as well. And the team as a whole, we've have this unique experience of working on databases and running them in production and so on and so forth. Before Facebook, I was at Microsoft working on the wireless stack and so on. So yeah. And after that, it probably gets boring.

Yeah. It's interesting how many people I've spoken with who have gone on to create their own database companies, who have a background working with some of these different

open source databases that have become sort of venerable

and part of the general canon of of the overall database market. Yeah. That's right. I think,

it's it's an interesting,

phase that we're in for sure. Like, and especially, like, over the last 10 years, there's been an explosion of data, explosion of digital applications,

and, every person building a database and running it, like, definitely gets insight and especially in the context of larger companies that are often ahead in time of the of the common enterprise. Right? So there's a a lot of learning to be had from that. And, at Yugabyte, we bring a lot of that together. Right? So and that's so it's it's our unique path, but I'm sure, like you said, everybody has their own reason, and,

there's definitely a need. And before you began working on all these different databases, I'm curious how you first got involved in the area of data management.

That's,

well, to be perfectly honest, it's by accident. But, let me try to go for a slightly more sophisticated answer. I was originally working on data in the context of networking.

So it was distributed data, but not exactly data being stored. And I I remember joining Facebook in 2007. Right? Like, interesting anecdote that, like, when I was joining Facebook, there were 30, 000, 000 users on the site, give or take. And I remember thinking at that time, I mean, it's 30, 000, 000 users already. I mean, how much more is this thing gonna grow anyway? Maybe it'll double in size, but there's interesting data challenges. So let me go work on the data stuff at Facebook. Well, anecdotally, it became 2, 000, 000, 000 users plus, and that's a different story. So but at that type of scale, like when you go from 30, 000, 000 users to 2 or 3, 000, 000, 000 users, there's an enormous amount of pressure at every layer of the infrastructure.

And,

I was

starting out building

inbox search. So the the way I got involved in databases was through inbox search. The problem that was presented to me and a couple other folks, like, we were working as a team, was,

you've got all this inbox data in Facebook, which was not Facebook messages of today. It's the older,

incarnation of Facebook messages,

and people needed to be able to search it. Now the funny thing about searching your messages is that you do it very rarely. So it's read very rarely, but it's gets a lot of rights because every word in your message

to each of the recipients

is a is like a reverse index entry. Right? So so it is extremely write heavy, very rarely read. Nobody wanted to spend a lot of money on it. Nobody definitely wanted to sit around babysitting this thing because of its scale, and it had to be multi data center ready. Right? So taking all this together,

before I knew it, I guess I was involved in data and databases.

And, what we ended up building then

finally ended up getting open sourced as Apache Cassandra. That's pretty funny. That's I didn't realize that that particular background of the Cassandra project. I know it came out of Facebook, but I didn't know that it was for that specific sort of write heavy, read rarely use case. Yeah. Actually, there's a lot more had had to do with the actual use case. We actually decided to constrain the problem and

write

it, and write it, and we're done. Right? So eventual consistency in Cassandra was born from that aspect. The other funny thing was, I mean, I don't know how much of a a bragging right it is, but I I had the opportunity to name the project because everybody else was so busy building the project. So I figured, like, Cassandra was a good name because Delphi was taken, and the next most famous Oracle was Cassandra. Right? So we ended up picking that name, and who knew at that time it would actually get so popular?

It's always funny getting to get some context about the history of these projects that so many people have used and have become so widely,

adopted.

Yeah. Absolutely. And and back then, you don't realize it. Right? Like, when, like, we were putting this thing together or doing it. Like, I mean, open source wasn't that popular back then. Database is definitely not. There was no nothing called NoSQL back then. So it it was it was a lot of interesting twists and turns that the world went through, and it's it's been pretty, rapid. Right? The NoSQL is now such a state staple thing, but back then, it wasn't even a term. And the other funny part of all this is that SQL has come back because of the fact that we have figured out ways to actually make it scale beyond what was originally thought to be the natural limits of that particular approach to things.

And so that brings us to the work that you're doing with Yougabite. And I'm wondering

if you can talk a bit about what the platform

is as a product and some of the origin story of why you thought it was necessary and how it came to market.

Absolutely. Absolutely.

So after working on Cassandra, like I told you, like, I went on to work on HBase where, like, where I ended up building some of the core features in HBase.

And, having,

like, been a part of both communities, like, people would consistently ask 1 question back then, and it was a question that I personally and all the people around me would soon come to dread because back then it was not a feature. It's like, could you just give me that secondary index, please? Like, that's all I need. But, I mean, it was very difficult to explain to people that, yes, that's all you need, but making it consistent and correct and always work in every failure scenario is a lot harder than just adding a secondary index. Like so so that was 1 of the core learnings where we knew that, like, is it worth rebuilding the entire database for just a secondary index? Probably not. Right? So but fast forward a few years, what we've seen is that the world is developing

a number of applications, a new fundamental new breed of applications,

and they have some fundamental properties. Right? They need transactions,

multi node transactions.

They need low latency, so the ability to serve data with low

millisecond latency. So you can see infrastructure such as 5 g, the edge, so on and so forth become more and more popular, and all of these will impose low latency access. The other big paradigm is massive scale. So people want to be able to add nodes and handle a lot of reads and writes. Like, in fact, I would say that, like, what the big tech companies, the top 5 or 10 tech companies were doing in terms of operations per day, we're seeing, like, regular startups being able to hit that level. Like, in fact, we have a bunch of startups

all doing more than a 1000000000 operations per day on the Yugabyte database itself, right, which was considered a big thing. But people are able to approach that level of scale because cloud has made it accessible. The other thing that cloud has fundamentally changed is geographic distribution of data. Right? Whether it be for getting low latency access to users or whether it be for GDPR or whatever other purpose, people are increasingly thinking geographically distributed apps. It could also be things like Uber and Lyft having to do search pricing within 1 mile geofence or, like there's a number of different scenarios where this starts coming up. So if you put all this together, you need transactions, low latency, scale, and geographic distribution. Now you could give any API, but the other change that we've seen happen in the market with especially with the proliferation of so many NoSQL databases is that people realize they can get some apps done quickly with NoSQL. But when they really wanna build other features, NoSQL is often limiting in what it doesn't have as features. So the pendulum has swung the other way to people expecting, hey. Why don't you just give me all of SQL, and I will decide which ones I want and which ones I don't want, which ones are performant, therefore, I'll use most of the time, and which ones are not performant but I will use rarely. Right? So SQL is definitely still dominant and is on a resurgence, if anything. So put the 2 together and you see that what you really need is a

transactional database that is SQL that can be low latency, scalable, and geographically distributed. Right? And that this is the underpinning of Yugabyte. This is essentially why we are building Yugabyte. And to your point about a number of different types of applications

requiring some of this geodistribution

out of the box and at the start, it has led to a number of other databases hitting the market with that being 1 of their main selling points, most notably things like FaunaDB and CockroachDB.

And I know that a lot of the inspiration behind that is some of the work that came out of the Google Spanner paper. And I'm wondering

if you have any other thoughts in terms of

what it is about

the current state of the market in the industry and the actual application and development needs of these different users that is driving this trend of more different database products coming out with

geo distribution and,

highly scalable transactional

capabilities

as their main selling point? Yeah. Absolutely. Like,

first off, like, the way we view it is that databases like CockroachDB,

FaunaDB,

what have you that are embracing at the core geographic distribution,

transactions, scalability,

it actually validates the market. I don't think it would be too much fun if we're in a market of 1 player and that player is ourselves, and we're the leader and we're the last of the first in everything. So the existence of more such projects and companies actually validates both the need to build such applications as well as the the financial side of things, right, the the possibility to monetize. So so it is it is exciting for us for sure. Now as far as what is happening in the market as a whole, if you think about a cloud native application that starts out small right? I mean, there's different a variety of different patterns in which users,

approach app building. A lot of companies wanna start small because they're start ups, but they want to make sure they wanna have the insurance that they can grow when they need. Right? But even when they are small, they don't want to deal with failures by hand because, a, failures are always going to happen at the most awkward time, 3 AM, never 3 PM. I mean, you can take it from us being at Facebook. We always got paged at 2 or 3 AM. It was never 2 or 3 PM. Would have been a lot more convenient, but, anyways so first is And then as the dataset gets bigger, they wanna be able to add nodes and so the second will be scalability.

And then as the number of different types of access patterns keep proliferating,

invariably, other things like geographic distribution and more complex ways to access data starts coming up. Right? So so this is

the normal path for an app or a user and a company in the cloud. Right? So, therefore, it is not, and if you look at what databases are around to satisfy this I mean, if you take the the the older but well established databases like Oracle and SQL Server and even like Post and MySQL,

these are not built to handle that paradigm, that user journey. Right? So what is the database that people end up picking? I mean, let's forget all the new entrants you mentioned. Right? It's invariably going to be Postgres because that is the fastest growing database.

The 2nd fastest growing database is MongoDB. Right?

So with Postgres growing so fast and and MongoDB has a huge commercial company behind it. Postgres does not. It is completely a child of the open source. Right? The people just develop it, adopt it, use it, and it is widely popular because it is very extensible, very powerful, and feature rich. Now the reason we are in this game is because on on what we're trying to do by building a new open source project is is that Postgres is so popular, yet it doesn't satisfy the cloud user journey. Right? We wanna build a database that can offer every single feature that Postgres has to offer while satisfying this cloud journey, which is start small, make it easy to run-in production by giving you

So address your issues at 3 PM, not at 3 AM, and then grow when you need to and geographically distribute. So this is our journey. Right? And, obviously, each of these other projects you mentioned would have their own pieces and their own reason for existence and how they are going about their journey. And it's probably perfectly valid for them because the market is huge and everybody has their own mark to make. But our vision is, like, there's MySQL and Postgres, which are the open source RDBMSs. We wanna create a third 1, which is as fundamental, as open, as powerful, and ready for the cloud. And continuing on your point about Postgres

being the

growing market leader in terms of open source, and I've seen reports go either way about whether it's MySQL or Postgres that's in the lead. But Postgres definitely has a significant portion of mind share regardless of which of those is the front runner. I'm curious

what you've seen as far as the challenges

of being able to maintain

compatibility

with that as an API and as an interface. And I'm curious

to what degree of compatibility you're supporting that Postgres interface, whether it's just at the SQL layer and some of the specifics of how they implement some of their extensions to that query language, or if it's deeper in terms of being able to support some of the different plugins that are available in the broad Postgres ecosystem. Oh, yeah. This is a great question. It's very insightful 1. With Youcubyte, we so our architecture is unique in the sense we reuse the upper half for our upper half of the database, the upper half of Postgres code. So we are literally not we're not just wire compatible. We are code compatible with the upper half. Right? So what that means is that as a database and our lower half so Postgres writes to a single node's disk or disks. Right? So at the lower half, what we did was to remove Postgres from writing to the single node and made it a distributed document store that is built along the lines of Google Spanner. Right? So if you look at Aurora, what Amazon Aurora does is it gives you full posters compatibility, but it doesn't scale the queries. Right? It's still a single head node that deals with queries. What Google Spanner does is it scales the queries in the storage, but it is not Postgres compatible. It is its own flavor of SQL. What we've done is a hybrid approach where we are fully Postgres compatible, not just in the wired protocol, but also the features inside

and married it with Google's Spanner underneath for the scalability

aspects. Right? So what do we end up supporting then? We end up supporting like, we're probably the only horizontally scalable database that can support all the way up to stored procedures,

triggers,

partial indexes,

extensions. I think you asked about plug ins. We support Postgres extensions. Now extensions fall in 2 categories, extensions that touch the storage aspect and extensions that touch the language aspect. We obviously,

have to redo the extensions that touch the storage aspect because we've changed it completely. And sometimes

they don't make sense in a distributed world. Right? So we have to reinvent a new paradigm for that or even look at if those make sense in the first place. But for extensions that touch the language aspect, we are completely able to support those. I mean, as examples, like the PG crypto library is something that we already support. There's a heavy demand for post GIS, which is, like a geospatial

extension that we're working on supporting. There's a number of, PLPG SQL or language runtimes that people wanna write, including a JavaScript push language that you can write your stored procedures in. So that is something we're working on. So we have a very, very deep amount of, Postgres features that we're able to support. Another thing that we're going to work on and and release pretty soon is foreign data wrappers. So you can import foreign tables and it and but interface and query them through the Postgres API. So so I'd say we have a a very deep support for Postgres. We started out with Postgres,

10 dot 4 version and, quickly rebase to 11.2.

And, our path is that we will keep upgrading to, you know, whatever is the latest or near latest version of Postgres. Postgres just released 12, but, you know, we'd probably move up to it at some point, but that is the strategy.

The second side is we wanna put some of these changes back into the open source so that we can become an extension to Postgres as opposed to having to fork and, you know, maintain this whole thing. It's not fun for us also. Yeah. There have been a number of companies that have gone through that journey.

The 1 that comes to mind most readily is Pipeline DB, where they started as a fork and then ended up refactoring to be a plugin to post res for being able to do in memory aggregates of streaming data in post res.

And then in terms of the challenge that you're trying to overcome

in terms of the use case, the project that comes most notably to mind as trying to target that same

area is Citus DB, which was recently acquired by Microsoft. And so I'm curious what you would call out as being some of the notable differences

between Yugabyte and Citus

and some of the specific features of Yugabyte that would tip the balance in terms of somebody choosing your project for a project that they're trying to build?

Yeah. Completely. Yeah. So with Citus specifically, right, Citus is an extension to Postgres, and we are not. Right? So the the the first question to answer is what are we doing that make and and it's not because we didn't wanna be an extension. It's because we cannot be an extension. So the first question to ask, I guess, is what are we doing that prohibits us from being an extension, right, which will give you a clue into what are the unique features we support. Right? The first thing we couldn't do was the Postgres

system catalog, which is the the set of tables that track all the other tables in Postgres. So this is the repository of all

the databases,

schemas, tables, users, etcetera. That, in Citus' case, is still left after to Postgres, which is on a single node, whereas in Gigabyte, even that is distributed. Right? So we wanted a shared nothing,

no single point of failure type system. So anything fails, the data is still replicated. It automatically fails over and recovers. So that forms a fundamental difference. The second difference is the way to think about Postgres,

Citus. Citus reuses the storage layer of Postgres, and, replication between

shards

is essentially Postgres replication,

so which I believe is asynchronous. So if you lose a node, right, if you lost a node and the data on that node, you will lose a little bit of data.

So that violates

asset compliance. So there's always I mean, even though it's just a little bit of data, I mean, I'm telling you from practical experience. Right? Like, we had like, we've run big systems at places like Facebook. It's always a difficult emotional call to make if, like, for example, you lost something or there's a network partition and you have to do the failover. And you don't know if

it's bad enough to do the failover or if you just wait a little longer, it's gonna come back. And you don't have to go through this hassle and and fill up stuff and explain to people what happened and why and what the impact is even when you don't know. And so it is complicated. Right? So that's something that Yugabyte is built for. And, obviously,

it's easier said than done because it's very fundamental to the database. You have to touch replication. You have to touch storage all the way below. The second big difference is, Postgres,

Citus

uses each of the Postgres databases to store a shard of data, and you could have keys of your entire app distributed across these shards. You may perform a, transaction across these shards. Now that goes through a coordinator node which coordinates this transaction because it is a distributed transaction. It's across shards. The problem with that is the coordinator node becomes a choke point or a bottleneck in terms of your scalability.

And,

and so that often limits the number of transactions you can get from the entire system. It also kind of puts some restrictions on what type of, like, indexes you create or what type of unique indexes and so on and so on, what type of features you can exploit across shards.

Now with Youcubyte, the direction we've taken is that the entire cluster comprising of whatever number of nodes you have in the cluster, it'll access 1 logical unit. So that means you can just run as many transactions. And if you want more transactions, you just add more nodes, and it'll just scale. And if you have failures, they're seamlessly handled. And similarly, all your unique indexes are enforced across nodes. So these are fundamental differences. So to summarize the whole thing, if you want,

so absolutely no failure or downtime and no manual touchpoint, then you go by this better. If you want scalability,

like, just add nodes and get more scale, especially when doing transactions and, you know, unique constraints and so on and so forth, then you could buy this better. Right? And, obviously, we're a newer technology, so people have to I mean, I'll just play the other side of the balance too, like which is that, like, Citus reuses Postgres at the storage layer, so there's something to be said for the maturity. But that's something, you know, that it behooves us as you go by a project to grow up and show the world that we are as mature. And so that's that's where we've been focusing on things like JEPs and testing and correctness testing and, like, you know, getting into more and more use cases and and, you know, earning our stripes, so to speak. Yeah. And 1 of the questions that I had had was

whether Yugabyte requires any special considerations

in terms of data modeling given the nature of its distribution and the availability of geo replication.

I know that in some of the implementations

of these databases that

support horizontal scaling and cross region replication,

there is some caveat as to how you think about modeling your data, whereas small tables you would wanna actually have located on on each of the nodes, whereas

the data that you're sharding will actually span across nodes. And so it affects the way that you approach joining across tables and the way that you handle your query patterns. And so I'm wondering if you can talk to that and maybe some of the caveats that come out in Yougabyte as far as how you handle some of this geographical scale.

Absolutely. I think this is a a very, very astute point, and it's 1 of the hardest points that we also strive to educate,

users about in general. So, I mean, I just, like, give some

general notions because it's hard to go into all of the

details. But, fundamentally, if you start at the beginning, a write in Yugabyte has to be persisted on multiple nodes. Right? So it will attempt to I mean, let's say your replication factor is 3, you can survive 1 fault. Any write will attempt to write to 3 nodes and will wait for 2 nodes to commit the data before acknowledging the user as a success. Whereas in a traditional RDBMS, you write to a node. The node simply writes data to its disk and acknowledges the user. So we've already introduced a network hop, minimally 1 network hop, right, in,

while while

making the use while handling the user write, whereas an RDBMS simply writes to disk. Now disk is much, much faster than the network. The the latency of a network hop is, like, you know, in the order of a millisecond depending if the if the machines are close to each other and could be much more, like, could be as much as as 70 to a 100 milliseconds if they are, like, say, across the east to west coast or or even different continents. So the first thing is to really understand the placement of data and to go in with the realization that your latency of rights will go up. I mean, this may not have a bearing may or may not have a bearing on throughput because that depends on how fat your network pipe is, but your latency is definitely gonna go up. Right? So it starts there. Now on the read side, you can I mean and there's a number of building blocks that Yugabyte gives you in order to be able to make in order to make things efficient, such as moving all the leaders that can serve data into a single data center so that you can satisfy the read completely local to the data center? But you may have failures where the entire set of leaders fail over to a different data center at which point you may have an increased read latency. So the second thing to realize is that in an RDBMS,

because you built the replication and failover

very tightly, you may be able to control and you redirect the app. You may be able to control latencies better, but it's a lot more involved. Right? But in a database, you have to think carefully about where the failover will go and what will be the latency upon failover. So that's point number 2. Now point number 3, I think your point on reads, writes, table sizes, and so on, we are working on a feature called colocated tables where you can place all of the data of your small tables into a single shard or tablet and let a few of the very large tables expand out and split and live across multiple nodes. Now in this type of an in this type of a setup, if you did a join that only joined that read data from the small tables, it'll typically be pretty fast. But if you joined data from 1 of the small tables to or 2 across 2 large tables or at least involving 1 large table, you could be moving a lot of data across the network. So that would be the next consideration is to think about what your join does. Right? Now the last point that I wanted to put out is that there's scalability and there's scalability. Right? So there are workloads that need, say, a couple of terabytes of data and need to scale their queries. Right? And then there are workloads that need tens or hundreds of terabytes and that need to have insanely low latency reads. Now it is very important to reason through for any of these workloads

what fraction of data you will actually read and have to transport across

or or have to process,

in order to satisfy the query. Right? Because at some point, the the this spectrum starts going over into the OLAP side where a single query is just doing a lot of work, and you'll have to move data back and forth and seamlessly switch over to that side. So it is important to make sure you distinguish between these 2 sides and keep it in the OLTP bucket because

on a small database, it's okay to read most of the data, like on an RDBMS,

because you are inherently bound by how much it can scale. It can only scale up to a certain point. Whereas a distributed database gives you the promise of being able to add a lot of nodes,

and this also brings with it the unintentional

danger of reading a lot of data, and so your queries actually become less scalable with time as you accumulate more data. Yeah. And your point about there being different ways of defining scale is

something that I'm interested in digging a bit more into because people will throw out that term as a catchall when they might mean very different things about it, and that can lead to breakdowns and expectations

as to what people are going to be getting when they buy something that, quote, unquote, scales.

Because it might be that the system can scale vertically and take advantage of more CPU cores on a single box,

or it might scale horizontally in terms of being able to handle,

more read or write throughput because of the fact that you're splitting that across multiple network connections, or it might be that you're able to scale horizontally across for storage. And so I'm wondering what the primary focus is in terms of the scalability of Yugabyte DB along these various axes.

Yeah.

So the simplest 1 to explain is if you need

fault tolerance. I know this is not scalability, but it still requires you to spread your data across nodes. So the simplest 1 is fault tolerance. Whether you have a little bit of data or a lot of data, the notion that a failure of a node should not impact

the correctness or anything with your data, and your application should continue to function as it is. Right? So that's, like, the bottom end of the the spectrum. Right? So that's where it starts. Now from there, you can take this small workload and you can geographically distribute it across multiple regions, and now you want to be able to run-in that mode. Right? So you have

a notion of scalability when you consider RDBMSs versus the setup. I mean, you could call it scale. You could it may arguably may not be scaled. Now from there, let's move forward. Now you have more and more queries coming in, which are only read queries. Right? In the RDBMS world, you would have used read replicas to scale this out. However,

your read replicas are obviously going to serve stale data. Right? In a distributed database like Yugabyte, you can which implicitly can shard your data. The shards live on various nodes and each node that the shard lives on, like the shard leader lives on, can serve its own reads. So you get consistent reads from a larger number of cores whereas this is something you could not have achieved with your RDBMS. So that's the first scale vector. The second scale vector is when you get a lot of writes.

Now let's assume that even though you have a lot of writes, your disk is more than the disk you have on a single node is more than capable of storing that data because most of these rights are updates. Let's just assume that for a second. So you're not bottlenecked on the amount of data you store. You are bottlenecked on the the CPU, the how much how many cores you have that can possibly handle this deluge of updates coming in. So in this case, you want to split your data again automatically sharded across multiple nodes and each node handle a a portion of the updates. Right? So, again, Gigabyte is a great candidate for this case. Now let's take it to the 3rd case where you actually have a lot of inserts coming in and you have data growing in volume. Now you could put bigger disks, but at some point, you're gonna be you're gonna a user is gonna run out of the use cases, gonna run out of the number of cores the database needs in order to handle that dataset size. So at this point, you want to take the data and put it on other nodes, and you need to leverage more aggregate cores in order to be able to sustain

that data size set size. Right? At this point, also, you go by is a great option in order to

serve data. Now rewinding, what are cases when there is no perceived scale and when gigabyte is arguably not a great fit? Right? So let's take cases when you don't have too many updates. Right? And your dataset size is not expected to grow very big at any point of time and your number of queries that you're handling is not expected to grow too large. And you have a fixed dataset size. And while the dataset size may be big, the amount of working data fits in memory. So you have maybe even a terabyte of data, but queries are always coming for a small subset of data.

And you don't care about a 100%

asset, you're okay with, you know, an asynchronous replica that gets promoted, you're okay with this type of a setup, then a more mature technology probably fits the bill at this point of time. Right? I mean, obviously, as 1 of the believers and 1 of the builders of the data of the gigabyte DB project, I'd like to say gigabyte solves everything, but, you know, it'll be probably true at some point, but there are other technologies that solve that today. And going back to the storage layer as well, 1 of the other interesting points is that while we focus most of this conversation

on the Postgres compatibility,

you also have another query interface that is at least based upon the Cassandra query language and supports a different way of modeling the data. So I'm wondering if you can talk about some of the way that you've actually implemented the storage layer itself and the way that you're able to handle these 2 different methods of

storing and representing and querying the data and some of the challenge that arises

in terms of having the split and the types of access?

Absolutely. Yes. So we are a multi

API database. Our query layer is pluggable,

which means we can continue to add more and more access patterns in the future to help users build a richer variety of apps. So that's that was really the vision even from day 1. We picked Cassandra specifically because the Cassandra language also uses a very SQL like dialect. It's it also has tables. It has columns. It has insert and select queries and so on and so forth. So we use that as a building block, and it has a rich ecosystem. It is good for a certain type of use cases, like, which are massive scale, massive amounts of data reads and writes and ultra low latency, which clearly complement the SQL very relational use case. The thing that we changed from Apache Cassandra is that unlike Apache Cassandra, yCQL,

the Yokabyte cloud query language, is, completely

asset compliant. Right? So we think of YCQL as a semi relational use case. And we talked about some of the dangers of scale out at massive scale where if you issued a bad query or poor query, like, you could really ruin not only your own life but everybody's life in the cluster because all the nodes are performing a lot of work, and it'd be too late to it it'll take a while for the whole thing to settle down, and that could cause unintended

consequences. And it's okay at a couple of terabytes. It's really bad at 10 or a 100 terabytes. So the YCQL API restricts you from make doing any of those queries by not even supporting them.

So

YCQL only supports the subset of queries in SQL that hit a finite number of nodes unrelated to the total number of nodes in the cluster. So there are no, like,

scatter gather type operations that do joins across all tables. And so it's really built for scale and and performance.

Alright? So that's on the on the YCQL side. Now where do we see these 2 fit in? If you look for workloads that are 10 to 100 terabytes or more, right, and they need very low latency access directly as the serving tier. You have,

use cases such as time to live that you have to implement automatic data expiry with the features called time to live. YCQL

perfectly fits the bill. It also supports compound data types such as lists maps and sets and so on inside, like, a single column. On the other end, y c q Y SQL, the Postgres compatible API, does

foreign keys,

constraints,

triggers, like, the the whole 9 yards, right, on on the completely relational side. Now you asked about how we designed the layer below. Right? Like, it was actually,

an interesting challenge for us. Like, it is document oriented, like, but way below. And what we figured was a document database is actually the most amenable to supporting a wide area of access patterns as long as we can keep enhancing the storage layer, by the way, is called Doc DB, so I'll just use that term for now. So what we realized in the Doc DB layer is that there's a number of access patterns that we have to

optimize, and we have to leverage these access patterns in the corresponding query layers above. Right? The advantage of a common docdb layer below each is that the advantages of 1 start flowing into the other. For example, on the YCQL API, we have the ability to store a lot of data for node. Like, 1 of our users actually tried loading 20 terabytes of data compressed per node. And then at that density level, tried to do, you know, hundreds of thousands of operations per second and,

had tried to kill a node, add a node, expand the cluster, so on. Right? All of that seamlessly flows into the Y SQL side. Right? And Y SQL side has, for example, features such as secondary indexes

and constraints, which we added to the YCQL side. So developers

coming in with a Cassandra knowledge and wanting to build those type of apps can actually use secondary indexes,

unique constraints,

transactions,

a document data type, a JSONB data type, and so on. And the Y SQL folks, the Postgres folks wanting to do scale can actually leverage a Cassandra like scale. So it really marries the 2 at the layer below. Now

what is another unique advantage that's often overlooked is the fact that we

internally distinguish between a single Rokey access pattern and a distributed access pattern. So what this means to the end user is that, like, if you went to a Google Cloud, you would put your most critical transactional workloads on Google Spanner. But Google Spanner uses atomic clocks. It's very expensive and has a lot

of limitations.

So you wouldn't put, like, use cases which have a ton of data in Spanner. You'd probably move it to something like a Bigtable. Right? So Yugabyte brings both into the same database as just 2 different table types. So that's that's really another huge advantage that the end user gets. Now as far as,

as far as the challenges, I think that's that's actually an interesting question. I think the challenge always comes down to is twofold. Right? Like, first part is the addition of so many features into something that's core at the lower layer should not destabilize whatever exists. Right? So that means and especially in something as fundamental as database, it's almost like a breach of trust if we build a feature that breaks some something else and loses data. Right? So so that means that the onus on testing is incredibly high. We have a super massive elaborate

pipeline to test our product for for every single feature matrix. Like, we, in fact, go the distance of having a CICD pipeline, which we're very proud of, that bids for spot instances the minute somebody uploads a diff, a code,

diff

for code review. So the minute they upload their changes, we automatically bid for spot instances

and run spark based parallel test like thousands and thousands of tests in parallel. And before the review is done or the even the reviewer gets to it, sometimes

the results of what happened by running all this wide array of tests are out. We had to invest in doing, thread based sanitizer, address sanitizer. We had to invest in, CLANG and Mac and Linux and all sorts of different environments to build in a Kubernetes Docker, so on and so forth. We have to do, we do Jepsen based testing. We do, deterministic failures, nondeterministic. So we have a like, it's a it's a very, very elaborate pipeline. So that's a big

onus. I mean but we we still we we some of us actually enjoy working on that stuff, believe it or not. So

so, so that it works out as a team. So that's 1 part. The second part is people often ask us what we're going to do for compatibility

with, like, for example, Apache Cassandra or with Postgres. Right? So the way we think about it is slightly different.

We will

do the compatibility

slowly. Like, that's not a concern for us. What is more important is enabling users to be able to build the type of applications they want to in the here and now instead of chasing

versions. So we're not going after lift and shift of an application. We're going after lift and shift of an application developer.

So a user that's familiar with Apache Cassandra

but really wants secondary indexes. I just wish I had JSON. I just wish I could do a couple of transactions

database that is not a new paradigm to them. Right? So similarly, the Postgres folks, like, all of this is great, but I just wish I had the scale or I had the So those are the things that we're going after. So, yeah, yeah, I think, I don't know if that gives a fair idea. No. That's definitely useful. And

to your point about the testing

and the CICD pipeline,

it definitely sounds quite impressive, and it also sounds quite expensive.

It is. That's why, like, the bidding for this had we're like, we had to a specific project to do spot instance bidding. So the funny thing is this type of I mean, like and and you're right. The expenses add up very quickly, and we have to keep prioritizing how to keep the price down. So every time somebody puts up a diff, actually, there is something that goes and finds out what the bidding rate is and then uses that bidding rate to

spin up instances

in the cloud by bidding at that price. So the and the price is often far lower than what you would if you played at a 247 rate. And, we do this parallelized testing and then shut down the instances automatically. And the other part is this entire infrastructure is cloud neutral, so we can run it on any cloud we want depending on where the cost goes. So so, yes, it is expensive, but we have

invested a lot to keep the price down. And 1 of the other core elements to discuss

in any database project is

the issues that come about from the operational characteristics of it. And from the conversation so far, it definitely sounds like you're paying a strong focus on that aspect of the project, but wondering if you can just talk through a bit of the

considerations that somebody who's interested in using and deploying Yugabyte should be thinking about, and some of the steps that are involved in actually getting it deployed in a production capacity

and maybe going from a small scale proof of concept use case on a single node and then scaling that out to multiple instances or multiple data centers? Yeah. Absolutely. So,

so we support, like, I mean,

most of the popular ways of deploying. So you go by the runs on and and we've taken special care to have make it have no external dependencies. So it runs on bare metal, VMs, and containers, Kubernetes to to the works. As far as where you can deploy, obviously, you can deploy it in any managed Kubernetes,

like, whether you're managing it yourself or a cloud provider is managing it. You can deploy it on all the public clouds, and we have a number of integrations with, for example, CloudFormation and TerraForm and all of the various ways. So that's just the raw act of deploying the the database. Right? Now how you wanna deploy it, like, you could do the our most

commonly deployed multi node paradigm is a multi zone deployment in a single region. So that's by far the most common. We're increasingly starting to see a lot of multi region

hybrid hybrid meaning across clouds or on premise and on a on a public cloud, these type of deployments.

So that's as far as the range of of deployments go. Right? So then the the next piece is,

we have a whole platform. So there's even in the multi data center deployments,

you could do it with 3 data centers where you you're your 0 failure, you don't have to touch the thing. You can survive an entire data center failure, zone failure, region failure, or you could do 2 data center deployments with asynchronous replication 1 way or bidirectional, right, which is a multi master. So so we've seen all of this, and and we have a third thing, which is a read replica. We actually have a user that's deployed a gigabyte cluster, 11 way replicated across various geographies of the world so that they can get ultra low latency access with a reasonably high number of operations per second. Right? So so we're seeing all of this,

come to just

come to fruition on the deployment aspect. Right? Now as far as rolling it out and running it in production and so on, 1 of the core value props we give is, high availability,

which means that if a node dies, you don't have to worry. At some point, come in and replace the node. Right? So so that that makes things

much easier already.

We also support, like, a variety of other things that you'd expect, like, you know, encryption at rest, encryption on the wire, the security side of things, authentication, role based access, all of that in Chilera so that the data is secure. Right? So so that you can do secure

like, day 1 secure deployments rather than kinda do it as an afterthought or, like, or so on and so forth. And then

further, we also support exporting observability

by exporting metrics through Prometheus. So we we have Prometheus ready metrics that can be scraped and put in, and so then you can set your own alerts and and so on and so forth on top of that and then monitor and observe what's going on with the database and get alerted. Right? Now, finally, we're a database that

is built for 0 downtime, so node replacement, and there's a number of our our users that do, like, AMI rehydration. Right? They wanna replace the entire node with a different OS with patches, for example. And similarly, rolling software upgrade so that you will be able to go 1 node after the other and upgrade your software while the database is running and the app gets no impact. Right? So there's also things like alter table and alter schema where you wanna add a column, drop a column, or do some other stuff. All of those are online as well where it just rolls through internally 1 node after the other. So we talked about a lot of these operational

aspects that it supports.

But to take this 1 notch higher and and all of this is in the open source, whatever I talk. Right? So now comes the commercial aspect. The the product that we have that's commercial is called the Yugabyte platform, and it is software that instantly converts

your

cloud account or your set of on premise machines or something into

a DBaaS. So effectively, it strings everything we talked about into software

where you can just, like, turnkey,

say, hey. I want to deploy it on these machines or or you go figure out the machines and go spin them up yourself.

You go figure out the security groups groups on AWS and make sure that it's restricted the right way and the right nodes have access to each other. I want a multiregion deployment with x nodes in this region, y nodes in that region, and the thing is gonna get the whole thing done for you. With a click of a button, you get, like, for example, encryption at rest with, integration

into a key management service.

You'll get alerting. You'll get, like, the ability to do software upgrades in a rolling fashion. So all of that is automated for you. So it's completely turnkey for folks that wanna run it. And so this has been very, very popular with some of our paying users that have graduated into wanting to manage this solution at scale or for their business critical applications. And this is a YouGovite platform. So it's it's all the stuff we talked about but bundled into a turnkey fashion and with an easy to use, you know, REST API UI, so on and so forth. And what are the other operational

aspects of running 1 of these types of platforms is the consideration

of

good enough, but it doesn't actually solve the problem where you introduce an error and you need to be able to restore from a certain point of time. And I'm curious how you approach that, particularly given the fact that you're able to scale to these large number of nodes and large volumes of data. Yeah. No. Absolutely. I think you've raised a good point. I should have covered it in the first place, but thanks for raising that. Backups is absolutely essential for the the application corruption that you mentioned, but also from the perspective of at the end of the day, we're a newer database and people want the peace of mind that you have the data backed up and you you can bring it into a different cluster or you can export it to a different system.

Because we're scalable, we do a distributed backup.

So the way this works is, I mean, I'm gonna explain it in very simple terms. Like, there is a lot of files in a lot of nodes.

We pretty much keep around a copy of the file without disturbing it as a backup cut. It's called a snapshot in cluster snapshot.

And then we take all of these frozen set of files across a variety of different nodes and then copy them into a target. And when you need to restore it, you can just get these files back appropriately split with the appropriate replicas on the different nodes in order to recreate the cluster. I mean, you don't even need the number of nodes in the source cluster to be the same as the destination. You can back up to, like, say, an s 3 and then,

restore to, like, a GCP cluster. You could do all of these kind of things. With the platform addition, which is the commercial side we talked about, you can even do nightly backup. So you can just say, I wanna backup on some frequency. You can set, like, a cron schedule, and the thing is gonna keep backing up for you. And you can do a 1 click restore which says, like, hey. Go to this s 3 bucket which holds a backup, and you just restore it for me into this cluster, and it will just do the whole thing for you. And for a lot of the open source infrastructure

components that are backed by a business,

1 of the common patterns for managing the business model is withholding some of these different types of

enterprise features such as backups and change data capture as a means of driving revenue to that commercial offering.

And you mentioned before that everything that we've been discussing so far aside from that hosted platform is available in the open source release, and it looks like that's been since version 1 dot 13. Wondering if you can talk through your reasoning and motivation for including all of those, what what might be considered advanced

features into that open source project? Yeah. That's a great question. So it's a it's yeah. It's version 1 dot 3. I think it was, like, towards the first half of some the first half of last year.

The reason for doing so, like, primarily is that, like, our ambition as a project is to become as fundamental

as a MySQL or Postgres. Right? So, like, we want to become another very fundamental piece of infrastructure for the Internet for all apps being built the cloud. So we wanna become the default database for the cloud. Right? So and as developers of database and as users of database, we ourselves have personally felt this pain a lot where, like, a couple of features that you really need are are held back. And it might be a weekend project, but you can't choose that database anymore. And if it's a really critical project, you probably will end up, like, paying for support anyway because you want the peace of mind. So what we decided was that it's better to have long term greed, not short term greed. So we we do want to become big. We do want to become popular, but not at the expense of developers really understanding and using the project is the first point. Well, the second point is when we communicated this to our our community of users, they pretty much were quick to point out, hey. Postgres and and I guess MySQL don't really hold back on features like backups or security or encryption or or what have you. And yet you say you wanna become like them, but you're not really, like, you know, doing the same thing. So at that point, we kinda decided, yeah, this makes sense. And when we looked at our our paying customers, our enterprise customers, they were mostly paying for the convenience. And, like, it's because in the cloud, everybody is busy building so many apps and without knowing which ones will succeed. And those that succeed just take off like a rocket and need massive scale, that the

the manageability of the whole thing and somebody the the the way to take care of all of these deployments without having to have people

babysit each 1 of them is like a much bigger value to them than the actual enterprise features that were held back. And secondly,

with the world warming up to cloud, with the world warming up to like like, our our thesis is, like, if you look at very popular

database companies and database prod like, products like Amazon Aurora or MongoDB, what you find is that they are all open source at the core. Like,

Amazon Aurora is really a managed service built on top of Postgres and MySQL, which are fully open.

MongoDB

reached where it did, like, with Atlas and the managed service on top of an open source core database, which is the Mongo database. Right? I mean, obviously, Mongo has gone the other way and shut the doors to the community. I'm guessing they think they don't need that anymore. But to us, it's a long game. To us,

the power of how deep you can get embedded in in people trying to build apps on you is is actually the most important and rewarding thing. And we feel that once we get there, there'll be a lot of opportunity to monetize. Right? So And in focusing

your

efforts on the long game, it seems to go against some of the accrued tribal knowledge of how best to run an open source company, at least as far as what's been put forth as best practice within maybe the past 5 to 10 years. And I'm wondering what you have seen as some of the feedback

either from your community or from some of the other companies that you've interacted with as far as how that decision has played. And I guess I'm most curious about cases where you've had people trying to convince you that you made the wrong move and that you should

go back to trying to withhold some of these features as a means of trying to drive revenue?

Actually, funnily, we've had the opposite. Like, we've had our enterprise customers tell us and paying ones at that, like those that were paying, tell us that this is a great move because they could end up paying us anyway for the convenience of the platform and for support because we are going after mission critical workloads. So,

I guess, in some sense,

it is true that you would need to withhold stuff if you are not running in the most mission critical, like if you are an add on to or

a plus 1 type of infrastructure. But us being a core type of infrastructure, people want the transparency, love the transparency, and in fact, use us more because of the transparency, if anything. Right? So, the community of users,

our enterprises, they're all, like, completely behind this thing. In fact, ever since we made the change, our community has grown like crazy. I'd say almost a 10 x, in in just, like, in less than a year. Right? And, we've seen enterprises also get pulled with strong interest and come and tell us, like, hey. This this this is the right move. We want an open database. Right? And and that's because specifically in the area of databases,

a lot of people are wary of, like, you know, players like Oracle where it's it's very close. They don't know what's going on. They don't know how it'll whether it'll work in the cloud or what the exact value is or how the feature works or so on and so forth, but something that will live for a longer time that has a community backing, that has wider testing by virtue of the wider community using it, the transparency, and the fact that it's mission critical, they will come and pay. So I think that's the feedback we've gotten

across the board. Like, I don't know if they're I think before we took the decision, there was a lot of second guessing and people trying to convince us. But having seen what happened since we took the decision, we're, like, not too many people have have said anything to us at all. And in terms of your plans going forward, I'm wondering what you have in store for the future of both the technical and business aspects of incubate. Absolutely. So let's do the business aspect because at at our stage, we're more focused on the technical, so the business is relatively quicker. On the business side, we

are we just announced the beta of our cloud, the Yugabyte cloud, because a lot of the smaller companies, the small and medium sized businesses, the fast growing companies, they really don't want to deal with even the platform on their side. They're like, you just take care of the whole thing, and we'll just, like, deploy the database. Give us an endpoint. We'll use it. You scale. You manage. You upgrade. You do everything. Right? So we're seeing that as, like, 1 of the drivers currently. That's like a like a big thing that's happening. On the second side, there's a number of, feature asks and number of asks of, like, going into a number of different clouds, different ways of deployment to, making all of those easier. So some of that work is also in progress. Right? So, like,

overall, that's the the set on the the business side. On the

technical side, like, we our aim our vision is to become the default database for the cloud. Right? Like, any cloud application you're building, pick a database that is ready for the cloud. So we're seeing a natural affinity from a number of related projects, and we wanna position ourselves as 1 of the best databases for these projects. So there's, for example,

ORMs

that are traditionally or so far have worked on a single node, but we have the opportunity

to

change the way these ORMs and and even JDBC works fundamentally so that it can it is aware of a cluster of nodes of topology awareness for multidata center and so on and so forth. So we're working with, projects like Spring and Go and so on and so forth in order to bring this to fruition.

Then there's the GraphQL community.

A lot of interest from the GraphQL community because it's a modern paradigm for building applications.

And,

GraphQL itself is high performance and it's stateless and scalable, but you need a database that's stateless, scalable, and as well. So there's a lot of resonance on that side. The PostgreSQL community is also pretty interested because it is a it is everything Postgres, but with it's great for if you need and the cloud and scalability and so on. We're also a great fit for Kubernetes because we are a multi cloud and hybrid deployment ready database where we don't have any external dependencies.

We we have built features so that we work natively in Kubernetes. So with Kubernetes taking off on the promise of multi cloud and hybrid cloud, there's also pulling us along with it, which is which is great. And finally, when you're building modern microservices,

there's a lot of messaging systems, so like, Kafka, like, for communicating between microservices.

And,

a a huge ask for of people is, like, can you give us a change data stream? We've talked about the BCM and other things. So can you give us a stream of data that has changed in the database so that we can communicate between microservices

and know what changed. Right? Subscribe to these changes. So that's another area. So these are just a few areas. There's a number of other areas where there's interest, but you can look to us,

making good intros, good features, good integrations into each of these ecosystems. So we we'd like to have a simple message around why Ucobite is a great database for

YouGoviteDB

or your position in the overall landscape

of data management or any of your other aspects of your business or your work on the platform

different sub elements that we could probably spend a whole other episode talking about in great detail,

but I think we've done a good job of the overview. So for anybody who does want to follow along with the work that you're doing or get in touch, I'll have you add your preferred contact information to the show notes. And as a final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

I think the the the 1 big gap that at least I see, and it's not directly in data management but is related, is the is networking in Kubernetes. Like, that comes to mind, by the way,

specifically when you're trying to run stateful workloads in Kubernetes. Like, first off, there's a lot of discussion on whether you should run stateful workloads in Kubernetes or not, but the answer is really irrelevant because everybody's doing it anyway. So the answer is yes. Now given that the answer is yes and an increasing number of people are doing it, the Kubernetes ecosystem is scrambling to mature,

how how to run stateful inside Kubernetes.

However,

Kubernetes is really strong at multi cloud, but the networking prevents multi cloud deployments.

So so 1 consistent ask that we get at gigabyte or we've seen a number of this ask come up is how do you stitch multiple Kubernetes clusters running possibly in completely different regions or even clouds

together using Yugabyte. Right? And we've actually had a bunch of these deployments where you have 3 Kubernetes clusters, and Yugabyte spans all 3 of them and keeps 1 copy of the data, 1 replica of the data in each of these clusters. Now the most annoying thing about this by far is that each 1 is like a craftsman solution. There's really like, you gotta figure out which cloud, what is the target source cloud, what are the different clouds, how does the networking work, how do you figure out, how do you route. So that's 1 part that I think is a gap that could keep getting better over time. So that's that's 1 piece.

The second piece is there is a lot of,

lot of interest in,

in, like, serverless as a technology.

So it remains to be seen how serverless and,

databases, like open source databases specifically, will end up playing. There's a number of serverless open source technologies. There's a number of databases like Yougovyte that are great and that'll work well inside containers.

How the the tool will work together and whether it can go down to a 0 cost because of the slow start problem, like or will it only be for

scaling as the workload increases? I think those are things that that remain to be seen. So that's another open problem that that I would see. Yeah. I think, I can't think of much else. I think these are 22 big areas. I mean, these are areas we're thinking about too. So Yeah. Those are definitely 2 pretty substantial problems to try and solve. So I think that's plenty.

Well, thank you very much for taking the time today to join me and discuss the work that you've been doing with Youcubite. It's definitely a very interesting platform, and the more that I've learned in our conversation today, the more I wanna look into it further. So thank you for all of your efforts on that front, and I hope you enjoy the rest of your day. Thank you for, for having me online. Really enjoyed it. Great set of questions, great discussion, and have a good day yourself.

Listening. Don't forget to check out our other show, podcast.init@pythonpodcastdot

com to learn about the Python language, its community, and the innovative ways it is being used.

And visit the site at data engineering podcast cast.com to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.