Summary
Businesses often need to be able to ingest data from their customers in order to power the services that they provide. For each new source that they need to integrate with it is another custom set of ETL tasks that they need to maintain. In order to reduce the friction involved in supporting new data transformations David Molot and Hassan Syyid built the Hotlue platform. In this episode they describe the data integration challenges facing many B2B companies, how their work on the Hotglue platform simplifies their efforts, and how they have designed the platform to make these ETL workloads embeddable and self service for end users.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold. Once you sign up and create an alert in Datafold for your company data, they will send you a cool water flask.
- This episode of Data Engineering Podcast is sponsored by Datadog, a unified monitoring and analytics platform built for developers, IT operations teams, and businesses in the cloud age. Datadog provides customizable dashboards, log management, and machine-learning-based alerts in one fully-integrated platform so you can seamlessly navigate, pinpoint, and resolve performance issues in context. Monitor all your databases, cloud services, containers, and serverless functions in one place with Datadog’s 400+ vendor-backed integrations. If an outage occurs, Datadog provides seamless navigation between your logs, infrastructure metrics, and application traces in just a few clicks to minimize downtime. Try it yourself today by starting a free 14-day trial and receive a Datadog t-shirt after installing the agent. Go to dataengineeringpodcast.com/datadog today to see how you can enhance visibility into your stack with Datadog.
- Your host is Tobias Macey and today I’m interviewing David Molot and Hassan Syyid about Hotglue, an embeddable data integration tool for B2B developers built on the Python ecosystem.
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you start by describing what you are building at Hotglue?
- What was your motivation for starting a business to address this particular problem?
- Who is the target user of Hotglue and what are their biggest data problems?
- What are the types and sources of data that they are likely to be working with?
- How are they currently handling solutions for those problems?
- How does the introduction of Hotglue simplify or improve their work?
- What is involved in getting Hotglue integrated into a given customer’s environment?
- How is Hotglue itself implemented?
- How has the design or goals of the platform evolved since you first began building it?
- What were some of the initial assumptions that you had at the outset and how well have they held up as you progressed?
- Once a customer has set up Hotglue what is their workflow for building and executing an ETL workflow?
- What are their options for working with sources that aren’t supported out of the box?
- What are the biggest design and implementation challenges that you are facing given the need for your product to be embedded in customer platforms and exposed to their end users?
- What are some of the most interesting, innovative, or unexpected ways that you have seen Hotglue used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while building Hotglue?
- When is Hotglue the wrong choice?
- What do you have planned for the future of the product?
Contact Info
- David
- @davidmolot on Twitter
- Hassan
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
- Thank you for listening! Don’t forget to check out our other show, Podcast.__init__ to learn about the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. When you're ready to build your next pipeline and want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With our managed Kubernetes platform, it's now even easier to deploy and scale your workflows or try out the latest Helm charts from tools like Pulsar, Packaderm, and Dagster. With simple pricing, fast networking, object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform. Go to data engineering podcast.com/linode today. That's l I n o d e, and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. Of your own. And don't forget to thank them for their continued support of this show. This episode of the data engineering podcast is sponsored by Datadog, a unified monitoring and analytics platform built for developers, IT operations teams, and businesses in the cloud age.
Datadog provides customizable dashboards, log management, and machine learning based alerts in 1 fully integrated platform so you can seamlessly navigate, pinpoint, and resolve performance issues in context. Monitor all of your databases, cloud servers, containers, and serverless functions in 1 place with Datadog's 400 plus vendor backed integrations. If an outage occurs, Datadog provides seamless navigation between your logs, infrastructure metrics, and application traces in just a few clicks to minimize downtime. Try it yourself today by starting a free 14 day trial and receive a Datadog t shirt after installing the agent.
Go to data engineering podcast.com/datadog today to see how you can enhance visibility into your stack with Datadog. Your host is Tobias Macy. And today, I'm interviewing David Malat and Hassan Saeed about Hot Glue, an embeddable data integration tool for b to b developers built on the Python ecosystem. So David, can you start by introducing yourself?
[00:02:04] Unknown:
I'm David, and I'm someone who's been working on business development and kind of entrepreneurship
[00:02:11] Unknown:
kind of things, for the last 5 years, I would say. Yep. I'm Hassan. I'm the technical cofounder at hot glue, and I've been doing professional software development for about 5 years.
[00:02:21] Unknown:
And going back to you, David, do you remember how you first got involved in the area of data management?
[00:02:25] Unknown:
I really wasn't until I met Hassan, and he kinda brought me on a hot glue. So it's really his project from the start. It was kind of his side project. And when he was looking to kinda productize Hot Glue, he brought me into the data management kind of industry, so he's really, I would say, the person that has more knowledge in this space.
[00:02:44] Unknown:
So Hassan, how did you get involved in data management?
[00:02:47] Unknown:
Actually, the way I got involved is I was working with a start up, and they had problems with their data management infrastructure. They were actually trying to onboard new customers and their data integration pipeline had a lot of holes in it that caused traction issues, like they were having trouble getting their first, like, 5, 10 customers because of a failure in their data integration pipeline. So that's kind of how the idea of Hot Glue came.
[00:03:12] Unknown:
You mentioned, David, that you joined up with Hassan after the project had already been conceived to help him productize it. I'm wondering if you guys can maybe talk a bit about how you came to meet. Is this the first time you've worked together, or do you have previous experience building things?
[00:03:29] Unknown:
Yeah. So this is the first time we worked together on something we met in college. Hassan is actually my former roommate's best friend, which is how we kind of got to know each other. So we we hadn't worked on any projects before, but we definitely knew each other and, you know, had mutual friends and things like that. So that's how we met. And then for me, I've worked kind of in, like, the social media marketing space before in high school. I was part of an Instagram account that had a few 100000 followers. So that was something I've done before. And then I was also super involved in a couple other entrepreneurial efforts that were just surrounding other, like, social media marketing things.
[00:04:03] Unknown:
And so digging more into the product and what you're building, can you give a bit of an overview about what HotGlu is and some of the motivation for starting a business to address the particular problem that you're building towards?
[00:04:16] Unknown:
HotGlu allows any developer to create an end to end pipeline in a matter of minutes, really, and and allows them to kind of utilize the full functionality of the Python ecosystem to while while also not having to get too down in the weeds. So we are built for developers, and it's by developers, obviously. And we're really striving to provide people with the opportunity to not only extract information and put it where it needs to go, but transform it in the ways they'd like to, and have it ready for their product to ingest. So if I wanted to kind of give a quick example, I would say if you're building an accounting software, need to go get your customers' data from QuickBooks, and Salesforce, Xero maybe, and the problem is that some of them have custom objects and some of their data is in different formats, so you'd use Hot Glue to extract that information and then transform it into whatever you want it to be in regards to data formatting, and then ingest it into your own product after you output it to s 3 or PostgreSQL or wherever you're looking to pick up your data.
[00:05:15] Unknown:
Who are the target end users of HotGlu? You mentioned that it's for developers, but what are the types of companies and engineers that are most likely to need this type of solution? And what are the biggest data problems that they're facing?
[00:05:30] Unknown:
What we've come to realize is really that it's we it's kind of taking away the job of a data engineer. So really any engineer can handle the data integration problem using hot glue. So you know, you have a JupyterLab notebook with inside of hot glue that allows you to define transformation scripts. So that allows these software engineers who may be not necessarily data engineers for their companies, but just regular software engineers to do the ETL pipeline. When you wanna look at the companies that would use this product, it's really any b to b company that deals with data can benefit from HotBlue, but the real problem we're see we're solving right now, especially for our initial customers, is attraction issues. So when a customer only supports a QuickBooks pipeline and then they have to manipulate it and change it because a new customer comes with custom objects or something like that. It tends to be a problem and is a huge tech debt for them. So what we really do is provide people the opportunity to support more sources, have all those pipelines maintained, without really having the hassle to work on that tech debt. So it's a traction problem where they're able to provide more sources than maybe a competitor, and they're able to pull customers for that kind of reason and don't have to deal with the managing of the pipelines.
[00:06:37] Unknown:
You mentioned an example of a QuickBooks data pipeline where you have the end user's customer. It's interesting work talking in the b to b space because they're your customers, and then they're the customers of your customers. So terminology gets a little confused. But for the case of your customer, they might want to allow their end users to be able to take their QuickBooks information and load it into your customer's platform for being able to maybe get a analysis view of their banking information or their finances. And then maybe they want to also be able to support, you know, a different system for being able to pull in their banking information. So maybe go directly to Bank of America or be able to support local banks. So I'm wondering if maybe you can just talk to some of the ways that that manifests in terms of the complexity that your customers are dealing with.
[00:07:33] Unknown:
That's actually 1 of the main things that Hotcla is designed to solve. 1 customer might have the same type of data in a different spot. So how do you make sure that each of your customers has a similar experience? The way we solve that is by creating this concept of flows. The idea of a flow is that it's any type of data that your product needs. So like your example, a flow could be invoices. Now your invoices could be in a different platform than another customer's invoices. So we create a list of supported sources for 1 invoice, so it could be QuickBooks and Xero, for example.
Now when a customer comes in to import their data and they say they want to import invoices, it asks them, where are your invoices? Are they in QuickBooks or Xero? And based on what they say there, it'll import the data from the corresponding place and ingest it into the application.
[00:08:26] Unknown:
What are some of the types of tools or solutions that they might be using, and what are some of the sharp edges or missing pieces that make this so complicated for being able to handle bringing in these new data sources for their end users?
[00:08:41] Unknown:
Yeah. That's a great question. So the number 1 thing we see is that these small development teams that start off building a product, they decide they want to support 1 source. So it might be QuickBooks or it might be something like Salesforce. And so they'll build a standard integration that, you know, hooks in directly to their product. And then problems start arising as soon as they start getting customers because data is very different than what they expected it to be like. And so that sort of leads them to scramble for, okay, what should we do? Should we move this completely out of our product? Do we expand the internal data integration thing we've already written? And so the result kind of ends up being either they have some professional services team that cleans up the data and creates a unique pipeline for every customer they have, which is pretty expensive for a small team. Right? They don't have the resources to just outlay developers to go onboard customers when they're still, you know, doing performance work on the product. Or what also ends up happening is they build their own solution that can scale by integrating several other tools. So it might be that they use something like Stitch, so they can use that to actually move the data around and into their own data warehousing system. And then they'll use some other tool for transformations. Maybe they write something of their own that runs on AWS. But these solutions, the big problem here is they become bigger and bigger as they get more customers and they support more integrations.
And it just becomes some tech debt that these guys have to keep managing as they grow as a company. The real trouble here is for small companies, that's not great because it's not something that impacts their core business value and but takes away from the engineering team's time to build out the core product.
[00:10:24] Unknown:
And so for people who are adopting Hot glue, how does it help to simplify the work that they have to do for being able to allow their end users to onboard new data sources and provide them the flexibility and free up the resources that are currently being consumed by trying to custom craft ways to bring in these various different types of data.
[00:10:47] Unknown:
Like David said, the idea is to make everything as simple as possible to the point where a standard developer can handle it, and it doesn't have to be somebody who's so well versed in 1 of these ETL tools or knows a lot about, you know, orchestrating workflows for moving data around. So Hotclue manages all of the running of the workflows and creation of the flows. It's just a matter of deciding what sources you wanna use, and Hot Clue has, you know, a ton of sources that work out of the box. And then what do you want to do with that data? And so Hot Clue handles running any transformation scripts, and we give developers the flexibility to choose how they want to write them. So our platform is built entirely on Python and allows them to choose, okay, I use a small amount of data, maybe I want to use Pandas to do some light data transformation, Or maybe I have a lot of data. I need to use Spark to handle a lot of data transformation.
Either way, they don't have to worry about how do I deploy these changes in the ETL script? How do I actually orchestrate this? How do I get it to run? Hi Clue handles all of that, and that's why, you know, the complexity for the developer is pretty small.
[00:11:53] Unknown:
And so digging more into Hot glue itself, can you talk a bit about how it's implemented and the steps that are involved in getting it integrated into your customer's environment and into their workflow?
[00:12:06] Unknown:
When a user first comes on to Hot Clue, they're asked to create a series of flows. And like we discussed earlier, flows, any type of data that they need their users to import. And then I'll go through and configure what sources they want their users to import from. So, like, for invoices, they might use QuickBooks and Xero, or for sales data, they might use Salesforce and HubSpot. Once they do that, they can create these transformation scripts inside of our product, or they can use sample transformation scripts that we've already written. And from there, what they do is they embed a widget into their web app. And the widget's goal is to make a very nice UI inside of their product that handles all the complexities of sending them to OAuth portals or grabbing API key credentials securely or, like, s 3 access, stuff like that.
It's all working out of the box. It's it's very similar to the way Stripe Checkout is designed and and the motivation behind it. Instead of dealing with payment information directly, you have Stripe do it for you, keep everything secure, and get it to their servers.
[00:13:11] Unknown:
In terms of the architecture of the system, how has it evolved since you first began working on it, and what are some of the assumptions that you had going into this project that have either been updated or invalidated or challenged in the process of building it out and bringing on customers?
[00:13:29] Unknown:
Yeah. I would say there's a couple things that have really changed since we kind of began talking to customers. So I'd say the initial thing is we were just a transformation platform. So initially we were like looking to just do transformations and make it easier for companies to ingest some data, but then we realized that there's so many people in the market who do a piece of the puzzle, and we wanted to do the whole puzzle. And so we decided to kind of expand out to the entire retail pipeline from end to And then the second thing I would say that the biggest realization we had is we initially believed that people would on prem model. So we thought that people would, you know, appreciate the security of an on prem model versus a cloud based solution. But after talking with a lot of people, they were like, yeah, we much rather prefer a cloud based solution. So we switched from offering an on prem and cloud based solution to just offering a cloud based solution for now. And so I would say those are the 2 biggest assumptions that we had challenged and eventually had to change our focus on.
[00:14:24] Unknown:
Once a customer has Hot Glue set up in their environment, what's the process for actually exposing it to their end users and integrating it into their application and their offering?
[00:14:36] Unknown:
Yeah. So they have 2 options. They can use the widget that I described earlier, and all they got to do to put that in is embed it into their own web app. And then using some simple JavaScript, they can show the widget to their customers, and the widget handles sending all the data to Hot Clue. Or they can call our APIs directly, and that allows them to have a little bit more freedom regarding calling jobs, so they can schedule jobs, like, a weekly, monthly, daily basis, or they can call them programmatically through our API, and they can also link data sources programmatically so that they can offer their own interface.
[00:15:10] Unknown:
In terms of the capabilities that Hotglu offers for being able to integrate with external systems, I know that there are some pre built integrations that your customers can just take and use. But for the case where they want to be able to offer an integration with with the system that you haven't already worked with, what are their options for being able to provide custom capabilities and hooking into the rest of the machinery that Hot Glue provides?
[00:15:38] Unknown:
That's part of the way Hot Glue was designed is to allow for our users to create their own taps and targets. Since we're built on top of the singer spec for tabs and targets, it's very easy for our users to go and create their own. And we have built in functionality to allow them to import their own targets and sources so that they can do that. Or we also offer the ability to commission us to create certain sources if they don't have the ability to do that themselves.
[00:16:11] Unknown:
Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting. And often takes hours or days. DataFold helps data teams gain visibility and confidence in the quality of their analytical data through data profiling, column level lineage, and intelligent anomaly detection. DataFold also helps automate regression testing of ETL code with its data diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. DataFold integrates with all major data warehouses as well as frameworks such as Airflow and DBT and seamlessly plugs into CI workflows.
Go to data engineering podcast.com/datafold today to start a 30 day trial of DataFold. Once you sign up and create an alert in DataFold for your company data, they'll send you a cool waterflask. With the singer ecosystem, I know that there are some variability in terms of the quality or capabilities of the available plug ins. And I'm curious what your experience has been working within that ecosystem and some of the ways that you have identified the plug ins that are well structured and well maintained, and how you help to guide your customers in terms of either selecting or implementing high quality options for the TAPSA targets?
[00:17:39] Unknown:
Yeah. So that's definitely been something that's been discussed in the community a lot. Internally, what we're doing is we kind of verify that the tabs on our system are working correctly and that they operate at a high level of quality. And if they don't, what we tend to do is we actually fork the open source versions, and we support our own version that runs alongside that. So sometimes they have compatibility issues, and that's something that we've addressed. And then other times it's just, like, bad documentation of the tab itself. So we aim to kind of alleviate that by offering our own forks of that. In that way, the open source community benefits from our efforts too.
[00:18:18] Unknown:
I know that there are other projects that are trying to improve the viability of the singer ecosystem. The 1 that comes to mind most notably is Meltano, but then there's also another project, Airbyte, that is at least in some regards able to use some of the singer plug ins. I'm wondering if you have had any interaction with some of the other players in the ecosystem and just the engagement that you have with that overall community?
[00:18:45] Unknown:
Yeah. We've definitely looked at what Airbyte's been doing. We're pretty familiar with that. We haven't directly talked to the folks behind Airbyte, but we have been part of that open source community. I'm not sure that the singer spec is something that is lacking. I think most people agree that the singer spec itself isn't something that's a promise. It's more that the people who create the taps aren't following the spec completely. And so I think the way Airbyte handles it is they kind of build stuff on top of the tabs to make some of the data about what a certain tab supports programmatically consumable, because some tabs support some features that other ones don't, and that's not very well documented. So that's something that we definitely support, and I think that's something that we have to deal with internally as well.
[00:19:26] Unknown:
In terms of your work of building an ETL engine that is consumable by business to business companies, what are some of the biggest design and implementation challenges that you're facing in terms of making sure that your product is accessible and usable for your customers and then providing useful abstractions or interface options for their end users?
[00:19:55] Unknown:
Yeah. I think 1 of the big things you learn pretty quickly when dealing with data integration in the b to b space is that customers have very different data needs. Especially bigger customers, they have data that's more spread out. So making sure that we have a standard that makes sense and can scale to different sizes of customers is important. Why we've set rules internally that kind of standardize the process a little bit more because the way things are architected right now, there's not a whole lot of standards that people follow. And so by making standards within Hot Clue and being a fully end to end system, we're able to enforce some things that make these data integration pipelines simpler. The other problem there is scalability.
So from the beginning of HatCliffs creation, we've always been focused on scalability. And part of that means that when we have customers that are onboarding somebody who has data that's a little bit more spread out, they might need to do custom things, like maybe data that we thought was in 1 spot is actually in, like, 3 different places and needs to be combined. So giving our customers the granularity to be able to treat 1 customer differently than others while still operating the same way is something that's been a struggle. But I think that by keeping that goal in mind, it's going to be very useful to a whole wide range of people.
[00:21:10] Unknown:
In terms of working with b to b customers, what are some of the other interesting or unique complexities that they're dealing with, and what are some of the unexpected aspects of that problem domain that you've encountered while working on this product?
[00:21:26] Unknown:
I think something we found interesting, like use case wise, is we have some VCs reach out to us, not only to maybe provide us funding, but also they're looking to build internal tools to keep track of their investments. So, you know, they have investments and they wanna hook up with their QuickBooks accounts and stuff like that and see kind of what's going on with their sales and their accounting and all that. So they'll they'll come to us and, like, we're building an internal tool. We'd love to use Hotlu to consistently update our information on how our investments are doing. So we never really approached that use case or thought about that use case, which has been pretty cool. Usually for us, it's always been okay. Like, probably like the anomaly detection softwares and accounting or other marketing analysis softwares, things like that is what we thought the main target would be. But the VC 1 has been interesting, and I know it's been a few VCs now that have reached us reach out to us about that kind of thing, and so we thought that was pretty interesting.
[00:22:14] Unknown:
And for products that have embedded hot glue and are offering it as a means of onboarding their customers' data, What does the experience look like for the end user and how they actually go about selecting the data sources that they want to work with and validating that any transformations that are being operated are sort of meeting their expectations or just the overall experience that they have interacting with your tool via a, you know, 2nd party that they're actually aiming to interact with, who has actually used your tool to facilitate that operation.
[00:22:53] Unknown:
The main way that our users have been basically exposing hot include to their customers is through our widget. And so the challenge there is making sure that the widget is something that kind of adapts to the styles of our customers. So a lot of the times, our customers' users don't actually realize they're in another product when they're connecting their data. And that's kind of the goal. Right? So the goal is that connecting your data should be a seamless process, and our goal really is to get it to a self serve state. So instead of having to deal with a developer that's going through and connecting your data manually, you should have the granularity to be able to do that yourself and also decide what data you wanna expose and also, like you said, how do you verify that the data is being ingested properly. So those are all things that we provide through the widget. Like, we show jobs that have run and whether there's some issue with the sync or if some data gets out of sync or if, like, data sources get disconnected, like, OAuth's credentials can be sometimes expired.
So all those sorts of things are made available through our widget, and that's kind of the beauty of that is none of these developers have to deal with introducing that to their users themselves. They They kinda trust Hacklu to do that, and we've had a really good response to that.
[00:24:06] Unknown:
And in terms of the use cases that you've seen, what are some of the most interesting or innovative or unexpected ways that you've seen it employed?
[00:24:14] Unknown:
Yeah. I'd point back to kind of what I mentioned, like the VCs, we didn't didn't really think about how internal tools could benefit from something like this, but at the end of the day it makes a lot of sense. And also we recently talked to someone who's like an accounting anomaly detection software. So they're not really working with the idea of bringing data in and analyzing it, but really matching data. So they'd have multiple, as we mentioned earlier, flows, and those flows would just have to be matching data within their own software. But we didn't think people may use Hot Glue to kind of compare streams of data and make sure they're the same. We thought, you know, they would bring in streams of data Right. To to analyze that data. Another way that we've seen people want to use Hot Clue is not only bringing data in, but taking data out of their platforms. So sometimes they want to write a tab that connects to their own internal data and allows users to export
[00:25:04] Unknown:
it so that they can see it themselves. So maybe it's something as simple as exporting their data back into a CSV or something more complicated, like exporting it back to some CRM or something like that.
[00:25:16] Unknown:
For people who are running these import and export operations, is this something that's typically just going to be a 1 off operation where somebody will load in all of their data into this new company's tool 1 time? Or is it something that is also going to frequently be a scheduled operation where you want to maintain a synchronized copy of all of their Salesforce data or all of their QuickBooks information operating on some periodic schedule?
[00:25:44] Unknown:
Usually, the use cases we've seen so far, they've almost all been a recurring use where they need to keep data in sync. Actually, I can't think of somebody who's using HotClue right now that's used it as a 1 off thing, especially because a lot of times users need to keep their data in more than 1 place. So maybe they need to keep the data in Salesforce, and they enter data in Salesforce, but the platforms our our our customers need access to the data as it comes available. Right? So keeping it in sync is definitely the predominant use case.
[00:26:15] Unknown:
In terms of your own experience of building the Hot Glue product and bringing on customers and talking to end users about it, what are some of the interesting or unexpected or challenging lessons that you've learned in that process?
[00:26:28] Unknown:
I would kinda say that brand is everything in this space. It comes across at least. So, you know, when we're talking to customers, we, you know, we are a start up. We're 2 younger people. So, you know, trusting your data with people is a really important thing. And even though a lot of people's core product doesn't include the idea of data integration, having the correct data is extremely important. So what we realized is that maybe going after the idea of partnerships and having other people basically sell the product for us, who already have a brand name in the space or consult about setting up workflows and stuff like that, is definitely the way to go. So I would say that that's been a really interesting challenge. I didn't think we'd approach. It's it's this whole idea if you build a better mousetrap, like, the world is not gonna beat a path to your door. The fact that we really have to focus on showing that we're a brand that can be trusted more than just having the best product out there, I think that's been a big challenge for us.
[00:27:22] Unknown:
And for people who are working on this use case of wanting to enable their end users to be able to onboard data from other systems, what are the cases where hot glue is the wrong choice and they might be better served either building something internally or using a different system or platform?
[00:27:39] Unknown:
HotGlu is specifically focused on the developers that are building a product. And usually, that use case is pretty broad, and Hotclue can handle the data. The the specific instances where I would say Hotclue is not the choice is, like, financial data. Like, Clad is a great example of something that's really standardized data that was not standard before. And if you're looking to do that, it might be something that comes before HotClue. So HotClue isn't always the best way to extract the data, and that's why we offer the ability to create custom tabs. But in terms of transforming data and offering a nice experience to users for getting the data in there, I would say HotGlu is usually a good route for b to b developers that are building a new product.
[00:28:23] Unknown:
As you continue to build HotGlu, what are some of the features or capabilities or improvements that you have planned for the near to medium future?
[00:28:32] Unknown:
1 of the major things we're doing is allowing our users to export their own data. So a user imports data into 1 of our customers' platforms, they might want to export that data later on. It's not something we fully support yet, so we're trying to get into a stage where that's something that's pretty common. The other thing is we've had some users ask for, like, real time synchronization. That's not something we offer, but that's something we're definitely looking to do. And beyond that, it's just making sure that we're scalable and that we support technologies that allow for speedy transformations of large amounts of data. So stuff like Spark.
[00:29:09] Unknown:
Are there any other aspects of the Hot Glue project or the business to business data integration space or just the overall experience of building out a business that we didn't discuss yet that you'd like to cover before we close out the show?
[00:29:23] Unknown:
Yeah. I mean, I think something that I just wanna mention, we're both undergraduate college students or were undergraduate college students. So it's just been really interesting, like, in a pandemic, in college, building out a company. I guess it's, like, something that we've both talked about and wanted to share with people, which is that we know how hard it can be when you build a business and it's in a pandemic and you have to, like, realize how hard you have to work. I'm kind of rambling here, but basically, we just wanted to say that, like, you should always be confident in yourself and keep believing even when it doesn't seem like it's working because I was someone who knew nothing about data integration before I started working on this product, and now I would say I'm pretty well versed. And I still feel like I make mistakes every day, but that's how you learn. So that's just anyone who's interested in entrepreneurship.
You know, it's just a nice lesson to hear, and it doesn't really matter when you start doing it, or if you have all the knowledge in the world, just keep trying, keep learning.
[00:30:19] Unknown:
In terms of the business itself, how are you approaching sustainability of it? Are you taking funding, or is this something that you're just looking to bootstrap and just support as something that you're interested in engaging with and you're not looking for hypergrowth?
[00:30:34] Unknown:
We're definitely looking for hypergrowth. We believe that there's definitely an opportunity here to grow quite rapidly, and we have been completely bootstrapped so far. We've talked with some VCs about potentially getting some funding, but more than anything, the funding would come with the idea of a network and more users more than the importance of money for us. So we are looking to grow pretty substantially. We believe that there's a serious market here for what we're doing. So, yeah, so I think we're eventually going to take funding at some point. We're not exactly sure when, and it depends on how fast we grow. But we've definitely been in conversation with few VCs about starting to speed up our process.
[00:31:11] Unknown:
Well, for anybody who wants to follow along with you and get in touch or keep up to date with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as a final question, I'd like to get get your perspectives on what you see as being the biggest gap of the tooling or technology that's available for data management today.
[00:31:29] Unknown:
Yeah. I would say that kind of goes into why we started Hotclue. As a young developer, when we kind of look at the integration space, especially from, like, a perspective of a developer who's building a product, you'd think that there was a product out there already that could handle simple integration pipelines. Like, the comparison we like to make a lot when we explain Hotclue is that Hotclue is kind of like Stripe for data integration, and I really think that's true. When Stripe started, it wasn't the first payment processing solution. What they did is they just appealed to developers, and they made something that was just better. And that's what we're aiming to do. We want to make data integration something that developers that are building products don't have to spend so much time on, spend so much money on, and we want to make it something that's simple, that any developer can do. Abstracting that away is really the goal of this company, and we think that it really falls in line with what a lot of these companies we've talked to have said.
These people aren't, you know, experts on building data integration software, so they shouldn't be building it. They should leave that to somebody who can do that and in a scalable way. Right? So a lot of the ETL solutions that already are out there, they're really, really expensive. Their starting prices aren't designed for a scale up model that startups need, Akhlu is. And we aim to help these people bring their products to life. That's the goal.
[00:32:50] Unknown:
Well, thank you very much for the both of you taking the time today to join me and share the work that you're doing at hot glue and your experience of building up this business, particularly given the interesting times that we are in. Definitely appreciate the time and energy that you've put into this problem domain. It's definitely an important 1. So I definitely wanna wish you the best of luck in that, and I hope you enjoy the rest of your day. Thank you so much. Appreciate it.
[00:33:18] Unknown:
For listening. Don't forget to check out our other show, podcast.init@pythonpodcast.com to learn about the Python language, its community, and the innovative ways it is being used. And visit the site of data engineering podcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.
Introduction to Hot Glue
David and Hassan's Backgrounds
Overview of Hot Glue
Target Users and Use Cases
Simplifying Data Integration
Implementation and Integration
Design and Implementation Challenges
User Experience and Use Cases
Lessons Learned and Business Strategy
When Hot Glue is Not the Right Choice
Future Plans and Features
Building a Business During a Pandemic
Approach to Sustainability and Growth
Biggest Gaps in Data Management Tools