Summary
The global economy is dependent on complex and dynamic networks of supply chains powered by sophisticated logistics. This requires a significant amount of data to track shipments and operational characteristics of materials and goods. Roambee is a platform that collects, integrates, and analyzes all of that information to provide companies with the critical insights that businesses need to stay running, especially in a time of such constant change. In this episode Roambee CEO, Sanjay Sharma, shares the types of questions that companies are asking about their logistics, the technical work that they do to provide ways to answer those questions, and how they approach the challenge of data quality in its many forms.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show!
- Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos.
- Prefect is the modern Dataflow Automation platform for the modern data stack, empowering data practitioners to build, run and monitor robust pipelines at scale. Guided by the principle that the orchestrator shouldn’t get in your way, Prefect is the only tool of its kind to offer the flexibility to write code as workflows. Prefect specializes in glueing together the disparate pieces of a pipeline, and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20,000 community members, Prefect powers over 100MM business critical tasks a month. For more information on Prefect, visit dataengineeringpodcast.com/prefect.
- Data engineers don’t enjoy writing, maintaining, and modifying ETL pipelines all day, every day. Especially once they realize 90% of all major data sources like Google Analytics, Salesforce, Adwords, Facebook, Spreadsheets, etc., are already available as plug-and-play connectors with reliable, intuitive SaaS solutions. Hevo Data is a highly reliable and intuitive data pipeline platform used by data engineers from 40+ countries to set up and run low-latency ELT pipelines with zero maintenance. Boasting more than 150 out-of-the-box connectors that can be set up in minutes, Hevo also allows you to monitor and control your pipelines. You get: real-time data flow visibility, fail-safe mechanisms, and alerts if anything breaks; preload transformations and auto-schema mapping precisely control how data lands in your destination; models and workflows to transform data for analytics; and reverse-ETL capability to move the transformed data back to your business software to inspire timely action. All of this, plus its transparent pricing and 24*7 live support, makes it consistently voted by users as the Leader in the Data Pipeline category on review platforms like G2. Go to dataengineeringpodcast.com/hevodata and sign up for a free 14-day trial that also comes with 24×7 support.
- Your host is Tobias Macey and today I’m interviewing Sanjay Sharma about how Roambee is using data to bring visibility into shipping and supply chains.
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you describe what Roambee is and the story behind it?
- Who are the personas that are looking to Roambee for insights?
- What are some of the questions that they are asking about the state of their assets?
- Can you describe the types of information sources and the format of the data that you are working with?
- What are the types of SLAs that you are focused on delivering to your customers? (e.g. latency from recorded event to analytics, accuracy, etc.)
- Can you describe how the Roambee platform is implemented?
- How have the evolving landscape of sensor and data technologies influenced the evolution of your service?
- Given your support for customer-created integrations and user-generated inputs on shipment updates, how do you manage data quality and consistency?
- How do you approach customer onboarding, and what is your approach to reducing the time to value?
- What are the most interesting, innovative, or unexpected ways that you have seen the Roambee platform used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Roambee?
- When is Roambee the wrong choice?
- What do you have planned for the future of Roambee?
Contact Info
Closing Announcements
- Thank you for listening! Don’t forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
- To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan's active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans could focus on delivering real value. Go to data engineering podcast.com / atlan today, that's a t l a n, to learn more about how Atlan's active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork, and Unilever achieve extraordinary things with metadata.
When you're ready to build your next pipeline or want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends at Linode. With their new managed database service, you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes with automated backups, 40 gigabit connections from your application hosts, and high throughput SSDs. Go to data engineering pod cast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don't forget to thank them for their continued support of this show. Your host is Tobias Macy. And today, I'm interviewing Sanjay Sharma about how Roambee is using data to bring visibility into shipping and supply chains. So, Sanjay, can you start by introducing yourself?
[00:01:39] Unknown:
Good afternoon. My name is Sanjay Sharma, and I'm the CEO of Roombi. We are headquartered here in Silicon Valley, California.
[00:01:46] Unknown:
And do you remember how you first got started working in data? Yes.
[00:01:50] Unknown:
My previous startup was in the RFID space. And while there were a lot of merits to this technology, companies never had a line item budget to use RFID in their processes. So, obviously, for that company, our business model was outcome based. So we would go in, we take a manual process, we deploy our RFID platform, the hardware, and we start deriving data, translate data into events, and finally, events into some business outcomes. And then the way we made money was basically proving that the benefits of RFID, the ROI was much better than being in the previous process.
So, obviously, it's all hard. Right? I mean, you want to prove value to the customer before and after savings. It's always hard to be an outcome based business. But after a few implementations, we figured out, you know, what the secret sauce was to prove to the customer, you know, how RFID could save money. And a piece of that was sort of our licensing fees for the platform and the services. So when there were companies who were doing, you know, licensing of their platforms. We actually ended up translating that probably 1 of the few RFID successful companies at the time that actually were profitable and growing and making money.
That's a little bit of a journey in translating data into revenue.
[00:03:21] Unknown:
In terms of Roombi, can you give a bit of an overview about what it is that you're doing there and some of the story behind how you got it started and why you decided that this was a problem area that you wanted to spend your time and focus on?
[00:03:34] Unknown:
Yeah. Goes back to the previous startup. Right? So with RFID, what we were solving was delivering visibilities inside the 4 walls of the enterprise. Whether it was monitoring of inventory, whether it was monitoring movement of assets, or any other problem statement the customer had within the facility, within the 4 walls. After the company got acquired, you know, we started looking at, you know, what the next problem would be. And after talking to many enterprises, we found out that the real problem was and a bigger problem was, you know, not having visibility outside the 4 walls of the enterprise. And initially, we thought, like, you know, the enterprises would know where their goods and assets are, and all we have to do is take all of that data and translate into some very interesting efficiency outcomes.
But it was quite surprising when we started talking to the customers. They didn't even know, you know, where their goods or assets were once it left their dock door or left their facility. And that's sort of where the journey of Roombi sort of started. Roombi is actually a data science company, But we think that in order to deliver very interesting analytics that can impact our customer supply chain, it is very important to collect highly granular and accurate data. And that's sort of how we got into bringing the hardware component to our service. And our thesis is if there is a sensor or a device that's available in the market, we will bring that device in our ecosystem.
But if there is no device that solves the use case, we will go build 1. So we have a hardware engineering team. We constantly look at this ecosystem. And, obviously, 1 thing that I learned when you walk back or top down from data analytics to data collection, we are collecting 10, 20, you know, pieces of data for a shipment moving from, I don't know, Shanghai to Hamburg. It's very easy. But when you are talking about millions of shipments, 500, 000, 000 bins going through the supply chain, or 11, 000, 000 trailers moving from point a to point b delivering items. It's a completely different ballgame.
And that's where the enterprise grade sensing devices come into play that can, you know, self heal itself. If there is a problem, we can basically make sure it has a heartbeat. And if it doesn't have a heartbeat, we can basically send a heartbeat, make it come alive. But at any point of time, collect good amount of data that then we can use and extrapolate using third party data streams as well to basically deliver a business outcome. Right? And I can give you some examples. Right? So our customer basically is moving vaccine products, say, for example, you know, from USA to Bangladesh.
Now if that vaccine is sitting at Dubai airport in room temperature for 5 hours, the efficacy of the vaccine is dramatically reduced from the 1 year to maybe 4 months, which impacts how these vaccines are going to be consumed by the individuals in Bangladesh. So collecting that data granularly and delivering value or delivering actionable intelligence for our customers is what that's what Roombi does and does that very well.
[00:07:13] Unknown:
So as far as the personas of the people that are looking to a solution like Roombi, I'm wondering if you can give some of the kind of broad categories of what their positions are, what it is the the type of work or industry that they're working in, and some of the questions that they're looking to be able to answer about the different kind of assets and resources that they are in charge of managing.
[00:07:41] Unknown:
Break these personas into few flavors. Right? So, obviously, what comes to mind when you put a sensor on your goods and asset is it must be high value, and, hence, it needs to be secured. So our customers basically would naturally low hanging fruit. Can I basically use this technology to secure my products as it moves from point a to point b? And secure means, you know, security from theft, security from getting misplaced, security from you know, there are many handshakes that happen within the supply chain. Make sure there is an audit trail. So right off the bat, the individuals that we talk to are security personnel who are responsible for moving or managing the security within the supply chain. So that's sort of 1 persona.
The second persona is the supply chain logistics persona. These are individuals who are looking at optimizing the supply chain. And the first thing they want to do is they don't even know where to optimize. Right? So the first thing they want to do is basically fix these sensors on various goods and assets and light up their supply chain. And when they light this up, we basically do a good job of that. And when they light this up, they identify the glitches that are there in the supply chain. And then these glitches becomes objectives either to be improved or to be fixed. Right? And then there are a third persona which is basically the business owner or the finance owner or either it's a CFO or it's a GM of a business unit.
And their problem statement is, okay. Now that I know in real time where these glitches are and I have confidence in my team that will fix it, how more can I derive in forecasting? So today, as an example, right, demand forecasting at a very nontrivial problem statement every customer faces. Can Roombi deliver some very interesting demand signals or inventory shock signals or delay signals that can basically I can extrapolate and adjust my inventory or adjust my delivery efficiencies. Right? And it's not a 1 month or a 2 month engagement. Right? It's iterative. You continuously are feeding data, training the models, and making sure you are getting closer to the truth.
And that's what Roombi helps enterprises. Right? All our customers are global 1000, global 2, 000 kind of a companies. So the value we deliver, you know, has a very exponential benefit, not only in terms of either increasing the top line or reducing the bottom line, but also there are intangibles where we are delivering this data to the field force, which can now act on it. Right? Whether it's a warehouse manager, whether it's a logistics manager, which never used to be the case in a technology category where, you know, you are using barcode and EDI and some of those things like that because they were never enabled to take decisions.
Now with real time visibility, you can empower some of these field workforce to take decisions and decisions quickly.
[00:11:15] Unknown:
In terms of the types of information sources that you're working with, you mentioned barcodes and EDI. I don't know if you can expand on what that acronym stands for when I hand it back over to you. But I'm wondering what are the sources of information that you're dealing with and some of the structure that that information takes as you are, you know, collecting it at the source and then managing it into, you know, the raw stage of your data infrastructure?
[00:11:42] Unknown:
The dinosaur era of data collection when it comes to real time visibility is barcode. An an analogy I give you is, a typical FedEx or a DHL package. Right? You know where it is only when it gets scanned in 1 of their hubs. And by the time you know the state or the movement of your courier or your documents has changed. Okay? So it's time delayed. You can't act on it. And most often because it's a manual process, it's also error prone. EDI, think about is old school email communication. Right? So when I, let's say, Lenovo or an Apple or 1 of those companies, when I basically hand over my container of products or air cargo or pallets.
I hand hand it over to a transporter. I tell my transporter, hey. Every time you basically, you know, have a meaningful update, you have to send it through electronic communication. It's like an email, but it's, like, got its own language. You have to send it through an email format, and that's sort of what EDI means. So electronic data interchange. And what we are doing is we are basically replacing some of these old school technologies with real time visibility. So for example, if you were to put a tracker on your shipment, of course, you will get a barcode scan from the FedEx guys or the DHL guys, but we can do better. We can tell you every 15 minutes, you know, as granular as that, where it is. We can tell you the health of the shipment. So is it hot and cold? Was it mishandled?
Did somebody tamper? Is it separated from its other packages and cartons? So there is a lot more you can gain out of this real time visibility, and this then becomes extremely actionable.
[00:13:38] Unknown:
As far as the types of latencies that you are aiming for in this kind of real time approach, I'm wondering what are the delays that your customers are willing to accept? And, you know, going from the barcode stage of, I just hope to be able to get some information because hopefully it actually gets scanned at the right places and doesn't just skip a terminal to you're using Roambee, so you can expect a latency of whatever that time delay is. And some of the other types of guarantees that you're focused on providing, whether it's in terms of accuracy of the information, how you manage things like data quality and being able to manage some of the kind of provenance of that information of, you know, I can verify that this sensor was processed through this port because I have some signature of the scanning device that registered the scan event or something like that?
[00:14:35] Unknown:
Yeah. Absolutely. So if you look at it, right, I mean, there are various flavors of customer requirements. So if you're a coffee filter company, of course, you don't need to know every 15 minutes where your shipments are. But if you're a pharmaceutical company, you do. Okay? Now the coffee filter company is most interested or a retailer who is making, you know, bath products and delivering to the Cascos and the Walmarts of the world, their problem statement is, did the product get delivered on time in full, and did it get delivered in a quality condition?
For example, the 1 of the retailers I was talking to was saying, you know, I'm shipping 300 pallets to, for example, Costco. And Costco says I received only 290 or sometimes they say I received 310 were broken. Now I'm in a receivable dispute with Costco, which is elongating my order to cash cycle. So I don't get paid for a long time. And if I'm operating at 2, 3% margin, it is impacting my cash flows. It will be impacting other KPIs from a finance perspective. And if I'm a $1, 000, 000, 000 company, even, like, 2 days savings on the order to cash cycles is huge.
So they come to Roombi and say, Roombi, I don't want you to come and tell me whether the shipment is on highway 101 or, you know, I5. All I need to tell you with a high level of granularity and confidence that this shipment, this 300 pallets were indeed delivered. They were delivered on a Wednesday at 2 o'clock when I was supposed to be delivering that. It was in great quality. It was not tampered. And if I can get that information with a high level of confidence from Eurombi, I can share it not only with the likes of Costco, but I can use that to automatically trigger an invoicing process in my SAP.
So the meaning of tracking is now completely different from a retailer compared to a customer that was using us for security, compared to a customer that was using us to improve or maintain the efficacy of the pharma product. So there are various inspirations the customer uses us for. And what we like to tell our customers is basically this is a journey. Okay? You start with collecting data first that basically tells you, you know, where the issues are. Then you basically use that data to start making some improvements. And our customers come to us and say, you know, on day 1, they would want all kind of alerts. You know? For example, the truck stopped for more than it was scheduled. The truck took a left turn. The truck is 50 miles away from destination.
But once they start seeing the confidence in the data, most times our customers will come, Roambee, can you not send me the 5, 000 alerts? Can you send me the 5 alerts that really matter? This means that the customer has now graduated to exception handling. And that's sort of 1 part of the journey. Right? So we start sending them only the problem alerts that matter, and that's sort of where AI, ML, and all of these interesting technologies come in play. Then the customer starts thinking about, hey. You have given me 5 alerts, but you have given me as and when they occurred. Can you start predicting some issues in my supply chain or disruptions?
So now because of, you know, our data collection been extremely savvy and granular, we are able to predict a spoilage risk, for example. We are able to predict the theft risk, able to predict the damage risk. We can do better ETA on when it's going to be delivered to a Best Buy store in Florida much better than anybody else. So that's the next level of graduation of the customer. And the customer then starts saying to us, hey, I love the fact that you're now delivering some predictions. Can you now integrate with my planning system?
And it could be an ERP. It could be a warehouse management system. And if you do that, we would be able to get a visualization of what was planned and what is happening on the ground from Roombi. And now we are in a better position to compare and start playing with some interesting knobs in the business that we are in. Right? And it could be demand forecasting or it could be vendor managed inventory or it could be as simple as customer satisfaction. And then finally, the final state that our customers look for is having lived through all of this efficiency data, can I start modeling it? So, you know, most times hear the buzzwords called digital twins.
Can I basically build a digital twin on 1 or more processes within my supply chain and feed that digital twin with a Roombi like sensor based data? And if I'm able to do that, I can model a much efficient supply chain network. So that's sort of how we graduate our customers from a very basic, I'll tell you where things are, to actually be more efficient. And our ambition is to enable the autonomousness in our customer supply chain. It's easier said than done. Right? I mean, autonomous means self healing, contextual, dynamic.
And we think we are far ahead of the pack because we are using sensors and driving some sensor intelligence to enable various signals within the customer supply chain. So that hopefully gives you some idea on how we are unpacking the data journey for our customers.
[00:20:35] Unknown:
Data engineers don't enjoy writing, maintaining, and modifying ETL pipelines all day every day, especially once they realize that 90% of all major data sources like Google Analytics, Salesforce, AdWords, Facebook, and spreadsheets are already available as plug and play connectors with reliable intuitive SaaS solutions. Hivo Data is a highly reliable and intuitive data pipeline formed used by data engineers from over 40 countries to set up and run low latency ELT pipelines with 0 maintenance. Posting more than a 150 out of the box connectors that can be set up in minutes, Hivo also allows you to monitor and control your pipelines. You get real time data flow visibility with fail safe mechanisms and alerts if anything breaks, preload transformations and auto schema mapping precisely control how data lands in your destination, models and workflows to transform data for analytics, and reverse ETL capability to move the transformed data back to your business software to inspire timely action.
All of this plus its transparent pricing and 247 live support makes it consistently voted by users as the leader in the data pipeline category on review platforms like g 2. Go to data engineering podcast .com/hevodata today and sign up for a free 14 day trial that also comes with 20 fourseven support. In terms of the implementation of the Roambee platform, I'm wondering if you can break it down into the kind of sensor level capabilities and integrations that you're working with and the data infrastructure and architectural elements that you're providing and some of the types of integrations that you have to build to be able to support the downstream use cases that your customers are looking to power?
[00:22:12] Unknown:
Obviously, the first layer in the architecture is the hardware abstraction layer. This is something we learned through a lot of failures. Right? You got fast throughput data coming from these sensors. You need to really have, like, a limitless messaging pipeline. That is, 1, listening to this data and acting on that data. So how can you basically build this limitless message pipeline that is processing messages at speed? So we actually ended up, you know, our first bottom layer, I would call it as the hardware abstraction layer because we can translate any message. So even if it's a proprietary device, which is a non room b sensor, we should be able to translate that message, pass it, and and push it into our pipeline that we can process it very fast.
For that, we've been basically, you know, using, you know, things like Kubernetes, Kafka architecture, you know, also dockerizing our container based services at scale. Right? Once you have collected, translated, extrapolated, and indexed this data, you're not trying to make sense of, are there any meaningful events that I can break out of it? And maybe the data is an event in itself, or maybe the event has to be derived out of multiple streams of data. And that's the decision making that sort of happens. I think about it like the eventing layer. Right? And then you start translating those events into signals.
And think about the signal could be, as I said, a demand shock signal. A signal could be, I delivered on time in full signal. A signal could be a delivery confirmation. A signal could be noncompliance with that temperature threshold. A signal could be your shipment is going in the wrong direction. Right? So over a period of time, we think we will build this 5, 6, 700 signals that are derived from sensor and sensorless data. And sensorless data could mean weather data, it could mean traffic data, it could mean ocean data, it could mean just a variety of other signals that basically we can bring. And maybe we have 5, 6, 7 week signals, but the combination of 7 week signals could be a very derived high confidence event that basically we can deliver to our customers.
So that's sort of how I see this architecture play out. Our integration capabilities are extremely consumable through webhooks. You know, we think integration should not take months. It should take days. We have, you know, very classic standard plugins when it comes to SAP or Oracle or when 1 of those planning systems that our customers have that basically can take advantage of it. So that's how our architecture is laid out. And obviously, you know, we have some, you know, applications, enterprise applications on top of it. We think it's best to deliver the flexibility to the customer when it comes to the dash boards and the visualization of this data can either reside in an SAP kind of a system, or it could be in some kind of a BI tool. Or if the customer wants to have it live within the Roombi ecosystem, that would work as well. So that's how we basically deploy our solution.
It's running on any pass because it's completely dockerized. So we run on Azure, the classic Amazon, IBM Bluemix, and Red Hat. But there are also customers who want us to run on private cloud. And having this architected in a manner that we can scoop these containers and deploy it on any data center or bare metal makes it easy from a DevOps perspective to manage this very well globally.
[00:26:06] Unknown:
In terms of the evolution of the ecosystem around the types of sensors that you're working with, the formats that the data packets are generated in from those different sensors, the volume of shipment, the types of analysis that companies are looking to perform on their supply chain logistics. I'm wondering how those shifting targets have influenced the way that you think about the kind of design and focus of Roombi and where you'd have been targeting your investment to be able to kind of stay ahead of the curve and be able to provide value to your customers as their capabilities and goals are constantly in flux as well?
[00:26:49] Unknown:
It applies to all silos or modules that bring this service together. So let's start with the sensor. Right? We think, at scale, the sensor should be easy to use. What I mean by easy and smart. Right? So our sensors basically today are purposefully built for monitoring of these goods and assets. What I mean by that is the sensor is smart enough to know the journey has begun. The sensor is smart enough to know the journey has ended. The sensor is smart enough to know it's on land or air on ocean and possibly reconfigure itself properly. The sensor is smart enough to know, hey. I'm going to die. I have only x amount of battery life, so give me a boost.
So that's the part of the sensing capability that's very smart on the sensor side. But it when it comes to processing or decision making, we can do some of it at the edge or we can push it to the cloud. And that's sort of where, you know, we bring that flexibility of that decision making. The second part is basically starting to think about once this data gets collected into our cloud stack, how this is going to be massaged or how it could be basically filtered through. And there are some very interesting industry standard filters that's, you know, that standards that are available that we use, but there are some proprietary ones as well. Right? And the fact is about learning. Unfortunately, these devices are not as powerful as our iPhones.
The iPhone you charge daily, these devices have to live for 90, 100 days on a single charge, transmitting at 15, 20 minute interval or even 1 hour interval. So the use cases are very different when you start thinking about mobility in the supply chain. The the other piece I also see is customers are having a very diverse, you know, ecosystem, IT ecosystem. Right? You got Wi Fi, you got Bluetooth, then you got ERP, you got, transportation management system, you got yard management system. And the customer is basically now wanting to take advantage of some of the signals we deliver. For example, we have a customer who said, Roombi, can you tell us when the shipments are 50 miles away from my destination so that I can take that feed.
I can start thinking about what is the open dock door that I can assign you to. So when you come in, you are not spending 7, 8 hours waiting for deliveries. You exactly know which gate, which dock door you are going to go. And by doing that, I'm compressing my transportation cost, for example. There are many, many scenarios that our customers come up with. Some of it is the ones we embrace are standard and repeatable.
[00:29:41] Unknown:
The others are something that we empower our broader partner ecosystem to build on. In that situation that you're giving an example of where your customer says, I want you to trigger an alert when you're within a particular range of the destination so that I can do some appropriate routing. Is that a situation where then there would be bidirectional communication from you to them and then them sending information back to you so that you can feed it into some other destination systems? Or are you largely dealing with a 1 way feed of information and then they do the processing on their end to figure out what that routing looks like? Yeah. I think 70, 80% of the time, while we can operate bidirectionally,
[00:30:20] Unknown:
80%, 70, 80% of the time, it's single. So think about it. We are triggers to their workflows. So these workflows are already defined. We don't want to win the workflow business. We don't claim to displace any of them. But what we do and do very well is trigger this workflow. As an example, right, I can only invoice if I deliver. So, Rumbi, can you tell confidently I have delivered? Okay. So that's a trigger that I can give you. Another trigger you talked about, and I give an example. Right? 50 miles away from destination. That's a trigger. So these are interesting triggers that allow our customers to trigger some of those workflows which has been built for many years and which are being used by their field force for many, many years. So we don't want to change any of that. Right?
But then there are other triggers that basically we will enable the customers with. For example, can you basically detect there is a tamper? And if there is a tamper, there is a complete different escalation process that needs to be handled. Now interestingly, in the barcode world, the tampering was never detected or it was detected only at the end of the journey. So there was no escalation procedure to follow through. But now with Roombi, the customers are basically reinventing their cells because now we are giving more triggers that they could not even get out of their existing infrastructure.
And that basically helps the customer, you know, what we all call digitally transform themselves to be more agile and more proactive in some of their decision making through this complex supply chain.
[00:32:05] Unknown:
Another interesting element of the problem that you're describing is for the case where you wanna be able to trigger based on a particular time or distance to destination, is that that implies that you have to have information about what the destinations are, what the rate of travel is, and being able to incorporate some sources of information that aren't strictly tied to the specific asset or mode of transportation that you're dealing with. And so I'm wondering if you can talk to some of the platform capabilities that you've had to build in order to allow your customers to provide you with some of those additional values and some of the other sources of information that you're pulling in, whether it's things like weather data, climate information, political events to be able to add in some of your predictive capabilities around. Oh, okay. I can see that there is a traffic event that's happening. So there was an accident on the highway, so I'm going to let you know ahead of time that your your shipment is going to be delayed by 3 hours. Just some of those other data feeds and sources of input that you're building to be able to have that kind of richer context around the alerting and triggering and predictive capabilities.
[00:33:14] Unknown:
It was interesting. When we started about 5, 6 years ago, we wanted our customers to enter the origin and destinations. Okay? And they loved it, okay, because they were only doing 10 shipments. And then 10 became a 100, and they said, why don't you integrate with our back end and pull some data, the origin and destinations from that shipment? And then few months later, you know, customer said, even I don't know, you know, until the last minute when the shipment is leaving where it distained to. Okay? Or sometimes that information is fed in a very time delayed manner. So we don't expect our customers to give us the origin than destinations.
Okay. We are in the business of visibility, so we know our device when it moves out of a location. We know through a lot of interesting ML and things like that. That's been the origin, and the shipment has started. And let's say from that origin, based on the patterns that we have seen in the last 6, 7, 10 months, the origin to destination pairs let's say there are 8 destiny origin destination pairs. So when it starts, there are 8 possible destinations the shipment could go. But as the journey starts progressing, we start basically shortlisting those those origin destination pairs because we know which are the possible destinations.
It's basically getting shortlisted to a point where now we have absolute clarity through our technology that the destination is 100% point b and not point c. So this is all built into our system. This is all built into you're using some very interesting probability theories here to start deciding where this shipment is destined to. So that's 1. And second is verification. Right? So once we basically make some of these estimations, at some point of time when the data feed of origin and destinations comes from the ERP into our system, we are able to verify, self correct, and and make it more automated. Mind you, we are basically talking about not 1 to 5 shipments. We are talking about 10, 000 shipments. We are talking about you know, we worked with a very interesting pharmacy company that basically delivers cancer medication.
This medication, from the time it was developed, needs to be delivered to the patient within 8 hours. So there is absolutely no room for error. You basically make sure that your ETA is very, very, very, very calibrated, and it's in the minutes of what the forecasted. So those are the kind of applications we basically are a very strong fit on. Now from a data perspective, right, most companies think that this is a very big data play. IoT is generating tons and tons of data, and we need to put, all these big data lakes and things like that. But it's interesting. After we started looking and consuming this data, we found that data has its own expiry dates.
Okay. So for example, you know, the sensor said the temperature has gone up by 1 hour or 1 degree. Now that 1 degree exception or excursion has probably a life of 2 hours if either you act within 2 hours or you don't. But after that, that data is meaningless. So once you start, you know, putting some expiry dates around the various kind of data streams you basically, you know, get, and then you start deriving some of those contextual signals, you're dealing with a small pool of data for our customers. So that's how we basically view the world of data, IoT data, and other data streams in our ecosystem.
[00:37:12] Unknown:
In addition to that, you also have capabilities for customers to be able to build and provide their own integrations for different downstream data systems that they wanna be able to work with. You in addition to the sensor based capabilities for adding context to a shipment, there are options for user generated or user input data. And I'm wondering how you manage some of the quality controls around those kind of destination integrations and some of the user inputs that are outside of the control and ownership of Roambee?
[00:37:47] Unknown:
It's hard. And it starts with the physical addresses itself. Okay? So we are working with a chemical company, and the chemical company said, here are the 60, 000 physical addresses that we deliver to. Now when you consume that 60, 000 physical addresses, they're all wrong. Right? They are missing a digit or they are in a place that's not identified by the map. So now you start need to clean this data because this data is so important because you got to geofence this data. And geofencing in our business is very key. If you put the right geofence, you can get a lot of accurate events. But if your physical address is here and Google Map or any of the other mapping technologies says down the street and my sensor is going in a very third direction, which 1 is accurate?
So it is very easy to manually move these dots and say, this is my new dot and this is my new physical address. But you're dealing with 6, 7, 000, 000 records. How can you do that manually? So you almost need a engine that basically start looking at, actually, where did the sensor go and where did the sensor go not 1 time but 10 times? What is the map lookup looks like? What is the address that was given? And combine that into 1 single source of truth. Okay. And to do that a 1000000 times or or and or and or again. And then applying geofencing automatically.
Right? So the entire world of tracking today is, you know, drawing this orthogonal lines or boundaries around the geofence you want to. Geofencing is a nontrivial problem. How do you apply geofences on 60, 000 addresses automatically? Okay? And geofencing is also as good as the signal strength of your tracker. So if you make the geofence too small, okay, your tracker might give you a false positive saying it is in and sometimes it is not in. Right? Even though the physical goods might be still within a warehouse. So your geofencing has to be elastic, and that's sort of where ML comes in. Right? So you start with a bigger geofence because you want to capture all of the dots that's there in that region, and you expand and contract until the geofence finds its right shape and size.
And to do that, you know, a few 1000000 times over without human intervention is again nontrivial. So that's how we basically up our game when it comes to location accuracy. We up our game in terms of ensuring when goods and assets when we say the goods and asset are within our warehouse or within a local address, we have a high level of confidence on that information that we deliver to the customers. So that's how we basically do that. We, of course, integrate with any of the third party tools, whether it's a planning tool of the customer, or sometimes it's also integrated with 3rd party cloud analytics tools that are available to the customer.
[00:40:58] Unknown:
Prefect is the data flow automation platform for the modern data stack, empowering data practitioners to build, run, and monitor robust pipelines at scale. Guided by the principle that the orchestrator shouldn't get in your way, Prefect is the only tool of its kind to offer the flexibility to write work flows as code. PreFect specializes in gluing together the disparate pieces of a pipeline and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20, 000 community members, Prefect powers over 100, 000, 000 business critical tasks a month. For more information on Prefect, go to dataengineeringpodcast.com/prefect today. That's prefect.
As far as the onboarding process, supply chain logistics is a very complex and interdependent area and requires a lot of moving pieces and integrations to be able to fully realize the potential for a solution like Roambee or being able to understand what is the actual impact on my business and my shipment capabilities and my, you know, business logistics. And so I'm curious how you think about the overall onboarding process of, you know, doing a proof of concept with a customer, figuring out what is the actual targeted use case that we want to invest in and be able to identify, is this actually going to be useful for this user? Do they have the appropriate capabilities on their end to be able to take advantage of the capabilities that we're providing, and then being able to just kind of demonstrate the overall utility of Roombi in the kind of shortest time to be able to demonstrate that value?
[00:42:40] Unknown:
From our service perspective, lot of the use case product solution fit happens during the sales process. So we exactly know what is going on. For example, you know, we are working with 1 of the largest retailer in Europe, basically who used to move products to trucks. And now with the fuel cost gone up and the truck journey become unpredictable, they are now moving to railcars. Now when you move to railcars, yeah, you get much more predictable ETA. But railcars, you know, if railcar is misplaced, you're talking about loads of misplacement. Right? So, few things that we do and do very well on the onboarding side, which is a combination of making the solution very simple. So starts again with the hardware. Right? The hardware is always on. So when I ship these hardwares to the customer, they are preconfigured.
You know, all of the black magic that goes into the hardware is all ready to go for that use case. The customer doesn't even have to push a button to turn it on. Because imagine, you got an a warehouse operator who wants to stick this on their pallet, and they wanna do that 10000 times. You don't want to be at the mercy of the operator to know if the button was turned on or off on the device. So our devices are always on. You can only shut them through a hammer or throw it in water. So that's sort of an example of simplicity. Right? So we bring a lot of simplicity in our technology, in our solution.
Then second part is the people. Right? So we basically part of the onboarding exercise is to train the people. And training the people is through, you know, we have a Rombi University. We sort of, you know, make this more certification driven. So we will basically bring some personas in a virtual environment and train them and certify them. So that's the second part. And the third part is a process. Supply chain, through my experience, is all about SOPs. Okay. Can you give me a word document with these 10 steps that I have to do all day long? And that's what we have. So we have invested in a lot of technology that allows us to build these SOPs out for the customers that we can give it to them and make it a part of their process.
And if you do these 3 things very well, there's a very little friction when it comes to customers embracing this new solution and realizing the benefits from the deployment.
[00:45:18] Unknown:
In your experience of building Roambee and working with your customers and working in this space of supply chain management, particularly in this, shall we say, interesting time of the past couple of years with the pandemic and the changes in how kind of workers are orienting themselves around how they think about kind of the value and the types of jobs that they want to do. So that introduces challenges around kind of staffing levels. What are some of the most interesting or innovative or unexpected ways that you've seen Roam be used to be able to navigate this time?
[00:45:52] Unknown:
From an internal perspective, we also got hammered by the short supply of semiconductors. Right? So we had to innovate our own supply chain in making the devices. The demand is so huge right now that we are not able to, you know, deliver these services to our customers fast enough. So we had to take parts of our own device manufacturing supply chain in our own hands, and that sort of was 1 interesting exercise. The second thing that we learned is from the customer is every use case is very different. So how do you basically derive what parts of these use case can be repeatable and either bring it into a process or bring it into a technology for automation.
And it could be as simple as fixing this device. Okay. There are probably 10 different ways of doing this. Okay. You talk to the customer. So earlier, we were all became very anxious because every customer wants to apply this device in a very different fashion. And then we started applying science behind it. Right? What are the all the possible ways that could happen? So if if your shipment is circular, well, how do you deploy it? If it's a rectangle, how do you deploy? If it's a wooden crate, how do you deploy it? If it's a container so really making it a cookie cutter approach was something that we benefited from by early access to these customers, by doing a lot of field visits and understanding the customer's problem statement. Right?
And lastly, in the very unique ways the customer is using this data, sometimes the customer is using our data to really rate transporter behavior and transporter performance, which we didn't think was the value that we were delivering. Then there are some customers who are rating route quality. So if I were to ship this on a 280 versus I ship this on a highway 101, You know, what is the quality of these routes? So can I score these routes? And based on the product, can I pick 1 route or the other? It is okay to have a longer route, but a better route. Right? So that's something that we didn't realize we were built for, but the customers are using it. Then the customer is also wanting to look at load and unload times. Right? So our technology, when your shipment enters a geofence, we deem it as delivered.
And the customer said, that's not good enough. It may be in a geofence, but it's still waiting to be unloaded. So can you basically give me a time dwell time between unload and load? And that's sort of where our technology basically push the envelope based on the customer ask. Right? So these are some very interesting feedback that we got from the customer and brought this back, into making part of our onboarding process or part of our technology and application to make it just super simple. Right? Our customers constantly tell us, if you deliver value with no mouse clicks, that's the biggest benefit. But even if you want to, you know, have me click 1, 2, 3 times in a in a warehouse that gets really mad after 3 o'clock in the afternoon, we might have visibility value, but it's not helping the field force that's do actually doing this.
[00:49:08] Unknown:
In your own experience of working at Roombi and working in this space, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:49:18] Unknown:
Challenging lessons is quite a few. You know, when we move these because of our entire business hinges on the sensors, The sensors have to be certified by country. Every country has its own laws. It has to work on almost any type of communication medium, and cell tower infrastructure is very different parts of the world. So just dealing with this was we are a technology company. We are not prepared to deal with a lot of this paperwork. So, you know, obviously, that's sort of 1 challenge and 1 surprise that came into running the business. The second was, basically, how do you charge the customers?
Right? So, obviously, our charging model is very much like a talk time model. So, you know, my family is on a 5000 minute talk time plan. Whether I use it or not, you pay for 5000 minutes. And that's sort of how we do this for our customers. But I think there is a better way, which is usage based. Right? So we are embarking on some ideas working with the customer to make it usage based. And lastly is, can we basically bring the cost of the solution down so that it is on all packages and all shipments? Obviously, you know, we can't be a chip company overnight to build a system on a chip. But we feel that if the industry can go in the direction of a system on a chip, plus an antenna, plus an battery, all combined together in the disposable form factor manner for sub $1, I think there is no reason why every box should have this kind of a technology.
I call that as a challenge, but I also call that as an opportunity for the industry in general to rally behind the transformation of supply chain.
[00:51:02] Unknown:
For people who are running a business that has significant supply chain chain requirements or they're trying to gain better visibility into the supply chain that they have, what are the cases where Roombi is the wrong choice?
[00:51:14] Unknown:
I think Roombi is the wrong choice when you're looking to monitor vehicles and not shipments. So most time, we basically get compared with the Fleetmatics solution. Right? They say, hey, but that vendor is offering $10 at obd 2 device on a truck. We are more granular. So if you are looking to monitor a shipment within a truck, within a container, within a plane, within a rail car, we are it. Right? But if you're looking to just track the vehicle itself or the carrier itself, then we are not 1. 2nd is basically, if you are looking for high volume monitoring, we are the right company. So most times, you know, our customers will come and say, I've got a problem lane from, I don't know, from California to Rhode Island, and I only want to do that lane. And I do 10 shipments on that lane. And that's sort of what my universe looks like.
We are not the right guys. We we might fail in very small implementations, but we are classically built and thrive on chaotic multi tier supply chain. You know, there's a lot of movement, a lot of volume of shipments moving between various nodes. And the third part is last mile. Right? So there are a lot of last mile delivery solutions. If the solution last mile delivery solution does not have a requirement for condition monitoring, we are probably not the right company to address that use case. You are better off, you know, having solutions that monitors our driver through a driver app and combine that with telematics of the truck.
And by doing that, you can get that last mile visibility at a much cheaper cost than Roombi.
[00:53:02] Unknown:
As you continue to build out Roombi's capabilities and evolve the platform, evolve the company, and continue to explore this drastically evolving space of logistics and supply chain management and supply chain analysis? What are some of the things you have planned for the near to medium term or any projects that you're particularly excited to engage with?
[00:53:24] Unknown:
I think the 2 ambitions we have, 1 in the short term, is addressing spoilage in a big way. So we recently won an award from the FDA. Our solution is not only low cost, but actually has a lot of promise when it comes from farm to fork kind of implementations. And part of that is also about choosing the right packaging for moving these products. And we think because we monitor a lot of things around the package, not only its health, its handling, but also its temperature, humidity, its tilt and shock and tamper. And many of these parameters, we think we are a company most suited to deliver a packaging recommendation for our customers.
And if a right packaging recommendation can take 50% of the risk of spoilage off the table, then the remaining 50% is in transit. And if we can solve that problem through granular visibility and real time intelligence, I think we can solve some of the glaring problems in the industry, not only just on the food, fresh produce side, but also, you know, on the automotive side, on the retail side, on the pharmaceutical side. The second ambition, which is a little bit long term, our thesis is the supply chain of tomorrow. And when I say tomorrow, it could take 3, 5, 7 years, It is going to be autonomous. And we are seeing a lot of early steps being taken by academia, by companies like us, by large companies like the SAPs to go in that direction.
And we feel that Roombi is uniquely positioned to enable the autonomousness in the supply chain through portfolio of these real time signals that we can derive from movement of goods and asset and translate those signals to bring that autonomousness. So I feel we are just basically scratching the surface right now, quite excited about where we are. And in the tech stack where we are positioned, obviously, a lot of learning. So we always reach out to, you know, industry veterans. We have customer advisory board. We also bring a lot of, you know, advising team in place from time to time because the last thing we wanna do is step on a landmine without knowing we are going south. So a lot of humility, a lot of transparency as a culture in the company to learn and be hungry and to commercialize some of the technologies for our customers.
[00:56:05] Unknown:
Are there any other aspects of the work that you're doing at Roombi or the overall space of supply chain analytics and logistics that we didn't discuss yet that you'd like to cover before we close out the show? The 1 thing that, you know, we are watching this category in this space is satellite communication.
[00:56:23] Unknown:
All of the IoT technology today works on cellular. Is satellite communication the next kind of technology upgrade for a lot of IoT sensing? The second part that we are also watching very closely is energy harvesting. At some point, can we basically bring perpetual energies from some of these devices? And maybe it's not applicable to the Roombi use case. But when you start talking about disaster management and things like that, you almost need to have, you know, some form of perpetual energy in your IoT stack. And then lastly is, can we basically take data that we have collected and many other collected in a federated manner? And can we start thinking about the impact and how it's going to propagate in the network? So if you have, you know, 3 ships of avocado stuck at Swiss Canal, can you basically take that event and look at an impact on pricing, look at an impact on just the availability of avocados in 1 or many regions around the world. So I think there's quite a few things that are unsolved, but I think, you know, industry generally is quite excited.
And these problems are coming upfront, which never used to happen. These problems are coming to us in a much more accelerated fashion because of the broken supply chain ecosystem we live in today. And, you know, COVID had just accelerated some of that brokenness.
[00:57:51] Unknown:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today. I think it's cleansing of data.
[00:58:10] Unknown:
A lot of companies have done a lot of work on deriving value of the data. But I think the data cleansing, data scrubbing part of this chain is underserved. We wish we could have more technology looking at that. And lastly, simulating. Right? So 1 is cleaning. And if you've cleaned it well, can you basically, you know, multiply it a few more times so that you can use that data for simulation? I think these 2 pieces would really solve sort of the, you know, pain points that, data science engineer and scientist, you know, impacted
[00:58:49] Unknown:
with. Thank you very much for taking the time today to join me and share the work that you and your team at Roombi are doing on being able to bring better visibility and autonomy to supply chain and logistics. It's definitely a very important and increasingly complex area. Appreciate the efforts that you're doing to make it a more tractable and maintainable problem. So thank you again for taking the time today to join me, and I hope you have a good rest of your day. Thanks, Tuvis. Bye bye.
[00:59:24] Unknown:
Forget to check out our other shows, podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning podcast, which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast.com product from the show, then tell us about it. Email hosts at data engineering podcast.com with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction to Sanjay Sharma and Roambee
The Journey from RFID to Roambee
Personas and Use Cases for Roambee
Data Sources and Real-Time Visibility
Roambee's Architecture and Integration
Evolving Sensor Technology and Customer Needs
Data Accuracy and Geofencing Challenges
Onboarding and Implementation
Innovative Uses and Lessons Learned
Future Plans and Industry Trends