Making Wind Energy More Efficient With Data At Turbit Systems

Hello, and welcome to the data engineering podcast, the show about modern data management.

What are the pieces of advice that you wish you had received early in your career of data engineering?

If you hand a book to a new data engineer, what wisdom would you add to it? I'm working with O'Reilly on a project to collect the 97 things that every data engineer should know, and I need your help.

Go to data engineering podcast.com/90

7 things to add your voice and share your hard earned expertise. And when you're ready to build your next pipeline or want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With their managed Kubernetes platform, it's now even easier to deploy and scale your workflow, so try out the latest Helm charts from tools like Pulsar, Packaderm, and Daxter.

With simple pricing, fast networking, object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform.

Go to data engineering podcast.com/linode,

that's l I n

o d e, today and get a $60 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

You listen to this show to learn and stay up to date with what's happening in databases,

streaming platforms, big data, and everything else you need to know about modern data management.

For more opportunities to stay up to date, gain new skills, and learn from your peers, there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to data engineering podcast.com/conferences

to check out the upcoming events being offered by our partners and get registered today.

Your host is Tobias Macy. And today, I'm interviewing Michael Tiegmeier about Turbot, a machine learning powered platform for performance monitoring of wind farms. So, Michael, can you start by introducing yourself? Hi. Yeah. Of course. So, yeah, I'm,

Michael, and I'm, the founder and CEO of, Turbot Systems.

And we are basically a data analytics platform

for wind turbines.

And,

we have built,

some tools to make the the maintenance and operation of wind farms

more efficient.

And, yeah, I'm looking forward to a great conversation.

And do you remember how you first got involved in the area of data management?

Yeah.

Of course.

So

my my education is,

is a physicist. So I somehow always was dealing with data in

in the university when we met made some experiments. And then also you had to write some programs to analyze data, of course.

So

that that's, like, 1 part where I got

confronted with some data, let's say, and also maybe some super complicated,

data.

And in my studies,

I I wrote my bachelor's thesis about some measurements I did on on wind turbines. So as, you should know, wind turbines need to be directed into the wind in order to generate power. And, what I was doing in that, bachelor's thesis was

I was, making measurements with a laser system. It's called LIDAR, so light detection and ranging.

And, with that laser system, you could

see the

wind direction and the wind speed in front of the turbine. So that means that the turbine could or that

was, at that time, that was somehow the,

the vision to control the turbine

before the wind is actually at the turbine. So see what's coming in before the turbine and then

have some control algorithms that that turn the turbine in the right direction and the pitch of the rotor blades and and and the right angles before the turbine actually,

before the wind is actually at the turbine. So but the measurement itself was, like, of course, again, with a lot of data, and you had to

match these data of this new LIDAR system

with the data from the turbine, like when is the turbine running, what wind speeds were measured with

the LIDAR system, what wind speeds were maybe measured with some other systems, like some

anemometers on top of the turbine.

And,

yeah, and

so to to make the story,

maybe to up until I get to Turbot,

later on then, I was still doing some some physics.

I I also might, I I I made my my master in in in laser physics, but this time, in in pulse shaping, temporal and and space pulse shaping.

And we did this this with splitting

a laser

beam

into its parts, into its different frequencies. And then you could control each frequency,

the polarization and and the amplitude of that

frequency. And then you put it the laser together, and then you could have a laser pulse that that's formed in in the time domain.

So, like, there's coming a lot of energy in the beginning of the laser poles and then maybe later a little bit more. And what we were doing there is that we were

trying to get electrons out of out of the atoms,

and we didn't really know how to push the electron to to to put it in in some easier words, maybe.

So the the electron is moving in the in the atom, and and at some point in time, you need to push the electron out. So and we didn't really know how to

form that pulse. And we did this

then with trial and error and basically with a genetic algorithm,

And that was the first time where I

have seen

the power of,

yeah, such algorithms.

And I got super interested in in in, yeah, the power of,

what you could do with data analytics and and, let's say, the the first

idea of machine learning. It's not really machine learning, but,

first algorithm that that comes into searching how how a computer is finding something out that you don't know about. And then later you try as a physicist to understand, okay, what what was going on there? Why did it work? Why did we

why were we able to put the electron out of the atom just with this form of pulse?

And,

yeah, that was quite interesting. And and then later with with my knowledge about,

wind energy,

I was,

I was thinking, like, what to do,

what to do in in life after after studying.

And I was looking for something that maybe is also making it sounds maybe stupid, but making the world a little little bit better.

And I came

to renewable energies, and and I I found wind energy most interesting because parts are turning. You have a lot of data.

You have it's it's international. You you can go around the world, and and

it has everything

that you need

for

for your brain to to have some interesting things to work on. And,

so, yeah, that's how

also, I decided to to found Turbot Systems because actually, this is a kind quite a interesting story, maybe. The first time I got on the turbine

was

with this lighter measurements, and and I I got quite dizzy, like some sort of seasick. Like, turbine tower is, like, 100 meters high, and,

when you're at the top and the turbine is switched off or is even running,

there's a lot of movement of the tower. And if you don't cannot look outside

because you are inside of the tower, you you get seasick. And I was thinking, okay,

if there's so much vibration

due to the wind, then, of course, you need need to see some some

some wind direction also in in in the wind movement.

And,

then together with some mates from the university, I I was

looking to that problem

more deeply. And I found out, yeah, there's there's a relation between the wind direction

and the type of the movement of the tower.

And,

that also meant that you could maybe,

see the wind direction more precisely. And

so we did some measurements, and this is how I came

to, to found Tervit, actually. And then later, we became more a data scientist sites company.

Yeah. It's definitely a very interesting problem domain because as you said, wind energy

is ubiquitous

in terms of its availability around the world because the air is always moving. So it's something that can provide a lot of benefit, particularly

for countries who are just starting to

build out their renewable infrastructure. I know that Germany has been using wind energy fairly heavily for a number of years at this point. So I'm sure that that also helped in terms of access to be able to

build out your product while being able to sort of remain local and do things, within your home country. Yeah. Totally. Like,

apart from Denmark I hope I'm not saying too much wrong here, but, apart from Denmark, Germany has been

quite early in in wind energy. And, of course, Germany is

always known as

a engineering country, maybe. And,

yeah, like, Wit Energy has been here

at my home.

I'm I'm born in Bremen, and and we have a lot of wind turbines there and also around Berlin.

In Brandenburg, there are there are many, many turbines. And I could see

that they exist, let's say. And then, of course, Germany is a good country

to build

a new company for wind energy, I think, because of all the resources that you have here and the knowledge

and and the connections that you can potentially,

get. On the other hand, also, maybe Germany's,

very special in in the wind energy domain because of its history. And

that's actually a good thing.

Wind turbines are owned in Germany by many, many people. There's, like, this this thing, it's which is called, so,

energy for the for the people, let's say. And then small cities, they they invest with many people in in a wind turbine, and then they profit from it, financially. And this concept,

maybe you don't see so much in other countries like the USA or

China where where there's more, like, big manufacturers that own

big wind farms

and,

yeah. And so for Turbot Systems in particular, I know that 1 of the main focuses

of the product that you're building out is to help improve the overall operating efficiency of the turbines, both individually and in aggregate. So I'm wondering if you can just talk a bit more about some of the ways that you're helping to optimize the output and some of the most problematic factors that contribute to performance

degradation in wind turbines and in oh, and both individually and in aggregate? Mhmm. Yeah.

So basically, a wind turbine is like a plane. So,

it has wings, which we maybe call

rotors,

and they are directed

they they must be shaped in a very special way, and they must be directed into the winds while the turbine is turning in a very directed

into

the

wind,

directed into the the wind,

so that, actually, the turbines are facing into the wind, which which we call yaw. And and both these pitch and yawing needs to be, yeah, optimal in order to to get all of the energy out of the wind. And so

when I was talking about the lighter system, and my bachelor see this, the the goal was to correct

the the way the turbine is turning into the wind. So the problem is that in in big wind farms, in big yeah.

In big wind

farms, you have other turbines in the wind park that are creating turbulences,

and you have maybe

sites that or a forest at at some sort of some some part of the wind park that is redirecting the wind in a weird way, let's say. And you want to make sure that

the turbine algorithm or the turbine behavior

is always in such a way that it gets the maximum

possible power output. So that's directed correctly into the wind and also with the pitch pitch systems. So but if you have a measurement of the wind on top of the nacelle that's behind the rotor plane,

then you always have some arrows and you wanna be able to correct this. And in addition to that, it's like a very simple problem

that sometimes the technicians that go up and and and put the anemometer that's measure measuring the wind's direction on top of the nacelle, they do this with an arrow and sometimes with, like, more than 5 to 10 degrees, and nobody's

detecting that. And then you have a

a bad performance of the of the turbine. So this is how we started with, as I said, with the vibration measurements. But then later on, going more in the the in the data analytics part, well, you get it you you get a lot of information from the turbine or potentially get it. So the turbine is logging a lot of data like wind speed, wind direction,

temperature

of the outside air, then temperatures of the gearbox,

temperature, like, a lot of data up to 500 different values. And

up to now

or maybe the past

up to the past 2 years, nobody really

analyzed this data, these these

huge datasets.

So another thing that we found out is that

sometimes the turbine is

operating in the in the throttle mode

that nobody knows about. So

sometimes because of regulations,

because of,

noise regulations, the turbine should not

produce much

power or is producing less power than it actually could. And sometimes these turbines go into these

noise modes,

without anybody knowing it. And so we figured out, okay, let's let's do some

general analyzation of the normal behavior of a turbine, and let's look if there's something that we can find with turbine is not behaving in a normal way. And that's, like, that's totally a data analytics

problem.

We don't really maybe have all of the domain knowledge of 1 particular turbine, how it should turn, and how it should behave. But we can look at the data and and see and and look for abnormal abnormalities. And with that

example that I was talking about,

you can understand, like, if if the turbine is producing half the energy that it could,

then, of course, this is a huge factor,

economic factor.

And if you find these data points and these these turbines that are not producing enough energy, then then you clearly have a value that you can give to your customers. And so you mentioned that at least up until the last couple of years, that a lot of this data that was being collected with the systems that are embedded into the turbines is being ignored or not analyzed in any great detail.

I'm wondering what the current state of the art is as far as being able to

analyze the performance of the turbines and correct for errors

and do any sort of preventive maintenance to reduce downtime?

Yeah.

So

up to now,

they're standard, at least in Germany, to have 10 minute average values of different,

measurements at the turbine, for instance, wind speed,

power output of the turbine. And so that's the standard. And, basically, these this data has been locked in the past just because of regulations.

For instance, like, if if the turbine is shut down

because of too much energy in the grid, then yeah. In this in the in in in this case,

you have the data to to see and locate there has been such amount of wind before this event, and you

would have generated so and so much energy because of this grid shutdown. And that's why, basically, maybe people were

logging data. But now,

people also under start understanding that you can you can do more with the data. So, also, more data is logged in the newer turbines, and there are more sensors, and the sensors

potentially can not only log 10 minute average values, but also maybe second values or sub second values.

So, potentially, you you can get more data than you could get maybe in the past.

And, yeah, it's like it's a physical system. The turbine is

is is a machine, and you can you can

grab a topic and then look into detail

and look at the into into the data and see if you can optimize something there.

So, yeah, so

just to give that example again to where where you can reduce the

the power,

where where the where the the power of the turbine is reduced because of some regulations or because nobody is noticing it. Yeah. Maybe I can explain a little bit more how we do it. So,

we we basically try to find datasets

that we definitely know about that the turbine is behaving in a good way. So we filter out these datasets, and,

we call them our training dataset.

And then

we train neural networks

on this dataset. And

we have to think about, okay, what physical system makes sense? Like, what is the input

of that

black box formula, and what's the output?

And the input

for the power output can be, of course, the wind speed, but the the energy that is contained in the in the wind is also dependent on on the density of the air, and the density

is dependent on the temperature, for instance. So if you have a value

a time series maybe of wind speed

and and temperatures of the outside air,

then you can use these 2 values as an input

to generate

the power, to to to simulate the power output.

And if you have a dataset where you know, okay, the turbine is behaving correctly,

then you can train a neural network

on that behavior, and then you can simulate

with new

datasets

how does that turbine should have behaved in that scenario, in that physical scenario. And then you can make comparisons.

You can add some more information like status logs and and other European data and service data from the from the maintenance companies

and mix everything together

and create a value out of that. And then as far as the

types of data that you're able to access from the sensors

and the control systems and the turbine,

what are some of the challenges that you're dealing with as far as just the data collection?

And what is the

level of variability

between

different turbines

and different manufacturers, Yeah.

For

Yeah.

For a good write, there there have been,

companies on the market that have had,

specialized exactly for that problem because

every turbine

somehow is a prototype

because

if maybe you let's say you you buy a turbine from manufacturer a,

and you put it in your site,

specific site, and then you have an additional contract for data management with another company.

And so you can imagine how many potential

variations

of of combinations of manufacturers and data

data collecting

computers

there are on the market. And that means that that there's a huge variety of the of the datasets.

So

we also had to learn that in the beginning. And you cannot assume that that if you have 1 turbine type that the data is looking always the same because you don't know if it's been generated by the same type of system. So the best way to deal with that problem is to

to look at each and every turbine as 1 system and and

not make cross correlations

too early with with let's say, if you have 1 turbine

type you want to make cross correlations with with many other turbine types

of the same model

sorry. So the same

turbine type

and make the cross correlations over that,

you you better you're you're better set

if you have, like, for each and every turbine,

a specific model. And that also means, again, that you have a lot

of machine learning models, that you have a lot of data that you need to train. There's a lot of scalability

problems, let's say, that that you have to look to.

And, yeah. And then then, of course, the standard data problems, you have data gaps. You have

data points that are weird, like,

outside temperature of 1, 000 degrees.

So you need to handle that. Or constant

constant,

temperature of

minus 10 during summer.

Doesn't make sense also,

degrees Celsius, of course. Yeah. And and and you need to clean your dataset. I I think every data scientist

knows how problematic

that can be.

And,

yeah, that so that that's this has really been a challenge,

to build some some automated systems that clean

these

very

these datasets

that are,

that have a great variety.

And then in terms of the actual collection of the data, how are you handling

getting it from the turbines?

And how much of the information

are you processing or filtering

on the collection point versus how much you're bringing back into your core service layer for being able to do more aggregate analysis across multiple turbines? Yeah. Yeah. I think in general, you can ask yourself as a data scientist

data science company,

what if you if you delete data,

you delete information. So if you if you say, okay, I don't trust this data point because it has 1, 000 degrees Celsius

outside air temperature. And you can ask yourself, okay,

why is that so? Is it because of the because of the real turbine control system or maybe it's a sensor.

Maybe it is

a calculation error during data collection, and you wanna know that because maybe that's the problem that the turbine has.

Maybe it's a sensor. Maybe the temperature sensor

gives you weird values, and because of that, the turbine is shutting down. So

you need to be with data cleaning, you need to be quite

that's a big point.

If you wanna throw away data,

so what we basically do, we we mark data as as,

not trustable, let's say, and then we can later

reanalyze,

how maybe maybe that's because there's a sensor

error.

And so yeah. So we basically get everything that we that we can get, and then later we we flag data

to be trustworthy or not. And,

so to answer your question, I think the most

preparation of the data is is is been done on the database

that we have.

Today's episode of the data engineering podcast is sponsored by Datadog,

a SaaS based monitoring and analytics platform for cloud scale infrastructure,

applications, logs, and more.

Datadog uses machine learning based algorithms to detect errors and anomalies across your entire stack, which reduces the time it takes to detect and address outages and helps promote collaboration between data engineering,

operations, and the rest of the company.

Go to data engineering podcast.com/datadog

today to start your free 14 day trial. And if you start a trial and install Datadog's agent, they'll send you a free t shirt.

And then as far as the overall system architecture of Turbot,

how have you designed the overall pipeline of being able to go from collection of that remote data at each of the individual turbines

into your central

dashboarding and analysis for your customers and just the overall

life cycle of data as it propagates from the control systems in the turbine through to the analysis that you're delivering to your customers?

Mhmm.

So, basically, we get

the data,

in different time periods,

sometimes in real time, sometimes

every hour, sometimes every day. It depends on on on the customer and whatever the customer has set up in his turbine.

And,

then this data is,

is locked into the database or written into the database.

And then we have different jobs running on the database, cleaning the data, flagging data,

and,

we have jobs that that train the models

that then yeah.

Then jobs that that,

generate simulation data.

Then

we we compare the data with,

so so the simulated data with the real measured data,

then we can detect, we have jobs that detect abnormalities

in these datasets.

And then finally,

1 has to ask you himself, okay.

What is really the value to the customer?

Is it detecting abnormalities, or is it detecting an error? And what does it mean detecting error? Like,

in in the best case, it is something like

a real action point that you can give to your customer, for instance. Okay.

Gearbox temperature

has been too high for the past 2 months,

So you better

send out the service team to check why that is, or maybe you can even tell the customer why, the temperature is so high.

And

this last part, I think, is the most

important part

because there you really

need to understand

the

the your customers. You really need to understand

what's the problem that you're really solving. And,

that's

I think, also,

as a data scientist, sometimes you need to

maybe focus more on your customers

than on on what you what you generate as as data sets. And,

yeah,

you you really need to understand what what are you delivering to your customer.

And on that point too,

how much of

a feedback cycle are you able to build with the Turbot system as far as being able to determine some of these

turbine misalignments,

are you able to then feed that back into the turbine itself to be able to automate some of that correction?

Or does it require generating a notification to your customer who's managing the turbine and the wind farms to then be able to do their own maintenance or operations as far as bringing the turbines into alignment and things like that?

Yeah. So

if you generate

some some action points for your customers, they basically

get an email or a pop up message in our web tool or

in the app and and and then,

they they can they can understand, okay, I have something to solve here, then they can put it in their own schedule

and, solve the problem. And after that,

they can divide

give us feedback.

So there are some basic questions like how how helpful has this been to you or

how relevant has this been to you so that in the next iteration,

we can then,

flag these detected events and and understand,

okay,

if we show this kind of error, how relevant was it to the customer or how how good were we with the prediction

so that we can then improve

the way how we do stuff

or use that labeled data to retrain other neural networks to do some optimizations.

And then the other question too, as far as being able to build useful notifications

is having the necessary domain knowledge

of how the turbines work and the atmospheric conditions that contribute to different performance outcomes.

And I know that you mentioned that you have some background of doing

research and working with turbines. But what are some of the other ways that you're incorporating some of that domain knowledge into your product to ensure that you're able to provide the most value to your customers?

Yeah. I think there are some

some things that you really need to know,

that you have to learn

also as a data scientist.

Like,

let's just give an example. Like, if you don't know that the turbine

has a has a limit,

limited power output. So if there's a lot of wind, the turbine will never produce

more power than, let's say, 3 megawatts.

And it depends on the man manufacturing turbine type. And if you don't know that, then you might think, oh, maybe the turbine is not generating enough

power

and

this is like just an example of some domain knowledge that you

need to know in order to to train the networks correctly and to

to to make the right conclusions out of your

data and, data analytics.

And then sometimes there's also stuff that you

or problems that you cannot really know

if you don't have 20 years of experience as a turbine technician.

And in these cases, we we just have a network of other companies that we work with,

and,

we can then

give them that problem, and then they can analyze it. And,

together with our customers,

they then can

make the

decisions what to do next with that kind of very special problem.

And then as far as

the work that you're doing to build out this product, what are you finding to be some of the most challenging aspects of building an analytics solution for the wind energy sector?

I think

handling so much different data sources is the

was and is the the the biggest problem.

And the second is the the quality

of your data.

And,

you really make you really want to make your

you you want to build a data lake

and

not a data sump.

I don't know if that's a correct word. But,

yeah, you you wanna have a good data pool,

and that's really hard with so many different data sources

that you cannot really trust. And yeah. And then maybe another thing is also making things scalable

is a hard thing. You have different connections to very different to many different turbines,

and Internet connections are breaking down very often. And

this is, like, really

a huge problem.

And are there any particular technologies that you've been able to lean on to help with some of that scalability problem in terms of being able to

handle the data collection and ensure that you're able to get reliable throughput? Yeah. We're we're working together with,

company called, Swarm 64.

And they

basically

managed us to to handle

a lot of data in real time. And and with real time, I really mean real time, like 1 second or sub second values.

And,

they they help us

to solve that scalability problem if you get more data

and even so much data that you, yeah, that that you cannot handle it with usual databases any longer.

And what we also want to

achieve, we want to give feedback to the turbine

in real time. And

for instance, that could be

that you have, 1 turbine standing in front of the wind park, and it's getting wind gust.

And,

that gust is moving through the wind park, And then the first turbine is telling the other turbines, okay, there's coming a wind gust, and you should better behave like this or like that. And, this information is then sent back to Turbot.

The the algorithms are giving the best way how to yaw and pitch the other turbines in the wind park, and, all that is happening in real time. And,

for that, you really need to handle a lot of data very fast. And for cases where you have maybe some sort of weather system coming through an area, are you then also able

to feed that information to other installations of turbines that might be in the path of the weather event in terms of being able to improve their energy output or,

maybe

throttle them so that it prevents potential damage if they're,

especially high wind gust or things like that?

Yeah. Of course. Like,

if there's a very momentarily

wind gust coming to the wind park, you you you could potentially do that.

If there's a huge weather system coming,

that's mainly part,

of the

that that's mainly the job of the grid operators,

or

yeah. Mainly that because

they they need to shut down some turbines in advance because they're knowing, okay, we are gonna produce a lot of energy, and that's too much for the grid. So let's better shut down some of some of the turbines.

And that's actually happening quite often in Germany,

especially in the north and the

at the seaside.

There are some turbines that are shut off 50% of the time, and nobody's using

the energy that that the turbine could potentially generate during these times.

That's another interesting aspect to this system is

the energy storage and energy distribution

capability. I'm wondering

how that factors into some of the decision making that you provide to the turbine operators as far as

ways to ensure that they aren't,

generating excess energy that's going to just get dumped or generating

excess energy that is going to potentially overload their grids or storage systems and ways that you're able to maybe bring that information into the overall equation or some of the other external data sources that you're able to rely on to feed into your models.

I mean,

yeah.

You're right. It's it's quite interesting, and there's tons of

topics and and problems that you could potentially solve.

This particular problem that you're mentioning right right now, we're not solving at the moment.

As far as I know, there are other companies around that that do that that that have specialized

on predicting

weather in the future and predicting the power output of for the grid operators, and then you can

trade

the day head auctions for electricity.

And that's that's

totally another problem that you wanna solve there.

And for us, it's more the the operation of the turbine. And

if you have the turbine running,

let it run the best way it can. And,

I mean, yeah,

I see a lot of potential in in analyzing

also also that kind of data. And and we're also

getting weather information,

data from from a third party source.

But

this is more because we wanna understand the operation of the turbine better and and make the operation of the turbine and the service maintenance better of that turbine.

And then as far as your overall experience

of building out Turbot systems, both from the technical and business aspects, what have you found to be some of the most interesting or unexpected or challenging lessons learned in the process?

Yeah. I think

talking about what I said earlier,

like, the the last question is, I think it's focus,

especially when you're starting a company.

It's quite hard to also as a scientist,

you have so many ideas and you know that so many things are potentially working out. But in order for to bring something to the market, you really need to focus and you really need to understand,

what problem you're solving. And you need to concentrate on on maybe 1 problem first and and do that the best way you can.

And then later on, you can add

more

problems that you solve.

I think that was the the biggest lesson of of the past years. Yeah.

And as you look toward the near to medium term of what you're building out both technically and in the business, what are some of the things that you have planned that you're most excited about or overall trends in the energy sector or technology capabilities that you're looking forward to try and incorporate or take advantage of? Yeah. I'm I'm I'm basically very excited about how much other problems there are in in this data that you could potentially

solve. And

the more we are growing,

and we are able to handle to and and and and to manage

all these different

problems,

the more I'm I'm looking forward because,

yeah, this

it's it's it's really fun.

And basically, the this

is really the real time

control algorithms for the turbine that that fascinate me the most. And I think there's that there's a great, potential,

in the real time operation of of the turbines.

But sometimes it's

sometimes it's

the the basic things that that give the mace the the the most,

value.

And,

it's sometimes technically

not so fancy, but,

you're just solving a basic problem, and that has a great value for your customers.

And,

sometimes that's

that's the the better things.

Well, for anybody who wants to get in touch with you or follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And then as a final question, I would just like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today. I think the biggest gap is really the

the handling

of how to clean the data. So you have

great packages like, let's say, TensorFlow

with which you can train models easily and you can

do almost everything with that. But there's not

I don't know if it's possible, but, there's nothing like a general solution

for cleaning datasets.

I would wish that there's some sort of some some sort of a solution for that.

And maybe it doesn't exist because it's too complicated. But,

I would be super happy if there's a package that does that for you.

Yeah. I'm sure that plenty of people would be happy to see that as well.

Yeah.

Alright. Well, thank you very much for taking the time today to join me and discuss the work that you've been doing with Turbot Systems. It's definitely very interesting problem domain and an interesting technical solution that you're building for it. So I appreciate all the time and energy you've put into that, and I hope you enjoy the rest of your day. Thank you very much. I enjoyed it.

For listening. Don't forget to check out our other show, podcast.init@pythonpodcast.com

to learn about the Python language, its community, and the innovative ways it is being used.

And visit the site at dataengineeringpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Closing Announcements

Links