Podcast cover graphic of AI in the Consumer Goods Industry

EP11: The Potential & Pitfalls of AI in the Consumer Goods Industry

Lasse Holmstedt, VP of Engineering at Alloy.ai, joins the Shelf Life Podcast to discuss the dynamic role of generative AI, large language models and machine learning within the consumer goods sector.

Transcript

Abby:

Lasse, welcome to Shelf Life. Lasse, can you get us started by telling us a little bit about your background and the work you've been doing with AI?

Lasse: 1:57

Of course, and thank you so much for having me. I think you covered a lot of the intro right there. I've been a VP of Engineering now at Alloy for the last three years. Been of the intro right there. I've been a VP of engineering now at Alloy for the last three years, been in the software industry overall for about 15. And then, yeah, at Nokia. What I worked on was being a lead for one for the first Android Maps project, where we shipped that to I don't know like 30 million people or thereabouts. That was really exciting. But also shipping several mobile phones and smartphones back when Nokia still did them. It was really exciting at that time. But I met the founders of Alloy around 2016, very soon since the company was founded. And then, yeah, the rest is history. My guys, I've been at Alloy ever since.

A little bit on the ways that we use AI today at Alloy, I think requires a little bit of expansion of what we do at the high level at Alloy, to begin with.

So I think, fundamentally I would say we're a platform, we're a data platform for the supply chain with the vision of really connecting it end-to-end, all the way from manufacturing to the POS, e-commerce and to the consumer, and so that requires you to look at things like forecasting, external signals and the like on the POS end to determine what's going to happen next, but also understanding how do you get the data to be clean to begin with, whether it's from manufacturing or from the POS side, from your ERP, and kind of making sense of all of that.

So there's a lot of use cases for advanced algorithms, machine learning, deep learning and the like, really across the whole spectrum, and so the use cases that we then cover and the audience really focuses a lot on the sales themes and kind of revenue operations, sales operations and also demand planning teams, to name a few of those examples of things that we focus on and who our current users are. But so to expand a little bit on the details there on how Alloy works, it's roughly like this we're essentially ingesting and harmonizing data from hundreds of different unique sources, really globally. There's a heavy focus on North America from our business perspective. A lot of our clients are there, but we roll out Alloy globally for some of our customers. Yeah, so what we do is essentially looking at pulling the data from hundreds of different integrations and as we do that, we have to make sense of both the POS side, erp side of things and then really build all of that into a singular data model together.

This is no small feat of engineering. We've worked on this for 70 years now and there's a lot of need for again, yeah, like deep learning models and the like A lot of, let's say, non-generative AI use cases that we have leveraged now for several years already. Then, on the user of, let's say, non-generative AI use cases that we have leveraged now for several years already. Then, on the user end, as you're, let's say, a sales leader and looking at actionable insights into what should you do about all of this data, we're offering metrics like phantom inventory, blast sales and, of course, sell-through forecasts that we generate ourselves to make the data actionable. So then, going back to the question of how we use AI, there's already a lot of those different use cases in there and it's really, I think, fair to say that we use it pervasively across the platform.

So, for example, as we ingest data in from these different POS sources, we have to do product matching, for, let's say, walmart data is different from the target data, from the Amazon data and so forth.

They keep products in different ways. There's the Amazon ASINs, there's the Walmart UPCs, target DPCIs, right. And then multiply that by like 800, right? So you need to have very flexible algorithms to do the matching across that whole spectrum with full automation and with confidence that it actually works. And so that's a great example, I think, of a deep learning algorithm that we use at Alloy, where we do this sort of cross and transitive matching of products across all of these retailers, looking at the brand's taxonomy, looking at all of the retailer taxonomies, and then iterating through the layers and layers of additional insights, transforming the data as we go and then taking that data in and matching those products from the retailers to the brands and vice versa, and then, ultimately, the use case that then we are able to serve there for the end user is giving them an omni-channel visibility into their entire sales or inventory across all of the channels that they sell through.

That's a lot to take in right there, but so that's just again one of those use cases.

Abby: 6:39

Fantastic, so you touched on a lot of different examples of AI there I'm particularly interested in. I think you briefly mentioned generative AI, so I know there's been a lot of hype in the industry around generative AI in particular. Can we just take a step back and do you mind explaining to our listeners what exactly we mean by generative AI and how it works?

Lasse: 6:57

Yeah, there's definitely been a lot of buzz around the AI in the last one year or so, and so it's interesting because these technologies have existed already for some time, so the large language models particularly but they have really kicked off and made it big this year. There's been incredible amount of progress. There's this one quote I really like from Arthur C Clarke that goes just about like this so any sufficiently advanced technology is indistinguishable from magic, and I feel a little bit is that how? That is how media uses the word AI, because the results of some of these technologies are so, so incredible, particularly with these large language models and generative AI. So Jack GPT in particular getting a lot of press I think deservedly so and like I think it's in a lot of people's minds like how exactly this works. But so I can try to take a stab at explaining it a little bit and maybe take some of the magic out, because, at the heart, generative AI is really a set of technologies that well, plainly generates stuff. It sounds kind of cheap, but that is ultimately what it is, and that stuff could be text, which is the case, of course, for Google, BART, gpp and the like, or it could be images, like for DALI, for example, or music 3D models, it could be PowerPoint presentations, it could be really anything, but that's roughly what that term means at a very high level, right, and the underpinning technology then, behind the generative AI as a term, is neural networks. And so, to explain neural networks really briefly, they are essentially probabilistic algorithms that take inputs like so the text that you write as a user perhaps it's a question like what is my last sales for product A and then they produce outputs that they are trained to produce. Again, it could be text, images, music, whatever, and the training then is used. It's this one-time process that is used to, in effect, ease out the algorithm and ease out those, let's say, the probabilities that the algorithm generates based on the training data.

Lasse: 9:07

So a very simple example of a neural network would be, for example, to detect if there are cats in an image. And this is not like super far-fetched, because if you think about captures, the kind of like a little bit irritating human checks up all over the internet. Often you are asked as a human to check if there's cats, if there's traffic lights, there's motorcycles and the like in these pictures. That's all part of neural network training, this data that you use in CAPTCHAs is actually used to train image recognition algorithms, but so a simple neural network could be like this Give me an answer yes or no if there are cats in this picture and then the algorithm will take a number of inputs, like every single pixel in an image, and then the output is just a yes or no.

It's a Boolean, like a truth in a statement, in a word, and so a large language model or an LLM is not really any different, so they are really just trained with a huge amount of content online. The model is massively more complex than this cat image recognizer, and the training is then biased towards assisting the user by really doing one thing by predicting the next possible word, sorry, by predicting the most likely next word or a token, perhaps in sequence, and they use the context that the user then provided, meaning the text that you wrote as a user, and perhaps additional context that the model has, let's say, from the available platform, anyways, and so, to give an example about the supply chain, you might have the model trained to answer questions on inventory levels out of stock sales opportunities and the like.

But so that's kind of taking the example from those I don't know cat image recognition algorithms and using the neural networks there into large language models that are massively more complicated and very powerful and can generate really accurate human-readable text, right. But so to kind of go back and stress on a couple of the points earlier that are really key for understanding how these models work, all that these language models really do is not magic, but rather predicting the next word in the sequence as they generate those, let's say, paragraphs, sentences or entire documents that the user might ask them to create, and doing so in a way that it assists the user, and so that might be a little bit eye-opening and remove some of the magic around this technology. And so there's no question that I think the tech is incredibly powerful, but it helps understand, I think, some of the issues also around the tech that we might want to discover a little bit more as well.

Abby: 11:52

Yeah, absolutely. It still does sound a little bit like magic to me, but I can recognize that of course there must be limitations with any new technology. So, yeah, do you mind talking to us a little bit around? What are those kind of shortcomings of these LLM platforms?

Lasse: 12:06

Yeah, in the context of our domain so the supply chain in particular we have an incredible amount of data that we need to work with, and so, in order for our users at Ally to make the best possible decisions, we need that data to be clean and accurate, and so this is a big focus for us internally, and we spent already a lot of time building algorithms and models that really allow us to provide that data in raw format and also in these predictive more, let's say, insightful, predictive versions that help the user understand what's going to happen next, and so, as a part of understanding the technologies or the tech around LLMs, we put the large language models to the test as well, and so what we discovered and I think we're not the only ones here in terms of tech organizations is there's a lot of complexities with the LLMs when you start to deal with numbers, and so they are not good today in doing math and they are not really built for that. The kind of the name large language models are just. They are really good with language and this is a hundred percent true. They are incredible at producing English or Japanese or Finnish or, like the language you named, like a human language is just as easy for them as a programming language, but asking them to do math, the technology is not there yet, and the kind of problems that you might run into are simply, let's say, asking for the largest possible number, so what were the largest lost sales instances at my region? It's not going to work out too well Sometimes. It's just going to pick the longest number, and the longest number might be the one with the most decimal points.

This is the kind of problems that we have seen, and so, while there are ways to work around this with technology, it's not something that you necessarily want to rely on when you're making business critical decisions. There are just better technologies for that type of jobs. So what I want to say is this is not a silver bullet. The other thing and this is, I think, more pervasive for LLMs with any use case is their tendency to hallucinate, making things up, and so what that means is you're asking a question, let's use the same what were my last sales in California for product A, again as an example, and we will not give the LLM the language model, the data about sales or lost sales or anything.

It should not be able to answer this question. It should say I don't know. But given the right circumstances, occasionally what you will see is that ChatGPT or BARD or LLM2 or any number of these off-the-shelf LLMs they will just make answers up and that's not too great. And so again the same thing. Looking at that, 100% accuracy when making business critical decisions is just super important, and the problem there doesn't go away with pushing more data to the language model either. In fact, in some cases it might make things worse.

So I think the rule of the thumb here that at least we have discovered so far is when you do need to do whether it's math or other, when you need to do decisions within your platform or within your business that require 100% accuracy or close to 100% accuracy. The LLM is not necessarily the right tool for that. Now can it help Absolutely, and there's a lot of ways in which they are really really powerful, but it's not the foundation that you can build, let's say, mathematical algorithms on, and it's also really really expensive to run, just computationally and therefore also just in terms of pure dollars.

Abby: 15:48

So the fact that these LLMs aren't good at math seems like it would limit how they're applied to supply chain problems in particular. I'm understanding that you're saying you have to make sure you're using these LLMs in the right way, right, Because sometimes you might get an answer but it might not be the right answer. So is that fair that this limits how you can apply these technologies in the supply chain space specifically, or are there creative ways that we can use them?

Lasse: 16:12

There's absolutely creative ways that we can make the most use of those, and so, if you think about those couple of key facets of these large language models that I was highlighting earlier, so looking at the aspect of generating text and assisting the user, the models have truly been trained with that assistance aspect in mind.

So, of course, the number one thing then kind of goes that makes sense is, yeah, really looking at, well, building an assistant of some sort where you get to speak in English or again, like in Finnish or Japanese or Polish or whatever language you speak, and then the language model has the conversation with the platform that you work with.

Let's take, yeah, like Alloy for an example, so you can just go ahead and ask the question and this is true today and this is how we integrated OpenAI into Alloy you can go ahead and ask what were my last sales for product CH005, being that your SKU ID, and you'll get the answer in a few seconds.

You could build a dashboard manually for this and that's probably what the power user would do. They probably have a whole lot of other questions that they know exactly how to get the answers to within a platform like Alloy, but for the user who just dropped in to the platform. It's going to be a lot harder for them to go about their business that way reading documentation and all of that than just going ahead and asking the question in plain English, and so I think there's real power to that in democratizing technology and so making it so that, ultimately, the data that you make decisions on is truly the asset that is democratized, and the whole team, not only the people who are really tech savvy get to have those answers that they're looking for as they're doing their work, whether it's business analysts or sales leaders.

Abby: 18:05

Yeah, absolutely. I think that's particularly exciting from a software perspective, as you say, helping new users or less technical teams really have access to that data without having to understand how to configure all these complicated insights in some data analysis tool. Being able to just ask those questions in plain English, I think, as you say, really really opens us up to more users being able to access these data insights. And so what about data security? I know that's a big concern that we hear about a lot of the time, particularly in CPGs. A lot of companies you know closely guard their data. It's kind of I guess it's concerning or worrying right to think about just handing over large sets of these data to these platforms and not knowing how it's going to be used. So how can our listeners think about that when it comes to data security and also leveraging these new technologies?

Lasse: 18:56

This is really interesting because there's been a lot of development, even in the last few weeks, really, so in the beginning of the year. So JetGPT, the one getting really the most press here, didn't have a suck to audit done. They essentially said, implicitly or explicitly, that they are using all of the data for training, and that put your data ultimately at real risk. Because imagine the situation where, if a third party like Chachipiti uses their data, uses your questions to train their model, the next time that the model is trained and that's an intense process, so it doesn't happen all the time but the next time it is trained it's going to be learning things about your business that only very few people should actually have access to, and that's just a massive business risk. Your competitors in the worst case, they could be just going ahead and asking those questions about like, for example, now has this enterprise scheme where they say they will not use the data anywhere for training and only for compliance reasons. They keep the data on their servers. I think this is a fantastic step forward, but we take it a little bit further than that, and our implementation is not to actually send any of the numerical data under any circumstances to any of these off-the-shelf models that we don't control. And so we kind of kill two birds with one stone, if you will, where we don't have to worry about the accuracy problems we do the math and we keep the math on our side but we also don't have to then worry about and our customers don't have to worry about the compliance, legality problems and potentially exposing the information to competitors, maybe some months or a year later when this retraining does happen. But so there's definitely a lot of risks there.

When you, let's say, as a CIO, are looking at AI strategy and evaluating the options of whether you build or buy. If you are building, you probably don't have that sort of staffing to go ahead and retrain the models yourself. This costs tens of millions of dollars. So JetGPT costs about $100 million to train the latest version, gpt-4. Few companies will have the budget to do something of that sort. So the chances are higher that you would be leveraging some of these off-the-shelf models and so you have to be really careful about all of the legalese, different jurisdictions doing different things and potentially persisting your data and maybe using it. It's easier to kind of skip on that part altogether. And yeah, look at software, look at platforms that say we are never submitting sensitive data to these platforms. Just the easier conversation to have.

Abby: 21:52

So, more broadly, how should companies be thinking about leveraging not just generative AI, but AI in general? We think about strategy and how companies are making sure that they're capitalizing on all these opportunities. What would you recommend, particularly for CPG companies and companies in the supply chain space? How can they be thinking about building this into their business strategy?

Lasse: 22:13

Yeah, I think it's really important to have a holistic perspective on yeah, essentially AI as a whole, machine learning and the key enabler technologies that then ultimately enable the IT departments to deliver as much value as possible for those stakeholder teams, because, if you look at LLMs, for example, they have only made it big this year and, while the pace of progress is really intense, there's going to be a lot more robust, a lot more performant models in 2024, 2025 than what we have today, and so there's a lot more mature technologies out there that I think would be beneficial to look at seriously and not just at the LLM side of the spectrum.

So, looking at what I said earlier about the deep learning and machine learning and kind of like, let's say, forecasting algorithms, I think one of the key enablers really for sales teams is accurate forecasts, and accurate forecasts is a great way of then looking at whether it's more accurate shipment plans or just more actionable sales outcomes, but it all starts with clean data, and so in order for you to produce accurate forecasts whether it's with Ally or whether it's with, let's say, sap, ibp or whether it's some other technology all of those different forecasting algorithms and platforms they rely ultimately on the data to be there and for the data to be clean, and so what our focus has really been is very heavily on making the data as clean as possible, because that is the core enabler for all of the future AI steps that are going to be taken in the next years and as the LLM's progress, I'm sure that the math problems and a lot of the performance problems that we see today cost issues.

They will eventually disappear, but they need to integrate with your data warehouse and your single source of truth for your, whether it's demand data, for your selling data. What have you?

Abby: 24:11

And I think that the AIs will be limited by the data cleanliness at the end of the day, so I think what you're talking about is really the companies need to be I've heard the term AI ready right, even if they're not necessarily building solutions around these technologies. Today, there's a lot to be done in terms of making sure that the data and your data architecture and the way you store and model that data is ready to be able to make use of these systems in the future.

Lasse: 24:37

Absolutely. I think. Still today, many of the companies are still looking at moving into a single source of truth data warehousing solution, looking at really commoditizing and democratizing the data as a part of the decision making process. We see that we help our customers with that, but there's a lot of companies out there that are not doing this in the same sort of way and not following suit. Not following suit and especially with these large language models, generative AI as that technology develops further, the companies who already have that clean data will be able to take a better advantage of the market. I think that's just the facts. We are not there today, not with all pieces of the LLM technologies, but we'll get there soon enough, and so next year the year after will be very, very interesting indeed.

Abby: 25:23

Absolutely Really excited to see how things develop in the coming weeks and months. That's all we have time for today. Lasse, Thank you so much for joining us here and thank you for all the insight.

Lasse: 25:32

Thank you for having me.