We’ve known for a while that
- Gartner’s first study found 85% of AI projects were failing (and that statistic is still being quoted everywhere, including this recent Medium Study)
- Bain’s study last year found that 88% of all IT / technology projects fail to some extent (2024 study)
And we now know, thanks to MIT, that
- 95% of all Gen-AI pilots fail. (Source: Fortune)
So what does this mean for you (and your ProcureTech journey)?
Well, beyond the obvious that you should stop dead in your tracks when a vendor starts pushing their “Gen-AI” enabled solution and dig deep into what that really means, at a foundation it means that:
You should never, ever, ever buy or use any solution that uses third party Gen-AI / LLMs, even if wrapped nicely, in their service or product because your chance of success will be 5% if you go with that provider.
You should only select vendors who only use in-house Gen-AI / LLM solutions that are built with the following rules in mind:
- custom trained on an expert culled corpus
- for a specific problem domain
- and applied in a specific context with guardrails and human checks on the output.
The best AI technologies has always been focussed on a specific problem, and this iteration is no different. Focus minimizes the LLM hallucinations (which cannot be trained out as they are a fundamental function of the technology) and guardrails prevent them from automatically being executed on / slipping through.
While they are far from perfect, with more discoveries being made daily on their many (many) drawbacks (where we summarized a dozen in this post on what not to do if you got a headache, but missed the recent revelation where it can not only lie on purpose but turn into something evil), the reality is that, as we have said before, LLMs, properly trained on vetted corpuses, do have two valid uses:
- large corpus search and summarization
- natural language translation
since, when appropriately trained, they can be almost as accurate as last generation semantic technology systems, but provide much more natural interfaces for the average user. (However, you won’t get a failure code from them when they are wrong, you will get a hallucination which will be so well phrased you’ll think it’s true when it’s an outright lie. Hence the need for guardrails and human review.)
So, if the vendor is
- using their own in-house LLM
- following the rules above
- and targeting the LLM at natural language problems LLMs are actually good for
Then you should definitely try what the vendor is selling. (Try, not buy, and definitely don’t make a decision off of the carefully crafted demo!) Put it through its paces in a typical use-case for your company, not the use case selected by their demo master. If it does the task better on average than an average team member or does it about as good but many times faster, that is what you are looking for in a tool. Since there is no real AI, you can’t be replaced. But as your bosses keep increasing the weight of your workload to hit ridiculous revenue and profit targets, you need a tool that multiplies your productivity. One that can do the majority of the tactical data processing grunt work, leaving you free to do the strategic thinking and then add in the intelligence to a process or output that no tool can possess, instead of spending 90% of your time doing data entry, processing, and summarization that computers were built for.
In something like Procurement intake, that’s not trying to mimic in text chat the old school phone conversation that took you fifteen minutes to do the monthly office supplies re-order, that’s asking one question:
processing the first one sentence answer:
to determine that the user needs to be pushed into the e-Procurement system with the monthly office supply cart pre-loaded, so that all he has to do is enter the number of units of each item, and possibly add or remove an item from an easily searched catalog if one or two items need to be changed. Not 20 questions of “what do you need”, “what quantity”, “the same supplier”, “so you want 2 cases of paper from office depot”, “no, office max”, “oh, standard printer not glossy for marketing”, etc.
When Gen-AI mania first swept our space, and every vendor was told they needed a conversational interface for buying (or no customer would consider them in their RFP), and then built one, not a single one wasn’t painful to use. Most customers upon seeing it for the first time (after insisting on it), quickly said “can we turn it off” because they quickly realized that a well designed catalog with blanket/standard orders, quick search, and easy drill down to preferred suppliers was at least 10 times faster than trying to use a dumb chatbot — especially if they could pre-build templates / carts / blanket orders for regular purchases.
It’s the same for almost every other process vendors have been trying to apply this technology to, including conversational analytics. (Which, FYI, even Gartner expects to disappear from the conversation in two years.) There’s no such thing as conversational analytics, only reporting. And while that is really useful in the right context (such as allowing an executive to retrieve some basic information with a plain English question), try building a detailed spend cube, which is the cornerstone of spend analytics, with conversational analytics! (And I mean try because you will fail.)
While this doesn’t mean that LLM technology doesn’t have uses, it does mean that those uses have to be finely tuned. So far, among the hundreds of companies I’ve seen over the past few years, only a few have both implemented LLMs and gotten it right. Let’s hope that number increases in the near future. If, not always remember, while it would be great if a few more companies would get it right, You Don’t Need Gen-AI to Revolutionize Procurement and Supply Chain Management — Classic Analytics, Optimization, and Machine Learning that You Have Been Ignoring for Two Decades Will Do Just Fine. Not to mention the fact that good, adaptive, RPA will take care of most of your automation needs!