I’ve been enjoying Ben Lorica’s podcast and newsletter content since he was back at O’Reilly Media. A recent Gradient Flow newsletter entry, “A pragmatic guide to enterprise search that works” was full of gold. Two statements that really struck me were:
“This reality forces a shift in focus from the AI model to the data foundation.” GIGO applies as well to the data that you’re siccing LLMs on. Organizational data engineering and data quality capabilities can provide large ROI.
“Enterprise success is not measured by open-domain accuracy but by reliability within a specific, messy, and private context. … The only way to solve this is to stop looking at external leaderboards and start building your own internal evaluation suite.”
Emphases mine.
I heard Ben encapsulate the second point into a sweet sound bite, “Evals are IP”. That also applies widely beyond enterprise search to other AI engineered applications. Hamel Husain has been doing a great job of leading the charge on this topic.
Full disclosure, this is a lightly edited version of content originally posted on my (gated) LinkedIn feed.