Speech to Action App: Practical A.I. on the Lakehouse using Transformers

Sathish Gangichetty
6 min readMar 30, 2022

--

If you’re reading this, you probably get the hype behind a data lakehouse these days! But, have you thought about what needs to happen to make the lakehouse deliver on the AI promise? Pretty much every data company in the world talks about being able to make your AI promise come to life. Usually, this turns into a pipe dream because their offerings are too narrowly focused or too fragmented. Narrow focus is generally good to build strength in one area, but far too often, companies harvest their brand by making expansive promises beyond what their offerings are truly capable of doing well. For example, conventional data warehouses/platforms masquerading as lakehouse offerings claiming to power AI.

Therefore, it becomes imperative to show how the tech needs to not just be capable of delivering on the promise of industrialized AI , but also accommodate a modern consumption pattern…a pattern predicated on how insights can be made inclusive for all levels of users to minimize friction in consuming the intelligence that these platforms drive. In other words, what most companies want out of their AI initiatives is “simplified access to a powerful insight for everyone”. A functional, feature-complete lakehouse offering should natively offer this capability, while being flexible enough to accommodate a new range of simplified consumption patterns never seen before. The details around the complexity of the implementation and such…,although deeply important, is not as important as democratizing this power without losing the flexibility and control over the ability to innovate. With this, let’s jump into characterizing our problem statement.

Today, if you are a business user, the typical response you get from tech teams around any analysis to make decisions is usually like “go to this dashboard, look at this chart after selecting some X on the drop down menu, click on another button and then export to excel”. If you need something else, well you need to go to another dashboard, track down the widget… you get where I’m going with this. What if, there’s a way to show how to deliver the power of the lakehouse to your average business consumer without driving them into the “maze of consumption” (a.k.a. confusion)?

So let’s use that as the business case here and build out an application that showcases a template for how to build apps on the lakehouse. To truly qualify as an app that showcases the promise of the lakehouse, we need to tap into what’s traditionally considered alternative data sources (i.e unstructured data such as text and audio) and use that to drive democratization of enterprise assets (expert know-how of anything from writing queries to model building).

Isn’t it funny that I called it unstructured data because that’s the closest type of structure that we, as humans, are programmed to understand naturally? This is why children pick up language first before they pick up how to read a table. But as is the case in many things with tech, we formalize definitions based on our distorted view of what’s possible with limited tech i.e. tables. Reality as we perceive has no separation between structured and unstructured data, so your data platforms should follow suit, one would think? At least, that’s how data platforms should evolve — leaving behind the ones that still create silos or are immature to deal with all kinds of data.

Thankfully, as we know it, that’s changing with the lakehouse. It brings about a mindset change where unstructured data is as much a first class resident as structured data.

Now, let’s walk through an example application of how an average business user can take advantage of the collective power of their enterprise’s data brain trust by simply relying on a lakehouse platform. Make no mistake, the most advanced, multi-cloud, simple and open lakehouse offering on the planet today is on Databricks and so we’ll use the databricks lakehouse platform to advance our case.

What would it take to generate meaningful business insights using the lakehouse by simply talking or texting?

At this point, it’s worth mentioning that the kernel of the idea comes from a discussion with a colleague with whom I was talking about the possibility of generating meaningful business insights using the lakehouse from either simply talking or texting. If that’s the minimum entry bar for someone to participate in the data & AI revolution, we wondered how far can someone get with the lakehouse? We collaborated on it and he went on to document his findings here for generating SQL from speech. Give it a read, its a fascinating area of research in AI.

It became clear to me that it’s possible to build out a fairly mature pattern by relying heavily on the databricks feature set. The entire exercise opened my eyes to how powerful databricks truly is and why the lakehouse pattern will leave the previous generation data architectures to bite the dust. This is because functionality defeats everything else (including politics). Think about the naysayers during the internet revolution, the smartphone revolution etc. Heck, even the cloud revolution. Eventually everything on the ecosystem catches up when the ability to function on the core idea exists. This is why the lakehouse shines. It pushes function over constraints (i.e. the drawbacks of the data lake and the warehouses).

So, without further ado, let’s see a demo of what this application can do. We can then work backwards on how I went about solving for it at a later time.

Example App on the Lakehouse: Truly democratizing generated intelligence requires a mindset change on how we source, integrate and consume data and AI.

This clearly accomplishes what we set out to do. It highlights the following salient capabilities:

  • A minimalistic interface with speech or text inputs for an average business user who can leverage the app from any device — a computer or a mobile phone (iOS or Android) of their choice — thanks to flutter.
  • A demonstrable way to build on the knowledge of the hive, the data experts who know what it takes to beat data into shape to provide a useful insight/result. We simply use task annotation as a way to tap into the appropriate “action” (job/query/workflow) in the backend using AI.
  • An async response loop that manifests itself via slack (in this example). This can be any communication mode of your choice!
  • A thoughtful user-first, usage pattern that waits to see if the user indeed wants this action to be performed. The cost of rejecting an action is just one ML Serving inference call, thereby saving unnecessary compute expense where intent doesn’t match the user utterance.
  • A way to methodically tame the complexity around natural language using ASR and semantic search using modern transformer architectures like Sentence-BERT . Sure, with the explosion of transformer models, you can experiment with other model variants thanks to model zoos such as huggingface, PyTorch Hub and TF Hub. This doesn’t mean language-to-action nirvana, but still provides an approachable AI augmentation alternative until there’s a reliable AI-only strategy.The core technology that makes all of this possible is the connected feature set that databricks provides via databricks SQL , Notebooks, MLFlow and ML Serving capabilities. I particularly found this to be a truly simple and amazing trick to the point where you can easily make your own!
  • An ability to tap the broad set of databricks APIs — across workflows, autoML, SQL etc.
  • A way to collect both user queries and feedback by writing audio/text to files on the lakehouse and then using delta to efficiently process them. While this isn’t immediately obvious here, this is the core secret to keeping the data flywheel that feeds the AI monster spinning, so that the quality of the service keeps getting better over time. Again, only because of the lakehouse paradigm. Data + AI = Promise Materialized.
High Level Overview of how everything comes together. Notice all the databricks lakehouse components that power the simple, minimalistic user experience

Clearly, as you may have already figured, what I show here is just a scratch on the surface of the range of possibilities that exist. If you can imagine, you can build it out on the lakehouse! So therefore, my personal learning has been that the lakehouse paradigm is indeed the way of the future and thanks to the lakehouse pioneers, many empty promises in AI/ML can now see the light of the day because the foundational data issues can now be handled reliably across the wide variety of data types and formats. The ML models can then be built natively on this data, prepared on open data formats, without the need for moving or duplicating data.

If practical AI/ML is important to you and you are moving data around because your data platform cannot support AI/ML natively, ask yourself — is it worth creating an ever thickening blanket of proprietary tech debt that is waiting to choke your future self as you embark on your roadmap initiatives? Play the game where your future self moves ahead and thanks you for the choices you make today! Be kind on yourself & lead the change for a better tomorrow! Onwards and Upwards, until next time! 🚀

To connect with me, please reach out to me on LinkedIn

--

--

Sathish Gangichetty

I’m someone with a deep passion for human centered AI. A life long student. Currently work @ databricks