How to bring AI into your Business (Jayesh Govindarajan, Salesforce) Artwork

Techzine Talks on Tour

Techzine Talks on Tour is a podcast series recorded on location at the events Coen and Sander attend all over the world. A spin-off of the successful Dutch series Techzine Talks, this new English series aims to reach new audiences.

Each episode is an approximately 30-minute discussion that Coen or Sander has with a high-level executive of a technology company. The episodes are single-take affairs, and we don't (or hardly) edit them afterwards, apart from polishing the audio up a bit of course. This way, you get an honest, open discussion where everyone speaks their mind on the topic at hand.

These topics vary greatly, as Coen and Sander attend a total of 50 to 60 events each year, ranging from open-source events like KubeCon to events hosted by Cisco, IBM, Salesforce and ServiceNow, to name only a few. With a lot of experience in many walks of IT life, Coen and Sander always manage to produce an engaging, in-depth discussion on general trends, but also on technology itself.

So follow Techzine Talks on Tour and stay in the know. We might just tell you a thing or two you didn't know yet, but which might be very important for your next project or for your organization in general. Stay tuned and follow Techzine Talks on Tour.

All Episodes

Techzine Talks on Tour

How to bring AI into your Business (Jayesh Govindarajan, Salesforce)

May 17, 2024 • Coen or Sander • Season 1 • Episode 4

Unlock the secrets of integrating AI in enterprise organisations with Jayesh Govindarajan, the SVP AI at Salesforce. Many organizations are challenged how to fit genAI into their business. Salesforce is helping it's customers by offering many integrated genAI features, but it also gives it's customers the power to build their own AI features.

Govindarjan takes us behind the scenes of Salesforce's AI journey, revealing the complex innovation of embedding AI features that elevate the user experience. We talk about AI deployment, the pivotal role of contextual data, and why developers are turning into prompt engineers to exploit the vast capabilities of large language models. We also discuss the role of the CoPilot chatbot, which is something that many users don't favor. How does Salesforce expect chatbots to become more popular?

Salesforce created many AI-ready features on its platform. Copilot plays a significant role, but there are also many AI-specific features that can be activated by a click of a button. For example, an auto-generated reply on a support ticket. Users on the platform also have the option to use Copilot at any given moment.

Organizations that want to use their own models or build their own AI features can use the new Einstein1 Studio. Salesforce also launched the prompt builder, which helps users build effective prompts in many ways, which can be really hard. According to Govindarjan, they have made a lot of progress in making prompt engineering a lot easier.

Salesforce is one of the companies that already made huge steps in AI, but it believes it's only at step one or two of ten step journey.

Speaker 1: 0:25

This is Koen. We're reporting from Salesforce TDX, the developer conference of Salesforce, and we're here with Jayesh Govindarajan. He's SVP at Salesforce for AI. Welcome, Jayesh.

Speaker 2: 0:39

Thank you, Koen. It's a pleasure to be here.

Speaker 1: 0:40

Yeah, nice that you took the time to meet with us. To meet with us, you've been actively developing Salesforce AI tools within all the different Salesforce products. Can you tell me a bit how difficult it is to put AI features within your products, because I think many corporations are challenged by that at the moment and Salesforce is one of the bigger ones, so more capable of creating these features. But what? What do you tell customers that are looking for AI?

Speaker 2: 1:14

yeah, yeah, it's been quite a amazing year year and a half with all the generative AI technologies coming out with such amazing capabilities, and you're absolutely right wrapping that into a product that's useful for enterprise users as well as consumer users is a challenge. At Salesforce, we've been looking at it from the perspective of three big waves when it comes to bringing AI to enterprises. The first one is the predictive wave, which is building predictive technologies in context of the products that customers build and use, things like predicting whether a case is going to be closed, whether a deal is going to be closed or not. That's one family of AI that Salesforce has been working on for the past decade or so. We do about a billion predictions a day on that front.

Speaker 2: 2:07

And then about a year and a half ago, as everyone knows, there's a new technology that sort of came out the large language model technology transformers, based on transformer architectures that Salesforce had been using in the past for predictive AI, and so we sort of jumped onto the generative AI applications of AI in the enterprise as well, and from a perspective of how these technologies are used. That was the second way for us how can we bring that technology in a way that's assistive to an enterprise user. That spans several personas. Salesforce users fall under many different sort of personas. There are sales people, there are service people, professionals, and bringing that together is something that we've been working on for the past year year and a half.

Speaker 1: 2:57

Okay, how hard is it to create Gen AI features within a SaaS solution? Because I think many of your customers are challenged by that. Do they do it themselves? Do they look for solutions like Salesforce or maybe use Vertex AI? What is your take?

Speaker 2: 3:14

on that.

Speaker 2: 3:14

So last year when we started to look at generative AI, many of our customers had played around with chat, gpt, which is more consumer domain, and a lot of the CEOs and CIOs that I've spoken with and my teams connect with a lot of our customers.

Speaker 2: 3:32

Their first question was okay, I've got a license for OpenAI, what do I do next with it? You know how do I bring this to my users, and a lot of last year was about you know customers and companies kicking the tires. You know customers and companies kicking the tires, so to speak, running some prototypes, creating proof of concepts using this technology. But the thing we realized is that to bring all of this together, you need to bring in three things. The first thing is contextual data, a deep understanding of exactly what you want to do, what job you want to either automate or assist a user to go, get done, and then codify that into a set of prompts, instructions, if you will, and then let the LLM come back with responses, of course, very quickly we realized that a lot of these LLMs by themselves have lots of challenges.

Speaker 2: 4:29

They hallucinate, the space of responses that come back are varied and to be able to get that working in the enterprise it's really important to have two things One is the context of what job you're trying to have the LLM do and second, the data that is needed for that context to work appropriately. And if you look at what we've been announcing today at TDX, a lot of that is bringing all of those elements together for a new breed of developers, prompt engineers, to be able to bring that together into applications.

Speaker 1: 5:03

Okay, yesterday I talked to Silvio. He's your chief scientist.

Speaker 2: 5:08

Yes.

Speaker 1: 5:08

He told me how hard it is to create your own LLM. It's really really difficult, I must say. I think nobody should try it themselves. Organization only if you're really big.

Speaker 2: 5:24

Like Salesforce or Google or whatever, absolutely.

Speaker 1: 5:26

Because it's just too hard. But I also think the whole prompt engineering thing is going to be difficult for many organizations.

Speaker 2: 5:35

Yeah, let's take it at the very bottom and then build our way up to the top.

Speaker 2: 5:39

Silvio is, of course, a fantastic partner to us, works pretty closely with our teams to go build up these things and bring them into production, and he's absolutely right. Building an LLM from scratch is not for the faint of heart. It requires not just using a large language model which predicts the next word, but then you have to teach it how to follow instructions, which is a training job, requires data, and then you have to. There's another layer that gets built on top of that, called reinforcement learning with human feedback, which is also fairly technical and requires investment to go get done. But once you've done that, or once you have access to an LLM, a lot can be done with what's called zero-shot learning, which is in essence prompting, so long as you're able to pack the prompt with a set of instructions that are viable and actually get the job done with the data that's needed by the LLM to go craft the response.

Speaker 2: 6:40

I think I know that LLMs perform amazing tasks, amazing feats. It is a challenge, although not as hard of a challenge as building your own model. If you're able to bring together data, a definition of the job and context, it is certainly doable for anyone that has an understanding of what they're trying to achieve, to be able to use the tools that we're producing to become a prompt engineer. So it's not as hard certainly not as hard as building a model yourself.

Speaker 1: 7:15

And what we hear from organizations that are stepping their toe in the water or something with the whole LMS and with prompting is that many models don't really listen quite good to the prompt you know you give. Within the prompt you give different rules what the output should look like, but many of those rules get ignored during the generation of the output. Yeah, how big is your challenge in creating the right problems and getting the right results? Because most organizations now say we need to have like a complete conversation with the LLM to get to the right result. But in the ideal world you want to send a prompt and you want the best result directly.

Speaker 1: 8:06

As it may, as it may Is it something you have been playing with or challenging.

Speaker 2: 8:12

We have been playing with that a lot, and there are two sets of products or studios that we launched here at TDX. One is called the Prompt Builder, which is the one-shot scenario, and the other one is the Co-pilot, which is more conversational. I do think there is place for both. I think there is a set of problems or challenges that enterprises face which can be resolved by a single prompt. Those are usually what we call single-turn interactions.

Speaker 2: 8:48

If you're able to express your need in a single sentence, that's a single-turn interaction, but it must have all the contextual data needed for the LLM to actually produce the result that it does produce, which is why we bring in data cloud into the picture. The second is a more longer conversation where you are not sure of what you want. For example, when you say something like build me a sales plan, that's an interactive activity by definition.

Speaker 2: 9:17

Why is that? Because the space of what is a good sales plan is really in your head, so you need to prompt it multiple times, have a conversation with the system to be able to give you a plan that you like. Second, it's very personalized. As a salesperson or as a service person, the way you sell or the way you resolve scenarios if you're a service person is very different, so it makes sense for a co-pilot like thing to emerge. On the other hand, if you really all you want is a fantastic email to be drafted based on a single ask, it should be possible to do so.

Speaker 2: 9:51

So I do think that the world of enterprise applications will be split into these two parts, with a simple, single prompt being able to build generative AI applications that are basically deeply embedded in the flow of work, where they're not even seen, so they're in the back end. It's basically a prompt that's doing the activity, but in the front end it looks like any other UI, except it has generative capabilities built in. That is basically the one shot approach that you're talking about, and then there's going to be a family of use cases which are going to need more, you know, deeper back and forth with a user, which is where co-pilots come in, and things like planning, reasoning, understanding, conversational state, being able to orchestrate actions on your behalf, is where all of that sort of comes together. So I do believe that I think we are going into future of enterprise applications where both of these interaction mediums will be used significantly.

Speaker 1: 10:47

So you're saying there are two ways the one-shot and the whole conversational. Isn't it a challenge for you to create the prompt builder with a one-shot? Because if you give it so many rules, you need to make sure the output is a one-shot and it's perfectly. But do you sometimes communicate multiple times?

Speaker 2: 11:04

with an LLM.

Speaker 1: 11:05

Or do you optimize the whole process that you can send one request to the LLM and get your output right at once?

Speaker 2: 11:13

Yeah, this is a great question, I think, as customers use the system. Our goal is to meet customers where they are. There are many customers who have armies of developers who are phenomenal at building applications. They have a deep and keen sense for what problem they're trying to solve, what's the persona they're trying to solve it for, and when they have that, we want to just give them the tools to be able to unleash whatever interactive or non-interactive experience that they might need.

Speaker 2: 11:42

So, in essence, what developers are doing with the tools we give them is they are crafting or tuning the prompts manually. So, for example, just like there's a development lifecycle where you build, you test, you take it with your users to run a pilot, you come back and you iterate on an application much the same way you iterate on a generative application, except, instead of writing code, you're writing prompts and tuning prompts. Where we're going with this is we GA'd a feature yesterday for feedback, which is we're trying to collect data on interactions which are both positive and negative, and what you want to do is have the prompts tune themselves based on positive interactions and where it doesn't work as well.

Speaker 2: 12:32

So if you want me to draw a vector for where we're going, in the future, I think a lot of the prompt tuning will likely be based on feedback and users and utilization, but for now, the state of the art is a set of really smart developers for whom we've built amazing tools to be able to do this. Just like it is a development lifecycle where you build a prompt, you build a generative AI application around that prompt, you plug in your data sources through data cloud, you plug in your data sources through data cloud, you plug in your action frameworks through your runtime environments and you test and then you tweak the prompts to get to the right results. Once you've done so, you understand the variance in how customers will interact with that prompt and then you account for that in the application that you build, much like you would in any other application.

Speaker 1: 13:17

You've been building this application over the past year somewhere the trial and error part of creating a prompt. Has that become shorter or is it still a challenge? How fast do you build a prompt now versus six months ago or something?

Speaker 2: 13:34

Oh, you know, it's exponentially better now. We understand the limits of the system a lot better. We understand how to craft applications and keep them specific to one or two tasks, that the variance in the output is lower.

Speaker 1: 13:53

Definitely, we've learned a ton and it's gotten significantly better, not only because Are you happy with the pace now, or do you think it should still be a bit better?

Speaker 2: 14:03

It should be better, and I think this is where tools come in. I think the space of innovation and what customers do with the tools we give them is almost infinite. There's always room for improvement and I think the improvement comes in many levels and layers. Llms themselves are not staying put. They're improving considerably and, over time, hallucinations are reducing a lot. We've been on this journey for a year and a half and if you were to ask me, you know what's happened. At the lower, most layer, which is the LLMs, there's been phenomenal improvements in the models and their ability to comprehend, their ability to orchestrate actions and their ability to produce output with citations, which is explaining themselves. All of these are new. And then there's the tooling that's been built around it, which is bringing in data, bringing in context. So definitely, it's gotten a whole lot easier and it's about to get more easier with the tools that we're building.

Speaker 1: 15:03

Okay, if you look at the other parts the co-pilot parts you just talked about there have been some researches that users are not very fond of using a chat-based assistant to get response from an AI, and that was usually on a consumer level and the expectation is that business users even are less willing to use a chat agent. But yeah, it's kind of the way of the future for now for AI. At least that's what everybody is focusing on. Microsoft's focusing on that, you guys are focusing on that, google is focusing on that. Do you have plans or an id, how you will get users better involved by using a co-pilot chat agent within your applications?

Speaker 2: 15:48

uh, that's a great question, I think. Um, I think you're right in that it'll take some time for users to become used to this new modality of interacting with applications. Up until now, most applications that you see out there are two-dimensional, with a user interface like buttons and drop-downs and menus and all that kind of stuff to go get your work done. Now we are introducing this notion of an assistant where you can talk to it to get jobs done. So certainly to some users it's daunting to be able to be given a text box and to express what they want to get done. I think that user interface will evolve. You're starting to see some of that with Einstein Copilot, which is we are launching ways to guide the user to go get their job done, much like you would in a traditional interface with buttons and such, but it's in the context of what you're able to express, which is almost infinite.

Speaker 2: 16:46

I think users will need to get used to having a really smart intern that can get a lot of the work done. But you need to know how to construct your job description if you will effectively to have the intern go do an amazing job of it. So I think, if you think about how people prompt the system, converse with the system. Yeah, I think it'll take some time for users to get used to having an intern to do what they've traditionally done self-service, using a user interface. But when that works out, I think it'll be a lot more powerful than the interfaces we have now, because you can express fairly complex tasks that you wouldn't be able to do otherwise in a static user interface. I call it static because user interfaces are predefined, whereas co-pilots are not. Yeah, is that also why you chose the two-root option defined whereas co-pilots are?

Speaker 1: 17:39

not? Yeah, is it also why you chose the two root option, one having a co-pilot and one that your admins can develop AI functions with a click of a button? Basically, they can place in every application. They can add information or add buttons to perform AI tasks.

Speaker 2: 17:57

Exactly. I think the path to full co-pilot, driven users and use cases, is not going to be immediate. I think there's a lot of value to be captured in that journey towards having an assistant that can do things for you. And even when that arrives, there are certain use cases where, quite simply, the AI is invisible. You don't even need to prompt it. It's proactive in the sense that if something changed in your database and a row was added, you could proactively run a generative AI component in the background without the user knowing it and bring back the results. Very much the same technology, but it's hidden, it's unseen, doesn't require an interaction. The interaction is driven by the system or the machine.

Speaker 1: 18:48

We've come a long way in the last year in how LLMs are developed. New LLMs are announced almost every week and they always claim they're better than the ones before every week, and they always claim they're better than the ones before. What do you expect for the upcoming year in this Gen AI process? Do you expect everything to get smaller and faster and better, or do you think it will die down a bit and we will stay at the current level and it will take more time?

Speaker 2: 19:16

to improve. Looking at the trajectory of where LLMs were a year and a half ago and where they are now, I think what we're seeing is the emergence of large language models as being as going beyond just language, into planning, into orchestration, into reasoning. I think that is a completely new frontier that we're looking to, that we've brought to Einstein Co-Pilots. If I were to peek into the future, I think the world sort of splits into two parts where, for simple, basic, language-oriented tasks, such as write me an email, summarize this document for me, or write me a document on this topic with these sources, these are very language specific tasks and already we are seeing lots of small models tackle this problem quite well. Where they lack is the reasoning and the planning element that is baked into what we call large frontier models, like OpenAI, anthropic and others.

Speaker 2: 20:25

So this sort of bifurcation is naturally happening, which tells me that there will be a place for small language models which are task-specific, and large frontier models that bring all of these elements together with reasoning and planning, and it's going to be up to enterprises all of these elements together with reasoning and planning, and it's going to be up to enterprises and one of the things that we've done in our platform is we give customers a choice on what models they want to bring, and if customers don't care, we make the choice for them based on the task that they're trying to get done.

Speaker 2: 20:51

For example, if there is a prompt with a button in the back and all it's doing is generating email, we will quite simply use the most inexpensive model to go get that done. However, if now you want to generate a big sales plan which is based on many different data sets together, then for something like that we need reasoning and planning and we would switch underneath on the engine to a more sophisticated model. I think that world is emerging where you see those two things break up in that way.

Speaker 1: 21:25

One of the bigger components of your AI vision is the whole trust model, because one of the biggest challenges for organizations or maybe they're a bit scared is the fact that the data might end up in a model or the data might leak in different ways. You don't want your company secret spoiled.

Speaker 2: 21:47

Absolutely.

Speaker 1: 21:48

Everywhere you guys built the trust model. Can you explain a bit how that works?

Speaker 2: 21:52

Yeah, it was one of the first things we built.

Speaker 2: 21:56

Last year is when we saw the LLM sort of coming into the enterprise, the very first thing we looked at was trust and how can we bring this to our customers in a trusted manner.

Speaker 2: 22:11

Working with our partners on the LLM, we have a zero data retention policy, which is basically being able to prompt and then have the engine forget that the prompt ever came in, which means the data is completely safe. It's not used to train any models of any sort. Now, customers, on the one hand, don't want the data to be used for training, but, on the other hand, they want to use the data to be able to generate great prompts, which is exactly what we allow them to do. We do not use that data for training, but we do use it with the customers inputs for grounding the LLM based on the request that came in, all of which happens on the Salesforce's secure private infrastructure, which is exactly what the trust layer is. It enables customers to bring in their data with confidence, be able to ground their data into the LLM with confidence, knowing that that data is not going to bleed into a model.

Speaker 1: 23:09

But you also mask it right, so it never gets even sent to the model?

Speaker 2: 23:13

Absolutely. We mask it and we run toxicity checks on the output that comes back before it goes back to the customer.

Speaker 1: 23:20

Okay, and you can also bring your own model right.

Speaker 2: 23:25

Is it?

Speaker 1: 23:25

then a shared responsibility that the data stays out of the model because you can't control if people add their own commercially model, if they add consumer chat, gtp, for example.

Speaker 2: 23:38

Yeah, so we allow customers to bring in their own model. That's exactly right. Choice is sort of paramount in the space that's moving so fast. Most customers that use bring your own models are large customers of Salesforce who have the sophistication to test what models they bring onto the platform. Second, we will not bring any model onto the platform that doesn't align with our values of trust and privacy. We've brought in Anthropic. They're a partner. We've worked with them, we've invested in them as a company. We understand how they're approaching their model building and their models run literally on our infrastructure. The data does not leave our infrastructure. So that is a responsibility we take with partners that bring in models. The models are certified on the platform, are ones that adhere to our values.

Speaker 1: 24:30

Is the infrastructure thing an issue for you guys? If a lot of customers switch to Einstein AI, do you have enough infrastructure to serve that? Because I can imagine everybody wants to be an NVIDIA partner nowadays. That is for a reason, I guess.

Speaker 2: 24:46

Yeah, we have. You know this couldn't have come at a better time for us. Salesforce has been on a journey to leverage some of the best clouds that are available today. Amazon is a trusted partner for us and you know we are a preferred partner for Amazon, which means that as we scale with our customers, we bring them along in the journey. Scale is not something that Salesforce is new to. As I said, on the predictive stack, we work closely with our partners, run about a billion predictions a month. We do expect GPUs to be a challenge but we expect when that challenge arrives we will meet that challenge with working with our partners and having the right plans in place to be able to deliver that kind of a bit of a harder question yeah is this just the beginning?

Speaker 1: 25:39

can we expect Salesforce to do a lot more in AI in the next year, or do you think the biggest step is already made?

Speaker 2: 25:48

Not a hard question at all. In fact, this is if you were to ask me, this is probably step one or step two of a 10-step journey. I think you will see amazing things happen in this space. I think you know I'm an optimist, a cautious one at that, but from what we can see, I think AI is going to be, you know, something that powers applications, consumer and enterprise applications for a long time to come. The trajectory and the scale of innovation in this space just in the last one year is just unbelievable, and it's been mostly at the substrate layer, but now you're starting to see how companies like Salesforce and others as well, to be honest are bringing that through an entire stack that's evolving around it, including making innovations on the substrate layer. So I think this is just the beginning.

Speaker 1: 26:45

Okay, and we do hear some I don't want to call it complaints, but challenges for some smaller companies. A lot of the new AI initiatives are at a price point where it can be beneficial for larger organizations, but sometimes for larger organizations but sometimes for smaller organizations is a bit of a big investment. Do you think in the long run, when the technology matures, it will become better available for smaller companies, that the prices will go?

Speaker 2: 27:17

down and efficiency will come up from the models as well. Again, I think our focus is always to meet our customers where they are. We work with a lot of small customers as well as large customers at Salesforce, and the goal here is to build experiences that, even for the small customers, they look at it and go it's worth paying for. I think what we are trying to do is generate tremendous value for the end user and then capture a part of that value. That's sort of where we are at From a price point perspective.

Speaker 2: 27:51

Absolutely we talked about this a little bit earlier where I think the future is going to be mixed large language models, which are not just a singular expensive model, but a mix of expensive models when they're needed, going to be mixed large language models, which are not just a singular expensive model, but a mix of expensive models when they're needed and cheaper ones where they're not. I think when you bring these systems together, the cost to serve will go down significantly. But even today, I don't think small customers need to wait. I think we have a stack where they can come on board. They can start experimenting. We give them a bunch of credits that they can get started with and if they derive the value by building the way we are working with them to go build, I don't think price is a problem, because they're deriving significant value from what they're building for their customers.

Speaker 1: 28:37

Okay, thank you. I think that will be something that we're looking into in the future, when you maybe have some examples of smaller companies that get benefits out of the AI, because I think that's something a lot of companies are wondering how to get that value pretty fast before it becomes like a cost issue.

Speaker 2: 28:58

Absolutely.

Speaker 1: 28:59

Okay, I want to thank you for your time.

Speaker 2: 29:02

Thank you, Koen. It's a pleasure being here. Okay, thank you for listening.

Speaker 1: 29:07

Be back for the next episode.