Profound

S4 E18 - Joseph Enochs - Embracing AI in the Enterprise

July 23, 2024 John Willis Season 4 Episode 18

In this episode, I speak with Joseph Enochs, Managing Director of AI/ML and Emerging Technologies at Enterprise Vision Technologies. Known for his extensive background in DevOps and digital transformation, Joseph shares his remarkable journey transitioning into the AI domain.

Joseph begins by recounting how his interest in AI was sparked, notably influenced by the foundational concepts of W. Edwards Deming and the subsequent developments in DevOps. He details the pivotal moments that led him to pursue a master's degree in AI, highlighting the importance of continuous learning and the foresight needed to anticipate technological trends.

The discussion covers the evolution and integration of AI within large enterprises, emphasizing the challenges and strategies for incorporating AI into existing systems. Joseph explains the significance of vector databases, context windows, and the roles of orchestrators and agents in enhancing AI capabilities. He also delves into the practical applications of AI in business, such as improving call center efficiency and automating complex tasks.

In the final segment, Joseph offers practical advice on how to start learning about AI, emphasizing the importance of experimentation and validation of assumptions. The episode concludes with a reflection on the profound impact AI is set to have on the future of technology and business.

John Willis: [00:00:00] Hey, this is John Willis. So this a little bit of laziness, a little bit of organization, but it's right now it's going to probably go up on the profound podcast. At some point, I'll be actually creating an attention is all you need. And I'm sure I've still gotten that all straightened out. There's been a couple of podcasts we've done that have been focused on AI and for now just resting in profound, but we'll, we'll we'll get that all cleaned up at some point, but I got a great guest.

 Has been my mentor. In AI, as some of you know, I'm writing a book. About the history of AI and and I've been sort of transitioning my day job to be doing a lot of a lot of rag stuff, but sort of, you know, working my way up the chain, trying to understand more of this and I'll talk about how I met this gentleman and what an amazing mentor he's been for me over the past year.

So Joseph, you want to introduce yourself? 

Joesph Enochs: Hello. Yeah. And, you know, and [00:01:00] just for a minute before we get started, actually, John has been a mentor of mine for well over a decade in the DevOps space. So it's exciting for me to hear some, some, the kind words that, that that John has shared about our AI journey together.

And yeah, so I'm Joseph Enochs. I'm the director of emerging technologies. and artificial intelligence and machine learning at Enterprise Vision Technologies. And I've been working with the organization for well over a decade now, trying to bring innovation to enterprise organizations. And, and we we support about a hundred very large enterprises.

And you know, It has been a journey from, you know, DevOps and transformation into cloud. And now over the last several years, I've been building out this machine learning, data science, and now what they're calling data intelligence practice. So thank you for having me, John. 

John Willis: Ah, it's amazing.

So about a year ago, maybe a little over a year ago, I don't know. We decided down, Alan Shimel at devops. [00:02:00] com, TechStrong, sponsored this what was indeed called the first DevOps Gen AI hackathon, and we all went down to Boca Raton, and I, I invited Gene, you know, he was a little busy, but he said, hey, do you mind if with this guy, Joseph. And we, me and Joseph had met, but like, you know, I'm lucky if I remember my kids names these days.

I'm getting old. You know, I, I didn't really remember him. And you know, I was like, yeah, sure. You know, and, and, and minute Joseph got, and he asked, he said, John, do you mind if I can throw up a couple of slides? I'll be like, yeah, go ahead. And I'm like, holy crap. This guy knows this stuff. Like all of us were like, I mean, Patrick was there.

He was, he had a head start on a lot of us, but Joseph sort of like, like, it was like, holy mackerel, this guy really knows some stuff and it was really helped us get the whole, you know, to the extent that we've got a, you know, small community emergent community doing DevOps and Gen AI think Joseph and Patrick probably, you know [00:03:00] have sort of, you know, helped guide us again.

Joseph is really has a strong background, which, which sort of goes into the sort of the first topic I wanted to cover is, so you're a DevOps guy. You're going to DevOps days. You're watching my presentation, Gene's presentation, Patrick's, and, and you're banging it out as a, as a, you know, big scale consultant with all these DevOps projects doing everything.

And somehow you, this is the thing that I think is so amazing, I'll let you tell the timeline, but you decide you need to go back and get a master's degree in AI before All this insanity started. So tell us that story. Cause I think it's like, it's, it's not just amazing timing. I, you, you, you had some insight that, that drew you to make that decision.

Joesph Enochs: I think it goes back, you know, to your point, early days of, of DevOps when I was watching You know, Gene had and you guys had started, you know, [00:04:00] DevOps and the Phoenix project came out. And then I watched your talk on Deming to DevOps, which really got me excited in research, right? And, you know, in that talk, if folks haven't seen it, you got, you got to go back and watch it, right?

Where you really step through the provenance from all the way back hundreds of years of, you know, the foundations of DevOps and thinking about entropy and value stream and theory of constraints. And you stepped all the way through to modern times. And that really got me thinking on the journey just of, of how to learn, right?

And at that, that point in time, I, I really adopted that in, in my sort of my life journey, like the why and the aim of what I'm doing and set out this aim and then conduct the research. So Around that same time frame, 2013 2014, I started hearing more about you know, obviously GPUs and things were, were big, more mainly in the gaming and crypto space.

 But I also happened upon a [00:05:00] guy named Ray Kurzweil at the time, and, and he had been making these interesting Predictions about a I for for a long period of time, but I really hadn't connected the dots between how the progression in hardware and the progression in AI was was really was really happening.

And when I watched Jensen Huang's discussions about the GPU, and then I watched Ray Kurzweil's discussions about the trajectory of a I. It really made me think about the conversations that I had heard from yourself and for Gene about sort of this, this preordained world of, of DevOps, where we try to eliminate entropy out of systems.

We focus on the aim, we focus on the goal, and then eventually, if we can get sort of that human element of the system figured out, then we can make progress as a group. And that sort of, got me on this journey of researching for, you [00:06:00] know, artificial intelligence. And then to your point, we were still doing a lot with DevOps and Kubernetes, obviously open sourced.

And we were working a lot with that. The SRE movement started. A lot of exciting things there. And we actually are our company, we actually acquired another consulting firm that was doing DevOps you know, consulting around a lot of containerization and things of that nature. So I was able then Because we acquired a company to do that to kind of allow those folks to take the reins with our DevOps and DevSecOps practices.

And it's been great and allow me to really, to your point, focus on the data and data intelligence practice. And that was 2019 timeframe. And and at the time I was looking around to your point for something like a, you know, executive education education that I could do part time and at night while still working right doing my day job.[00:07:00] 

And I found a program and an initial program to really remediate my skills at Columbia. And it was primarily focused on Python programming and applied machine learning. But the prerequisites of that were, was math that I hadn't touched in, let's just say, a number of years. So I literally, before I could even start on that journey, I had to go back, and to your point, and do a self inventory on like, hey, is this, do I have the time to do this?

I have, you know, commitments to, you know, personal commitments to family. Personal job commitments and what do it. Can I really do this? Right. Cause I you know, can I sign myself up to do it? And I, and I started thinking and more and more data points would come out kind of like, again, the early days of, of let's say containerization, like containers and then Kubernetes.

And then you start seeing, you know, VMware was acquiring [00:08:00] Heptio. And I'm like, well you know what? Containers. This is kubernetes is going to eat the world kind of thing or start seeing these things happen. And all of these same sort of things continued to happen with data and data science. And I'm like, this is got, you know, so many data points have hit that this is coming and I have got to make this commitment.

So I really locked down. And to your point Made the focus on remediating my skills for math. I took that first jump into the Python and applied machine learning and started coding different coding tasks, you know, smaller, small coding tasks, and then moved up to larger coding tasks, remediated my math.

And then to your point, once I had gone through that program, I went through a, you know, like you said, a degree completion program, at the same Columbia Engineering for a machine learning and data science and AI, you know, [00:09:00] artificial intelligence. So I went through that program after that. But to your point, it was a long journey of soul searching a long journey of doing a personal inventory.

But once I kind of, you know, All the boxes started aligning. I said, you know what? I have to go after this. And I will say, you know, through the pandemic, since we were all, everybody kind of went remote, it it made it less challenging for me to be able to kind of sit through and go through, because all education at that time had primarily transit transitions to online.

John Willis: Yeah. And you, you had said, so Jensen is the NVIDIA founder, right? And you said it was something that you saw him, so you said Kurzweil, and Kurzweil is amazing. Like you, you turned me onto him and, and he's going to be a big part of my book as well. But but Jensen, it's all, obviously History of AI, Jensen's in my book too, but, but you said there was some, you, you saw him in a presentation or something and it just sort of, you just realized this is going to hit.

Do you remember what that was? [00:10:00] 

Joesph Enochs: Well yeah, really, when NVIDIA itself you know, he, there had been a big push at the time for, for GPUs. There was a GPU kind of crunch. And to your point, he was, he was talking about, like that this GPU crunch, this is just the, you know, like They have a, you know, a, a, a concept of like the crossing the chasm people, I'm sure people have heard of that, right?

Where you have, you know, your, you know, your innovators and your early adopters and sort of this crossing the chasm, right? And he started talking in the, in those terms about sort of this GPU enabled databases and, and artificial intelligence. And he was talking about at the time. That they were crossing the chasm.

Right. And he's like, this is going to become mainstream. And one of the things that he, as a CEO said, he's like, and because we are crossing the chasm here on AI, I'm going to make an investment in Nvidia that [00:11:00] we are going to identify throughout our value chain, all of the individual components that we rely on.

And we're going to start, you know, creating an end to end vertical. And, you know, that's when to your point. CUDA and the networking and every piece in their you know, design process, et cetera, they started identifying what were core components of their business. And, and then at that, that, that talk, when, when I started recognizing, you know, NVIDIA is so serious about this AI, they are so strategic about it that the, the, the future of their entire existence that had gone Kind of from gaming into this AI realm that he has make making this whole big bet that that the vertical market is going to all be based on GPUs and AI.

That's one of the, to your point, one of the defining moments because that and alignment with What Kurzweil have predicted on artificial general intelligence, which, by the way, is [00:12:00] it, it appears to be coming true. His 

John Willis: whole podcast onto itself. But right. But, but, but those 2 things were aligning.

Joesph Enochs: The hardware was actually making it happen and To your point, there were all of these way points in, in, in the AI, the underlying components of AI that were coming true. And, and again, that one that one presentation that he did where he really started talking about he's they're changing the entire company for a vertical strategy. 

That's what got me on, you know, sort of that final straw to say that this is coming. 

John Willis: You know, because I mean, Like, especially like, you know, I mean, one of the things that, you know, we both see, we both go to a lot of hackathons together and we've done having a lot of fun just figuring this out sharing notes and stuff.

And but like a lot of the people that are deep into this have no enterprise background. You know, and I think what makes you so interesting, you know, where I'm trying to follow in your [00:13:00] footsteps is. We understand the enterprise. We understand the banks. We understand the utility companies. We understand.

And for us to be able to walk in with the, with similar knowledge to what these kids have, these brilliant kids, but they had, you know, we, if we can even just stay on par with them, they have no understanding of what we know about how a large bank works and how the, you know, back to the, you know, You know, even the Deming, the entropy stuff.

So I think to me that's been one of the most exciting things is watching you and trying to follow your lead. The so I, I thought another thing would be interesting. So yeah, it's fascinating your timing you know I feel so fortunate that we ran into each other at the timing when I was trying to figure, you know, like, is this for me?

And now, you know, I'm pretty much all in, you know, so but the. So you know, we've been, like, again, I, you know, I, you know, one thing I get to ask, because Joseph's been through the training, right? So he, he, like, all [00:14:00] these, like, things that are sort of new to most of us, like, what is a large language model?

What is a vector? What is a you know, he's allowed me to sort of sit down over coffee or a drink and, and just, you know, he'll break out a, a pad and just show me. And I thought it might be fun to have you sort of walk through some of this at a, at a, you know, a first level. For listeners who are, you know, struggling with like, what are these things really?

So, you know, everybody knows what an LLM is by now. Everybody knows sort of what Lang chain is, right? But I thought we'd take it one level deeper and you know, and just just something I've been using as an acronym for really just to help. DevOps, you know, so our DevOps tribe, if you will, be better prepared for understanding all this stuff, because I, you know, we're, we're actually working on a paper called Dear CIO, which maybe at the end we can go back to, but it's like warning the CIO that you're going to own all this stuff in the [00:15:00] enterprise.

And so there's going to be a large technical debt. And, you know, I feel like. My role as somebody who's been involved in sort of, you know, I hate to sound hokey, but the greater good that DevOps brings to communities, our community, for helping enterprises, protecting enterprise, help the enterprise sort of not fall into vicious technical debt, cycles. So I came up with an acronym called LORMA. Basically to try to use what we successfully did with the LAMP stack, right? Maybe sort of, you know, for no other reason, but to make it easier for people to comprehend what all this stuff is. So I call LORMA , the L for language model management, O for orchestration management, R for rag, M for model.

management and A for agent management. So I thought maybe we'd walk through you know, so you're you know, sort of a little bit lower than like what, you know, that, you know, GPT 4, you know, good, you know, things that people can Google. [00:16:00] So let's start with the L. You know, Tell us a little bit about maybe the difference between these I guess one of the interesting coverage right now is small language models versus large language models.

What do we need to know as DevOps practitioners about these things other than what we can Google, you know, or ask ChatGPT about? 

Joesph Enochs: And you know what, you can and you should ask ChatGPT and you know, Claude and, you know, all these models and get access to them. But I think there's really two distinctions, right?

There's, there's sort of the, What they call the frontier models, right? And then you have your really your foundation models and the frontier models are, you're, you're primarily the ones that, that, that to, to your point, you're talking about, that's going to be your Claude that's going to be you know, the, the models coming from open AI, you know, all of these, these main models that you really [00:17:00] they're really a black box underneath the covers.

But you can get API access to them or you have a chat interface to them. So those are the primary models that everybody's heard of. But there's another section of models that are these foundation models, and these foundation models are from other labs that are then giving open source for the actual, you know, the parameters and open weights.

Now, there's something that's, that's important distinction there. You can have a large language model, which is, or a small language model, which we'll talk about, that they give you sort of the structure. Of the model, but they don't give you the underlying weights. So what does that mean? Well, you have kind of the shell, but when they train these models, they put millions and millions and trillions of documents through these things.

And it builds up this let's say map, reduce knowledge of language understanding. [00:18:00] And those are the weights to the models. So even though you may release just the model itself, if you don't release the weights, you're kind of. You know, you're, you're missing some pieces to that. Some of these models have, are open model, right?

Like open source and open weight. And then with those combination of those things, you really have good insight into what this this model can do and is doing. 

John Willis: So just to sort of level set. So the foundational models are like GPT 4 frontier models, 

but the models. Okay.

And then Because the foundational models are, 

Joesph Enochs: well, I would say that, you know, say foundation models are, are, are in a sort of all encompassing the frontier. Ones are the ones that you can't get access to. Okay. Okay. Then your general person, your general foundation models are 1 that you will can either get, let's say, open source and open weight, right?

So it really depends on like let's [00:19:00] say Meta's LLAMA 3 model or LLAMA 2 model. This is a model that you can have very, you can see various pieces of that model. Or you look at Microsoft who has released the PHI models P H I models, right? the five models. These are models with various licenses and structures to them where you can actually see the model and you can actually see the weights and you can actually really get underneath the covers of how this model works.

And I would say those are the two big distinctions, the black box models or the frontier models and then your general purpose foundation models, which come to your point in the large language models. And the small models and you have sort of the large language models, like we talked about Lama three, seven DB, right?

Which would be a large language model. And then you look at a small language model like the five mini model, which only has like 3. 8 billion parameters, right? That can run on your cell phone 

John Willis: and what, and so what's an example of [00:20:00] a model that is open, but doesn't produce the weights. 

Joesph Enochs: Yeah. Yeah. I think in that scenario.

Yeah. You I think like, let's say the models that are coming out of AWS, right. Or like cohere, right. For instance, like cohere will give you access to a model, but you won't necessarily see the weights to that model, right? They, they, they will give you some models are coming out that'll run on a virtual machine and they are, you know, a large language model that's there.

But you don't actually have access to the underlying weights. So there's a few of them that are coming out again, like, like the one that I mentioned from Cohere, that you can see the model, but you won't necessarily be able to see the underlying weights. You can run like 

John Willis: once you can run on prem versus like the Anthropic OpenAI basically.

You, you call those with APIs, but some of these other models you can actually run on your own, on your own [00:21:00] machines and, but you don't have the, the weights, right? And then, and then what, and then what, oh, and you're saying the Lama have, so you can run them anywhere and they, they, they do make public the weights.

Joesph Enochs: Yes, however, and there's a lot of caveats here from version to version to version, different sort of licensing terms, right? And, and meaning is it going to be what they call commercially viable? Meaning that, okay, I'm going to give you this. You can see underneath the covers, but if you use it for personal use, It's fine.

If you use it for usage on up to, let's say, a thousand customers, it's fine. But if you look down in the fine print of some of these models, and again, it changes from release to release you, you need to contact them in order to get permission to leverage the model. Even though you may have access to the underlying [00:22:00] components, you may need to check in with the people who have published the model to actually know if you can use it for commercial viability.

John Willis: And what is Mistral fall into this? 

Joesph Enochs: Yeah, they're, they're primarily in both the open realm and they are starting to offer some models where you can get an API for it, right? They started offering their models and it's just, to your point, completely open source, open weight. It was exciting. People were training on it.

Then they released a mixture of experts model. And then, and then they started saying, Hey, you know what? We want to actually run this in our environment. So we're going to have an API so that you can hit it and offer more of a, you know, as a service type model, but they primarily fit in the more open weight and source you know, components.

John Willis: Got it. And great. That's great. Great, great overview. And so now what are, what are these orchestrators and why are they important? 

Joesph Enochs: Well, when you have [00:23:00] a large language model, You know, you, you may want to you may want to have this, this model understand more intent or basic intent, or do basic tasks, and these models have, they're, they're okay if you ask it one thing, it can probably provide you a good answer.

But these models have not been trained on all of the data, you know, in the world. They've been trained on publicly available data, but they have not been trained on your private data. So one of the things about taking a model and let's say it's, it's an internal task. I have a user who's asking a large language model something.

It doesn't have access to a certain amount of private data, but if I give it access to this data and put it sort of in the context window, you know, Then the large language model with its emergent behaviors can use that internal documents to respond back with a specific task that it is intended to do.

So what these [00:24:00] what's so important about these orchestrators at these orchestrators allow you to create Functions and tools and give access to different pieces of data that you can really make it so that you can give behaviors to these various large language models and make them do things that would be very difficult for them to do in just a traditional chat interface.

John Willis: And so it's like Lang chain and Lama index. 

Joesph Enochs: Yeah. Lang chain, Lama index DS pi. The just recently data bricks through their acquisition of mosaic has created their data intelligence platform and they have orchestrators that they're building. But I think for, for us, like Lang chain was like 1 of the very 1st ones To kind of come on the scene.

And at that time, you know, the context window for open AI was like 2k context window. And when you're trying to do like retrieval augmented [00:25:00] generation, which is one of the, like a simple task of taking documents, retrieving them, and then responding back with them, it was really important that you had this orchestrator with a retriever, like in Lang chain that would pull these chunks into the context window and then respond back to you.

So. Lang chain was one of the very first that sort of came out on the scene. And it, it's got a really good name and then llama index really came on. I think they, they took some lessons learned from the Lang chain pedigree of Lang chain, and they really started focusing on that, that rag aspect, meaning that retriever aspect and being able to do SQL retrievers and different tools and retrievers, and, and really being able to.

To create an ecosystem of being able to pull data into the large language models. And then it's D. S. P. Y. People call it D. S. Py. It came out more as when you are a user [00:26:00] and you're asking a large language model, something a lot of times the users will will use You know, let's say language that is broken language or something that's not a clear question.

And so there's this, there's this concept of query rewriting, meaning taking what the user has and putting it into a format that the large language model can see and understand better. So what the DSPy group did was they created this signature component. Where what you can do is you can actually see all the questions and answers that people are asking and you can find out and sort of think of it like you know, a music board where you have like bass and treble and, and reverb, you have these little knobs with, with the prompt and what it can do is when you have all these users sending him prompts, it can figure out The tasks that people are really asking, and then it can translate that [00:27:00] into these signatures so that when a user type something in, maybe it's not written.

Maybe it's written poorly. It hits DS py and it transforms that into a signature. That's a really well written prompt that goes into the large language model so that it can respond very well to you. So it's gotten a lot of traction lately. But but again, these These orchestration tools are very important on capturing the intent of the user and making the large language model do a task that's appropriate for whatever that intent is that the user wants to get out of the large language.

John Willis: And, and, you know, you said it, Lang chain is an appropriate name, so the idea is that, that, you know, what you're wanting to do is have some, like if, if we take sort of chat GPT, right, it's pretty straightforward, right? I, I mean, it's gotten better and, and obviously I've added more bells and whistles, but I put in a question that question gets translated into a format [00:28:00] That's compatible with the, the large language model itself, and then through the magic of maybe we'll get to, you know, embeddings and vectors and all that, it, it really sort of finds the most approximate answer, right?

But to your point, when, when the when the task is, needs to be more of a chain. Of like, hey, I really would like to go out and get some data from SQL. I'd like to move that in. And then I'd like to then take ask a question on top of that. That's where these orchestration engines sort of enhanced the prompting mechanism, if you will.

Yeah, 

Joesph Enochs: we start at the very, you know, to your point, the basic question and answer, the bots are pretty good at. Again, if you want to move to private data, we have to give access to that private data. And that's to, I think you're alluding to the, what they're calling RAG, right? Retrieval Augmented Generation.

And that Retrieval Augmented Generation [00:29:00] requires that you be able to pull in that data. We really started initially with, The vector databases. But over time, people realized, Well, I have data that's structured data as well. I have data in graph databases as well. And actually, some of these formats can be more easily queried.

And so these orchestrators, you can have a router function. Let's say you have a question that comes in, and then there's a router function, which is an LLM. And it says, Hey, based on this question, where should I get this data? And then you can have tools in the background that says, well, if it's something about looking up, let's say, ITSM incidents, then I pull it right out of this incident table in the CMDB.

If it's something about a policy, well, I pull that right out of the vector store that has all of my 20, 000 policies and procedures, right? If it's something off of an ontology for a particular task, I pull [00:30:00] that cipher from a graph database. And that, that gets into more Advanced sort of rag type topics.

 And we can step into that. In addition, you have a concept where you have multiple steps. So let's say something as simple as I want to look at sentiment for customers, right? Which is the standard use case. I'll look through all if I ask an LLM, hey, what's the sentiment of you know, all of these, you know, You know, documents here that people have written that might be hard for it to do on LLM to do.

Whereas if I ask it two steps, something like, hey, read through all of these and give me a summary of all of these documents and then the second step could be read through all of these summaries and give me a sentiment, right? In that scenario, since I have two pieces, it may make it easier for the large language model to do a summarization task [00:31:00] and then a classification task instead of trying to ask it to do both.

And this is where these chains really start adding a lot of value because I can make it where I can have sort of an atomic unit of, Hey, LLM, do this one thing and then take the output of that one thing and pass it to the input of this. thing and really break these complex tasks down into very simple tasks to your point, which were chains, right?

And hence the name Lang chain. Right, 

John Willis: right, right. Yeah. So yeah, the chains and then, and then just quickly, the context window is like you explain that a little bit, like it's, is it, you don't need to go too deep on it, but like, if that's, I mean, this is going to be a terrible way to explain it, but sort of a buffer of information.

That gets put into the large language model. 

Joesph Enochs: Yeah. So when we were first training transformers, there was this sort of pivotal paper that came out, I'm sure to your point, your podcast attention is all you [00:32:00] need, right? There's a, there was this attention mechanism and that attention mechanism. You, if you look at it, you'll look at these models that come up.

Like I mentioned, these open source models, if you go look at, you know, At hugging faces leaderboard for these open source models, you'll see these models come out as like instruct model and chat models. The instruct model is really kind of the root model that is created and inside of that root model. Is this this sort of attention window or attention mechanism and that's the way that these things are trained and early days what it was is you would take let's say this this chunk size and let's say 2000 tokens, right?

Or 2000 words. And what we would do is we would train it on. Okay, okay. Here's a token. Predict the next token. Here's 100 tokens. Predicts the next set of tokens. Here's 1000 tokens. Predict the next set of tokens. All the way up to here's 1, [00:33:00] 999 tokens. Predict the 2000th token. And because of that training, those early days of training, that's what created the context window.

Because the the chunks that of text that it was used to train on were were broken up into these positional embeddings that were 2000 chunks wide. So if you put 1000 words in, it could only predict back. Another thousand words to you because that was question and answer. And that's where the original context window came about.

And, and what, what ended up happening is they started training these models on larger and larger batches, right? And that you get a 16K context window. And then what ended up happening was they started figuring out, wait, mathematically, we can actually do some, some math here to actually make it so that, Even though we may have trained this, our original instruction tuning was done at 2k or [00:34:00] 5k or 10k or 16k, we can create these new attention mechanisms that can work around that.

And that's where like rope rotational positioning embeddings, rope and alibi and ring attention, these different mechanisms really created a mathematical way so that we could have very, very long context windows. And, and you can see like, ring attention for the open world model where they can do, you know, a million tokens and they can do needle in a haystack.

Find out, you know, they actually test this. They take some some weird you know, piece of text that they they put 500, 000 words back and then they ask it a question and it can pull it out. And that's because they have actually figured out through these new attention mechanisms, how to expand that context window.

But it was really originally based on these, these chunks of, of training data that we utilize to train these transformers. 

John Willis: So [00:35:00] the intent, you always said this, and, you know, I think anybody who sort of, you know, sort of goes to the, maybe the second level of trying to understand this stuff, how important that paper was, the attention is all you need by mostly Google.

And the way I understand it, I'd love for you to sort of Clean up my explanation of it, but they found that certain combinations of words or certain words in a sentence or in a chunk and a chunk really for layman's terms, just a paragraph, right? It's it's a some set of words defined by, you know, whoever is sort of the creator of this.

The, you know, in this case, the vector, I guess we'll say, keep it that simple for now. But, but they found that, like, you could sort of, I always like the, the sort of, like, the, the sentence I went to the, the bank to get my money, and, so the bank and the money like would basically be sort of words that create it. A level of sort of self attention that takes it in a [00:36:00] direction that wouldn't go to river or river bank or something like that. Right. And, and I guess you're saying that that's what sort of created a little bit of a shortcut of having to evaluate every word, putting higher precedence or weights on it.

on certain words that will have meaning in a sentence other than the other words that sort of could take you down. 

Joesph Enochs: Yeah, yeah, absolutely right. And, and, and like we talked about in that, that training. So, you know, they start with a single word or a single token, you know, and then two words and then three words and four words.

And it's like, Make it, we'll just make it easier to your point from a layman's terms and think about a, a word. If I give you one word, I want you to predict the next word. It's actually a lower level of tokens, but if I give you three words, then it's probably easier for you to predict the fourth, fourth word.

And it's we, we call it like puzzle pieces, right? And if you imagine that original context window [00:37:00] was the full puzzle was 2000 puzzle pieces, right? And if I gave you, like you know, 1999 puzzle pieces, it probably be very easy for you to predict that 2000 puzzle piece, right? And so in the example that you talked about, if I gave something about a river bank, right, if I included the word river as one of those puzzle pieces, then the next token, it would know that you were talking about, you know, a river bank versus if you're talking about money and a bank, it would know that you're talking about It would be associated with the money in the bank, right?

And and and and as it trained those things, there's a technical underlying component of that. There's actually a dictionary of let's say every word or every token in this dictionary. And and it's actually predicting that that dictionary that that word from the dictionary and they have another sort of piece that's called the query The key and the value.

So the query is the query that I'm asking. [00:38:00] And then what it's trying to do is it's trying to look at all of the keys, every token in the every possible element in the dictionary. And then it's mathematically calculating the value for the next token to predict. So it's actually predicting it's using that All of its training based on your query, all the puzzle pieces that you've given it to before looking through all of the elements and, you know, in the alphabet, if you will, to predict that next character.

And that's where that sort of quick key query and value that's part of sort of the transformer architecture actually does what we call this auto regressive responsing. It's actually taking, let's say you gave it five words. or five puzzle pieces, it predicts the sixth puzzle piece. Then it uses those six puzzle pieces to predict the seventh puzzle piece, and so on and so forth, which is what they call the autoregressive nature of these large [00:39:00] language models.

John Willis: That's awesome. It's really good. So we've gotten it. So we did L for language model, O for orchestration, or sort of lang chain, Lama index DS pi and, but you've mentioned RAG. Right. So when we talked about you know, RAG being really sort of this, you know, getting sort of any retrieval augmentation generation, which means it could come from really sort of anywhere, right?

Structured data, unstructured data. But most of the conversation around RAG today are vector databases. And so So I, you know, because I've been spending a lot of time, you know, working with MongoDB, you know, shout out MongoDB, they're one of my clients. Yeah, they have 

Joesph Enochs: a great, great tool and Atlas.

John Willis: Yeah, Atlas Vector, Atlas Vector Search, right? And and so I, I sometimes I, you know, like with all this stuff, and I think even you say this happens to you sometimes, you, you, you say, okay, I think I got it now, man. And then somebody will hit you with a question, or if you're [00:40:00] like me, I'll wake up in the middle of the night going, wait a minute, what?

What happens after that? And you're like, I know nothing, you know, so you know, so that that was interesting. I think when you, when you throw rag into something like Lang chain, which is sort of the one on one, a lot of people start with like a simple rag with Lang chain and, you know, and maybe create their own little chat bot.

And now you've got a sort of a chat bot that works against your own data. Like I, I learned a lot of this by just taking my Deming book, my profound, you know, Deming's journey, profound knowledge book and throwing it into a rag and then sort of creating a little chat bot and figuring out like, okay, cause I knew my data.

Right. But when you sort of, you, you do that You know, that that's sort of one example, which is a vectorized vector as opposed to going out to like slack or API calls or SQL and throwing those in the context window, but on the vector database, right? There's kind of a interesting whole set of dynamics that happen, right?

You, you [00:41:00] literally have to take data. The 101 of this usually is take a PDF, like what I did in my book. Take my book, turn it into a PDF. I go through some process to vectorize. We already talked about chunks, so I kind of split up my book in a certain way where these chunks are, sort of. But but what, what is the whole vector thing?

And like, can you give a like, you know, like, like we're trying not to go to the fourth level. We're trying to say at the tip of the third level, but a little more than the second level conversation of these things. 

Joesph Enochs: In the simplest terms, a vector for a natural language is a mathematical representation of the text, the language.

And they've actually, before you actually get to this embedding model they've done a lot of training on a model. So that it can understand these words. Right. And what it is, what they'll do is at the end state, you kind [00:42:00] of get this black box and it's words in and numbers out. But the beauty of that is, is that it's a one to one representation, meaning I can go from words to these numbers and then I can.

Pass those numbers back through this model and get the words. Right? So these embedding models are, you know, they're, they're just one of them. Just amazing inventions of humanity. It's like if we look back at the, you know, sort of industrial revolution of, you know, electricity and steam engines and, you know These sort of transformer mechanisms, these models, like these embedding models are one of the things kind of like the transformer, if you will, that is, is one of these inventions that will transcend time.

It's pretty amazing, right? You put in, you put in something and you get something known out. It kind of is like a compression algorithm, if you will. But the beauty of that is when you put it into these numbers, it can [00:43:00] sit like in a matrices, right? And, and that, and that's. That's exactly what it is, is the, the words in, and you get a row of numbers out or a vectors, and depending on the embedding model that you choose, you're gonna get a, a, an exact number of of embeddings out.

And, and that, that's what they call the dimensions of the model. Right. And so it can be like 384 5 12, 10 24. You know, if you look at open ai, they've got like. 3k, they've even got some embeddings that are up to, you know, 8k, if you will. But, but what it is, is even if you put, let's say, one word in, you're going to get out 384 you know numbers that are representing that that particular embedding of that one word.

So if you were to look at it in a table, Let's say I have a document like you mentioned your book, and let's say I have one [00:44:00] 1000 words of that document and I break it up into 100 word chunks. That would be basically 10 of these vectors that I've created. And then what I would do is I would say table.

Column one is the I. D. Zero through nine. And then I would have 100 words of text. And then let's say it was 384 dimensions. I'll have 384 numbers, comma, separated value numbers sitting in you know, column three of this table, and I would have nine of those. And that representation would be the embedding of that actual document. 

, so the, the beauty of that is, is that, that's again, a mathematical representation of those hundred words. So let's say I took an SEC document, right? A document about Apple or NVIDIA. And I took that filing and I broke it up into these same chunks.

Well, there's going to be. talk about their financials, there's [00:45:00] going to be talk about the board of directors, there's going to be talk about the CEO, there's going to be talk about different things in each one of these chunks, right? Well, if I take another question and I ask about the CEO, it's going to take my question, it's actually going to convert it into that same Mathematical representation.

It's going to pass it through that embedding model and then rag. What rag does is it executes a similarity search multiple ways of doing the similarity search. They call it cosine similarity or inner product, outer product MMR. There's a lot of different ways of doing it. But what's going to happen is it's going to pull back documents.

That are, that have the word CEO in them. Those are the ones that are going to rate the highest, right? Those are the ones that are going to exist in vector space closest to my question about the CEO. It's going to pull those relevant chunks back and give them to the LLM and the large language model is going to read who's the CEO.

It's going to see these however many chunks I give it and one of them says [00:46:00] the CEO is Jensen Wong and he stated this on the call and etc, etc, etc. The large language model is now going to have. Context on how to how to answer that question on who's the CEO. And that's really the foundation of these retrieval augmented generation and how these embeddings and vector databases actually work.

John Willis: And, and, you know, 1 of the things that I try to explain to people is you know, like, if you think about, you know, when he talks about dimensions, he talks about, like, think about the. a two dimensional, or just a Cartesian plane, you know, x and y, right? And if you had the word dog, right, the, the position on that plane would maybe five, eight, you know, x is five, y is eight, right?

And, and food might be, you know, X is seven. You know Y is eight nine, right? Right. And so, and then like, then it would just be sort of like, you know, what's the distance? So when he talks about like similarity searches on a two dimensional, what would [00:47:00] be the distance between that? And maybe that maybe instead of a dog food, maybe dog cat, that would be a lot further away than food.

 But the, what, when we talk about these 384 or 1536 or 8000 dimension, you know, think of like the, let's go from two dimension to three dimension, you know, cat would be, you know, dog would be x equals five, y equals eight, and z equals, you know, 11. But now we're talking about, this is where I think the people came up with.

This is just so frigging brilliant. Like what if we went to 1500, 384 dimension, right? So cat would be 380, you know, whatever the 384 usually shows up as floating point numbers visually in a, in a, in a GUI that shows you this. But those will, there'll be all the positions. It's so like Lily, the word cat would have, you know, like 384 [00:48:00] numbers.

Which are just representing, instead of x, y, or x, y, z, the x, y, x to, x, n positions in the vector, and then it's really the same math, right? It's the, it's like you said, cosine similarity, if that's the choice, which seems to be the most common. Prevalent use case. 

Joesph Enochs: Everybody starts starts there, right? Okay.

Right. 

John Willis: Yeah. No. And the other thing I think too, which is interesting that like, I didn't realize at 1st is just 1 of the things that the orchestrators do for you is, you know, like he said, you know, like, somebody's got to convert your question into the same dimension. Yeah, so like if, if it's sort of, I wanted to get like a question about that or I'm hoping to get something related to dog food without an orchestrator, I have to take my question and I have to put it into the same dimension.

vector that sort of my, my vector database is in, which is 384, right? Which we have dog food. So the orchestrators, I didn't realize they'll actually [00:49:00] do that conversion for you under the covers and, and turn, turn a question into the embedding format, the, the highly dimensional that's consistent with the dimension of the, The vector database, right?

And that's correct or 

Joesph Enochs: yeah, how this how this how this manifested for us, right? As a community, what it really unfortunately started because of that context window dilemma, right? We, you know, I could have given an LLM a document and said, Hey you know, what's the price of. Dog food, right? The problem is, is that if that document was more than 2000 words long and the original days there, I would have never been able to put the document in there to begin with.

Right? So we had to create a work around and that work around really became known as retrieval augmented generation. And why is that? It's because I could take a large amount of documents, which they call a corpus of documents, right? And I could [00:50:00] put that into this vector database. And then when I asked a question to your point, I could take my question and search for all the relevant chunks and bring only back the number of relevant chunks that my model can handle, right?

And if, and, and if it was about dogs and cats or this or that, it would pull back relevant chunks and I wouldn't get this error like you're out of you know, you're out of tokens, right? So that's what really Kicked it off in the beginning was, was context length and the limits on what we could do for a large language model, but it really started becoming more than that because we want to, we don't, we won't, not everybody just wants a simple chat interface, right?

Chat with your documents. People want to be able to ask questions. standard operating procedures, right? They want to ask different things about this type of document or in that type of document. Let's say insurance. Somebody wants auto insurance or life insurance or you know, these types of things. So people were creating [00:51:00] different types of document stores or you know, corpuses of data.

And then in a RAG, you can have multiple retrieval augmented generation Things for it to pull data from, to your point. So it was really the orchestrators themselves helped us make it, make our large language models and our, let's say, agents, if you will, more general purpose or our tools, more general purpose because we could have different vector databases.

We could have different routing functions and we could give the LLM choices to if somebody asks about this type of document. I'm going to pull it from this vector database, and I'm only going to pull back the relevant chunks from that vector database. Whereas if I'm asking something about, like I mentioned before, incident management, maybe I pull that out of the incident management table in my ServiceNow instance for, for for example.

 But that's where. Again, to your point, these orchestrators really, the power of them [00:52:00] comes through, is really based on, again, intent and action and tasks that I want to accomplish. 

John Willis: And I think some people get confused too, because like, like if they were, okay, now I get all this, but now can I just go into Claude and load you know, what a five megabyte you know, PDF in there or, or chat GPT, you know, load PDFs in like, you know, I mean, like to call it to give you what five, 10 megabytes.

I don't know. I forget what it is, but, you know so then why do I need all this rag stuff? But then one of the things I've learned a lot from you about is, you know, There's sort of a whole data, there's a whole sort of, I'll call it, sort of engineering process to get higher, more accurate answers.

So, you know, we've, we've sort of gone up the chain of, okay, first it was, can we load data in the context window, context window was too small, so let's create vector databases, so we chunk it up in some, reasonable sense, maybe every thousand characters or every five hundred characters, and, oh [00:53:00] good, now we've solved this.

So the context window, the context window has grown, so can't we just load our whole, you know, our book or whatever into, or, or 10k, you know, SEC filing or whatever, into Claude and ask questions, but then comes the difference between, like, the work you have to do, which you do, you spend a lot of time on, and you've taught me a getting your data.

Loaded in the vector in a way that creates higher accurate, much more accurate. And where, where, like, for example, where, like, the answers are high consequence if they're wrong. 

Joesph Enochs: Yeah. I mean, you know, over time, if we look at, you know, technologists, we're all technologists, but ultimately want to create a business value, right?

And so that that goes towards you know, where do we get the business value? And let's say we're at a call center and we've got You know, we've got a product that's, that's really exciting. And a lot of people are calling into the call center and I've got. You know, [00:54:00] 500 people in the call center and all of a sudden our new product comes in and the phones, you know, the, we got a million people calling in to check on this and our, our call centers overwhelmed.

Well, if I can have an, an agent that I develop an AI agent that can for, let's say Simple things like you're talking about. I can ask a simple question. It does simple retrieval, augmented generation and pulls back. What's the policy for this? Or what's the policy for that? Well, I don't have to burden my call center with that volume of calls.

So now, if I then, to your point, take If I take, you know, car insurance on auto insurance, and I have these multiple sets of data and more and advanced retrieval, augmented generation, I can pull back specific data pieces. And now again, less people are, there's less pressure on my call center. Well, if I can then have various tasks like I want to actually do something right, I want to create a new policy or things of that nature.

This is where sort of reactive [00:55:00] agents come in. And in that scenario, that's where I start chaining various things together, chaining documents together. And what happens is, is to your point, in that scenario, I can't just paste one document into Claude or whatever and ask it, like, how do I create a policy on this document?

It would be, it would get confused in the steps that it needs to do. But since the humans get involved and they say, Oh, if it's an insurance policy first, look up the username and then based on the username, look at which policy that they have. And then once you have the policy, they have get the terms of their policy and then and then validate with the customer.

I see here that you have this auto insurance policy and this auto insurance policy. Which policy are you asking about? Oh, I'm asking about policy number two, right? Oh, and policy number two, and then the LLM can pull back all the specifics of policy number two, what the policy limits are, all those various details, right?

[00:56:00] And then say, Hey, you know, if you want to extend this policy, you could look up the options for extending the policy, those are behaviors that start gaining business value. Again, cause now I'm pulling pressure away from my call center. But it's, it's, that's not something that A Claude or an OpenAI can do natively, right?

 So this is where, to your point, the enterprises gaining business value get a lot more they have a lot more need to have accurate responses. They have a more need to be able to validate what the LLM is doing. And, and then to your point, we have tasks that we can chain together. And with that, there's, there's a lot of different strategies.

 And I'll just give you one, one example of a strategy that that we do like we can break the documents up into, to different factoids, right? Or smaller pieces of data, like a, You know, a sentence or a paragraph or a full [00:57:00] document and in that retrieval, maybe the sentence and it's actually more sparse than this, the math behind it.

But just, you know, to keep it at a higher level, if I'm looking at sentences, maybe it's maybe I can pull back sentences very well, but I need a specific chunk that has the policy. And then from that, I can pull back the chunk to answer the question, or I need the full page. to get the the understanding of policy limits, but it may have been more difficult for me to find out to your point how to get that actual page.

But through that process of Oh, I'm getting auto. Oh, I'll get this sentence. That sentence refers to this chunk. I read that chunk. Oh, that chunk is is, you know, 85 to 90 percent this document. I pull back that specific document. Now I can really have a high degree of certainty that I'm responding back to the customer in an accurate way.

John Willis: Yeah, no, that's, that's great. And I guess to sort of end up a little bit is, you know, the sort of last piece is [00:58:00] agents. You, you mentioned a little bit about the agents. They seem to, like, for me, like, okay, I'm learning everything I can about RAG and sort of getting the language model orchestration.

 We, I, I, I skipped observability. So, but we can maybe do that some other time. You know, sort of Observability for AI is a whole other interesting conversation. But, but the agents you know, like I, I you know, Patrick has created a really nice workshop on that recently. And I guess I didn't really understand, you know, exactly what was going on here.

And I guess I still kind of don't, I've run through, I've gotten a couple of work, but, but it seems like they're just They're just looping on questions. I know it's not, there's a very terrible way to describe it, but they seem to be loops on on you know, queries and responses. Is it, you know, you kind of expand on that a little bit?

Joesph Enochs: Yeah, I think, you know, if we pull back [00:59:00] to our DevOps we're, you know, world and our value stream world, imagine you have an outcome. Right. And then in the value stream, we have our value added tasks. So think of each step along the way of an agent chain as a value added activity, right? And what do we do?

We can't throw things over the fence, right? Or Because what happens, right? It's like, you know, software throwing it over to operations, right? You know, it's not going to work. So the same exact thing happens in these agentic frameworks. You start with an outcome, you break down all the value added tasks.

Each one of those value added tasks can be an individual agent. Now you have to have communication from agent one to agent two to agent three, To the final outcome of the agent. Some of those agents may, may need to be recursive loops to check the work of a previous agent, right? Mm-Hmm, . But that's kind [01:00:00] of how these agentic frameworks are, are coming about.

And you'll see Lang Graph, you'll see Crew, right? You'll see many of these agent frameworks. Meta GPT is an agent framework. There's, there's a lot of, you know, obviously open AI has their their frameworks as well. Azure open AI has their API frameworks, but at the basic context and an agent, all what it's really doing, it's really doing the value added tasks of a value stream.

 And then the more context that you can give from agent one to agent two to agent three, the least likely. They are to hallucinate because they, they do hallucinate, but but I really, I really liken it to, to DevOps, right? We have to know what the agent in front of us needs to receive in order to do their work.

And if the agent before it doesn't give it that context, there's a high likelihood that [01:01:00] it's going to do. The task incorrectly. It's just like people, right? Now again, we can stitch these frameworks together. We can make them as complex as we want to, but but in reality, if we don't build them correctly, they will go off the rails and they will go off the rails quickly because they're not people, right?

And it's programmatic, meaning it's only, let's say, a dag. It goes from step one to step two to step three, and then it's done. And if you There's no way that, you know, the agents can get on the phone and sort of call each other unless you program that in, but at a high level, that's what these agent frameworks are meant to do.

They're meant to handle complex tasks just like just like people 

John Willis: got it. Yeah. Okay. I think we We did a pretty just folks, if you're thoroughly confused right now you know, Joseph's gone through this like a number of times with me and each time I'm like, okay, now I get it a little more. So I think it's, it's like anything else.

It's when you got to work with this stuff [01:02:00] to You know, I think you just like, it's, it, I think, and I'd like to hear you comment on this. I think one of the things that's really interesting for people who don't have sort of a, you know, proper AI training or sort of don't think in a, in a non deterministic way.

 You know, Patrick was doing a workshop, we were in Amsterdam on his agent stuff, right? And, and he somebody had asked, you know, I think he, he had demoed crew and and somebody had asked, well, how does it know what to do? Who tells it what to, you know, like he defined like an operations, a developer, you know, different sort of you know, I guess, I don't know if you call them personalities, but whatever.

And so the crew definition, and I could see this person struggling with like, where is the, how is it knowing to be that, that, that, you know, that crew member or like you know, and, and that's the heart, I think that's the hardest part for, like, sometimes I. [01:03:00] Find myself. You know, I'm trying to step out of my, you know, 40 years of like, hardened, calcified thinking and very deterministically about like, you know, if, you know, if A, you know, then B, if, you know, if C, then whatever, right?

Like, you know, like, like, as if there's like, We think everything works through a bunch of case or if statements, and this is a whole world where, like, you know, I think you call it sort of model programming versus, you know, just classic programming, right? 

Joesph Enochs: Yeah, I think to your point if we break it down in the simplest terms, we can tell.

through a system prompt, which, you know, you have a user prompt and a system prompt in the system prompt. We can tell the AI some instructions, right? And I can tell this AI, I can even experiment. You can experiment with this at home and chat GPT. You [01:04:00] can say, Hey, chat GPT, you're the best. the world and you know everything about accounting documents.

I'm going to give you this document and tell me what it means. Or you're a data scientist, right? And read through this data and do complex data analysis tasks and output the standard format for context data analysis tasks. And because you told it that it's a data scientist. It's been trained on general knowledge, right?

On, on all of the best practices for a data scientist. So it would know the data science life cycle. And if you say, I'm going to give you some data. Output out some, you know, attributes for data science. That's how it's gonna know how to do it. And all a a, a crew is, is a series of prompts that, that give personality to a, a large language model.

along that chain. So again, if I'm if I'm doing a task about accounting or a task about reporting, maybe the first task [01:05:00] is read through this data and clean up this data so that an analyst can do exploratory data analysis on the data, right? And you're outputting a table. for an analyst to do exploratory data analysis.

So I drop in a file, it hits that LLM or agent with the prompt that you've told it what it is, and it outputs a table. Now the output of that table goes to the input of another prompt. It can be the same LLM, right? But just imagine now I've given a new system prompt that says you're an expert at exploratory data analysis.

Grab all the important features out of this exploratory data analysis and based on those features, output a heat map, right? It's going to then read all of that data and it's going to do its exploratory data analysis on features. And then it's going to spit out a heat map as an example. And as we go through that same sort of thing in the value stream that we talked about, ultimately, we [01:06:00] want to get a report to an executive.

Well, we take that report to the executive and we step back those value added tasks. We try to put those value added tasks into net natural language about the personality of the person who would actually do it. And then we figure out the inputs and the outputs of each one of those agents. And that builds to your point, this crew of agents that work together to accomplish this task.

John Willis: Perfect. Well, cool, man. I think it was a good baptism by fire for a lot of people who are just maybe hearing about. I mean, everybody's heard some of this stuff, right? Maybe a lot more than I assumed. But but I could share a couple resources to at the end. Yeah, we'll put those in the show notes. So you sent us to put those in the show notes.

So let's so I guess the thing I would say is how do people find you? Joseph? 

Joesph Enochs: Yeah, you can look me up on LinkedIn. Joseph Enochs on LinkedIn for Enterprise Vision Technologies. I write a [01:07:00] newsletter for countdown to AGI. I don't release it every month, but I will get back on track.

We're. We're trying to take in feedback for format for upcoming articles. But subscribe to the newsletter and give us feedback on any formats that you want. But yeah, check, check me out on linked in and love to connect and love talking about AI and John, I can't thank you enough for having me on your show.

And 

John Willis: that's great. I think people are going to learn a lot of stuff. We'll do some more of these too. We'll kind of go, maybe we'll do, you know, maybe when I get the The proper attention is all you need. We can do you know, kind of recurring, you know, where we can sort of dive deeply into like some of the other stuff you shared with me, you know, which is just mind blowing and I, the other thing I totally agree is it's, you know, it's, you know, when I.

I, you know, I had a networking startup a while back and you know, and I, I did like a bunch of Coursera classes on, on, you know, networking, you know, deep networking Coursera classes and, and [01:08:00] you start like realizing how smart those guys were when they were building, you know, like, you know, the, literally, you know, the, all the sort of the, the networking stuff and, you know routing protocols.

And it was just like, just, it was just like, you know, I mean, like. 

Joesph Enochs: Hostel and all those RFCs and, you know, to your point, routing protocols, TCP and IP. Yeah. 

John Willis: Yeah. Like MPLS and all that, you know, like how like DaVinci like, but then you start learning about the, like the, the stuff that led us up to this generative AI, right.

It's, it's just. You know, the, the, you know, the the ideas that people had just came up with that, like solve these incredibly interesting problems, you know, in a way, like, you know, like you said, the whole vector, you know, the, you know, turning sort of, you know, physical like words or [01:09:00] sounds or, you know, pixels into these mathematical, you know, math, you know, matrices.

And then be able to just spit them right back out as they were like, it's just, it's like ridiculously clever. 

Joesph Enochs: And I would say the, the exciting thing, you know, if again, if your audience is sitting back saying you know, waiting, when should I jump in when, you know, we, we are. We are early, but we, but you need to jump on this train at some point.

John Willis: I agree. 

Joesph Enochs: Because this train is, is, it's really speeding up. And you know, at some point you're going to be sitting at the train station and it's going to, To fly by and we're sitting at, like I mentioned, if you look at the first industrial revolution, second industrial revolution, third, we are really at the cusp of this, this industrial revolution where, where, where AI is going to lead us, you know, down this [01:10:00] pathway.

And if, you know, a hundred years ago. With the automobile, you know, 200, 300 years ago with the steam engines and electricity. This is that same time where, you know, the, the Edison's of the world are right. The Tesla's of the world were, we're doing things with electricity or the automobile or the steam engine that we're exactly in that situation right now with, with, with AI and, and you really should.

You know, dedicate yourself. Maybe you as a person, right? Give yourself a weekly task and, and, and, you 

John Willis: know, 

Joesph Enochs: Yeah, and maybe hold yourself account like this week. I'm gonna research one thing about AI and then check in on yourself, right? Did you give yourself a green a red or you know a yellow? and definitely I recommend doing that 

John Willis: denial like Do do some corny cliche, like in denial is a rvia just don't want to be on right now.

I mean, like we, you know, we can talk about the [01:11:00] dangers and the ethics and there's a lot there and there's a lot of things like, you know, this whole a GI conversation, you know? Are we there? Are we close? Are we you know, I just saw a podcast with Melanie Mitchell, which I think her book is the best book so far.

I've read now 12 books on the history of ai. Melanie Mitchell's book titled Ai is. The best book and I just saw her with Lex Friedman and she says 100 years. So, and so like the mileage varies on the experts on this count, but, but the point is who the F cares really at this point, like I, to your point, like just start, you're not going to.

Eliminate humankind by doing one task a week and learning something about AI. And, you know, but the train is gone that, you know, the river is flowing. You know, it's not stagnant. It's you know, like just trying to make all the arguments of why this doesn't work and you can't use it because I mean, you know, Linux, [01:12:00] Linux in the, in, in the enterprise.

I mean, I, I go back so far that I remember a time where large banks would tell me we'll never run Linux in production in a bank, you know, we'll, we'll never run cloud in a bank. We'll never 

Joesph Enochs: virtualize. It's going to be bare metal. We're never going to run a container. 

John Willis: That's right. Yeah. It's just like, like, trust me, you know, you will be running AI.

So like you can. You know, again, I'm not dismissing the dangers, the hallucinations. It's a different paradigm. You know, again, Joseph and I said we're, you know, we'll be this summer you'll see a paper a bunch of us have written as part of IT Revolution called Dear CIO, where it's, you know, telling the CIO the dangers of ignoring this.

 So, you know, so we're, we're, we're trying to track it as responsible as we can, but I agree with you. Do something each week, make sure you even if you want to make a defensive argument [01:13:00] against it, at least go in and do the homework. Don't be one of these, like all these brilliant people I know that are just making these arguments against this stuff.

And they're not even trying it, right? That, that's not the right way to approach this. So, anyway. 

Joesph Enochs: Yeah, because those, those those things, those you know, whenever you find out this you know, things that you disagree with or discourse, right? That should motivate you even more to validate. I mean, that's, that's the way I do myself.

I always, to your point, play devil's advocate and I'm like, but you know what? I need to really figure this out. And a lot of times, maybe from week to week, my, my assumptions may have changed based on things that, you know, the government has released or some large entity that had been thinking thousands of hours on this problem and said, actually, no, we've, we've tested it.

We validate it. It's specifically precisely this. This is what the risks are. This is how we mitigate the risks. And under these circumstances, it's good enough [01:14:00] for These type of activities and I'm like, see, there I was thinking, assuming that they never figure this out, but somebody was actually thinking about it.

So to your point I would say, validate those assumptions and find out for yourself. Whenever you have this discourse, even use that to motivate yourself whether it's hallucinations or whether it's toxicity or, or whatever it is. 

John Willis: Yeah, I can contribute all the way 

Joesph Enochs: back 

John Willis: to Devin. After all. Well, Joseph, thank you so much, man.

 It's been great. So 

Joesph Enochs: as always, thank you, John. And looking forward to doing more of these and it's great. And as always thank you for everything that you do for the community and very much appreciate you. 

John Willis: I got you, buddy. All right.