Experience the ultimate flexibility with the Isolation API, allowing you to securely Quisque pellentesque id ultrices lacus ornare elit vitae ullamcorper. Learn More

Ultimately, GenAI has the potential to create jobs in areas like data science and cybersecurity, but only if it is integrated responsibly. However, the consensus today is that AI should remain a just tool for analysts, not a direct decision-maker, until reliability and trust in its outputs significantly improve.

About Brian Fuller

Brian A. Fuller is Director of Operations for the Ridge College of Intelligence Studies and Applied Sciences at Mercyhurst University, a position he assumed in December 2019. As the Director of Operations, Fuller supports all operations related to the academic curriculum or Ridge College activities. This includes working as the Director for the Center of Intelligence Research, Analysis and Training (CIRAT) and Director of the Innovation Entente Lab (IEL). Previously, he served as a Senior Open Source Intelligence (OSINT) instructor for the Department of the Army’s OSINT Office, where he was charged with overseeing the Army’s OSINT training program for the Midwest and Rocky Mountain regions. He trained Army intelligence professionals at the strategic, tactical, and special operations levels. He managed the training curriculum, personnel, financial, and administrative affairs of the program while participating as a subject matter expert in the intelligence communities’ OSINT program and operational development working groups, ensuring the continued growth of the discipline and associated tradecraft and technologies.

[00:00:00] Welcome to Needle Stack. I'm your host, Robert Vamosi, and I'm your co-host AJ Nash. And our guest today is Brian Fuller, executive Director from the Center for Intelligence Research Analysis and Training the CIRAT at Mercyhurst University in Erie, Pennsylvania. Previously, he served as a senior open source intelligence instructor at the Department of the Army's OSINT office.

[00:00:40] Brian, welcome and let me know if I missed anything in that intro. I also serve as a adjunct professor teaching our open source intelligence and managed attribution courses in our intelligence studies program. And I'm very excited to be here today. And thank you for having me. Yeah. Well thanks for being on today, Brian.

[00:00:59] I appreciate it. Usually we do a prep call for anybody who doesn't know and we all get together. I didn't get, I didn't to do it this time. Rob was able to prep without me. So Brian and I are actually meeting for the first time today. This, this poor guy stuck with me live. But a amazing background.

[00:01:11] Listen, I'm very familiar with Mercyhurst. I've worked with people who graduate, graduated from the program. It's a great program. I, I recommend it to lots of people. If you're wanna get into Intel and you're not coming outta the military space, for instance if you wanna take the academic route, mercy is fantastic.

[00:01:23] Your background, obviously having been in, you know, the Army for. 24 years, I believe. Also has a great background, so I appreciate you being on. It's gonna be fun to have a conversation with a real Intel professional. And then I'm gonna have a whole bunch of, a bunch of arguments with you, I think.

[00:01:34] 'cause I know what we're talking about today and, and you and I have some differing opinions, which will be fun. But you know, that's how we learn, right? So listen, I wanna jump right in. All right. So we're Intel pros, right? This is our background. And, and Osint all source, you know, multisource intelligence, right?

[00:01:49] For, I don't know, as long as I've been in intelligence now, 30 something years, the government's been trying to get rid of Intel people. I remember getting to NSA in 1999 and literally first month at the agency being told, you know, enjoy it. What lasts? You're all gonna be replaced. I was like, what? Like I just, I'm here.

[00:02:05] I just got here. What are you talking about? You know, I'm an airman, fresh out, you know, training. And they're like, yeah, yeah, you're all gonna be replaced. Machines are gonna do all your jobs. You know, tens of thousands of intel people are gonna go away. And decades later that hasn't happened. You know, inte has just grown, but here we are again at another moment where that's coming up and now it's AI and generative ai.

[00:02:21] Oh, that's it, that's the end of Intel. It's all gonna be done by machines. And I know you are an expert in this area, so I want to, I wanna ask you your thoughts and we'll get kind of an open around this on where you think AI fits into, you know, intelligence, real intelligence, government work, intelligence, professionalism, and what you see as the future in, in that area.

[00:02:37] So first, let me put everybody's mind at ease. Gen AI is not going to replace the analyst. Alright. It's been nice having you on Brian. Let's have a good, no, keep going. But, but it is going to replace the analyst who doesn't know how to leverage gen AI that it will, and what I mean by that is Gen AI is the Gen AI is a, a tool.

[00:03:05] Just like any other tool or platform or database we have, think of gen ai, we look at it as a research assistant, right? It's another member of your team. I look at it as a way to, in enhance the efficiency of doing analysis and providing more timely, relevant, and more robust information in answering your decision-maker's requirements or your operational needs.

[00:03:33] So if I told you you could spend five of your seven days working on a project, actually analyzing the information and putting it into actionable intelligence products, you'd be, you wouldn't tell me you're crazy. No, I'm not do that. I'd rather waste all my time doing the research collection, having to find the info, identify all the links and patterns, find the actionable nuggets of information, then pulling it all in and doing analysis.

[00:04:01] So of those seven days, you might get two days to actually do the analysis side, because you've also got the production side, right? So now you're taking time off the back end. But if I told you I can give you a capability that will allow you to retrieve the information, analyze the informa, or not analyze, but retrieve it.

[00:04:23] Put it into products or put it that data, normalize that data in a way that you can analyze it, spend your time analyzing it, identifying the intelligence gaps, filling those intelligence gaps, spending more time analyzing it. And then you also didn't have to spend hardly any time actually creating the products.

[00:04:43] And then you, you're, you are doing your editing of the information and then sending it off to your decision maker. That's what we're getting at with Gen AI as a capability. So you're talking about the, the traditional 80 20 problem that I, I mean we used talk about in the Yeah. Forever ago. Right. And for those who don't know what I'm saying, when I say 80 20 problem, every Intel Pro I've ever worked with will tell you the same issue we had is we spent 80% of our time on collection on just getting things together, organization.

[00:05:10] And then we only have like 20% of our time maybe to actually do analysis because you got collection on the front end and bending and all this organizational stuff and the de-duping and all the normalization. And then of course, like you said, you have the production on the back end. We actually have to do the writing and the editing and et cetera.

[00:05:22] And you have this small window in the middle. Where you can actually do the analysis piece. So what you're saying is what we've all wanted to do forever, which is flip that ratio. What if I only had to spend 20% of my time on the collecting part and the writing part and the authoring, you know, the administrivia as a lot of us would call it, and I can spend 80% of my time actually on the thought.

[00:05:38] In theory that would be fantastic. We've all wanted that for a long time, so I'm excited to hear that. But here's, here's a question I have so. You said, Hey, think of it like a research assistant, like a teammate, which is great. Most of us work on Intel teams, you know, whether in the agencies or in private sector, you know, 5, 6, 10 people on the team.

[00:05:57] And you're saying that this could act as, as one of those assistants. Doesn't that still end up replacing somebody, you know, the junior people, et cetera? And, and my concern, I'll be really honest, is, is the economics, right? So, if it weren't for economics, I don't think I'd worry about much of any of these things.

[00:06:12] But what I have seen is while the academic discussion is no, no, this won't replace people and make people better, well, I'll be superhuman. You can do, instead of one person's job, you can do the work of 10 people. So now you have a 10 person team that can do the work of a hundred people. When money gets involved.

[00:06:26] The flip side is, well, we don't need a hundred peoples all the work. How about we just have one person do the work of 10 people and just fire the other nine? And we're already seeing that. I mean, there's massive layoffs in industry across the board. Companies are looking to get rid of people and use AI to do that.

[00:06:38] They want efficiency so they have a better bottom line. So. They're more focused on bean counting and and getting the numbers as opposed to whether there's quality. It's about quantity and quality, and I worry that's coming to the government space as well, because until there's a catastrophic result, until you have intel, that results in a bad decision.

[00:06:55] There's no way to show for sure if this is a bad idea or not, but boy, the bottom line streak gets better in a hurry. So how do we protect against that and make sure that the intel. The importance of making sure that it's the right intel, the right content, the right people at the right time to make the right decisions, continues to be the most important part, and not just efficiency and numbers and beam counting and money.

[00:07:14] So that's a great question and the answer is Gen ai and this is why. Holy cow. It really is. That's not the answer I was expecting to hear. I'm really excited to hear what comes next on this one. Right, so, so if I'm a decision maker, I may have. You know, a very specific requirement in mine based on my company's operations or my agency's operations, right?

[00:07:41] And that's because those are the resources I am provided and limited to is to focus on that. But now if I have five analysts and I can have two analysts with a Gen AI working on this operational focus or requirement. I can expand my ability and my company, I can expand those operations and increase my revenue by either breaking into different markets.

[00:08:10] I can increase my revenue by looking into doing more competitive business intelligence to determine where else I am not selling my product. Where else I'm not getting you know, these markets are happening. I can use it to get ahead of the markets. I can use it to get it to identify my supply chain risk and analysis, executive protections, right?

[00:08:33] All this stuff that five analysts can't really cover, but I can have two analysts cover it with gen ai. Now I can take those other three analysts and reallocate them to look at and provide intelligence or even quicker intelligence out of that. Which increases the efficiency of my operations, which increases my ability to produce more, sell more, create more, supply more, and do all of that.

[00:09:00] The things I can't do right now because I am, I have a certain budget I have to operate off of, so now I can actually, for the same budget, I can expand into these other areas that are only going to increase my company's bottom line with the government, right? The government. Takes all their requirements from, you know, the president on, you know, the commander in chief on down to the Secretary of Defense, on down to all these agency directors.

[00:09:32] Their mission is focused in one area or a couple areas, but they should be looking globally, right? Not just your, again, the resources. Keep them focused on what's important at that time. They're not looking as far down the road or they're not looking outside of that left and right limit where they should be.

[00:09:51] This is going to help 'em to do that. 'cause now you can reallocate those same resources to expanding the collection requirements or the priority intelligence requirements in the government can be expanded now. No decision maker, no CEO. No COO, no director. Nobody is gonna say. I don't wanna make more money or I don't want more intelligence requirements fulfilled.

[00:10:16] This gives you that ability. So that's why we're not going, the bean counters are not going to take away those positions. I, I I wanna say this in the most respectful way. I, I think, I think that's a great vision I do. And I wanna believe that that's the gonna be the, the dominating opinion. I don't see any evidence that supports it right now.

[00:10:38] I think the evidence right now supports the opposite, which is that, and maybe it'll go that way, fir eventually, but it seems like the first plan is let's get as few people working as possible and then from there figure out how we can expand and maybe they'll bring some back. But it seems like in business and in the government right now, for that matter, I.

[00:10:51] The goal is strictly to get to the bare bones because the easiest way to show improvement in efficiency, improvement in profit improvement in ROI is have as few people as possible, have as little labor cost as possible. So it seems like the idea right now is cut all the way to the bone, then figure out what the minimal number, number of people can do with these technologies, and then maybe they'll add some more people as they go.

[00:11:11] So it sounds like what we're betting on is, is really gonna come down to the nature of the leaders, make decisions, and whether they see it as which direction, right? 'cause you seem to be contrasting opinions, right? Whether it's your opinion, which I like by the way. And I would love to see people take that as the, as the predominant opinion.

[00:11:27] I just don't see any evidence today that that's what's happening. Now you. Talk to a lot of people I don't talk to. So are you seeing people who are more aligned with that? Maybe in academia, maybe in the government side as opposed to what, you know, I tend to see mostly, which is a lot of reduction of force all over the place.

[00:11:43] Hundreds of thousands of people in tech and, and intel have lost their jobs. Tens of thousands certainly in government have already, and it seems like there's just a push to do more and more companies have massive profits and then turn around and dump a bunch of people and keep investing more in the ai and it just keeps going that direction.

[00:11:57] I'm not seeing people see it as the. Opportunity to multiply and advance and, and you know, exponentially grow their productivity as much as the opportunity to do, as much as they've currently been doing, with a lot less cost as the way to, to a larger profit. Because again, there's also the consumer side, you can only only produce so much 'cause people can only consume so much.

[00:12:16] So, you know, is it more optimistic from your viewpoint? Are you having better conversations? Happier ones than I am on this subject, which is possible. Certainly I am, but right now it's apples and oranges and here is why. You are talking about seeing the development of ai, of gen ai and the problem is all these cuts and everything are happening and people are losing their jobs, but it has nothing to do with gen AI because the applied side of gen AI hasn't been integrated in these companies yet.

[00:12:44] It hasn't been developed to a level where it can come in and start doing the, the same work as your analyst. It's not there yet, and here is why. Because there, there is a lot of, a lot of lack of trust in Gen AI right now because OpenAI Chat, GPT, there are models that are developed as a kind of concept to provide the ability to use an AI for day-to-day act kind of activities, fun stuff, return things, you know, go out on the open web and bring back information and all of that.

[00:13:24] It's, it's, it's not being designed for the applied application side of an on-premise solution that's specifically tailored to the needs of a company, a government agency, law enforcement. It's been developed as a AI capability by a company pushing it out as part of their agenda, right, and showing what they can do with ai.

[00:13:50] Where it needs to be before companies and, and, and agencies and everybody else is gonna incorporate it as part of their operations. They need it developed in a way that it doesn't hallucinate, it doesn't fall for disinformation. It can be trusted data. And right now a lot of the capabilities that are out there can't do that.

[00:14:13] However, we are on the cusp of that, and that's why we're talking today because. Here at Mercyhurst, we have developed a Gen AI workflow for on-prem, an on-premise solution. We've developed some intellectual property on how to provide an on-premise solution that takes away all of the risks that are currently associated with using an A AI that goes out on the open web to retrieve the information.

[00:14:40] And there are tools being developed like data square's review. That are also helping to alleviate that by using more knowledge based graphs that are gonna provide more accurate intelligence. That's not in there yet. None of, I've talked to all these companies, I talked to these CEOs, I talked to these directors of government agencies, and they all love this idea and say, yes, this is something we would use, but it's not there yet because the trust isn't there yet.

[00:15:11] So. When you say you're seeing all of these jobs losses, and you're seeing all of these cuts, I would not attribute that to a, a gen AI capability. I attribute it to a lot of the geopolitical things that are happening out there right now. And a lot of the political climate that's happening, not gen ai. We talk in a year and if things go right in a year and you're still saying that, you're still seeing that, then okay, we've got a discussion to have.

[00:15:40] But as of right now, I don't think you can attribute any of that to Gen ai. As gen AI is not being utilized yet at the decision le maker level, which is why we need what we've developed here at Mercyhurst and why we need what some of the vendors that are out there developing some of these tools.

[00:15:59] It's the applied side of it now, right? The Gen AI development and engineering is that the AI is there? The Gen AI and using it as an applied in intelligence capability is what's needed. And we've developed that here at Mercy Hers. So if you wanna talk about the solution to how we can get this safely into the workplace and into government agencies and everything else to be used, even on the classified systems, we can do that.

[00:16:30] And I think it's gonna create jobs. By doing this, not take away jobs because you need the analysts who are gonna be able to prompt it, who are gonna be able to identify the information coming back, who are gonna be, have to drive the APIs, who are gonna have to control the data. Lake Gen AI should never send a product to a decision maker.

[00:16:56] Let's put that out there right now. It should always. Go through a live analyst, a human analyst, before it goes to that decision maker. You're gonna need all of this. You're not gonna cut those people. I'm telling you the, the first company that decides they're going to allow Gen AI to directly answer collection requirements or intelligence requirements and send them to a C-suite level executive.

[00:17:23] Yeah, that company's gonna f fail in a year. They will, but they're gonna make a lot of money first just, just to note. And there's companies already working that path. They don't care if they fail. They wanna make the money first. And I think you're right. And I think to make a good point, that people should be cautious if you work with a company who says they're doing it.

[00:17:39] And I said this a couple years ago, we don't have intel. People don't have that. We're gonna automate this. Is this. Be careful what you're paying them. 'cause you're, you're buying things that look and smell like intel right up until something explodes on you because they're not giving you intel. They're giving you facsimiles.

[00:17:51] They're giving you what a par, you know, if a parrot has a conversation with you, they're not having a conversation. It might seem like one, but they dunno what the hell they're talking about. And neither does ai on a lot of those things. So, sorry Rob, I know you had something you wanted to ask, man. Well, I, I wanted to stay with that idea.

[00:18:02] The. The assistants that I'm working with, et cetera, et cetera, if you view them as like interns or a junior direct report, you have to work with them to a degree. And I just keep hearing the mantra, garbage in, garbage out. And it's very possible that you could be working with an AI that's that's working off of bad data, and then you have to spend that time unraveling where it went wrong.

[00:18:30] And correcting it. And so it doesn't actually save me time in many ways. I mean, in theory in the future it will, but as you were saying about data lakes and stuff, sometimes they can be poisoned either intentionally or unintentionally. And it's like we still have to get through a period of this where we're training, where I, I don't feel comfortable turning even the collection over to an AI at this point.

[00:18:59] A a and I agree. And you know, you, you look at it as AI has to be prompted, right? Who's prompting the ai, AI is prompting ai. I mean, that doesn't make any sense. The prompt is all. And, and, and Rob, to your point. The, the way in the process in which the AI returns that information or that does the collection is only gonna be as good as how it was prompted and then how what it returns it as is only gonna be as good as what it's prompted to do.

[00:19:32] Which brings up another big question, right? Let's talk about the intel cycle. How does you know if your AI is only trained? To go out, retrieve data, bring that data back, and then put it into a, a report, right? What good is that? What, what really, what good is that to you? If it can't do structured analytical techniques, if it can't put it into a data visualization chart, right?

[00:20:02] So who knows how to, who knows what a structured analytical technique is, right? I know about 200 of them. Unless I sit there and ask that AI to give me a SWOT or a PESTLE analysis, it's not gonna return it that way. It's just gonna gimme a bunch of data. So your garbage in, garbage out is, is exactly right, the reports, right?

[00:20:26] What if I want it built into a knowledge graph? If I don't, if you, if I don't know what a knowledge graph is, the AI doesn't know what the knowledge graph is, or the AI may know a hundred knowledge graphs. I've gotta be able to prompt it to tell it how to return it. In that knowledge graph, who is doing that?

[00:20:42] Prompting is not the decision maker that's pushing in the requirements. It is the analyst who is interpreting what the decision maker wants and then translating it over to prompts into the ai, then getting it back from the ai, putting it, making sure it goes into a product that's been vetted, but is also retranslate it into a way that that decision maker wants it.

[00:21:05] And then getting it to the decision maker. So Rob, to your point, you can spend a lot of your time having to unravel or having to redo the work of ai. Where you don't have to do that is when you have somebody that's knowledgeable on how it works, that's controlling it, which is the human analyst. So I think that's interesting.

[00:21:27] But we're, we're talking about the here and now, right? The current generation. Because you're right, right now, listen, I spend a ton of time. I never thought I was gonna be a prompt engineer, but apparently I am accidentally now because I, I openly admit, you know, a year ago I was very much against ai, AI and intelligence.

[00:21:41] I've advocated against it regularly. But as you said, at some point, you know, it's not that's gonna replace people, but people who don't know how to use these types, these tools will, will be replaced. And I, I don't wanna be replaced. I was like, all right, I gotta start figuring out what these tools do.

[00:21:53] And I, and I've learned a lot and, and as Rob said, I, there's a lot of times I finish a project, I'm like, man. It took just as long as if I'd done it on my own. But I keep doing it. I'm investing the time. 'cause I'm thinking, well I'm teaching, I'm teaching the, the system. I'm also learning how to, how to do things at the same time.

[00:22:07] And you know, I keep believing that at some point it'll pay off. At some point I'll start seeing a diminishing use of my time trying to make this thing do what I need it to do and then I can spend more time in other places. So there's some faith in that. And it takes time 'cause we're in the now generation.

[00:22:20] Moving forward you know, eventually, yeah. The, the theory at least, and I've read on the fair amount on this, is that the AI is gonna teach other AI that, you know, everybody's saying, oh no, it's not gonna replace you, but it's gonna replace junior people. Well, if you replace junior people, where do you get the next generation of mid and senior people?

[00:22:33] Well, it's because they think the junior AI will then grow into becoming a mid ai, AI that'll teach the junior next junior ai, and then the mid grows into a senior. And so over time, generationally, they're able to do this themselves. They're able to, to learn and to run it. Without people more and more. And so you have less and less people.

[00:22:49] And so my concern with that is a couple of things. First of all, the hallucinations right now are just comically bad still. No matter how much you work with them, there's some AI are better than others. I'm not gonna advocate for one company over another, but I will say there's one I've completely abandoned.

[00:23:01] 'cause the hallucinations were just impossible to get passed. And there's a couple that do it better, but that's a problem. And it's a real problem because if you're dealing with people who don't know. It's hallucination. I'll give an example. Not meant to be political, it just happens to be a very public example.

[00:23:14] There was a, a document that went to Congress about health. It's from our, our, I don't remember which department it was Department of Health perhaps, something like that. And it cited all these studies and all these, all these documents. And then it was real. And so it became a bit of a scandal in Congress about, you know, you, you push this fake information.

[00:23:29] I actually don't think the people that pushed it were trying to push fake information. I just think they offloaded it to AI and didn't bother to check anything. 'cause it all looks good. All these, you know, sites and sources and all that stuff, and it's junk. And you have lawyers saying, I never wrote this paper.

[00:23:40] This doesn't exist. None of this stuff's real. But it looked real. It looked real enough to get past experts and professionals and get presented to Congress as a way that should change government policy that would affect millions of people. And I fear that we're gonna see more and more of that because you have these hallucinations and then.

[00:23:55] On top of that, and I promise I'm getting a question. Miss this and malformation, right? So it's, again, it's gonna be what the data is in there. So what happens if we have people that are involved in this who decide they have an opinion they wanna push and they want to get in and, and poison the database?

[00:24:09] They wanna poison the, the, the algorithms, right? And it says a user, you don't know that, right? So in Intel we used to talk about like getting rid of bias, things like that. But you have a whole team, right? You have peer reviews, you have senior review, et cetera. If you know Bob is. Absolutely against Russia.

[00:24:23] And so everything he's gonna see through aso, it must be Russia. It must be Russia. You have people to check that. But if that becomes the bias that's built into the database that everyone's using, your whole office is now biased, or your whole agency is now biased, or your whole government is now biased.

[00:24:36] So how are we gonna manage that and, and who's gonna be the trusted agents for this? We have a world where people don't really trust each other. We certainly don't trust the ai. But how do you trust the people that are. Building into it, and how do we avoid this future where it's ai, teaching AI and it's, you know, based on misinformation and, and so you've just changed what reality is.

[00:24:54] Sorry, there's a lot of information there. I know. Well, that's a great question and I have your answer. If you say generative ai, again, I'm hanging up. First of all, disinformation and hallucinations. Two different problems. Absolutely true. Yep. Disinformation is a. A, you know, government wide problem in being able to more control that or put regulations on that.

[00:25:24] But it's also a problem in the, in the private sector, because you don't, you know, you're going to pull that information out and you're trusting that information, right? You're trusting your analyst as a subject matter expert, and you're absolutely right. People may have bias or they may. Be lazy and not vet that information properly and all of that, or have the right processes in place.

[00:25:46] So one, we've gotta figure out a way to identify disinformation. We've gotta figure out a way to mitigate disinformation and processes on how to do that. That might be a discussion for another podcast. We are working with the Security Executive Council on risk and security here on how to provide a disinformation process.

[00:26:08] And we've spent a year on this and we finally published the paper with them on how to co combat disinformation in the big data world. Now, the second part of that is the hallucination, and that's the AI side of it, and that is going to be. You know, the AI can't differentiate between disinformation, so it's gonna pull it back unless it's been properly trained on how to vet it.

[00:26:32] But again, you've gotta prompt it, right? You've gotta have the analyst putting a process in place. But the large language models and how they're developed and trained, those are the things that are causing, that are falling for the hallucinations. It's not the gen ai, the software isn't falling for the hallucinations.

[00:26:53] It depends on where the, so what language, large language model the software is built on as to whether or not it, it, it has the ability for that software to properly execute the prompts in where it pulls the information from, which gets to AJ's question. You, the way we are mitigating it is we are building a large data lake on premise, right?

[00:27:20] So we're building this data lake of trusted data from vendors that we have used, that we have used that, that we have vetted the data that we have used. We, we would recommend to users, we've contacted them, we have set up APIs, so we are pulling from their APIs into our. Data lake and then what we are, and that's an on-premise data lake on our servers, right?

[00:27:48] These data, it, it's a, it's kind of like a mini data form, but we've pulled in that data from these APIs, and then we pull, and this goes to creating jobs too. We have data scientists that are responsible for ensuring the data was properly pulled. We have data scientists that are, that are the ones that go on there and make sure the APIs are.

[00:28:08] On, on the timeframe and the schedule, they're supposed to pull the data, they're pulling the data in. And then we have cybersecurity students experts that are making sure that there was no compromise to the network during this what's supposed to be a one-way API pull. Right? And it all works flawlessly and it's a team effort.

[00:28:28] And we have this robust data lake now, right on premise. And then we have, aside from that, we have our intellectual property server. Where we have the data that's internal to us, that we, that we have either created or used from other projects that we've brought in. So now we have a data lake and we have a, what I call the mini data lake or the IP data lake.

[00:28:54] So we have a software, which is what, now we're using a, a software called Rover, which is our Gen AI software. So when my analysts Prompt Rover for what they need, rover then goes out into the data lake. I've trusted data and it goes into our IP data lake, and then it pulls that both of the data back to answer the prompt, right, or to pull the information back.

[00:29:17] And we've trained it to do structured analytical techniques. We've trained it to do data knowledge graphs. We've trained it to do, you know, all of the products and how we need products built, so then we can prompt it to return the data however we want it. Well. Yeah. Then the issue became, well, that's great, but now the tools that are out there that we use, that you push the data into, like everybody's familiar with I two, right?

[00:29:45] Right. So how do you Now the, the analyst was having to manually get, take whatever was created by Rover, the the Gen AI software, and then they still had to upload it into these programs. To build these graphs and to build these, these you know, data analysis type of products. And then they would have to pull that back out if they were trying to put it back into the Gen AI software to ask it to fill in some things.

[00:30:15] Well, we've now figured out that you can actually build, what we're doing is building APIs from the Gen AI software to feed in directly into the tools. So now you can prompt it when it returns to data to have it build an I two chart. It'll then take that, put it into a CSV file, and then automatically push that CSV file into I two.

[00:30:40] So now the analyst is sitting there looking at a analytical product for them to do the analysis off of that may have taken, you know, days, weeks, even months to build some of this out. So. It's all being done on premise, right? It doesn't leave my trusted network. It's not leaving that, and so we're working to do that.

[00:31:02] But one of the things else we learned is when it's pulling back the data from inside the data lake, from the APIs being fed from our vendors, oftentimes in that data, there are links to the original raw data, right? Well, if I send my, if I send my Gen AI software out there, my Gen AI capability out there onto the open web, that's where it now becomes exposed to potentially pulling in bad information, opening up myself for being hacked, my network, for being exposed, disinformation, misinformation, you name it, right?

[00:31:37] Dirty data. So how do I mitigate that, right? Because I still want it to go out. I may want it to go out and pull that information, bring that raw data back, because. The data that was originally put in the data lake, there may, may not have had the same, you know, focus or agenda as what I'm looking at it for.

[00:31:55] So I may need that raw data to analyze where the analyst still comes in to determine this. The AI can't determine this, but it goes out to that link live on the web, pulls that information back, and then provides it as a return. On it, but it's marked. Everything we do has the sources. R ai, which is something that's not being done a lot either.

[00:32:16] It gives you the sources of where it pulled all of the information, so it'll mark, if it pulled something live from the web like that, and not out of the data lake, we'll know, well, how do we do that safely? Because I want it to go out and pull that piece of information and bring it back. Now, the next question is, well, what happens if you're a smaller company or you're a company that doesn't have the ability?

[00:32:38] To build this robust data lake. We're working again, we have what we've developed as called a virtual airlock, which is our intellectual property to this whole process, and that allows for a, a privatized version of our Gen AI software with Rover to be deployed on a client's network. For them to use the same process to be able to go out and access the data from their own internal database.

[00:33:05] But they'll use a Google Cloud feature to go through and allow Rover to go into their own privatized cloud and then through our virtual airlock, it will allow them to come into our data lake. Right. So now those that we're working with and partnered with that can't create their own data lake, they can get into my data lake.

[00:33:28] Just like I do with an analyst, right? And now it'll return it back to their analyst who's on their end that prompted it to do that. So all I have to do is compromise them then. No. You're not compromising anybody. No, no. I'm saying I, if I can, if I can compromise one of your trusted partners, that there's a connection there, there's somebody outside who now has access to your virtual airlock, if I can compromise them, I'm in and I can poison your database and you won't know because it's coming through a trusted source.

[00:33:53] Well, that's, that's where the virtual airlock is set up to, to mitigate that. So the answer is no. You can't do that. Not through what we're doing. I, I, believe me, we've, we've exhausted that one, which is awesome because there is multiple firewalls built into this that mitigate for that. Before you can't access me, you can't get to.

[00:34:17] So, oh, Brian, challenge accepted. Oh my goodness. Brian, aj, I would love to do this. I would love to have you when we get here finalized to have you try. I definitely wanna see it, and it wouldn't be me, but there's others, obviously. I mean, listen, we'd have to find a technology that can't be hacked. I'm, I'm, I'm being obviously a little coy when I say this.

[00:34:35] No, and I hope it's successful. Obviously, anybody who says, well, it's. We got security and it's airlock and it's locked off and that's the solution. You go, well, okay, so that's what I'm gonna breach then. That's what everybody compromises and then you're back where you started. I assume you have audit and you have all sorts of other ways to, to make sure if data's manipulated, I'm sure there's a lot of trails and whatever.

[00:34:51] So you'd have to revert if somebody poisons 'cause somebody's going to poison this data lake, like I think we should accept that. That'll happen someday. I'd be stupid not to say and anybody to say it's a hundred percent. Of course nobody says that. I can say it's, it is the solution to this, and until someone does do that, I can say right now with what we've done with the proof of concepts and how we're doing this with the testing, it hasn't been compromised yet.

[00:35:20] But I, the great thing is, is nobody can push data into my data lake. So you are not at I control the entire data lake. The only thing that's happening when you run, if AJ, you know, incorporated is using my privatized version, and you request data or you prompt Rover to retrieve something, it's gonna retrieve it from your on-prem database and it's gonna retrieve it from mine.

[00:35:48] You can push data, you can only retrieve data outta my data lake. Okay? It's a one way street. So if you're, if you're over. If Rover does that, it pulls from your data lake and it pulls from my data, does Rover then keep my data and add it to your database, to your lake? So Rover Rover is not a a rover is a on-prem deployed software that Rover does not collect.

[00:36:10] Data Rover is the Gen AI software capability. The only you have to House Rover on your network. You control it. And it's built specifically for your needs. It's tailored. So earlier in the conversation when I said a lot of AI is being built at the agenda of the companies that are developing it and pushing it out based on what they think everybody needs, what we're doing is building it as a tailored to what you need specifically.

[00:36:37] And we've got Rover has engineers that'll come on, on premise, spend six weeks at your company helping you to set all of this up and develop it and train it. I hate the word train 'cause it's not a model, but it's gonna develop the software for your specific needs. Right? So that's, that's the difference there.

[00:36:56] And that's where I think we get to that trust level of executives finally trusting and wanting to use a gen AI software. 'cause we're reducing all the risks. But again, you still have to have people to do this, right? You still have to have people that know how to do it. You still have to have people that know what the company needs, how to retrieve it, how to make sure that the information is being properly curated into a product that can be sent to a decision.

[00:37:21] Maker Rover should never be directly sending information to a decision maker that should always be vetted. The whole process needs to be vetted, and you need to have somebody that knows how to do all of this on both the back end and the front end. So I'm concerned about the, I'm gonna go back to the data lake part of it, and, and you're building a trusted data lake.

[00:37:42] That's per your criteria. Someone else is building a trusted data lake over here and saying that we only have vetted information in our data lake, and then maybe a government builds one. But then maybe that government changes leader and suddenly all these words that were in the data lake have to go away because of a new agenda of the new leadership.

[00:38:09] I guess what I'm concerned about is back in the day when we had like the Encyclopedia Britannica, there was a, a esteemed committee that decided what made it in. And what was left out when we were creating the Bible from all the stories, there was a esteemed committee that decided whatever with ai, it seems like we're incredibly balkanized, where you know uc, Berkeley could have one and Harvard could have one, and Mercy Harris could have one.

[00:38:35] Who do I know to trust? So that's comes down to who? Do you know the trust right now to do your analytical products? Right? I don't think it's a matter of the, the look, the, the data lake currently exists that analysts are pulling from in the, the private sector. It's called the worldwide web, right? It's called the deep web.

[00:38:58] It's called the dark Web. In the government side. They're doing osint, but they have classified networks, right, that are closed off. So this software can be deployed on those because the software. Is gonna pull from that, the same data that those analysts are currently pulling from on those classified servers.

[00:39:21] I will tell you from my time in the military, this happened all the time. What was allowed, you know, to be collected on this month may not be allowed to be collected on the next month. And this terminology changes. And it was the, the engineers or the, the, the Intel personnel that were responsible for that.

[00:39:41] That would have to go in and purge that database for that stuff. But with data science the way it is today, you can quickly do all of that on the Rover side with, with this or on the Gen AI side for what we're doing, they would be responsible for helping you to go in and, and purge that. But I, I don't think, I don't think it's a genuine concern because it comes to the prompt, right?

[00:40:09] It comes to the prompt. So if you, yeah, hold on a sec. Hold on a sec. I gotta challenge, I gotta challenge that. I gotta challenge that, Brian. So I'm sorry. I will let you finish your thought, but I gotta challenge that. So you, you said, you know, Intel space, you know, you, this happens all the time and people have to go back and change database.

[00:40:22] Listen, I did 19 years in the intelligence community. That is not my experience. You know, some things get pulled or, or, or kept based on collection, based on, you know, legalities, some of those things. And we certainly do, you know, some analysis and determine, oh, this is US persons, we gotta pull it out.

[00:40:35] That kind of a thing, but not hold. Well, sure that happens, but not whole terminology. Right? You know, we don't what, what is, what is truth and what is fact seems to be negotiable in some cases now. And that is, I think, the bigger concern. So, you know, to make it really simple for my, my small brain, there's a massive technology, lots of power, very capable.

[00:40:55] And I believe technology is something we all eventually adapt and use 'cause it makes our lives better. But I'm gonna go to the simplest of technologies. We used to have an abacus, right? And then we had a calculator. Huge math. Math, you know, jump forward obviously, and I don't know anybody who doesn't use a calculator.

[00:41:08] If you're so great with math, you don't need one. I, I applaud you. I use calculators all the time, but when I put in two plus two and any calculator in the world, that gives me four. There you go. If two plus two starts giving me five, we have problems. But if all the calculators say two plus two is now five, we have a change in reality.

[00:41:24] And that's what we're looking at as a possible concern I have. And this is a technology. This isn't just gonna change two plus two to five. It is changing or has the potential theoretically, if people want to manipulate it, to change reality, to change what is truth at a, at a systemic level that is much wider spread.

[00:41:42] And it is fundamental to so many important things in our world, whether it's healthcare, whether it's science, whether it's, you know, whether it's, politics, geopolitics, wars, you know, all the big, big, big things we have to worry about. And so that's, I think the concern I have, and I think Rob was, was going down that path.

[00:41:57] I don't wanna speak for him, but I think that's the concern when we're talking about this. It's not about, well, we always have databases that people play with them, and we're all, you know, subjected to whatever's on the internet. That's true. And you're always subjected to what you have access to. But I've never been in a position before where I've worried that I'm gonna end up giving intelligence that's completely based on false premise from the start.

[00:42:15] I mean, I did collection, I've done humans, I've done, I've done cient. We always had to vet, you know, if somebody, if you hear something inside in your ears, you don't assume they're telling the truth either. We have to vet it and double check it. But this is a, a different scenario because people can. Push an agenda through these data lakes and through these, you know, that feed these ais and it becomes truth.

[00:42:32] People, people are very quick to believe the machine, which is sad. Something comes on the screen and comes back and they go, that's real. You know, it's the old joke about it, you know, I found it on the internet, but but people are just quick to offload to these things, and so it's like having the whole world say two plus two is five.

[00:42:46] If millions of people are saying that, billions of people are saying that, I guess two plus two does become five. But we have a real problem on our hands, so I'm really. I, I hate to be difficult on this one, but that No, I, I, so, I, I agree. But again, it's not the gen AI software that's causing that problem.

[00:43:06] It's the people and the data lake. This is where you have to build your trusted data lake from vetted, trusted data, right? From I pay licenses to vendors. I pay for that license. To get that trusted data, right? Because they've put in the effort to curate it. They're telling me it's good data. They have an algorithm that they run that does all of this, right?

[00:43:32] And it comes from experience of understanding and using those. But I will tell you, I've used data, a data provider that burned me. And guess what? The next day, they were no longer my data provider and. So by building a data lake, if you have a good catalog of where that data came from, you can go in and purge it and pull that data outta there.

[00:43:52] So yes, it is a concern, but it shouldn't be a concern if you have properly vetted the vendors and have that you are building your data lake into, if you open up the internet to build your data lake and you're pulling in raw data off the internet. Y shame on you. You're, it's not a, it's, it's a da, it's a poor data lake.

[00:44:12] Right. It's a swamp at that point. It's not even lake, it's a swamp. But to rob, to Rob's point about truth, like who's, who's the arbiter of truth, right? So he, you know, he mentioned committees, which I think is a great point. I hadn't thought about. You know, mercy Earth obviously is one of the leaders in this space.

[00:44:27] You're talking about some really impressive things. Are you aware of? Are you working with other universities? Is there consortium of universities? Who's gonna be the arbiter of truth? And of course truth? We're always having a problem. Listen, there's people that attack universities. Now, there's people that attack hard science at this point, but who's who is, are we in a position like that is, you know, mercy Earth might be leading the way.

[00:44:44] Are there other universities you're working with? Is there a consortium? Is there an intent to try to create an overwhelming understanding of what truth is? To help validate what the data is in these places because it's the foundation for everything. This can, again, this could be the save salvation of humanity or our destruction, I suppose.

[00:44:59] So is that happening now? Do you know if it's happening? Are you guys doing some of that stuff? Oh, I think gen AI on the applied side it's, I've been asked this question when it comes to doing it in the world that, that we're in, in the CA with the intel, whether it's competitive business. Whether it's strategic, tactical, law enforcement, there isn't a lot of people out there to to, to do that vetting of what we've got because there's no other C rats out there.

[00:45:33] There's no one else out there doing this. Now, when it comes to the software's, processes, and capabilities of the actual software, absolutely. We have others that are outside the university in working for vendors or working for private companies or our experts in, in the AI field that are helping us to to vet the software and the capabilities internally.

[00:45:59] We have a computer information science department, right? That faculty is not in the crad. S, but they're experts in teaching ai, right? And they're more on the development side and engineering side of AI than they are on the applied side. So they'll come in and tell me you're absolutely crazy with that, or This is a better way of doing that.

[00:46:20] Or Here's the solution that will help you to get their quicker. And then they come in. And that faculty, those experts, they're the ones that are the sanity check for what we're doing. But the vendors I have pulled together a conglomerate of vendors. That are helping to feed my data lake or are helping to develop the gen AI and understand what we're trying to get at.

[00:46:41] And I talk with them a lot and get them. And then I go to these conferences and I talk at these conferences and I get feedback from the experts in the audience, right. Or the other speakers that are at these conferences. But to tell you, there's not a lot of people out there that are at the point we're at, and there's not a lot of people out there that are focusing on the applied side.

[00:47:01] They're focusing on the development side and the engineering side of it and creating it. They're not focusing on how to use it. I don't care how it's created. I, I could real, I, I don't care how the LLM is developed. I don't care how the software works on the backend. What I care about is how can I utilize this as a practitioner?

[00:47:24] How can my analysts utilize this? How can I prove that Gen AI can be an asset? In the Intel cycle, I got a I I, I was with a couple guys from Data Squared and we were kicking the can on where the DA Intel cycle. Does Gen AI now fall? Right. Where does it fall in the process of the intel cycle and does it become its own step in the intel cycle?

[00:47:49] Right. And it, it's a great discussion to have depending on what your view is of how to use it. But their engineer is talking about where it falls in there from a, a, a more of a process and not an applicability. Whereas me and a couple of the others were looking at it from the practitioner side as applicability.

[00:48:10] Where do I using it as a tool to help increase the efficiency, timeliness, and relevance of the information, the data, and building it into actionable intelligence to get to my decision maker in time to utilize it. But how do I use it to beat my competitors, right? How can I use it to beat my competitors?

[00:48:29] There's a gen AI race between China and the United States right now. Right There is. But it's again, on the development side. Let's develop, develop, develop. That's great if you build me a flying car, but I never actually get in it and fly it from Erie, Pennsylvania to Pittsburgh. What's the point of having a flying car right?

[00:48:56] So what's the point of having AI if you don't have an applicability for it? So we've, where we've gotten ahead of the curve is the applicability side and using gen ai. So as it's being developed, we're also developing the applicability of it. And then as the technology increases and continues to grow, it's helping us to better our processes.

[00:49:17] But in no way do I honestly believe you're gonna see a significant cut in jobs. From the use of Gen AI as an applicability, I just don't, I, I mean, you look at all the vendors, how many jobs were lost when all these vendors stood up and started doing CI, commercially available intelligence and now selling it, selling Osint information to the government, selling it to companies.

[00:49:46] Analysts weren't cut because of that. The analysts were still there because they had to communicate what they needed. They had to go in and use it, pull it in, put it into. You know, a, a product. And I think the same thing's gonna happen with Gen ai. I mean, I think, I think that's for what it's worth, I think that might be apples and oranges.

[00:50:01] But I, I'll, I'll leave it be for now. 'cause I, we, neither of us knows the future on that one. I, I, I wanna revisit it like I, we'll come back and take a look and see, you know, where we are in a year. I think it's, it's interesting. I'd love be back in a year. Discuss. I, I, I would too. I I'm hopeful that it's not me talking to some AI that's teaching your classes now, but so listen, I'm gonna ask one question.

[00:50:18] Rob warned you before we, we got on that I talk a long time. So, we are long on time, but I'm gonna ask you one more anyway and then we'll get outta here. You're my last meeting of the day, so, oh, thank God. It's always good to schedule me last for anybody who knows me. So last question. So we talked about a lot of stuff, you know, good, bad and, and, and unknown really at this point.

[00:50:36] You're an expert on this man. Like where do you see us in, you know, 1, 2, 3, 5 years down the road in terms of gen ai, in terms of academic, in terms of industry, in terms of government, you know, where do you see this going and, and you know, what do you see as the future looking like for all of us with this I, I see within a year, I think we have a gen AI capability that's being implemented and explored both in the private sector and government and how to.

[00:51:04] Utilize it as an on-premise solution mitigating the current risks that are associated. Where do I see us in three years? I believe in three years. The c, the issues we have right now with the large language models and how they return data. I do think a lot of the hallucinations if we can implement a process for identifying disinformation, those large language models can be trained.

[00:51:30] On that process so they can help to mitigate and filter out that information better and more, a lot more accurately. And five years from now, I do believe the current risks associated with disinformation, hallucinations, misinformation, agendas, biases, creating a swamp instead of a data lake. I do believe that will all be mitigated.

[00:51:56] But I also think with that. We're gonna see threats that we're not even thinking about to our networks and how to expose that gen ai, how to access it and, and, and and use it as a threat. I think five years from now, there's gonna be things we didn't even think about because of the creation of those mitigation factors.

[00:52:16] I hate to say it, but the bad guys, they evolve as the technology evolves as well. So I think we're gonna be having the conversation about. Those specific cyber threats? Not necessarily about the process of the ai, the issues with how the AI processes the data. I think it's gonna be more of the direct threats on.

[00:52:37] Risks to the network. All right. Well thank you very much Brian, for engaging with us today. It's been a spirited discussion and, and to your point, yeah, we'd like to have you back on the show in the future to see how this is evolving 'cause it is an evolving story. You can learn more about Brian on our show notes and you can also find transcripts there as well as other episodes.

[00:52:59] Where can you find that? You can find that. At authentic8.com/needlestack, that's authentic with the number eight .com/needlestack. And there's a comment button there. So please let us know what you think about this show and other shows. AJ and I do read those comments, so please leave some notes for us about what we can do better in the future and what you'd like to hear more of.

[00:53:23] And also share your thoughts on social media. We're on Mastodon Blue Sky. We're in all the popular places and you can find us @needlestackpod. Where you can also be sure to subscribe to wherever you're listening to us or watching us today. 'cause that's also important. We appreciate our audience and we'd love to hear from you over time.

[00:53:42] So, AJ unless you have anything else? No, I just wanna thank you, Brian. I appreciate it. I especially because, you know, I can ask a lot of pointed questions, but I appreciate you, you know. Going along with this discussion and, and not ducking questions and, and really having a great, open discussion.

[00:53:55] This, it's, it's tough, right? I'm, I'm thrilled you're doing the work you're doing and Mercy Arts is a great program and I'm, I'm thankful to have the opportunity and I do wanna talk to you more about this and learn and, and I'm hopeful that your view of the future is, is correct. You know, it's nobody knows till the future shows up.

[00:54:08] There's a lot of opportunity here, so thanks. I really appreciate you taking the time to come on and talk with us today and share with the audience what's going on at Mers Harrison, your opinions and your point of view and your educated thoughts on this subject. It's, it's gonna be interesting for quite a while.

[00:54:20] Well, it's been my pleasure. Thank you. This is great questions, great discussion. If it's a one-way discussion, it's not very entertaining, right? Wouldn't be a good podcast, but I'll tell you, you both have an open invitation a year from now to come and do this podcast again, live. From the CIRAT here at Mercyhurst.

[00:54:37] Ooh, I love it. All right. Maybe we'll try to speed up to six months. So things are going fast. I'd like to really see you guys, so, we'll, we'll see how that works out, but I really appreciate it. I'm sure we'll have you on again. I'll, I'll, I'm gonna be ping you off offline too. I'd like to talk more. So thanks again for being here.

[00:54:51] Thanks everybody for watching and listening. As Rob said, appreciate all of you being here and you know, for, for. Everybody here, I'm gonna go ahead and just say that's, you know, we're gonna close it out here today. This has been another episode of Needle Stack.

Close
Close