theregister.com

AI agents swarm Microsoft Security Copilot

Microsoft's Security Copilot is getting some degree of agency, allowing the underlying AI model to interact more broadly with the company's security software to automate various tasks.

Security Copilot showed up in 2023 promising automatic triage of security incidents in Microsoft Defender XDR.

At a press event on March 20 at Microsoft's San Francisco office, Vasu Jakkal, corporate vice president of security, compliance, identity, and management at Microsoft, revealed an expanded flight plan for Security Copilot, which is now assisted by 11 task-specific AI agents that interact with products like Defender, Purview, Entra, and Intune.

"We are in the era of agentic AI," Jakkal said. "I'm sure everywhere you go, you hear agents and agents and agents. What are these agents? So, well, agents are all around us."

Jakkal went on to note that in a conversation with a colleague, the question was posed: "What is an agent?" Her reply was: "That's a great question," and yet she went on without answering it.

That established the pattern for the event – questions like "in what ways have agents failed when deployed?" and "what's the cost of running this in compute resources?" tended to go unanswered.

But Jakkal did say that of the 11 Security Copilot agents introduced, five come from Microsoft Security partners.

The Microsoft-made agents include:

Phishing Triage Agent in Microsoft Defender, for sorting phishing reports.

Alert Triage Agents in Microsoft Purview, for triaging data loss prevention and insider risk alerts.

Conditional Access Optimization Agent in Microsoft Entra, for monitoring and preventing identity and policy issues.

Vulnerability Remediation Agent in Microsoft Intune, for prioritizing vulnerability remediation.

Threat Intelligence Briefing Agent in Security Copilot, for curating threat intelligence.

Microsoft Security partners have also contributed to the agent pool:

Privacy Breach Response Agent (OneTrust), for distilling data breaches into reporting guidance.

Network Supervisor Agent (Aviatrix), for doing root cause analysis on network issues.

SecOps Tooling Agent (BlueVoyant), for assessing security operations center controls.

Alert Triage Agent (Tanium), for helping security analysts prioritize alerts.

Task Optimizer Agent (Fletch), for forecasting and prioritizing threat alerts.

The eleventh agent resides in Microsoft Purview Data Security Investigations (DSI), an AI-based service designed to help data security teams deal with data exposure risks.

Essentially, these agents use the natural language capabilities of generative AI to automate the summarization of high-volume data like phishing warnings or threat alerts so that human decision makers can focus on signals deemed to be the most pressing.

This fits with Jakkal's thesis that the security landscape is changing faster than people can handle, making it necessary to rely on non-deterministic macros, or AI agents in more modern jargon.

"You look at this web landscape, the speed, the scale, and the sophistication is increasing dramatically," she said. "From last year when we were seeing 4,000 attacks per second, we're seeing 7,000 attacks per second. That translates to 600 million attacks a day."

Jakkal said the initial iteration of Security Copilot has already helped organizations deal with high-velocity threats.

"For security teams using it, we've seen a 30 percent reduction in mean time to respond," she said, without elaborating on the cost of that improvement. "That means the time it takes them to respond to security incidents. We've seen early career talent, people who really wanted security but didn't know how to get started, being 26 percent faster, 35 percent more accurate. And even for seasoned professionals, we've seen them get 22 percent faster and 7 percent more accurate."

Intrigued by the way in which AI agents might go wrong, The Register chatted with Tori Westerhoff, director in AI safety and security red teaming at Microsoft, about what her team had learned during the development of these agents.

Westerhoff expressed confidence in Microsoft's overall approach to AI security, pointing to a blog post last year on the subject and noting that the AI models already come with guardrails and that her team has done a lot of work to limit cross-prompt injection.

"We've been pushing this to product devs so that they're building with the awareness of how cross-prompt injection works," she said.

Pressed to provide an example of false positive rates or related metrics for failures that emerged during the development of Security Copilot agents, Westerhoff said: "So I think in terms of specific product operations, I can't talk through those," but she allowed that Microsoft's red team does work with product developers prior to launch on AI hallucinations and hardening agentic systems.

She went on to explain: "I think you're asking, 'Hey, what's the thing that's going to go wrong with this?' And I think the beauty of my team is that we work through those things and try to find any soft spots for any high-risk GenAI before launch, well, before it actually gets to customers."

So, nothing to worry about.

Nick Goodman, product architect for Security Copilot, showed off how the Phishing Triage Agent in Defender worked.

Screenshot of Microsoft Defender Phishing Triage Agent

Microsoft Defender Phishing Triage Agent – click to enlarge

"Everybody has phishing solutions," he explained. "Even despite phishing solutions, we train our employees to report phishing. And they do. Lots and lots of reports. Ninety-five percent of them are false positives. They each take about 30 minutes. And so our analysts spend most of their time triaging false positives. That's what this agent is going to help us with."

At the same time, the customer still has to help the agent. Goodman showed how one company-wide email was flagged as a true positive – an actual phishing message – based on its characteristics, like language urging rapid action.

Goodman said the message, despite its appearance and spammy language, was actually a legitimate HR communique. "The agent can't know that because it lacks my context," he said. "So it flags it for my review."

Goodman went ahead and changed the classification of the message from suspected phishing to legitimate, and this instructed the agent how to do better next time. "This is learning," he said. "It's applied for this agent going forward, but only to me. It's not shared with Microsoft, not shared with other customers. There's no foundational model training happening. This is my context. This is literally all I have to do to start training the system, very much the same way you would train a human analyst."

But without the salary, benefits, or desk occupancy. Asked how much Microsoft expects this system might save in labor costs, Goodman replied: "I don't have any studies we can share with you. Our standard for studies that we publish is pretty high."

Goodman said that customers are using Security Copilot for this sort of phishing triage already.

Asked whether Microsoft has a sense of the false positive rate out of the box compared to after training, Goodman said: "The input false positive rate is driven by human behavior [based on what people report]. The output rate, like the percentage triaged, I don't have numbers to share. We're in the evaluation period with customers right now."

Ojas Rege, SVP and general manager of privacy and data governance at OneTrust, showed off how the company's Privacy Breach Response Agent might help corporate privacy officers deal with data breach reports.

"If you have a data breach, in your blast radius analysis, you have a set of privacy obligations that you have to meet," he explained. "The challenge is that those breach notification regulations differ by every state, by every country, they're very complex and they're fragmented, and sometimes the notification numbers are really short, 72 hours."

That's where the summarization capability of generative AI models comes into play. OneTrust's agent will construct a prioritized list of recommendations for the privacy or compliance officer to deal with, based on its analysis of data from OneTrust's regulatory research database.

"The agent's not going to notify the regulatory authority," said Rege. "The agent's doing all the background work, but the human has to actually do the notification."

Asked about the possibility of hallucination, Rege replied that the chances of hallucination are very narrow and that there's also an audit log that links to specific regulations, so the agent's recommendations can be confirmed.

Microsoft's agents are here to help. You'll just need to check their work. ®

Read full news in source page