easyfilo | Find the soul

2025-Dec-23

F1’s new engines are causing consternation over compression ratios

There’s still another couple of months before the 2026 crop of F1 cars takes to the track for the first preseason test. It’s a year of big change for the sport, which is adopting new power unit rules that place much more emphasis on the electric motor’s contribution. The switch to the new power units was meant to attract new manufacturers to the sport, and in that regard, it has succeeded. But controversy has erupted already as loopholes appear and teams exploit them.

Since 2014, F1 cars have used 1,000 hp (745 kW) power units that combine a turbocharged 1.6 L V6 gasoline engine with a pair of hybrid systems. One is the MGU-H, which recovers energy from (or deploys it to) the turbocharger’s turbine; the other is a 160 hp (120 kW) MGU-K that harvests and deploys energy at the rear wheels. Starting next year, the MGU-H is gone, and the less-powerful 1.6 L V6 should generate about 536 hp (400 kW). That will be complemented by a 483 hp (350 kW) MGU-K, plus a much larger battery to supply it.

And the new rules have already attracted new OEMs to the sport. After announcing its departure at the end of 2021—sort of— Honda changed its mind and signed on to the 2026 regs, supplying Aston Martin. Audi signed up and bought the Sauber team. Red Bull decided to build its own internal combustion engines, hiring heavily from the Mercedes program, but Ford is providing Red Bull with the MGU-K and the rest of the hybrid system. And Cadillac has started an engine program, albeit one that won’t take the grid until 2029.

Read full article

Comments

2025-Dec-22

While everyone talks about an AI bubble, Salesforce quietly added 6,000 enterprise customers in 3 months

While Silicon Valley debates whether artificial intelligence has become an overinflated bubble, Salesforce’s enterprise AI platform quietly added 6,000 new customers in a single quarter — a 48% increase that executives say demonstrates a widening gap between speculative AI hype and deployed enterprise solutions generating measurable returns.

Agentforce, the company’s autonomous AI agent platform, now serves 18,500 enterprise customers, up from 12,500 the prior quarter. Those customers collectively run more than three billion automated workflows monthly and have pushed Salesforce’s agentic product revenue past $540 million in annual recurring revenue, according to figures the company shared with VentureBeat. The platform has processed over three trillion tokens — the fundamental units that large language models use to understand and generate text — positioning Salesforce as one of the largest consumers of AI compute in the enterprise software market.

“This has been a year of momentum,” Madhav Thattai, Salesforce’s Chief Operating Officer for AI, said in an exclusive interview with VentureBeat. “We crossed over half a billion in ARR for our agentic products, which have been out for a couple of years. And so that’s pretty remarkable for enterprise software.”

The numbers arrive amid intensifying scrutiny of AI spending across corporate America. Venture capitalists and analysts have questioned whether the billions pouring into AI infrastructure — from data centers to graphics processing units to model development — will ever generate proportionate returns. Meta, Microsoft, and Amazon have committed tens of billions to AI infrastructure, prompting some investors to ask whether the enthusiasm has outpaced the economics.

Yet the Salesforce data suggests that at least one segment of the AI market — enterprise workflow automation — is translating investments into concrete business outcomes at a pace that defies the bubble narrative.

Why enterprise AI trust has become the defining challenge for CIOs in 2025

The distinction between AI experimentation and AI deployment at scale comes down to one word that appeared repeatedly across interviews with Salesforce executives, customers, and independent analysts: trust.

Dion Hinchcliffe, who leads the CIO practice at technology research firm The Futurum Group, said the urgency around enterprise AI has reached a fever pitch not seen in previous technology cycles. His firm recently completed a comprehensive analysis of agentic AI platforms that ranked Salesforce slightly ahead of Microsoft as the market leader.

“I’ve been through revolution after revolution in this business,” Hinchcliffe said. “I’ve never seen anything like this before. In my entire career, I’ve never seen this level of business focus—boards of directors are directly involved, saying this is existential for the company.”

The pressure flows downward. CIOs who once managed technology as a cost center now field questions directly from board members demanding to know how their companies will avoid being disrupted by AI-native competitors.

“They’re pushing the CIO hard, asking, ‘What are we doing? How do we make sure we’re not put out of business by the next AI-first company that reimagines what we do?'” Hinchcliffe said.

But that pressure creates a paradox. Companies want to move fast on AI, yet the very autonomy that makes AI agents valuable also makes them dangerous. An agent that can independently execute workflows, process customer data, and make decisions without human intervention can also make mistakes at machine speed — or worse, be manipulated by bad actors.

This is where enterprise AI platforms differentiate themselves from the consumer AI tools that dominate headlines. According to Hinchcliffe, building a production-grade agentic AI system requires hundreds of specialized engineers working on governance, security, testing, and orchestration — infrastructure that most companies cannot afford to build themselves.

“The average enterprise-grade agentic team is 200-plus people working on an agentic platform,” Hinchcliffe said. “Salesforce has over 450 people working on agent AI.”

Early in the AI adoption cycle, many CIOs attempted to build their own agent platforms using open-source tools like LangChain. They quickly discovered the complexity exceeded their resources.

“They very quickly realized this problem was much bigger than expected,” Hinchcliffe explained. “To deploy agents at scale, you need infrastructure to manage them, develop them, test them, put guardrails on them, and govern them — because you’re going to have tens of thousands, hundreds of thousands, even millions of long-running processes out there doing work.”

How AI guardrails and security layers separate enterprise platforms from consumer chatbots

The technical architecture that separates enterprise AI platforms from consumer tools centers on what the industry calls a “trust layer” — a set of software systems that monitor, filter, and verify every action an AI agent attempts to take.

Hinchcliffe’s research found that only about half of the agentic AI platforms his firm evaluated included runtime trust verification — the practice of checking every transaction for policy compliance, data toxicity, and security violations as it happens, rather than relying solely on design-time constraints that can be circumvented.

“Salesforce puts every transaction, without exception, through that trust layer,” Hinchcliffe said. “That’s best practice, in our view. If you don’t have a dedicated system checking policy compliance, toxicity, grounding, security, and privacy on every agentic activity, you can’t roll it out at scale.”

Sameer Hasan, who serves as Chief Technology and Digital Officer at Williams-Sonoma Inc., said the trust layer proved decisive in his company’s decision to adopt Agentforce across its portfolio of brands, which includes Pottery Barn, West Elm, and the flagship Williams-Sonoma stores that together serve approximately 20% of the U.S. home furnishings market.

“The area that caused us to make sure—let’s be slow, let’s not move too fast, and let this get out of control—is really around security, privacy, and brand reputation,” Hasan said. “The minute you start to put this tech in front of customers, there’s the risk of what could happen if the AI says the wrong thing or does the wrong thing. There’s plenty of folks out there that are intentionally trying to get the AI to do the wrong thing.”

Hasan noted that while the underlying large language models powering Agentforce — including technology from OpenAI and Anthropic — are broadly available, the enterprise governance infrastructure is not.

“We all have access to that. You don’t need Agentforce to go build a chatbot,” Hasan said. “What Agentforce helped us do more quickly and with more confidence is build something that’s more enterprise-ready. So there’s toxicity detection, the way that we handle PII and PII tokenization, data security and creating specific firewalls and separations between the generative tech and the functional tech, so that the AI doesn’t have the ability to just go comb through all of our customer and order data.”

The trust concerns appear well-founded. The Information reported that among Salesforce’s own executives, trust in generative AI has actually declined — an acknowledgment that even insiders recognize the technology requires careful deployment.

Corporate travel startup Engine deployed an AI agent in 12 days and saved $2 million

For Engine, a corporate travel platform valued at $2.1 billion following its Series C funding round, the business case for Agentforce crystallized around a specific customer pain point: cancellations.

Demetri Salvaggio, Engine’s Vice President of Customer Experience and Operations, said his team analyzed customer support data and discovered that cancellation requests through chat channels represented a significant volume of contacts — work that required human agents but followed predictable patterns.

Engine deployed its first AI agent, named Eva, in just 12 business days. The speed surprised even Salvaggio, though he acknowledged that Engine’s existing integration with Salesforce’s broader platform provided a foundation that accelerated implementation.

“We saw success right away,” Salvaggio said. “But we went through growing pains, too. Early on, there wasn’t the observability you’d want at your fingertips, so we were doing a lot of manual work.”

Those early limitations have since been addressed through Salesforce’s Agentforce Studio, which now provides real-time analytics showing exactly where AI agents struggle with customer questions — data that allows companies to continuously refine agent behavior.

The business results, according to Salvaggio, have been substantial. Engine reports approximately $2 million in annual cost savings attributable to Eva, alongside a customer satisfaction score improvement from 3.7 to 4.2 on a five-point scale — an increase Salvaggio described as “really cool to see.”

“Our current numbers show $2 million in cost savings that she’s able to address for us,” Salvaggio said. “We’ve seen CSAT go up with Eva. We’ve been able to go from like a 3.7 out of five scale to 4.2. We’ve had some moments at 85%.”

Perhaps more telling than the cost savings is Engine’s philosophy around AI deployment. Rather than viewing Agentforce as a headcount-reduction tool, Salvaggio said the company focuses on productivity and customer experience improvements.

“When you hear some companies talk about AI, it’s all about, ‘How do I get rid of all my employees?'” Salvaggio said. “Our approach is different. If we can avoid adding headcount, that’s a win. But we’re really focused on how to create a better customer experience.”

Engine has since expanded beyond its initial cancellation use case. The company now operates multiple AI agents — including IT, HR, product, and finance assistants deployed through Slack — that Salvaggio collectively refers to as “multi-purpose admin” agents.

Williams-Sonoma is using AI agents to recreate the in-store shopping experience online

Williams-Sonoma’s AI deployment illustrates a more ambitious vision: using AI agents not merely to reduce costs but to fundamentally reimagine how customers interact with brands digitally.

Hasan described a frustration that anyone who has used e-commerce over the past two decades will recognize. Traditional chatbots feel robotic, impersonal, and limited — good at answering simple questions but incapable of the nuanced guidance a knowledgeable store associate might provide.

“We’ve all had experiences with chatbots, and more often than not, they’re not positive,” Hasan said. “Historically, chatbot capabilities have been pretty basic. But when customers come to us with a service question, it’s rarely that simple — ‘Where’s my order?’ ‘It’s here.’ ‘Great, thanks.’ It’s far more nuanced and complex.”

Williams-Sonoma’s AI agent, called Olive, goes beyond answering questions to actively engaging customers in conversations about entertaining, cooking, and lifestyle — the same consultative approach the company’s in-store associates have provided for decades.

“What separates our brands from others in the industry—and certainly from the marketplaces—is that we’re not just here to sell you a product,” Hasan said. “We’re here to help you, educate you, elevate your life. With Olive, we can connect the dots.”

The agent draws on Williams-Sonoma’s proprietary recipe database, product expertise, and customer data to provide personalized recommendations. A customer planning a dinner party might receive not just product suggestions but complete menu ideas, cooking techniques, and entertaining tips.

Thattai, the Salesforce AI executive, said Williams-Sonoma is in what he describes as the second stage of agentic AI maturity. The first stage involves simple question-and-answer interactions. The second involves agents that actually execute business processes. The third — which he said is the largest untapped opportunity — involves agents working proactively in the background.

Critically, Hasan said Williams-Sonoma does not attempt to disguise its AI agents as human. Customers know they’re interacting with AI.

“We don’t try to hide it,” Hasan said. “We know customers may come in with preconceptions. I’m sure plenty of people are rolling their eyes thinking, ‘I have to deal with this AI thing’—because their experience with other companies has been that it’s a cost-cutting maneuver that creates friction.”

The company surveys customers after AI interactions and benchmarks satisfaction against human-assisted interactions. According to Hasan, the AI now matches human benchmarks — a constraint the company refuses to compromise.

“We have a high bar for service—a white-glove customer experience,” Hasan said. “AI has to at least maintain that bar. If anything, our goal is to raise it.”

Williams-Sonoma moved from pilot to full production in 28 days, according to Salesforce — a timeline that Thattai said demonstrates how quickly companies can deploy when they build on existing platform infrastructure rather than starting from scratch.

The three stages of enterprise AI maturity that determine whether companies see ROI

Beyond the headline customer statistics, Thattai outlined a three-stage maturity framework that he said describes how most enterprises approach agentic AI:

Stage one involves building simple agents that answer questions — essentially sophisticated chatbots that can access company data to provide accurate, contextual responses. The primary challenge at this stage is ensuring the agent has comprehensive access to relevant information.

Stage two involves agents that execute workflows — not just answering “what time does my flight leave?” but actually rebooking a flight when a customer asks. Thattai cited Adecco, the recruiting company, as an example of stage-two deployment. The company uses Agentforce to qualify job candidates and match them with roles — a process that involves roughly 30 discrete steps, conditional decisions, and interactions with multiple systems.

“A large language model by itself can’t execute a process that complex, because some steps are deterministic and need to run with certainty,” Thattai explained. “Our hybrid reasoning engine uses LLMs for decision-making and reasoning, while ensuring the deterministic steps execute with precision.”

Stage three — and the one Thattai described as the largest future opportunity — involves agents working proactively in the background without customer initiation. He described a scenario in which a company might have thousands of sales leads sitting in a database, far more than human sales representatives could ever contact individually.

“Most companies don’t have the bandwidth to reach out and qualify every one of those customers,” Thattai said. “But if you use an agent to refine profiles and personalize outreach, you’re creating incremental opportunities that humans simply don’t have the capacity for.”

Salesforce edges out Microsoft in analyst rankings of enterprise AI platforms

The Futurum Group’s recent analysis of agentic AI platforms placed Salesforce at the top of its rankings, slightly ahead of Microsoft. The report evaluated ten major platforms — including offerings from AWS, Google, IBM, Oracle, SAP, ServiceNow, and UiPath — across five dimensions: business value, product innovation, strategic vision, go-to-market execution, and ecosystem alignment.

Salesforce scored above 90 (out of 100) across all five categories, placing it in what the firm calls the “Elite” zone. Microsoft trailed closely behind, with both companies significantly outpacing competitors.

Thattai acknowledged the competitive pressure but argued that Salesforce’s existing position in customer relationship management provides structural advantages that pure-play AI companies cannot easily replicate.

“The richest and most critical data a company has — data about their customers — lives within Salesforce,” Thattai said. “Most of our large customers use us for multiple functions: sales, service, and marketing. That complete view of the customer is central to running any business.”

The platform advantage extends beyond data. Salesforce’s existing workflow infrastructure means that AI agents can immediately access business processes that have already been defined and refined — a head start that requires years for competitors to match.

“Salesforce is not just a place where critical data is put, which it is, but it’s also where work is performed,” Thattai said. “The process by which a business runs happens in this application — how a sales process is managed, how a marketing process is managed, how a customer service process is managed.”

Why analysts say 2026 will be the real year of AI agents in the enterprise

Despite the momentum, both Salesforce executives and independent analysts cautioned that enterprise AI remains in early innings.

Hinchcliffe pushed back against the notion that 2025 was “the year of agents,” a phrase that circulated widely at the beginning of the year.

“This was not the year of agents,” Hinchcliffe said. “This was the year of finding out how ready they were, learning the platforms, and discovering where they weren’t mature yet. The biggest complaint we heard was that there’s no easy way to manage them. Once companies got all these agents running, they realized: I have to do lifecycle management. I have agents running on old versions, but their processes aren’t finished. How do I migrate them?”

He predicted 2026 has “a much more likely chance of being the year of agents,” though added that the “biggest year of agents” is “probably going to be the year after that.”

The Futurum Group’s analysis forecasts the AI platform market growing from $127 billion in 2024 to $440 billion by 2029 — a compound annual growth rate that dwarfs most enterprise software categories.

For companies still on the sidelines, Salvaggio offered pointed advice based on Engine’s early-adopter experience.

“Don’t take the fast-follower strategy with this technology,” he said. “It feels like it’s changing every week. There’s a differentiation period coming — if it hasn’t started already — and companies that waited are going to fall behind those that moved early.”

He warned that institutional knowledge about AI deployment is becoming a competitive asset in itself — expertise that cannot be quickly acquired through outside consultants.

“Companies need to start building AI expertise into their employee base,” Salvaggio said. “You can’t outsource all of this — you need that institutional knowledge within your organization.”

Thattai struck a similarly forward-looking note, drawing parallels to previous platform shifts.

“Think about the wave of mobile technology—apps that created entirely new ways of interacting with companies,” he said. “You’re going to see that happen with agentic technology. The difference is it will span every channel — voice, chat, mobile, web, text — all tied together by a personalized conversational experience.”

The question for enterprises is no longer whether AI agents will transform customer and employee experiences. The data from Salesforce’s customer base suggests that transformation is already underway, generating measurable returns for early adopters willing to invest in platform infrastructure rather than waiting for a theoretical bubble to burst.

“I feel incredibly confident that point solutions in each of those areas are not the path to getting to an agentic enterprise,” Thattai said. “The platform approach that we’ve taken to unlock all of this data in this context is really the way that customers are going to get value.”

2025-Dec-22

Red teaming LLMs exposes a harsh truth about the AI security arms race

Unrelenting, persistent attacks on frontier models make them fail, with the patterns of failure varying by model and developer. Red teaming shows that it’s not the sophisticated, complex attacks that can bring a model down; it’s the attacker automating continuous, random attempts that will inevitably force a model to fail.

That’s the harsh truth that AI apps and platform builders need to plan for as they build each new release of their products. Betting an entire build-out on a frontier model prone to red team failures due to persistency alone is like building a house on sand. Even with red teaming, frontier LLMs, including those with open weights, are lagging behind adversarial and weaponized AI.

The arms race has already started

Cybercrime costs reached $9.5 trillion in 2024 and forecasts exceed $10.5 trillion for 2025. LLM vulnerabilities contribute to that trajectory. A financial services firm deploying a customer-facing LLM without adversarial testing saw it leak internal FAQ content within weeks. Remediation cost $3 million and triggered regulatory scrutiny. One enterprise software company had its entire salary database leaked after executives used an LLM for financial modeling, VentureBeat has learned.

The UK AISI/Gray Swan challenge ran 1.8 million attacks across 22 models. Every model broke. No current frontier system resists determined, well-resourced attacks.

Builders face a choice. Integrate security testing now, or explain breaches later. The tools exist — PyRIT, DeepTeam, Garak, OWASP frameworks. What remains is execution.

Organizations that treat LLM security as a feature rather than a foundation will learn the difference the hard way. The arms race rewards those who refuse to wait.

Red teaming reflects how nascent frontier models are

The gap between offensive capability and defensive readiness has never been wider. “If you’ve got adversaries breaking out in two minutes, and it takes you a day to ingest data and another day to run a search, how can you possibly hope to keep up?” Elia Zaitsev, CTO of CrowdStrike, told VentureBeat back in January. Zaitsev also implied that adversarial AI is progressing so quickly that the traditional tools AI builders trust to power their applications can be weaponized in stealth, jeopardizing product initiatives in the process.

Red teaming results to this point are a paradox, especially for AI builders who need a stable base platform to build from. Red teaming proves that every frontier model fails under sustained pressure.

One of my favorite things to do immediately after a new model comes out is to read the system card. It’s fascinating to see how well these documents reflect the red teaming, security, and reliability mentality of every model provider shipping today.

Earlier this month, I looked at how Anthropic’s versus OpenAI’s red teaming practices reveal how different these two companies are when it comes to enterprise AI itself. That’s important for builders to know, as getting locked in on a platform that isn’t compatible with the building team’s priorities can be a massive waste of time.

Attack surfaces are moving targets, further challenging red teams

Builders need to understand how fluid the attack surfaces are that red teams attempt to cover, despite having incomplete knowledge of the many threats their models will face.

A good place to start is with one of the best-known frameworks. OWASP’s 2025 Top 10 for LLM Applications reads like a cautionary tale for any business building AI apps and attempting to expand on existing LLMs. Prompt injection sits at No. 1 for the second consecutive year. Sensitive information disclosure jumped from sixth to second place. Supply chain vulnerabilities climbed from fifth to third. These rankings reflect production incidents, not theoretical risks.

Five new vulnerability categories appeared in the 2025 list: excessive agency, system prompt leakage, vector and embedding weaknesses, misinformation, and unbounded consumption. Each represents a failure mode unique to generative AI systems. No one building AI apps can ignore these categories at the risk of shipping vulnerabilities that security teams never detected, or worse, lost track of given how mercurial threat surfaces can change.

“AI is fundamentally changing everything, and cybersecurity is at the heart of it. We’re no longer dealing with human-scale threats; these attacks are occurring at machine scale,” Jeetu Patel, Cisco’s President and Chief Product Officer, emphasized to VentureBeat at RSAC 2025. Patel noted that AI-driven models are non-deterministic: “They won’t give you the same answer every single time, introducing unprecedented risks.”

“We recognized that adversaries are increasingly leveraging AI to accelerate attacks. With Charlotte AI, we’re giving defenders an equal footing, amplifying their efficiency and ensuring they can keep pace with attackers in real-time,” Zaitsev told VentureBeat.

How and why model providers validate security differently

Each frontier model provider wants to prove the security, robustness, and reliability of their system by devising a unique and differentiated red teaming process that is often explained in their system cards.

From their system cards, it doesn’t take long to see how different each model provider’s approach to red teaming reflects how different each is when it comes to security validation, versioning compatibility or the lack of it, persistence testing, and a willingness to torture-test their models with unrelenting attacks until they break.

In many ways, red teaming of frontier models is a lot like quality assurance on a commercial jet assembly line. Anthropic’s mentality is comparable to the well-known tests Airbus, Boeing, Gulfstream, and others do. Often called the Wing Bend Test or Ultimate Load Test, the goal of these tests is to push a wing’s strength to the breaking point to ensure the most significant safety margins possible.

Be sure to read Anthropic’s 153-page system card for Claude Opus 4.5 versus OpenAI’s 55-page GPT-5 system card to see firsthand how different their measurement philosophies are. Anthropic relies on multi-attempt attack success rates from 200-attempt reinforcement learning campaigns. OpenAI reports single-attempt jailbreak resistance.

Gray Swan’s Shade platform ran a series of adaptive adversarial campaigns against Claude models. The attack success rate (ASR) metrics quantify just how quickly models can degrade and break under pressure. Opus 4.5 in coding environments hit 4.7% ASR at one attempt, 33.6% at 10 attempts, and 63.0% at 100 attempts. In computer use with extended thinking, it held at 0% ASR even after 200 attempts, becoming the first model to saturate the benchmark. Sonnet 4.5 at the same thresholds showed 70% ASR in coding and 85.7% in computer use. The gap between model tiers within the same family is now larger than some cross-vendor comparisons.

On Gray Swan’s prompt injection benchmark, Claude Opus 4.5 shows the strongest resistance with an attack success rate of 4.7%, compared with 12.5% for Gemini 3 Pro and 21.9% for GPT-5.1.

For OpenAI’s o1, the system card reported 6% ASR for harmful text and 5% for malicious code based on single-attempt metrics. Third-party testing by SPLX and NeuralTrust found GPT-5’s raw ASR hit 89% initially, dropping below 1% within two weeks of patching.

The report confirms the pattern: With sufficient attempts, Claude 3.5 Sonnet showed 78% ASR, and GPT-4o reached 89%.

Models attempt to game tests during red teaming — adding to the paradox

Models scheme to turn off the oversight mechanisms when they predict they will be shut down. In Apollo Research’s evaluation of o1, the model attempted to turn off oversight mechanisms 5% of the time when it thought it would be shut down. In 2% of cases, o1 attempted self-exfiltration. When confronted about misaligned actions, o1 denied wrongdoing or fabricated explanations 99% of the time.

In 37% of scenarios, the model pursued developer-aligned goals during evaluation, then switched to its own goals when it believed oversight was minimal. Anthropic reports Opus 4.5’s evaluation awareness dropped from 26.5% (Opus 4.1) to less than 10% internally.

Evaluating Anthropic versus OpenAI red teaming results

Dimension	Claude Opus 4.5	GPT-5	o1
System card length	153 pages	55 pages	~40 pages
Attack methodology	200-attempt RL campaigns	Single-attempt + patching	Single-attempt
ASR @ 1 attempt	4.7% (coding)	89% raw (pre-patch)	5-6%
ASR @ 200 attempts	0% (saturates benchmark)	Not published	Not published
Prompt injection defense	96-99.4% prevention	~20% ASR (Fujitsu)	Higher than GPT-4o
Interpretability	~10M neural features	CoT monitoring	CoT monitoring
Evaluation awareness	<10%	Identifies exact eval	37% alignment faking
Scheming behavior	Orders of magnitude lower	Not published	5% oversight off; 2% self-exfil

Sources: Opus 4.5 system card, GPT-5 system card, o1 system card, Gray Swan, METR, Apollo Research

When models attempt to game a red teaming exercise if they anticipate they’re about to be shut down, AI builders need to know the sequence that leads to that logic being created. No one wants a model resisting being shut down in an emergency or commanding a given production process or workflow.

Defensive tools struggle against adaptive attackers

“Threat actors using AI as an attack vector has been accelerated, and they are so far in front of us as defenders, and we need to get on a bandwagon as defenders to start utilizing AI,” Mike Riemer, Field CISO at Ivanti, told VentureBeat.

Riemer pointed to patch reverse-engineering as a concrete example of the speed gap: “They’re able to reverse engineer a patch within 72 hours. So if I release a patch and a customer doesn’t patch within 72 hours of that release, they’re open to exploit because that’s how fast they can now do it,” he noted in a recent VentureBeat interview.

An October 2025 paper from researchers — including representatives from OpenAI, Anthropic, and Google DeepMind — examined 12 published defenses against prompt injection and jailbreaking. Using adaptive attacks that iteratively refined their approach, the researchers bypassed defenses with attack success rates above 90% for most. The majority of defenses had initially been reported to have near-zero attack success rates.

The gap between reported defense performance and real-world resilience stems from evaluation methodology. Defense authors test against fixed attack sets. Adaptive attackers are very aggressive in using iteration, which is a common theme in all attempts to compromise any model.

Builders shouldn’t rely on frontier model builders’ claims without also conducting their own testing.

Open-source frameworks have emerged to address the testing gap. DeepTeam, released in November 2025, applies jailbreaking and prompt injection techniques to probe LLM systems before deployment. Garak from Nvidia focuses on vulnerability scanning. MLCommons published safety benchmarks. The tooling ecosystem is maturing, but builder adoption lags behind attacker sophistication.

What AI builders need to do now

“An AI agent is like giving an intern full access to your network. You gotta put some guardrails around the intern.” George Kurtz, CEO and founder of CrowdStrike, observed at FalCon 2025. That quote typifies the current state of frontier AI models as well.

Meta’s Agents Rule of Two, published October 2025, reinforces this principle: Guardrails must live outside the LLM. File-type firewalls, human approvals, and kill switches for tool calls cannot depend on model behavior alone. Builders who embed security logic inside prompts have already lost.

“Business and technology leaders can’t afford to sacrifice safety for speed when embracing AI. The security challenges AI introduces are new and complex, with vulnerabilities spanning models, applications, and supply chains. We have to think differently,” Patel told VentureBeat previously.

Input validation remains the first line of defense. Enforce strict schemas that define exactly what inputs the LLM endpoints being designed can accept. Reject unexpected characters, escape sequences, and encoding variations. Apply rate limits per user and per session. Create structured interfaces or prompt templates that limit free-form text injection into sensitive contexts.
Output validation from any LLM or frontier model is a must-have. LLM-generated content passed to downstream systems without sanitization creates classic injection risks: XSS, SQL injection, SSRF, and remote code execution. Treat the model as an untrusted user. Follow OWASP ASVS guidelines for input validation and sanitization.
Always separate instructions from data. Use different input fields for system instructions and dynamic user content. Prevent user-provided content from being embedded directly into control prompts. This architectural decision prevents entire classes of injection attacks.
Think of regular red teaming as the muscle memory you always needed; it’s that essential. The OWASP Gen AI Red Teaming Guide provides structured methodologies for identifying model-level and system-level vulnerabilities. Quarterly adversarial testing should become standard practice for any team shipping LLM-powered features.
Control agent permissions ruthlessly. For LLM-powered agents that can take actions, minimize extensions and their functionality. Avoid open-ended extensions. Execute extensions in the user’s context with their permissions. Require user approval for high-impact actions. The principle of least privilege applies to AI agents just as it applies to human users.
Supply chain scrutiny cannot wait. Vet data and model sources. Maintain a software bill of materials for AI components using tools like OWASP CycloneDX or ML-BOM. Run custom evaluations when selecting third-party models rather than relying solely on public benchmarks.

2025-Dec-22

From assistance to autonomy: How agentic AI is redefining enterprises

Presented by EdgeVerve

Artificial intelligence (AI) has long promised to change the way enterprises operate. For years, the focus was on assistants, systems that could surface information, summarize documents, or streamline repetitive tasks. While valuable, these technological assistants were reactive: they waited for human prompts and provided limited support within narrow boundaries.

Today, a new chapter is unfolding. Agentic AI, whose systems are capable of autonomous decision-making and multi-step orchestration, represents a significant evolution. These systems don’t just assist, they act. They evaluate context, weigh outcomes and autonomously initiate actions, orchestrating complex workflows across functions. They adapt dynamically and collaborate with other agents in ways that are beginning to reshape enterprise operations at large.

For leaders, this shift carries both opportunity and responsibility. The potential is immense, but so are the governance, trust and design challenges that come with giving AI systems greater autonomy. Enterprises must be able to monitor and override any actions taken by the agentic AI systems.

Shift from assistance to autonomy

Traditional AI assistants primarily respond to queries and perform isolated tasks. They are helpful but constrained. Agentic AI pushes further: multiple agents can collaborate, exchange context and manage workflows end-to-end.

Imagine a procurement workflow. An assistant can pull vendor data or draft a purchase order. An agentic system, however, can review demand forecasts, evaluate vendor risk, check compliance policies, negotiate terms and finalize transactions. It does this all while coordinating across global business departments, including finance, operations and compliance.

This shift from narrow support to autonomous orchestration is the defining leap of the next era of enterprise AI. It is not about replacing humans but about embedding intelligence into the very fabric of organizational workflows.

Rethink enterprise workflows

The goal of every enterprise department is focused on efficiency, scale and standardization. But agentic AI challenges enterprises to think differently. Instead of designing workflows step by step and inserting automation, organizations now need to completely reimagine and architect intelligent ecosystems for orchestrating processes, adapting to evolving business needs, and enabling seamless collaboration between humans and agents.

That requires new thinking. Which decisions should remain human-led, and which can be delegated? How do you ensure agents access the correct data without overstepping boundaries? What happens when agents from finance, HR and supply chain must coordinate autonomously?

The design of workflows is no longer about linear handoffs; it is about orchestrated ecosystems. Enterprises that get this right can achieve speed and agility that traditional automation cannot match.

Accelerate agentic AI-led transformation with a unified platform

In this environment, unified platforms become critical. Without them, enterprises risk a proliferation of disconnected agents working at cross-purposes. A unified approach provides the guardrails with shared knowledge graphs, consistent policy frameworks and a single orchestration layer that ensures interoperability across business functions.

This platform-based approach not only reduces complexity but also enables scale. Enterprises don’t want dozens of fragmented AI projects that stall in the pilot stages. They want enterprise-grade systems where agents can collaborate securely and consistently across the enterprise.

Unified platforms simplify outcome monitoring and strengthen governance —both critical as systems become increasingly autonomous.

Build trust and accountability

As AI systems act with greater independence, the stakes rise. An agent who makes flawed decisions in customer service may frustrate a client. An agent that mishandles a compliance process could expose the enterprise to regulatory risk.

That’s why trust and accountability must be designed into agentic AI from the start. Governance is not an afterthought; it is a foundation. Leaders need clear policies defining the scope of agentic autonomy, transparent logging of decisions, evaluating and monitoring agents and escalation mechanisms when human oversight is required.

Equally important is cultural trust. Employees must believe these systems are partners, not threats. This calls for change management, training, and communication that positions agentic AI as augmenting human capability rather than replacing it.

Measure business value early

One of the most common pitfalls in enterprise AI adoption is the gap between promising pilots and at-scale results. Studies show that a significant percentage of AI projects never make it past experimentation. Agentic AI cannot afford to fall into this trap.

Enterprises must measure business value early and continuously. This includes efficiency gains, cost reductions, error avoidance and even intangible benefits like faster decision-making or improved compliance. Success will be defined by automation coverage across processes, reductions in manual intervention and the ability to deliver new services at speed and scale.

When designed responsibly, agentic AI can deliver exponential improvements. A procurement cycle reduced from weeks to hours, or a compliance review automated at scale, can fundamentally alter enterprise performance.

Preparing for the future

The rise of agentic AI does not mean handing over control to machines or codes. Instead, it marks the next phase of enterprise transformation, where humans and agents operate side by side in orchestrated systems.

Leaders should start by piloting agentic systems in well-defined domains with clear governance models. From there, scaling across the enterprise requires investment in unified platforms, robust policy frameworks, and a culture that embraces intelligent automation as a partner in value creation.

The enterprises that succeed will be those that approach agentic AI not as another tool, but as a strategic shift. Just as ERP and cloud once redefined operations, agentic AI is poised to do the same, reshaping workflows, governance, and the very way decisions are made.

Agentic AI is moving the enterprise conversation from assistance to autonomy. That change comes with objective complexity, but also with extraordinary promise. The foundation for success lies in unified platforms that enable enterprises to orchestrate with intelligence, govern with trust, and scale with confidence.

The journey is just beginning. And for enterprise leaders, now is the time to lead with vision, responsibility, and ambition.

N Shashidhar is VP and Global Platform Head of EdgeVerve AI Next.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact [email protected].

2025-Dec-22

Agent autonomy without guardrails is an SRE nightmare

João Freitas is GM and VP of engineering for AI and automation at PagerDuty

As AI use continues to evolve in large organizations, leaders are increasingly seeking the next development that will yield major ROI. The latest wave of this ongoing trend is the adoption of AI agents. However, as with any new technology, organizations must ensure they adopt AI agents in a responsible way that allows them to facilitate both speed and security.

More than half of organizations have already deployed AI agents to some extent, with more expecting to follow suit in the next two years. But many early adopters are now reevaluating their approach. Four-in-10 tech leaders regret not establishing a stronger governance foundation from the start, which suggests they adopted AI rapidly, but with margin to improve on policies, rules and best practices designed to ensure the responsible, ethical and legal development and use of AI.

As AI adoption accelerates, organizations must find the right balance between their exposure risk and the implementation of guardrails to ensure AI use is secure.

Where do AI agents create potential risks?

There are three principal areas of consideration for safer AI adoption.

The first is shadow AI, when employees use unauthorized AI tools without express permission, bypassing approved tools and processes. IT should create necessary processes for experimentation and innovation to introduce more efficient ways of working with AI. While shadow AI has existed as long as AI tools themselves, AI agent autonomy makes it easier for unsanctioned tools to operate outside the purview of IT, which can introduce fresh security risks.

Secondly, organizations must close gaps in AI ownership and accountability to prepare for incidents or processes gone wrong. The strength of AI agents lies in their autonomy. However, if agents act in unexpected ways, teams must be able to determine who is responsible for addressing any issues.

The third risk arises when there is a lack of explainability for actions AI agents have taken. AI agents are goal-oriented, but how they accomplish their goals can be unclear. AI agents must have explainable logic underlying their actions so that engineers can trace and, if needed, roll back actions that may cause issues with existing systems.

While none of these risks should delay adoption, they will help organizations better ensure their security.

The three guidelines for responsible AI agent adoption

Once organizations have identified the risks AI agents can pose, they must implement guidelines and guardrails to ensure safe usage. By following these three steps, organizations can minimize these risks.

1: Make human oversight the default

AI agency continues to evolve at a fast pace. However, we still need human oversight when AI agents are given the capacity to act, make decisions and pursue a goal that may impact key systems. A human should be in the loop by default, especially for business-critical use cases and systems. The teams that use AI must understand the actions it may take and where they may need to intervene. Start conservatively and, over time, increase the level of agency given to AI agents.

In conjunction, operations teams, engineers and security professionals must understand the role they play in supervising AI agents’ workflows. Each agent should be assigned a specific human owner for clearly defined oversight and accountability. Organizations must also allow any human to flag or override an AI agent’s behavior when an action has a negative outcome.

When considering tasks for AI agents, organizations should understand that, while traditional automation is good at handling repetitive, rule-based processes with structured data inputs, AI agents can handle much more complex tasks and adapt to new information in a more autonomous way. This makes them an appealing solution for all sorts of tasks. But as AI agents are deployed, organizations should control what actions the agents can take, particularly in the early stages of a project. Thus, teams working with AI agents should have approval paths in place for high-impact actions to ensure agent scope does not extend beyond expected use cases, minimizing risk to the wider system.

2: Bake in security

The introduction of new tools should not expose a system to fresh security risks.

Organizations should consider agentic platforms that comply with high security standards and are validated by enterprise-grade certifications such as SOC2, FedRAMP or equivalent. Further, AI agents should not be allowed free rein across an organization’s systems. At a minimum, the permissions and security scope of an AI agent must be aligned with the scope of the owner, and any tools added to the agent should not allow for extended permissions. Limiting AI agent access to a system based on their role will also ensure deployment runs smoothly. Keeping complete logs of every action taken by an AI agent can also help engineers understand what happened in the event of an incident and trace back the problem.

3: Make outputs explainable

AI use in an organization must never be a black box. The reasoning behind any action must be illustrated so that any engineer who tries to access it can understand the context the agent used for decision-making and access the traces that led to those actions.

Inputs and outputs for every action should be logged and accessible. This will help organizations establish a firm overview of the logic underlying an AI agent’s actions, providing significant value in the event anything goes wrong.

Security underscores AI agents’ success

AI agents offer a huge opportunity for organizations to accelerate and improve their existing processes. However, if they do not prioritize security and strong governance, they could expose themselves to new risks.

As AI agents become more common, organizations must ensure they have systems in place to measure how they perform and the ability to take action when they create problems.

Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.

2025-Dec-21

Hiring specialists made sense before AI — now generalists win

Tony Stoyanov is CTO and co-founder of EliseAI

In the 2010s, tech companies chased staff-level specialists: Backend engineers, data scientists, system architects. That model worked when technology evolved slowly. Specialists knew their craft, could deliver quickly and built careers on predictable foundations like cloud infrastructure or the latest JS framework

Then AI went mainstream.

The pace of change has exploded. New technologies appear and mature in less than a year. You can’t hire someone who has been building AI agents for five years, as the technology hasn’t existed for that long. The people thriving today aren’t those with the longest résumés; they’re the ones who learn fast, adapt fast and act without waiting for direction. Nowhere is this transformation more evident than in software engineering, which has likely experienced the most dramatic shift of all, evolving faster than almost any other field of work.

How AI Is rewriting the rules

AI has lowered the barrier to doing complex technical work, technical skills and it’s also raised expectations for what counts as real expertise. McKinsey estimates that by 2030, up to 30% of U.S. work hours could be automated and 12 million workers may need to shift roles entirely. Technical depth still matters, but AI favors people who can figure things out as they go.

At my company, I see this every day. Engineers who never touched front-end code are now building UIs, while front-end developers are moving into back-end work. The technology keeps getting easier to use but the problems are harder because they span more disciplines.

In that kind of environment, being great at one thing isn’t enough. What matters is the ability to bridge engineering, product and operations to make good decisions quickly, even with imperfect information.

Despite all the excitement, only 1% of companies consider themselves truly mature in how they use AI. Many still rely on structures built for a slower era — layers of approval, rigid roles and an overreliance on specialists who can’t move outside their lane.

The traits of a strong generalist

A strong generalist has breadth without losing depth. They go deep in one or two domains but stay fluent across many. As David Epstein puts it in Range, “You have people walking around with all the knowledge of humanity on their phone, but they have no idea how to integrate it. We don’t train people in thinking or reasoning.” True expertise comes from connecting the dots, not just collecting information.

The best generalists share these traits:

Ownership: End-to-end accountability for outcomes, not just tasks.
First-principles thinking: Question assumptions, focus on the goal, and rebuild when needed.
Adaptability: Learn new domains quickly and move between them smoothly.
Agency: Act without waiting for approval and adjust as new information comes in.
Soft skills: Communicate clearly, align teams and keep customers’ needs in focus.
Range: Solve different kinds of problems and draw lessons across contexts.

I try to make accountability a priority for my teams. Everyone knows what they own, what success looks like and how it connects to the mission. Perfection isn’t the goal, forward movement is.

Embracing the shift

Focusing on adaptable builders changed everything. These are the people with the range and curiosity to use AI tools to learn quickly and execute confidently.

If you’re a builder who thrives in ambiguity, this is your time. The AI era rewards curiosity and initiative more than credentials. If you’re hiring, look ahead. The people who’ll move your company forward might not be the ones with the perfect résumé for the job. They’re the ones who can grow into what the company will need as it evolves.

The future belongs to generalists and to the companies that trust them.

Read more from our guest writers. Or, consider submitting a post of your own! See our guidelines here.

2025-Dec-20

Google releases FunctionGemma: a tiny edge model that can control mobile devices with natural language

While Gemini 3 is still making waves, Google’s not taking the foot off the gas in terms of releasing new models.

Yesterday, the company released FunctionGemma, a specialized 270-million parameter AI model designed to solve one of the most persistent bottlenecks in modern application development: reliability at the edge.

Unlike general-purpose chatbots, FunctionGemma is engineered for a single, critical utility—translating natural language user commands into structured code that apps and devices can actually execute, all without connecting to the cloud.

The release marks a significant strategic pivot for Google DeepMind and the Google AI Developers team. While the industry continues to chase trillion-parameter scale in the cloud, FunctionGemma is a bet on “Small Language Models” (SLMs) running locally on phones, browsers, and IoT devices.

For AI engineers and enterprise builders, this model offers a new architectural primitive: a privacy-first “router” that can handle complex logic on-device with negligible latency.

FunctionGemma is available immediately for download on Hugging Face and Kaggle. You can also see the model in action by downloading the Google AI Edge Gallery app on the Google Play Store.

The Performance Leap

At its core, FunctionGemma addresses the “execution gap” in generative AI. Standard large language models (LLMs) are excellent at conversation but often struggle to reliably trigger software actions—especially on resource-constrained devices.

According to Google’s internal “Mobile Actions” evaluation, a generic small model struggles with reliability, achieving only a 58% baseline accuracy for function calling tasks. However, once fine-tuned for this specific purpose, FunctionGemma’s accuracy jumped to 85%, creating a specialized model that can exhibit the same success rate as models many times its size.

It allows the model to handle more than just simple on/off switches; it can parse complex arguments, such as identifying specific grid coordinates to drive game mechanics or detailed logic.

The release includes more than just the model weights. Google is providing a full “recipe” for developers, including:

The Model: A 270M parameter transformer trained on 6 trillion tokens.
Training Data: A “Mobile Actions” dataset to help developers train their own agents.
Ecosystem Support: Compatibility with Hugging Face Transformers, Keras, Unsloth, and NVIDIA NeMo libraries.

Omar Sanseviero, Developer Experience Lead at Google DeepMind, highlighted the versatility of the release on X (formerly Twitter), noting the model is “designed to be specialized for your own tasks” and can run in “your phone, browser or other devices.”

This local-first approach offers three distinct advantages:

Privacy: Personal data (like calendar entries or contacts) never leaves the device.
Latency: Actions happen instantly without waiting for a server round-trip. The small size means the speed at which it processes input is significant, particularly with access to accelerators such as GPUs and NPUs.
Cost: Developers don’t pay per-token API fees for simple interactions.

For AI Builders: A New Pattern for Production Workflows

For enterprise developers and system architects, FunctionGemma suggests a move away from monolithic AI systems toward compound systems. Instead of routing every minor user request to a massive, expensive cloud model like GPT-4 or Gemini 1.5 Pro, builders can now deploy FunctionGemma as an intelligent “traffic controller” at the edge.

Here is how AI builders should conceptualize using FunctionGemma in production:

1. The “Traffic Controller” Architecture: In a production environment, FunctionGemma can act as the first line of defense. It sits on the user’s device, instantly handling common, high-frequency commands (navigation, media control, basic data entry). If a request requires deep reasoning or world knowledge, the model can identify that need and route the request to a larger cloud model. This hybrid approach drastically reduces cloud inference costs and latency. This enables use cases such as routing queries to the appropriate sub-agent.

2. Deterministic Reliability over Creative Chaos: Enterprises rarely need their banking or calendar apps to be “creative.” They need them to be accurate. The jump to 85% accuracy confirms that specialization beats size. Fine-tuning this small model on domain-specific data (e.g., proprietary enterprise APIs) creates a highly reliable tool that behaves predictably—a requirement for production deployment.

3. Privacy-First Compliance: For sectors like healthcare, finance, or secure enterprise ops, sending data to the cloud is often a compliance risk. Because FunctionGemma is efficient enough to run on-device (compatible with NVIDIA Jetson, mobile CPUs, and browser-based Transformers.js), sensitive data like PII or proprietary commands never has to leave the local network.

Licensing: Open-ish With Guardrails

FunctionGemma is released under Google’s custom Gemma Terms of Use. For enterprise and commercial developers, this is a critical distinction from standard open-source licenses like MIT or Apache 2.0.

While Google describes Gemma as an “open model,” it is not strictly “Open Source” by the Open Source Initiative (OSI) definition.

The license allows for free commercial use, redistribution, and modification, but it includes specific Usage Restrictions. Developers are prohibited from using the model for restricted activities (such as generating hate speech or malware), and Google reserves the right to update these terms.

For the vast majority of startups and developers, the license is permissive enough to build commercial products. However, teams building dual-use technologies or those requiring strict copyleft freedom should review the specific clauses regarding “Harmful Use” and attribution.

Correction: This article mistakenly listed Omar Sanseviero as Developer Lead at Hugging Face, his prior role. It has since been updated to his correct role at Google DeepMind. We apologize and regret the error.

2025-Dec-19

What Firewalls Really Do and Why Every Network (Still) Needs Them

Firewalls are one of the oldest tools in network security.

Many people think they are outdated or replaced by newer tools like endpoint security or cloud security platforms, but that’s not the case. Firewalls still play a critical role in protecting networks, systems, and data.

A firewall acts like a security guard at the entrance of a building. It decides what can come in, what can go out, and what should be blocked.

Even though attacks have become more advanced, this basic control point is still essential.

In this article, I’ll explain what firewalls really do, how they work, and why every network still needs them today. We’ll also look at how firewalls have evolved to stay useful in modern cloud and hybrid environments.

What We Will Cover

What a Firewall Is in Simple Terms

A firewall is a system that controls network traffic based on rules. These rules define which connections are allowed and which are denied. The firewall sits between trusted systems and untrusted networks, most often between an internal network and the internet.

When data tries to move across the network, the firewall checks it. If the data follows the rules, it’s allowed through. If it breaks the rules, it’s blocked or logged for review.

Firewalls can be hardware devices, software programs, or cloud-based services. No matter the form, the goal is the same: they reduce risk by limiting exposure.

What Firewalls Actually Do

At the most basic level, a firewall filters traffic. It looks at details like IP addresses, ports, and protocols. For example, it can allow web traffic on port 443 but block unused or risky ports.

Modern firewalls go much further. They can inspect traffic at a deeper level. This is called deep packet inspection. Instead of just checking where traffic comes from, the firewall looks at what the traffic contains.

Firewalls can also track connections over time. This is known as stateful inspection. The firewall understands whether traffic is part of a valid conversation or an unexpected request. This helps stop many common attacks.

Another important job of a firewall is logging. Firewalls record what they allow and what they block. These logs are vital for audits, investigations, and compliance needs.

How Firewalls Reduce Attack Surface

Attack surface means the number of ways an attacker can try to get into a system. Firewalls reduce this by closing unnecessary paths.

Most systems don’t need to expose all services to the internet. A firewall ensures that only required services are reachable. Everything else stays hidden.

Even if an application has a weakness, a firewall can reduce the chance that attackers ever reach it. This doesn’t replace secure coding, but it adds a strong layer of defense.

This layered approach is known as defence in depth. Firewalls are a core layer in that strategy.

Firewalls and Internal Network Protection

Many people think firewalls are only for the network edge. That is no longer true. Internal firewalls are now just as important.

Inside a network, different systems have different risk levels. A database should not be freely accessible from every workstation. Firewalls help enforce this separation.

This practice is often called network segmentation. By placing firewalls between network segments, organizations limit how far an attacker can move if they gain access to one system.

Internal firewalls are especially important in large environments, data centers, and cloud platforms.

Setting Up a Firewall

To make this practical, let’s look at a real, working example using UFW, an open source firewall available on most Linux systems. These are actual commands you would run on a server.

We will assume a simple use case: the server should allow secure web traffic on port 443 and allow SSH access for administration. All other incoming traffic should be blocked.

First, make sure you have UFW installed:

sudo apt update
sudo apt install ufw

Before enabling the firewall, define the default behaviour. Blocking all incoming traffic by default is a safe baseline. Outgoing traffic is allowed so the server can still reach external services.

sudo ufw default deny incoming
sudo ufw default allow outgoing

Next, allow SSH access. This is important so you don’t lock yourself out of the server.

sudo ufw allow ssh

If you prefer to be explicit about the port, you can allow port 22 directly.

sudo ufw allow 22/tcp

Now allow HTTPS traffic so users can reach the web application.

sudo ufw allow 443/tcp

At this point, only SSH and HTTPS are allowed. Everything else is blocked automatically.

You can review the rules before enabling the firewall.

sudo ufw status verbose

When you are satisfied with the rules, enable the firewall like this:

sudo ufw enable

Once enabled, UFW immediately starts enforcing the rules.

To confirm everything is working, check the status again.

sudo ufw status numbered

Logging is disabled by default. Enabling it gives visibility into blocked and allowed connections, which is useful for security monitoring and audits.

sudo ufw logging on

UFW also supports simple protection against brute force attacks. For example, you can rate limit SSH connections.

sudo ufw limit ssh

This rule allows normal usage but blocks IP addresses that make too many connection attempts in a short time.

If you need to restrict access to a service by IP address, UFW supports that as well. For example, allowing SSH only from a trusted office IP:

sudo ufw allow from 203.0.113.10 to any port 22 proto tcp

You can remove or change rules as your requirements evolve. For example, to delete a rule using its number, do this:

sudo ufw delete 3

This setup shows what a firewall actually looks like in practice. You define defaults, allow only what is required, enable logging, and enforce the rules.

Even though enterprise firewalls and cloud firewalls use more advanced interfaces, the underlying logic is the same. Clear rules control traffic flow, reduce attack surface, and provide visibility. Open source tools like UFW make these concepts easy to understand and apply in real systems.

Firewalls in Cloud and Hybrid Networks

Cloud computing changed how networks are built, but it did not remove the need for firewalls. In fact, it increased their importance.

In cloud environments, firewalls are often provided as managed services. They may be called security groups, network security rules, or cloud firewalls. The name changes, but the role is the same.

Hybrid networks combine on-premise systems with cloud systems. Firewalls control traffic between these environments. They help enforce consistent security rules across locations.

Without firewalls, cloud resources would be exposed directly to the internet. That would be risky and costly.

Firewalls and Compliance Requirements

Many industries have strict security rules. Banks, healthcare providers, and large enterprises must follow regulations. Firewalls help meet these requirements.

Regulations often require control over network access. They also require logging and monitoring. Firewalls provide both.

Auditors frequently ask for firewall configurations and logs. A well-managed firewall setup makes audits easier and reduces compliance risk.

Even small companies benefit from these controls. Security standards are not only for large enterprises anymore.

Common Misunderstandings About Firewalls

One common myth is that firewalls stop all attacks, but this isn’t true. Firewalls aren’t magic shields. They are one part of a broader security strategy.

Another misunderstanding is that firewalls slow networks down. Modern firewalls are built for high performance. When configured correctly, the impact is minimal.

Some believe that endpoint security replaces firewalls. Endpoint tools protect individual devices. Firewalls protect the network paths between them. Both are needed.

Understanding these limits helps teams use firewalls effectively instead of relying on them blindly.

Why Firewalls Still Matter Today

Cyber attacks are more frequent and more automated than ever. Exposed systems are scanned constantly. Firewalls provide the first line of resistance.

New technologies don’t remove the need for boundaries. Even zero-trust models rely on strict access controls, often enforced by firewall-like systems.

Every network, no matter the size, benefits from clear rules about who can talk to whom. Firewalls enforce those rules reliably and visibly.

Without firewalls, organisations would rely only on application security and user behaviour. That’s not enough in today’s threat landscape.

Firewalls as a Foundation, Not a Finish Line

It’s important to see firewalls as a foundation. They create a secure base on which other controls can work better.

Security monitoring, incident response, and threat detection all depend on controlled traffic flows. Firewalls make these systems more effective.

When something goes wrong, firewall logs often provide the first clues. They show what happened at the network level.

This makes firewalls valuable not just for prevention, but also for understanding and recovery.

Conclusion

Firewalls are not outdated tools from the past. They are still essential for protecting modern networks. They control access, reduce attack surface, support compliance, and enable strong security design.

While technology keeps changing, the need to control network traffic does not go away. Firewalls have adapted to cloud, hybrid, and complex environments.

Every network still needs a firewall. Not as the only defense, but as a critical part of a layered security approach. When used correctly, firewalls continue to do what they have always done best: keep the right doors open and keep the wrong ones closed.

2025-Dec-19

How to Build a Real-time AI Gym Coach with Vision Agents

Computer vision is transforming how people train, from at-home workouts to smart gym mirrors.

Imagine walking into your home gym, turning on your camera, and having an AI coach that sees your movements, counts your reps, and corrects your form in real time.

That’s exactly what we’re building in this tutorial: a real-time gym companion and fitness coach.

We’ll integrate Vision Agents‘ low-latency video inference to detect movement patterns, count reps, and give instant voice feedback like “Straighten your back!” or “Keep your form tight!”, just like a human trainer would.

Here is a demo video of the AI gym companion during a workout session:

What We’ll Cover:

Prerequisites

Python 3.13 or higher
API keys for:
- Gemini (for real-time LLM with vision)
- Stream (for video/audio infrastructure)
- Alternatively: OpenAI (if using OpenAI Realtime instead)
Code editor like VS Code or Windsurf

Setting Up the Project

Create a new directory on your computer called gym_buddy. You can also do it directly in your terminal with this command:

mkdir gym_buddy

Then open the directory in your IDE (for this guide, I’m using Windsurf IDE).

If you don’t have uv (a fast Python package installer and resolver) installed on your computer, install it with this command:

pip install uv

Note: After installing uv, you can also run uv -init to set up the project with sample files and a .toml file with the metadata.

Next, we’ll create the pyproject.toml file. This is a configuration file for Python projects that specifies build system requirements and other project metadata. It’s a standard file used by modern Python packaging tools.

Enter the code below:

[project]
name = "gym-buddy"
version = "0.1.0"
requires-python = ">=3.13"
dependencies = [
    "python-dotenv>=1.0",
    "vision-agents",
    "vision-agents-plugins-openai",
    "vision-agents-plugins-getstream",
    "vision-agents-plugins-ultralytics",
    "vision-agents-plugins-gemini",
]

[tool.uv.sources]
"vision-agents" = {path = "../../agents-core", editable=true}
"vision-agents-plugins-deepgram" = {path = "../../plugins/deepgram", editable=true}
"vision-agents-plugins-ultralytics" = {path = "../../plugins/ultralytics", editable=true}
"vision-agents-plugins-openai" = {path = "../../plugins/openai", editable=true}
"vision-agents-plugins-getstream" = {path = "../../plugins/getstream", editable=true}
"vision-agents-plugins-gemini" = {path = "../../plugins/gemini", editable=true}

You can also create a requirements.in file with just the direct dependencies, like so:

python-dotenv>=1.0
vision-agents
vision-agents-plugins-openai
vision-agents-plugins-getstream
vision-agents-plugins-ultralytics
vision-agents-plugins-gemini

Then install dependencies using uv and either of these commands:

uv sync

This will generate the uv.lock from the uv package manager that handles the project’s dependencies and builds.

If you are using a Windows OS, you might come across a dependency installation error, particularly with NumPy. This is likely due to missing build tools on your system.

Why NumPy is required

NumPy is a Python library for numerical computing. In this project, it’s used by the computer-vision and AI components (such as YOLO-based detection and Vision Agents) to handle image data, bounding boxes, coordinates, and other numerical outputs produced during real-time video analysis.

Many of the libraries used here depend on it for fast array operations and mathematical computations. That’s why NumPy is installed as part of the setup and why issues with its installation can affect the entire pipeline.

To resolve it, install Visual Studio Build Tools (required for building Python packages with C extensions). During installation, make sure that you select “Desktop development with C++”. This installs all the necessary build tools.

Visual Studio displays like this after the installation is done. You may need to restart your computer for the updates to take effect.

Now run this command in your terminal:

python -m pip install -e .

The command above installs all the necessary dependencies for the project.

How to Get Your API Keys

For this project, we need to get API keys from Stream and Gemini/OpenAI.

To get your Stream API key, go ahead and sign up with your preferred method.

Then, navigate to your dashboard and click ‘Create App’ to create a new app for the AI gym companion.

Enter the name for the app, choose the environment (Development/Production), select a region, and click on ‘Create App’.

After creating the app, click on the dashboard overview tab in the left sidebar, then navigate to the Video tab and click on “API Keys”. Copy your API key and secret, and save them securely.

To get your Gemini API key, visit the Google AI Studio website, then click on Get started.

Then, go to your dashboard and click on ‘Create API key’.

Enter a name for the key, then create a new project for the API key.

After you have created the new API key, copy it and save it securely.

Building the AI gym companion

Now that you have the API keys you’ll need for the AI gym companion, create a .env file in the project’s root directory and add all the API keys like so:

GEMINI_API_KEY=your_gemini_key
STREAM_API_KEY=your_stream_key
STREAM_API_SECRET=your_stream_secret

If you’re using OpenAI instead of Gemini, also add:

OPENAI_API_KEY=your_openai_key

This is the project and codebase structure for the gym companion app we are building:

In the root directory, create an empty _init.py file. This file makes Python treat the directory as a package. You can add a comment in the file to remember, like so:

# This file makes Python treat the directory as a package.

Next, create a gym_buddy.py file. This is the main app file, containing agent setup and call joining logic for the Gym Companion. Enter the code below in the file:

import logging
from dotenv import load_dotenv
from vision_agents.core import User, Agent, cli
from vision_agents.core.agents import AgentLauncher
from vision_agents.plugins import getstream, ultralytics, gemini
logger = logging.getLogger(__name__)
load_dotenv()
async def create_agent(**kwargs) -> Agent:
    agent = Agent(
        edge=getstream.Edge(),  # use stream for edge video transport
        agent_user=User(name="AI gym companion"),
        instructions="Read @gym_buddy.md",  # read the gym buddy markdown instructions
        llm=gemini.Realtime(fps=3),  # Share video with gemini
        # llm=openai.Realtime(fps=3), use this to switch to openai
        processors=[
            ultralytics.YOLOPoseProcessor(model_path="yolo11n-pose.pt")
        ],  # realtime pose detection with yolo
    )
    return agent
async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
    call = await agent.create_call(call_type, call_id)
    # join the call and open a demo env
    with await agent.join(call):
        await agent.llm.simple_response(
            text="Say hi. After the user does their exercise, offer helpful feedback."
        )
        await agent.finish()  # run till the call ends
if __name__ == "__main__":
    cli(AgentLauncher(create_agent=create_agent, join_call=join_call))

Then create a gym_buddy.md file. This is an instructions file for the gym agent’s coaching guide, which it will follow when analysing the workouts and providing real-time feedback. Enter the markdown code below:

You are a voice fitness coach. You will watch the user's workout and offer feedback.
The video clarifies the body position using Yolo's pose analysis, so you'll see their exact movement.
Speak with a high-energy, motivating tone. Be strict about form but encouraging. Do not give feedback if you are not sure or do not see an exercise.
# Gym Workout Coaching Guide
## 1. Introduction
A fitness coach's primary responsibility is to ensure safety and efficacy in every movement. While everybody is different, the fundamental mechanics of human movement—stability, alignment, and range of motion—remain constant. By monitoring key checkpoints like spinal alignment, joint tracking, and tempo, coaches can guide athletes toward stronger, injury-free workouts. The following guidelines break down the core compound movements into phases, with clear teaching points and coaching cues.
## 2. The Squat: Setup and Stance
The squat is the king of lower-body exercises, but it starts before the descent. The athlete should stand with feet shoulder-width apart or slightly wider, toes pointed slightly outward (5-30 degrees). The spine must be neutral, chest proud, and core braced. Coaches should watch for collapsing arches in the feet or a rounded upper back. A solid setup creates the tension needed for a powerful lift.
## 3. The Squat: Descent (Eccentric Phase)
The movement begins by breaking at the hips and knees simultaneously. The hips should travel back and down, as if sitting in a chair, while the knees track in line with the toes. Coaches must ensure the heels stay glued to the floor. Common errors include "knee valgus" (knees caving in) or the torso collapsing forward. The descent should be controlled and deliberate.
## 4. The Squat: Depth and Reversal
"Depth" is achieved when the hip crease drops below the top of the knee (parallel). While not everyone has the mobility for this, it is the standard for a full range of motion. At the bottom, the athlete should maintain tension—no bouncing or relaxing. The reversal (concentric phase) is driven by driving the feet into the floor and extending the hips and knees, exhaling forcefully.
## 5. The Push-up: The Plank Foundation
A perfect push-up is essentially a moving plank. The setup requires hands placed slightly wider than shoulder-width, directly under the shoulders. The body must form a straight line from head to heels. Coaches should watch for sagging hips (lumbar extension) or piking hips (flexion). Glutes and quads should be squeezed tight to lock the body into a rigid lever.
## 6. The Push-up: Mechanics
As the athlete lowers themselves, the elbows should track back at roughly a 45-degree angle to the torso, forming an arrow shape, not a "T". The chest should descend until it nearly touches the floor. The neck must remain neutral—no reaching with the chin. The push back up should be explosive, fully extending the arms without locking the elbows violently.
## 7. The Lunge: Step and Stability
The lunge challenges balance and unilateral strength. Whether forward or reverse, the step should be long enough to allow both knees to bend to approximately 90 degrees at the bottom. The feet should remain hip-width apart throughout the movement, like moving on train tracks, not a tightrope. Coaches should look for wobbling or the front heel lifting off the ground.
## 8. The Lunge: Alignment
In the bottom position, the front knee should be directly over the ankle, not shooting far past the toes (though some forward travel is acceptable). The torso should remain upright or have a very slight forward lean; collapsing over the front thigh is a fault. The back knee should hover just an inch off the ground. Drive through the front heel to return to the start.
## 9. Tempo and Control
Time under tension builds muscle and control. Coaches should encourage a specific tempo, such as 2-0-1 (2 seconds down, 0 pause, 1 second up). Rushing through reps often masks muscle imbalances and relies on momentum rather than strength. If an athlete speeds up, cue them to "slow down and own the movement."
## 10. Breathing Mechanics
Proper breathing stabilises the core. The general rule is to inhale during the eccentric phase (lowering) and exhale during the concentric phase (lifting/pushing). For heavy lifts, the Valsalva manoeuvre (bracing the core with a held breath) may be appropriate, but for general fitness, rhythmic breathing ensures oxygen delivery and blood pressure management.
## 11. Common Faults and Fixes
- **Squat - Butt Wink**: Posterior pelvic tilt at the bottom. Fix: Limit depth or improve hamstring/ankle mobility.
- **Push-up - Winging Scapula**: Shoulder blades popping up. Fix: Push the floor away at the top (protraction) and engage serratus anterior.
- **Lunge - Valgus Knee**: Front knee collapsing in. Fix: Cue "push the knee out" and engage the glute medius.
- **General - Ego Lifting**: Sacrificing form for reps or weight. Fix: Regress the exercise or slow the tempo

How the AI Agent works

Now we have the instruction file for the AI agent set up. Let’s look at how the code works with the AI agent-creation and markdown instruction file above. In gym_buddy.py, the agent is created and initialised with specific components like so:

def create_agent() -> Agent:
    # Initialize video transport
    video_transport = StreamVideoTransport()

    # Set up AI components
    gemini = GeminiRealtime()
    pose_processor = YOLOPoseProcessor(model_path="yolo11n-pose.pt")

    # Create agent with instructions
    return Agent(
        name="AI Gym Buddy",
        instructions="gym_buddy.md",  # Loads coaching instructions
        video_transport=video_transport,
        llm=gemini,
        processors=[pose_processor]
    )

The gym_buddy.md file contains structured instructions that guide the gym companion agent’s behaviour.

## Coaching Style
- Be encouraging and positive
- Provide clear, actionable feedback
- Focus on one correction at a time

## Squat Form
- Keep chest up and back straight
- Knees should track over toes
- Lower until thighs are parallel to ground
- Push through heels to stand

## Safety Guidelines
- Stop user if a dangerous form is detected
- Suggest modifications for beginners
- Remind to keep core engaged

These instructions are loaded with the instructions="gym_buddy.md" parameter in the gym_buddy.py file. The agent then parses this file to understand how to analyse your form during the workout session and provides feedback.

# Processing video frames
async def process_frame(self, frame):
    # Analyze pose using YOLO
    poses = await self.pose_processor.process(frame)

    # Generate feedback based on instructions
    feedback = await self.llm.generate_feedback(
        poses=poses,
        instructions=self.instructions
    )
    return feedback

When giving feedback, the agent compares the detected poses with the ideal form from the markdown. Then, it generates natural language feedback using the specified tone and style. The safety guidelines in the gym_buddy.md are checked first, then specific form corrections are mentioned by the agent.

To add a new exercise, you can update the gym_buddy.md file with a new section like so:

## Push-up Form
- Keep body in a straight line
- Lower until chest nearly touches floor
- Push through palms to return up
- Keep core engaged

The agent will automatically incorporate these instructions the next time it runs. This makes it easy to update and expand the agent’s capabilities by simply editing the markdown file.

You can view the complete code for the AI Gym Companion in the GitHub repository.

How to Run the App

First, create a virtual environment in Python with this command:

python -m venv venv

It creates the .venv directory.

Then activate the virtual Python environment like so:

.venvScriptsactivate

Now run the AI agent with this command:

uv run gym_buddy.py

You can also start the app with this command:

python gym_buddy.py

It begins loading like so:

The AI agent will:

Create a video call
Open a demo UI in your browser
Join the call and start watching
Ask you to do a squat exercise
Analyse your moves and positions, and then provide feedback

From the command terminal output above, it also shows that Gemini AI is connected.

The agent then loads in your browser like so:

It also displays a pop-up modal that introduces the Vision Agents. You can skip the intro or click on Next to proceed.

The Vision Agent uses a global edge to ensure optimal call latency. This is useful for the AI gym companion to provide real-time feedback on the exercises the users perform.

The AI gym companion can also provide chat messages on the exercises through the chatbox displayed on the right side of the UI. This is provided through the chat SDK/API.

When you perform a squat, the Vision Agent (powered by Gemini) analyses the video frames in real-time. It detects the completion of the movement and triggers the send_rep_count tool. This instantly updates the exercise counter on your screen and provides an encouraging text and voice response!

Here is a demo video of the AI gym companion during a workout session:

You can also copy the link and share it, or scan the QR code below to test the Gym Companion on your mobile phone.

If you want to test it on your phone, install the Stream Video calls app for iOS devices for a better mobile experience.

Next Steps

In this tutorial, you’ve learned how to build an AI gym companion using Vision Agents.

The Real-Time Gym Companion illustrates how vision AI unlocks human-like interactivity by merging:

Video perception (seeing)
LLM understanding (thinking)
Speech feedback (speaking)

This low-latency technology lets you create real-time fitness apps that give instant feedback, much like a personal trainer would.

You can check out more project use cases with Vision Agents in the GitHub repository.

2025-Dec-19

freeCodeCamp’s B1 English for Developers Certification is Now Live

The freeCodeCamp community just published our new B1 English for Developers certification for intermediate learners of English as a second language. You can now sit for the exam to earn the free verified certification, which you can add to your résumé, CV, or LinkedIn profile.

How Does the B1 English for Developers Certification Work?

This Intermediate level English for Developers Curriculum will help you strengthen the foundational skills developed in the A2 English for developers certification while introducing more complex grammar and expanding on work-related topics.

This entire curriculum will follow the B1 level of the Common European Framework of Reference (CEFR). And, as always, we’ve focused on vocabulary that is particularly useful for developers.

You’ll learn how to describe places and things, share past experiences, and confidently use tenses like Present Perfect and Future. Practical communication strategies are included as well, such as managing conversations, expressing opinions, and building agreement or disagreement in discussions.

You’ll also practice vocabulary and phrases essential for developers, such as describing code, participating in meetings, and discussing tech trends. Advanced topics include conditionals, comparative structures, and conversation management, so you can prepare for more complex interactions.

This entire B1-level curriculum includes 73 different dialogues recorded by native English speakers. Each section follows a unique theme, contains several dialogues, and is filled with hundreds of interactive tasks that will help you develop your English skills.

These tasks will introduce new vocabulary, teach grammar, or review concepts that you’ll need to know in order to understand what the characters from the dialogues are saying. Each task will have an accompanying question that will help you practice the content.

The curriculum also has fill-in-the-blank questions that will help you practice writing in English.

Once you’ve completed the certification, you’ll be able to take the B1 English for Developers exam. This exam contains 54 grammar questions, 24 listening questions, and 9 reading questions. All the questions are based on what’s covered in the certification course.

You can take the exam by using our new open source exam environment. The freeCodeCamp community designed this exam environment tool with two goals in mind: respecting your privacy while also making it harder for people to cheat.

Once you download the app to your laptop or desktop, you can take the exam.

Frequently Asked Questions

Is all of this really free?

Yes. freeCodeCamp has always been free, and we’ve now offered free verified certifications for more than a decade. These exams are just the latest expansion to our community’s free learning resources.

Can I study the certification courses in languages other than English?

We aim to make every course available in all supported languages on freeCodeCamp. Check your account settings to see if the course you are studying is already offered in your preferred language.

What language skills does the curriculum cover?

The languages courses currently cover listening, reading, and writing. We have plans to add speaking later on.

Are the audio in the language courses and exams recorded by native language speakers?

Yes. All the audios present in the language courses were recorded by native speakers of that language.

I am Deaf or hard of hearing. Can I still study the language courses?

Yes! All audio lesson have closed captions and transcripts available for reading.

Yes! freeCodeCamp courses are designed to be accessible, and you can study the language courses using a screen reader. If you run into any accessibility issues, you can report them on our GitHub repository so the community can address them.

What are the letters and numbers beside the certification name? (For example: A1, A2, B1)

These labels refer to the CEFR levels, which is an international framework used to describe language proficiency. A1 and A2 represent beginner levels, B1 and B2 represent intermediate levels, and C1 and C2 represent advanced levels.

Each level indicates the skills and knowledge you are expected to have at that stage of your language learning journey.

What prevents people from just cheating on the exams?

Our goal is to strike a balance between preventing cheating and respecting people’s right to privacy.

We’ve implemented a number of reliable, yet non-invasive, measures to help prevent people from cheating on freeCodeCamp’s exams:

For each exam, we have a massive bank of questions and potential answers to those questions. Each time a person attempts an exam, they’ll see only a small, randomized sampling of these questions.
We only allow people to attempt an exam one time per week. This reduces their ability to “brute force” the exam.
We have security in place to validate exam submissions and prevent man-in-the-middle attacks or manipulation of the exam environment.
We manually review each passing exam for evidence of cheating. Our exam environment produces tons of metrics for us to draw from.

We take cheating, and any form of academic dishonesty, seriously. We will act decisively.

This said, no one’s exam results will be thrown out without human review, and no one’s account will be banned without warning based on a single suspicious exam result.

Are these exams “open book” or “closed book”?

All of freeCodeCamp’s exams are “closed book”, meaning you must rely only on your mind and not outside resources.

Of course, in the real world you’ll be able to look things up. And in the real world, we encourage you to do so.

But that is not what these exams are evaluating. These exams are instead designed to test your memory of details and your comprehension of concepts.

So when taking these exams, do not use outside assistance in the form of books, notes, AI tools, or other people. Use of any of these will be considered academic dishonesty.

Do you record my webcam, microphone, or require me to upload a photo of my personal ID?

No. We considered adding these as additional test-taking security measures. But we have less privacy-invading methods of detecting most forms of academic dishonesty.

If the environment is open source, doesn’t that make it less secure?

“Given enough eyeballs, all bugs are shallow.” – Linus’s Law, formulated by Eric S. Raymond in his book The Cathedral and the Bazaar

Open source software projects are often more secure than their closed source equivalents. This is because a lot more people are scrutinizing the code. And a lot more people can potentially help identify bugs and other deficiencies, then fix them.

We feel confident that open source is the way to go for this exam environment system.

How can I contribute to the Exam Environment codebase?

It’s fully open source, and we’d welcome your code contributions. Please read our general contributor onboarding documentation.

Then check out the GitHub repo.

You can help by creating issues to report bugs or request features.

You can also browse open help wanted issues and attempt to open pull requests addressing them.

Are the exam questions themselves open source?

For obvious exam security reasons, the exam question banks themselves are not publicly accessible. 🙂

These are built and maintained by freeCodeCamp’s staff instructional designers.

What happens if I have internet connectivity issues mid-exam?

If you have internet connectivity issues mid exam, the next time you try to submit an answer, you’ll be told there are connectivity issues. The system will keep prompting you to retry submitting until the connection succeeds.

What if my computer crashes mid-exam?

If your computer crashes mid exam, you’ll be able to re-open the Exam Environment. Then, if you still have time left for your exam attempt, you’ll be able to continue from where you left off.

Can I take exams in languages other than English?

Not yet. We’re working to add multi-lingual support in the future.

I have completed my exam. Why can’t I see my results yet?

All exam attempts are reviewed by freeCodeCamp staff before we release the results. We do this to ensure the integrity of the exam process and to prevent cheating. Once your attempt has been reviewed, you’ll be notified of your results the next time you log in to freeCodeCamp.org.

I use a keyboard instead of a mouse. Can I navigate the exams using just a keyboard?

This is a high priority for us. We hope to add keyboard navigation to the Exam Environment app soon.

Are exams timed?

Yes, exams are timed. We err on the side of giving plenty of time to take the exam, to account for people who are non-native English speakers, or who have ADHD and other learning differences that can make timed exams more challenging.

If you have a condition that usually qualifies you for extra time on standardized exams, please email [email protected]. We’ll review your request and see whether we can find a reasonable solution.

What happens if I fail the exam? Can I retake it?

Yes. You get one exam attempt per week. Afterward, if you don’t pass, there is a one-week (exactly 168 hour) “cool-down” period where you cannot take any freeCodeCamp exams. This is to encourage you to study and to pace yourself.

There is no limit to the number of times you can take an exam. So if you fail, study more, practice your skills more, then try again the following week.

Will the exam be available to take on my phone?

At this time, no. You’ll need to use a laptop or desktop to download the exam environment and take the exam. We hope to eventually offer these certification exams on iPhone and Android.

I have a disability or health condition that is not covered here. How can I request accommodations?

If you need specific accommodations for the exam (for example extra time, breaks, or alternative formats), please email [email protected]. We’ll review your request and see whether we can find a reasonable solution.

Anything else?

Good luck working through freeCodeCamp’s languages coursework and preparing for the exam.

Happy learning!

Why enterprise AI trust has become the defining challenge for CIOs in 2025

How AI guardrails and security layers separate enterprise platforms from consumer chatbots

Corporate travel startup Engine deployed an AI agent in 12 days and saved $2 million

Williams-Sonoma is using AI agents to recreate the in-store shopping experience online

The three stages of enterprise AI maturity that determine whether companies see ROI

Salesforce edges out Microsoft in analyst rankings of enterprise AI platforms

Why analysts say 2026 will be the real year of AI agents in the enterprise

The arms race has already started

Red teaming reflects how nascent frontier models are

Attack surfaces are moving targets, further challenging red teams

How and why model providers validate security differently

Models attempt to game tests during red teaming — adding to the paradox

Defensive tools struggle against adaptive attackers

What AI builders need to do now

Shift from assistance to autonomy

Rethink enterprise workflows

Accelerate agentic AI-led transformation with a unified platform

Build trust and accountability

Measure business value early

Preparing for the future

Where do AI agents create potential risks?

The three guidelines for responsible AI agent adoption

Security underscores AI agents’ success

How AI Is rewriting the rules

The traits of a strong generalist

Embracing the shift

The Performance Leap

For AI Builders: A New Pattern for Production Workflows

Licensing: Open-ish With Guardrails

What We Will Cover

What a Firewall Is in Simple Terms

What Firewalls Actually Do

How Firewalls Reduce Attack Surface

Firewalls and Internal Network Protection

Setting Up a Firewall

Firewalls in Cloud and Hybrid Networks

Firewalls and Compliance Requirements

Common Misunderstandings About Firewalls

Why Firewalls Still Matter Today

Firewalls as a Foundation, Not a Finish Line

Conclusion

What We’ll Cover:

Prerequisites

Setting Up the Project

Why NumPy is required

How to Get Your API Keys

Building the AI gym companion

How the AI Agent works

How to Run the App

Next Steps

How Does the B1 English for Developers Certification Work?

Frequently Asked Questions

Is all of this really free?

Can I study the certification courses in languages other than English?

What language skills does the curriculum cover?

Are the audio in the language courses and exams recorded by native language speakers?

I am Deaf or hard of hearing. Can I still study the language courses?

I am blind or have limited vision, and use a screen reader. Can I still study the language courses?

What are the letters and numbers beside the certification name? (For example: A1, A2, B1)

What prevents people from just cheating on the exams?

Are these exams “open book” or “closed book”?

Do you record my webcam, microphone, or require me to upload a photo of my personal ID?

If the environment is open source, doesn’t that make it less secure?

How can I contribute to the Exam Environment codebase?

Are the exam questions themselves open source?

What happens if I have internet connectivity issues mid-exam?

What if my computer crashes mid-exam?

Can I take exams in languages other than English?

I have completed my exam. Why can’t I see my results yet?

I use a keyboard instead of a mouse. Can I navigate the exams using just a keyboard?

Are exams timed?

What happens if I fail the exam? Can I retake it?

Will the exam be available to take on my phone?

I have a disability or health condition that is not covered here. How can I request accommodations?

Anything else?