Audit Every Network Request from Code Execution

AI agents that execute user-provided code face a critical security blind spot: untrusted code making network requests that bypass your monitoring. When your agent runs Python, JavaScript, or SQL on behalf of users, that code inherits the network privileges of your execution environment. Without explicit auditing, you've created a tunnel attackers can exploit for data exfiltration, credential theft, and lateral movement.

This article examines how adversaries exploit unrestricted network access, why traditional logging fails, and what defensive architectures actually work in production.

The Hidden Network in Your Sandbox

When user code runs in your infrastructure, it can reach any endpoint your execution environment can reach—including internal APIs, cloud metadata services, and third-party platforms. The problem isn't just that code can make requests; it's that these requests often occur through channels you aren't monitoring.

Consider a typical scenario: your agent runs user Python code in a container with internet access. The user submits code that downloads a "required" dependency. That package makes a request to http://169.254.169.254/latest/meta-data/iam/security-credentials/ to steal AWS credentials from the instance metadata service. Your logs show successful code execution. Your security team sees nothing.

This is trust boundary collapse—you've created an execution environment that sits at an ambiguous boundary between "user code" and "your infrastructure." Without network segmentation and request auditing, this boundary provides no protection.

Why Standard Logging Falls Short

Most observability stacks capture HTTP requests at the application layer—your agent's explicit API calls are logged and traced. But when user code runs inside your agent process, it can use its own HTTP clients, DNS resolution, and TLS handshakes that never touch your instrumentation.

DNS exfiltration demonstrates this gap. Attackers encode stolen data in subdomain queries like base64-data.attacker.com. These queries resolve through the container's DNS configuration, never appearing as HTTP requests in your application logs. Standard monitoring sees "DNS activity" at best, not the data theft occurring through sequential lookups.

Even when you see traffic, attribution fails. If your sandbox makes a request to an internal service, your logs show the container's identity—not which user, which session, or what code triggered it. During incident response, you cannot answer critical questions: when did this start, which users were affected, what data was accessed.

Implementing Request Interception

Effective auditing requires intercepting network calls before they leave the execution environment. This means operating below the application layer—through network namespaces, proxy injection, or kernel-level filtering.

class AuditedExecutionEnvironment:
    ALLOWED_HOSTS = {'api.github.com', 'pypi.org'}

    def __init__(self, session_id: str, user_id: str):
        self.session_id = session_id
        self.user_id = user_id
        self.proxy_port = self._start_audit_proxy()

    def _log_request(self, method: str, host: str, path: str):
        audit_log.info({
            'event': 'sandbox_network_request',
            'session_id': self.session_id,
            'user_id': self.user_id,
            'method': method,
            'host': host,
            'path': path[:100],
            'timestamp': datetime.utcnow().isoformat()
        })

    def execute(self, code: str) -> dict:
        restricted_env = {
            'HTTP_PROXY': f'http://localhost:{self.proxy_port}',
            'HTTPS_PROXY': f'http://localhost:{self.proxy_port}',
        }
        return self._run_in_namespace(code, restricted_env)

This provides three critical capabilities: request visibility before traffic leaves the sandbox, policy enforcement at the network layer, and audit logs tying every packet to a specific user session.

Production Architecture Patterns

Deploy multiple defensive layers for high-risk code execution:

Network Namespacing with Egress Filtering Run user code in isolated network namespaces with explicit egress rules. Use iptables or eBPF to intercept all outbound connections. Block traffic to RFC 1918 addresses, link-local ranges, and cloud metadata endpoints by default.

Transparent Proxy Injection Inject proxy configuration via environment variables (HTTP_PROXY, ALL_PROXY) and LD_PRELOAD libraries that intercept connect() syscalls. This catches requests from any language runtime.

DNS Query Logging and Filtering Deploy a local DNS resolver in the sandbox that logs all queries and applies domain-based policies. Block internal hostnames and known command-and-control domains. Log query patterns to detect DNS tunneling.

Request Size and Rate Limits Implement per-session caps on outbound data volume and connection frequency. Anomalous patterns—10,000 DNS queries or 500MB transfers—trigger automatic sandbox termination.

Attribution in Audit Logs Every network event must include: session identifier, user identifier, timestamp, source code hash, and the code line responsible. Without this context, incident response cannot determine scope or notify affected users.

Building for Incident Response

Monitoring only matters if it enables action. Your audit pipeline should feed detection systems identifying anomalous patterns: unexpected ports, newly registered domains, unusual data volumes, or internal service access from sandboxed sessions.

When detection triggers, implement automated response: session termination, network isolation, and immediate security team notification. The goal isn't just logging—it's containing compromise before exfiltration completes.

Network request auditing from code execution environments is non-negotiable for production AI agents. The capability to run user code is too valuable to abandon, but the security model must treat every execution as potentially hostile. Implement transparent interception, enforce strict egress policies, and ensure every packet carries full attribution.