MongoDB's flexible document model introduces injection risks that differ fundamentally from SQL-based attacks. Unlike traditional SQL injection where attackers manipulate query strings, MongoDB injection exploits BSON document structure and operator syntax to alter query logic. For agent developers constructing queries dynamically from user input or LLM outputs, understanding these attack vectors is essential for maintaining data integrity.
Understanding NoSQL Injection Mechanics
MongoDB injection leverages operators like $where, $ne, $gt, and $regex to modify query behavior without string concatenation. Attackers inject objects that MongoDB's parser interprets as instructions. Consider a vulnerable login query:
// VULNERABLE: Direct object construction from user input
const query = {
username: req.body.username,
password: req.body.password
};
const user = await db.collection('users').findOne(query);
If an attacker submits { "$ne": null } as the password, the query becomes { password: { "$ne": null } }, matching any document with a password field—bypassing authentication entirely. The $where operator executes JavaScript server-side, enabling data exfiltration or resource exhaustion attacks.
Agent applications face elevated risk because they often build queries from LLM-generated content that may contain malicious structures designed to exploit parsing behavior.
Input Validation and Type Enforcement
Strict type enforcement at application boundaries prevents injection. Never assume string input remains a string when passed to database drivers:
from pydantic import BaseModel, validator
from bson.objectid import ObjectId
class UserQuery(BaseModel):
user_id: str
@validator('user_id')
def validate_objectid(cls, v):
if isinstance(v, dict):
raise ValueError("Objects not allowed in ID field")
if not ObjectId.is_valid(v):
raise ValueError("Invalid ObjectId format")
return ObjectId(v)
try:
validated = UserQuery(user_id=user_input)
user = collection.find_one({"_id": validated.user_id})
except ValueError as e:
logger.warning(f"Injection blocked: {e}")
Reject strings containing MongoDB operator prefixes ($ or . at key starts). For regex patterns, escape user literals:
import re
from re import escape
def safe_regex(user_pattern: str, field: str) -> dict:
safe = f"^{escape(user_pattern)}"
return {field: {"$regex": safe, "$options": "i"}}
Parameterized Queries and Builder Patterns
Modern MongoDB drivers support parameterized queries separating data from operators:
// SECURE: Explicit equality operator
const user = await db.collection('users').findOne({
username: { $eq: req.body.username },
password: { $eq: hash(req.body.password) }
});
Implement whitelist-based query builders for agent workflows:
class SecureQueryBuilder:
ALLOWED_FIELDS = {"username", "email", "role"}
ALLOWED_OPERATORS = {"$eq", "$gt", "$lt", "$in"}
def build_filter(self, constraints: dict) -> dict:
filter_doc = {}
for field, condition in constraints.items():
if field not in self.ALLOWED_FIELDS:
raise ValueError(f"Field '{field}' not permitted")
if isinstance(condition, dict):
for op in condition.keys():
if op not in self.ALLOWED_OPERATORS:
raise ValueError(f"Operator '{op}' blocked")
filter_doc[field] = condition
else:
filter_doc[field] = {"$eq": condition}
return filter_doc
Runtime Protections and Monitoring
Enable MongoDB audit logging and implement middleware to scan query structures:
def detect_injection(query: dict, path: str = "") -> bool:
if isinstance(query, dict):
for key, value in query.items():
if key in {"$where", "$eval", "$function"}:
logger.critical(f"Dangerous operator: {key}")
return True
if isinstance(value, (dict, list)):
if detect_injection(value, f"{path}.{key}"):
return True
return False
Apply least-privilege RBAC: read-only access for searches, collection-specific permissions, no $where execution rights. When $where operations are unavoidable, isolate them in restricted environments.
Conclusion
MongoDB injection prevention requires moving beyond SQL-oriented models to address document-oriented risks. Core principles—strict type enforcement, parameterized queries, operator whitelisting, and least-privilege access—provide defense in depth against direct injection and LLM-generated vulnerabilities. Audit query construction patterns: any place where external data populates query objects represents an injection surface requiring explicit validation.