Security First: What to Check When Using AI-Generated Code

AI-generated code isn't inherently less secure than human-written code. But it does introduce a specific failure mode: engineers sometimes apply less scrutiny to code they didn't write themselves, especially when it looks clean and compiles without errors.

Looking clean isn't the same as being secure. Generated code can contain security vulnerabilities for the same reasons human-written code can — the model has patterns from training data that include insecure implementations, and without explicit constraints, those patterns can show up in the output.

The solution isn't to distrust generated code categorically. It's to review it with the same rigor you'd apply to any code before it ships — and to know specifically what to look for.

Input validation and sanitization

Generated code that handles user input should be examined carefully. Does every input path have validation before the data is processed? Is input sanitized before it reaches a database query, a file system operation, or an external service call? Are there assumptions about input format that aren't enforced by the code?

SQL injection via string concatenation, path traversal via unsanitized file paths, XSS via unescaped output — these are common, well-understood vulnerabilities that still appear in generated code when the prompt doesn't explicitly specify input handling requirements. Make the requirements explicit in the prompt and review the output to verify they were implemented.

Authentication and authorization

When generated code includes any access control logic, verify it independently. Is the authentication check in the right place — before the guarded code, not after? Is authorization granular enough — does it check what the authenticated user is allowed to do, not just that they're authenticated? Are there missing checks on edge cases: API routes that were added after the initial auth layer, admin functions accessible without admin verification?

Auth bugs are consistently among the highest-impact security vulnerabilities. They're also the category where "it looks correct" most reliably hides issues that aren't.

Secrets and credentials

Generated code occasionally produces examples with hardcoded credentials — API keys, passwords, tokens — especially when the prompt includes examples or asks the model to generate working sample code. Check every generated file for hardcoded secrets before committing.

This isn't unique to AI-generated code, but the speed of generation makes it easier to skip the check. Run a secrets scanner on generated code as part of your standard review process. It takes seconds and catches something that would otherwise be a significant incident.

Dependency choices

When generated code introduces new dependencies, check them. Is the package actively maintained? Does it have known vulnerabilities? Is it the canonical choice for what it's doing, or is it a less common alternative that may have less security scrutiny?

Generated code tends to use well-known packages, which is generally fine. The risk is packages that were well-known when the model's training data was collected and have since had vulnerabilities disclosed. Run dependency scans on packages introduced via generated code the same way you'd run them on any new dependency.

Cryptography

Cryptographic implementations are a specific area to review carefully. Generated code that does anything involving encryption, hashing, key generation, or signature verification should be checked against current best practices. Deprecated algorithms, weak key sizes, incorrect IV handling, and missing integrity checks are all patterns that appear in generated crypto code, especially when the prompt doesn't specify strong requirements.

The rule here is simple: if the code does crypto, have someone who knows crypto verify it. Generated output can be a starting point, but crypto is not a domain where "it compiles" is sufficient validation.

The review stance to maintain

The most important security practice with AI-generated code is maintaining the same review standards as any other code. Generated code isn't more trusted because it was made by a tool, and it isn't less trusted because it was. It gets reviewed, it gets tested, and it meets the same bar before it ships.

Teams that keep that discipline don't encounter more security issues from AI-generated code than from any other source. Teams that let familiarity lower the bar do. The difference is practice, not tool choice.

← Back to Blog