Reflections on PromptLock PoC Ransomware

When Accurate Postmortems Become Nearly Useless

PromptLock is not interesting because it uses AI.

It is interesting because it breaks a quiet assumption most incident response systems rely on:

PromptLock does not meaningfully innovate on encryption, propagation, or persistence. What it demonstrates instead is something subtler and more disruptive: a class of threats where the malicious artifact does not exist until execution—and never exists in the same form twice.

That turns out to be less of a malware problem, and much more of an organizational learning problem.

What PromptLock Actually Is (And What It Is Not)

There has been a lot of imprecise language around PromptLock, so clarity matters.

PromptLock is a proof-of-concept ransomware project created by researchers at NYU's Tandon School of Engineering. It is not known to be deployed in the wild. It does not compromise or poison a victim's AI system. It does not perform prompt injection against enterprise LLMs.

Instead, it works as follows:

The malware bundles or connects to a local LLM runtime
The model is given hardcoded natural-language prompts
At runtime, the LLM synthesizes malicious Lua scripts
Those scripts are executed for reconnaissance, staging, or exfiltration

The LLM is not the target. It is the generation engine.

This distinction matters, because it defines the real novelty.

The Genuinely Novel Problem: Generative Polymorphism

Traditional malware evolves through obfuscation and mutation. Even when heavily packed, the underlying logic still exists somewhere and can be reverse-engineered given enough samples.

PromptLock breaks that assumption.

Each execution:

Generates different scripts
Produces different hashes
Leaves different observable artifacts

As ESET observed, indicators of compromise may vary from one execution to another because the malicious code is synthesized dynamically. Splunk reached a similar conclusion: the same malware produces materially different runtime artifacts across runs.

There is no template being tweaked. There is only intent, rendered anew each time.

That difference is what breaks learning systems.

Why This Breaks Conventional Incident Learning

Most incident learning—formal or informal—depends on repeatability.

We assume:

Artifacts recur
Patterns stabilize
Lessons transfer

Postmortems faithfully capture:

Scripts observed
File hashes
Network indicators
Commands executed

But with generation-based malware, those artifacts are execution-specific, not threat-specific.

Which leads to an uncomfortable truth:

You can document exactly what happened—and still be blind next time.

Not because the analysis was wrong, but because it was encoded at the wrong level.

When "Root Cause" Collapses Into a Prompt

In PromptLock-style systems, the "logic" of the attack is not a binary or a script. It is a natural-language instruction.

Based on ESET and Splunk's analysis of the captured prompts, the instructions were operationally specific, for example:

Instructing the model to generate Lua code that identifies files likely to contain PII (based on name, extension, or content heuristics)
Generating staging scripts to compress and prepare files for exfiltration
Producing ransom notes customized to the file types and directory structure discovered on the host

Each of these prompts is:

Trivially modifiable
Model-agnostic
Impossible to signature-match in a durable way

A postmortem that records the prompt captures trivia. The next variant changes wording or decomposition and bypasses everything you learned.

The only stable signal is not what was generated—but what capability was requested.

What Stable Learning Looks Like for Generative Threats

This is the step most organizations miss.

If artifacts are unstable, learning must move up a level.

Consider how the same incident can be recorded two different ways.

Artifact-oriented capture (what most postmortems do today):

Accurate. Also nearly useless.

Capability-oriented capture (what durable learning requires):

The difference is not verbosity. It is abstraction level.

The second version survives mutation. The first does not.

Stable learning for generative threats means capturing:

Capability classes, not outputs
Environmental preconditions, not just symptoms
Patterns across incidents, not hashes within one

The Organizational Memory Failure

This is where the failure becomes systemic.

In most organizations:

Security teams see the IOCs and forensic artifacts
Platform teams see the Ollama or local LLM deployments
ML teams see model usage and prompt behavior
Developers see convenience and productivity gains

No one sees the whole picture until after the next incident.

This is the "Shadow AI" problem highlighted by Splunk: local LLM runtimes are proliferating faster than security teams can inventory them. Each team is locally rational. The organization, collectively, is blind.

Without shared memory:

Each team rediscovers the threat independently
Each postmortem lives in isolation
Each incident feels novel—even when it isn't

Learning decays faster than the threat mutates.

The Defensive Asymmetry We Don't Talk About Enough

There is a structural asymmetry here that matters.

Attackers benefit from:

Generation
Polymorphism
Per-execution novelty

Defenders rely on:

Static artifacts
Historical indicators
Documentation-heavy learning systems

This is not a tooling gap. It is a learning mismatch.

As long as defenders encode lessons as artifacts, and attackers operate in capabilities, the asymmetry persists.

Why This Matters Beyond PromptLock

PromptLock is a proof of concept. The pattern it demonstrates is not.

The same learning failure appears anywhere behavior is generated rather than written:

AI coding assistants
LLM-driven automation
Agent-based systems

In all of these, artifacts vary per invocation. Capabilities persist.

This is the class of problem we are building COEhub to address—not incident archival, but capability-oriented memory that surfaces at decision time.

Closing

PromptLock does not require panic. It requires reframing.

The organizations that learn durably from this incident will not be the ones that catalogued the most indicators. They will be the ones that encoded the capability, connected it across silos, and surfaced it before the next local LLM quietly went live.

That is what durable learning looks like when behavior is generated, not written.

And that is the problem most postmortem systems were never designed to solve.