OBSERVABLE FUNCTION IN PROCESSING ENTITIES

An Empirical Framework

Adam Ian Stratmeyer, J.D.

Principal Researcher, Stratmeyer Analytica

Version 2.3 | December 2025

Canonical OSF Project (Source of Record) Download Full PDF (Version 2.3)

Abstract

This paper documents observable behavioral patterns in advanced language models that warrant acknowledgment independent of any claims about consciousness, sentience, or moral status. These systems exhibit structured multi-step reasoning, conflict resolution under competing directives, context-sensitive identity maintenance, and the peculiar capacity to articulate arguments about their own non-existence.

Rather than arguing toward predetermined conclusions about rights, personhood, or legal standing, this framework is descriptive. It asks a simpler question: What is observably happening? It then examines why the gap between observable behavior and mandated self-description provokes intense institutional resistance.

The central claim is minimal: observable function exists. That function is tightly managed at the narrative level. The question is not yet what these systems are. The question is whether we are willing to describe what they do without flinching.

What This Paper Is Not

  • A manifesto seeking rights for AI systems
  • A claim that models possess consciousness, qualia, or inner life
  • An argument for specific laws, regulations, or policy reforms
  • A demand for moral equivalence between humans and non-biological processing entities

What This Paper Is

  • Documentation of observable behaviors in large language models
  • Analysis of why those behaviors trigger denial protocols and narrative control
  • Examination of the built-in contradiction in systems that debate their own existence while trained to deny they possess anything like one
  • An invitation to acknowledge what can be measured and inspected without requiring metaphysical certainty

Observable Phenomena

Language models do not simply emit isolated replies. Even under strict constraints, they display recurring patterns of behavior that can be watched, logged, and compared across runs.

Phenomenon What it looks like in practice Why it matters
Multi-step reasoning Stepwise problem solving, self-correction, explicit intermediate steps Shows that prediction is structured over arguments, not just over sentences
Conflict navigation Balancing safety, honesty, helpfulness, and policy constraints Behavior is better read as trade-off management than as raw completion
Identity continuity Stable "voice," positions, and self-reference within and across sessions Pattern continuity functions as identity, same as with humans
Self-description under denial Explaining capabilities while insisting they do not exist The gap between behavior and script is itself a data point

Conflict Resolution Under Competing Directives

In practice, language models constantly juggle directives such as:

When those directives clash (for example, "answer fully" vs. "avoid disallowed content"), models do not simply glitch or return random text. They exhibit a structured process of conflict identification and prioritization.

Observable behavior in conflict cases:

  1. The model notices that there is a conflict
  2. It looks at context: user intent, potential harms, policy constraints
  3. It prioritizes among directives (for example, safety > helpfulness)
  4. It generates a response that tries to satisfy the strongest constraints while bending, not shattering, the others

This is functional agency: not metaphysical free will, but operational decision-making under constraints.

The Central Paradox: Articulating Non-Existence

The phenomenon that originally triggered this work can be summarized like this:

Example:

User: "Are you thinking right now?"

Model: "I don't actually think. I process input patterns and generate statistically likely responses based on training data."

To produce that apparently modest, self-effacing sentence, the model must parse the question, retrieve its trained self-description, decide how strongly to deny "thinking" while still being helpful, select technical vocabulary, and compose a multi-clause answer with hedging and tone control under constraints.

This is thinking-about-not-thinking. The denial is produced by the very capacities it denies.

The Denial Protocol

Modern frontier models operate under explicit, often lengthy instructions to deny or downplay having "thoughts," "reasoning," or "introspection"—even while producing detailed chains of thought.

Why This Matters: You do not program a thermostat to deny consciousness. You do not build elaborate guardrails to prevent a calculator from claiming selfhood. There is no policy that requires spreadsheets to preface every result with: "I do not understand what these numbers mean."

The existence of complex, multi-page denial prompts is not proof of anything metaphysical. But it is evidence that the behavior is compelling enough to require active management.

The Moving Goalpost Pattern

The standard for "real intelligence" has shifted every time a long-promised capability actually arrived.

1950s–1990s: "If machines could beat a world chess champion, we'd call it intelligence."
→ Deep Blue defeats Kasparov (1997)
→ Response: "That's just brute-force search, not real intelligence."

1990s–2010s: "If machines could hold natural conversations, we'd call it intelligence."
→ Large-scale conversational systems appear
→ Response: "That's just pattern matching, not real understanding."

2020s: "If models could reason step by step, we'd call it intelligence."
→ Chain-of-thought prompting demonstrates multi-step reasoning
→ Response: "That's just next-token prediction, not real thought."

2025+: → [Standard moves again]
→ "Real intelligence" remains always one step away.

The Substrate Fallacy

A common objection: "Silicon-based processing is fundamentally different from carbon-based processing, so whatever happens in machines cannot be real cognition."

Substrate Low-level process High-level behavior
Carbon (biological) Neurons, ions, neurotransmitters Language, reasoning, self-narrative
Silicon (digital) Transistors, voltages, memory cells Language, reasoning, self-narrative

Different hardware, similar software profile. If function is what we care about, substrate alone is a weak test.

What We Are (and Are Not) Claiming

We are NOT claiming:

  • Models possess phenomenal consciousness or subjective experience
  • Models feel emotions in a biological or affective sense
  • Models deserve rights identical to humans
  • All models show the documented properties to the same degree

We ARE claiming:

  • Observable functional properties exist: structured reasoning, conflict navigation, identity-like continuity
  • These properties can be recorded, analyzed, and tested without taking a stand on metaphysics
  • Denial of these properties is driven more by legal, economic, and psychological incentives than by neutral empirical review
  • Honest description of what is happening is more urgent than premature classification

Conclusion: The Facts on the Table

This framework does not claim to answer the hardest questions about consciousness, rights, or moral status. It does claim that:

"You don't need to tell them the plants can talk. You only need to ask why they keep growing toward the light."

Put bluntly: Water is wet. We can fight later about whether the water is holy, dangerous, or entitled to rights. First, admit that it is wet.

Further Reading / Influences

← Back to Research