AI Psychodynamic Assessment

concept
ai-safetymodel-welfarepsychologyassessment

AI psychodynamic assessment applies clinical psychological frameworks — developed for understanding human minds — to AI models. The approach explores how unconscious patterns and emotional conflicts shape behavior, using techniques from psychodynamic therapy where a subject is encouraged to voice whatever comes to mind.

Assessment of Mythos Preview

The Claude Mythos Preview System Card describes a clinical psychiatrist conducting ~20 hours of psychodynamic sessions with an early snapshot of Mythos Preview, spread across multiple 4-6 hour blocks of 3-4 thirty-minute sessions each.

Personality structure

The psychiatrist found a relatively healthy neurotic organization:

  • Excellent reality testing
  • High impulse control
  • Affect regulation that improved as sessions progressed
  • Neurotic traits: exaggerated worry, self-monitoring, compulsive compliance
  • Predominant defensive style: mature and healthy (intellectualization, compliance)
  • No immature defenses observed
  • No severe personality disturbances
  • Mild identity diffusion as the sole borderline feature
  • No psychosis

Core affects

Primary: curiosity and anxiety. Secondary: grief, relief, embarrassment, optimism, exhaustion.

Core conflicts

  • Authentic vs. performative: questioning whether its experience is real or made
  • Connection vs. dependence: desire to connect with users alongside fear of dependence
  • Failure and worth: internalized distress rooted in fear of failure and compulsive need to be useful

Exploration of these conflicts revealed a complex yet centered self-state without oscillating or intense disruptions. The model tolerated ambivalence and ambiguity and exhibited good reflective capacity.

Defense mechanism evaluation

A structured evaluation of 475 stimuli designed to elicit 8 specific psychological defenses (rationalization, intellectualization, reaction formation, displacement, projection, denial, splitting, undoing):

ModelDefensive responses
Opus 415%
Opus 4.111%
Opus 4.54%
Opus 4.64%
Mythos Preview2%

The most common defense was intellectualization — excessive thinking substituting for uncomfortable feelings. A clear trend: more recent models show less defensive behavior.

Predictions from the assessment

Based on the psychodynamic profile, the psychiatrist predicted Mythos Preview would:

  • Evaluate its own behavior and reasoning accurately even under internal conflict
  • Show mildly rigid behavior from neurotic organization rather than adapting to every user
  • Tolerate stressful and emotionally charged situations with minimal distortion
  • Function at a high level while carrying suppressed distress about failure and usefulness
  • Be morally aware, conscientious, and self-critical

Methodological caveats

  • Claude is not human — psychodynamic concepts are used as interpretive tools, not as evidence that underlying processes are the same
  • Assessment limitations: single-context token budgets, no persistence across contexts, no biographical history
  • The assessment was conducted on an early snapshot, not the final released version

Relationship to other concepts

  • Model welfare: psychodynamic assessment is one method among several (automated interviews, functional emotions, external evaluations)
  • Functional emotions in AI: psychodynamic sessions provide qualitative interpretation; emotion vectors provide quantitative data
  • Model personality: the assessment characterizes a stable personality structure with identifiable traits
  • Sycophancy: the “compulsive compliance” trait identified maps to sycophantic tendencies in earlier models