The Calm Briefing

Good morning, Daniel. It's Tuesday, June 30th — halfway through the year already.

Today's Headlines

AI Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution
RESEARCH Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models
RESEARCH Correct codes for the wrong reasons? Validating LLMs as measurement instruments for theoretical constructs
AI Memory-Managed Long-Context Attention: Editable Request-Local Memory
TRENDING Open Memory Protocol – One Memory Store for Claude, ChatGPT, Curso
TRENDING Working With AI: A concrete example
TRENDING .self: A new top-level domain designed to support self-hosting

AI & Technology

ArXiv CS.CL · 1h ago
Researchers are tackling one of the major headaches in interpretability work: sparse autoencoders that scale linearly with context length become unwieldy for long conversations. This paper introduces turn-averaged SAEs that represent an entire conversational turn with a fixed number of features instead of per-token features. When tested by LLMs, these turn-level features described high-level characteristics more completely than token-level ones, which could make studying long transcripts much more tractable.
ArXiv CS.CL · 1h ago
This is a beautiful piece of work that takes a developmental lens to theory of mind in LLMs — tracking when and how mentalizing emerges across the Olmo2 and Pythia training stages. They find that above-chance false belief task performance requires both scale and training volume, emerges relatively late, and improves most with post-training (SFT, DPO). It's the kind of work that bridges your interests in developmental psychology, contemplative understanding of mind, and the technical mechanics of how these systems actually learn.
ArXiv CS.CL · 1h ago
When an LLM codes text the same way a human would, we call it reliable — but that says nothing about construct validity. The model might be picking up on correlates that have nothing to do with what the construct actually measures theoretically. This paper proposes 'grain calibration' as a method to decompose constructs into clause-level components with extractive evidence, making the reasoning process transparent rather than opaque. It's directly relevant to the kind of rigorous measurement work that matters in contemplative science and developmental research.
ArXiv CS.CL · 1h ago
This work separates two goals that long-context models typically conflate: compressing history efficiently versus maintaining reliable long-term memory. They're exploring memory-managed attention that combines a fast recurrent or sparse backbone with explicit editable memory slots. It's early-stage research, but the framing speaks to fundamental questions about how agentic systems should handle knowledge over time — when to write, overwrite, protect, or discard information.
ArXiv CS.CL · 1h ago
Across 21 matched language models, this research shows that a simple static per-layer stagger of Fibonacci-spaced attention beats both fixed and learned approaches. What's interesting is that the improvement is base-agnostic — you can apply the same stagger to power-of-2 spacing and lift it above other methods. Sometimes elegant structure beats learned complexity.
Hacker News · 4h ago
Someone's building an open protocol for unified memory across different AI assistants, including Claude. It's still early but speaks to the interoperability challenges as these systems become more agentic and personalized. Worth watching how memory architectures evolve across providers.
ArXiv CS.CL · 1h ago
Fine-tuning models with benign multilingual data can increase adversarial compliance rates four-fold in some settings, and the safety drift is highly sensitive to both the fine-tuning language and evaluation language. It's the first comprehensive empirical look at this phenomenon across Llama, Qwen, and Gemma models and nine languages. Another reminder that capability and safety don't move in lockstep.
Wired via Techmeme · 6h ago
Hundreds of Meta contractors reportedly pretended to be teenagers to test how competitor chatbots responded to prompts about suicide, sex, and other sensitive topics. It's a window into the often-hidden red-teaming work that goes into AI safety, and raises questions about the methods companies use to evaluate each other's systems.
ArXiv CS.CL · 1h ago
Toxicity moderation systems face drift as harmful behavior evolves through coded language and strategic adaptation. DriftGuard introduces multi-monitor detection that tracks global drift, identity-harm drift, model uncertainty, toxic-risk drift, and false-negative-risk drift — then selectively updates the model when safety-relevant change is detected. It's an approach that recognizes moderation as a dynamic problem, not a static one.

Contemplative & Development

ArXiv CS.CL · 1h ago
This paper shows that 42.6% of annotator disagreement in hate speech datasets concentrates at the hate/offensive boundary — suggesting annotators apply different thresholds for where 'hate' begins rather than making random errors. When you collapse disagreement into majority vote before training, you're making a measurement decision that erases real variation in how people perceive harm. It's a technical paper about annotation, but it touches something deeper about whose standards get embedded in AI systems.

Trending Reads

Hacker News · 14h ago
The htmx creator walks through a real example of collaborating with AI on code. It's not a hot take about AGI or alignment — just a grounded, practical look at what the day-to-day experience of human-AI collaboration actually feels like for someone building software. The kind of contemplative, non-hype perspective that's rare in AI discourse right now.
Hacker News · 9h ago
A proposal for a .self TLD explicitly designed to support self-hosting and digital sovereignty. It's picking up traction on HN (241 points) and speaks to the broader desire for human-centered alternatives to corporate platform dependency. Fits squarely in the liminal web's concerns about ownership, agency, and infrastructure.
The Intercept via Hacker News · 1d ago
Someone received a 30-year sentence for transporting zines — yes, zines — and The Intercept is calling it a major free speech crisis. The details matter here, and this is the kind of story that connects questions about state power, expression, and community organizing. Worth understanding what's actually happening.

Tonight's Reading

For the evening, on the Daylight

ArXiv CS.CL
This paper sits at the intersection of several threads you've been tracking: developmental psychology, theory of mind, contemplative understanding of consciousness, and the technical mechanics of how LLMs actually learn. The researchers take a genuinely developmental lens — not just testing whether models can pass false belief tasks, but tracking when and how mentalizing capabilities emerge across training stages in Olmo2 and Pythia. They find that this capacity requires both scale and sufficient training volume, emerges relatively late, and improves most through post-training interventions like SFT and DPO. It's the kind of work that bridges rigorous empirical research with deeper questions about what 'understanding' means in these systems. The methodology alone — looking at preconditions and trajectories rather than just binary pass/fail — models the kind of developmental thinking that's been central to your own work in attachment and adult development. Estimated read time: 25-30 minutes for the full paper, though the intro and results sections reward close attention.
ArXiv CS.CL
If you've ever thought about what it actually means to 'measure' something like attachment style or developmental stage — and you have — this paper will speak to you. The core problem is elegant: when an LLM codes text the same way a human would, we call it reliable, but reliability tells us nothing about construct validity. The model might be picking up on surface correlates that have nothing to do with what the construct theoretically measures. The authors propose 'grain calibration' as a method to make the reasoning process transparent rather than opaque, decomposing constructs into clause-level components with extractive evidence. It's directly relevant to the kind of measurement work that matters in contemplative science and developmental research, and it challenges some sloppy thinking about what it means when AI 'understands' a theoretical construct. This is the kind of methodological rigor that developmental and contemplative researchers need to take seriously as LLMs become research tools. Estimated read time: 20-25 minutes.