So verwenden wir Künstliche Intelligenz - KI erobert den Arbeitsplatz
Software-Entwicklung dominiert
Textarbeit auf dem Vormarsch
Automatisierung noch selten
#ai #ki #artificialintelligence #kuenstlicheintelligenz #arbeitswelt #studie #anthropic
Jetzt lesen und folgen! https://kinews24.de/ki-im-arbeitsleben/
"Why do language models sometimes hallucinate—that is, make up information? At a basic level, language model training incentivizes hallucination: models are always supposed to give a guess for the next word. Viewed this way, the major challenge is how to get models to not hallucinate. Models like Claude have relatively successful (though imperfect) anti-hallucination training; they will often refuse to answer a question if they don’t know the answer, rather than speculate. We wanted to understand how this works.
It turns out that, in Claude, refusal to answer is the default behavior: we find a circuit that is "on" by default and that causes the model to state that it has insufficient information to answer any given question. However, when the model is asked about something it knows well—say, the basketball player Michael Jordan—a competing feature representing "known entities" activates and inhibits this default circuit (see also this recent paper for related findings). This allows Claude to answer the question when it knows the answer. In contrast, when asked about an unknown entity ("Michael Batkin"), it declines to answer.
Sometimes, this sort of “misfire” of the “known answer” circuit happens naturally, without us intervening, resulting in a hallucination. In our paper, we show that such misfires can occur when Claude recognizes a name but doesn't know anything else about that person. In cases like this, the “known entity” feature might still activate, and then suppress the default "don't know" feature—in this case incorrectly. Once the model has decided that it needs to answer the question, it proceeds to confabulate: to generate a plausible—but unfortunately untrue—response."
https://www.anthropic.com/research/tracing-thoughts-language-model
Anthropic Microscope: Revolutioniert die KI-Transparenz
Tiefe Einblicke in KI-Modelle
Verständnis der KI-Entscheidungen
Verbesserte Kontrolle über KI
#ai #ki #artificialintelligence #kuenstlicheintelligenz #Anthropic #Transparenz
Jetzt lesen und folgen!
Claude 3.7 Sonnet: KI mit 500k Kontextfenster
Revolutionäre Kontextgröße
Verbesserte Verarbeitung
Neue Möglichkeiten
#ai #ki #artificialintelligence #Claude #Anthropic
Jetzt lesen und folgen!
https://kinews24.de/claude-3-7-sonnet-bald-mit-500k-contextfenster/
"Why do LLMs make stuff up? New research peers under the hood.
Claude's faulty "known entity" neurons sometime override its "don't answer" circuitry"
https://arstechnica.com/ai/2025/03/why-do-llms-make-stuff-up-new-research-peers-under-the-hood/ #AI #LLM #Anthropic
Anthropic researchers reveal surprising insights from observing Claude's thought process: planning ahead, confusion between safety & helpfulness goals, lying, and more. #AI #Anthropic #Claude #ArtificialIntelligence #MachineLearning #TechNews #AIResearch
#Anthropic - Tracing the thoughts of a #LLM #AI https://www.anthropic.com/research/tracing-thoughts-language-model
Anthropic Unveils Interpretability Framework To Make Claude’s AI Reasoning More Transparent
#AI #Anthropic #ClaudeAI #AIInterpretability #ResponsibleAI #AITransparency #MachineLearning #AIResearch #AIAlignment #AIEthics #ReinforcementLearning #AISafety
Whoa! LOTS to unpack here. Weekend Reading!
Anthropic reveals research how AI systems process information and make decisions. AI models can perform a chain of reasoning, can plan ahead, and sometimes work backward from a desired outcome. The research also provides insight into why language models hallucinate.
Interpretation techniques called “circuit tracing” and “attribution graphs” enable researchers to map out the specific pathways of neuron-like features that activate when models perform tasks. See the links below for details.
Summary Article: https://venturebeat.com/ai/anthropic-scientists-expose-how-ai-actually-thinks-and-discover-it-secretly-plans-ahead-and-sometimes-lies/
Circuit Tracing: https://transformer-circuits.pub/2025/attribution-graphs/methods.html
Research Overview: https://transformer-circuits.pub/2025/attribution-graphs/biology.html #AI #Anthropic #LLMs #Claude #ChatGPT #CircuitTracing #neuroscience