The Bayesian Brain and Predictive Processing: A Critique, #1.

(Medical Xpress, 2024)


TABLE OF CONTENTS

1. Introduction

2. Part A: Exposition

2.1 From Passive Reception to Active Inference

2.2 Mathematical Foundations: Bayesian Inference and the Brain

2.3 Core Principles of Predictive Processing: The Brain as a Prediction Machine

2.4 Empirical Evidence for Bayesian and Predictive Processing

2.5 Applications to Cognitive Phenomena

2.6 Applications to Psychopathology

2.7 Theoretical Implications and Unification

2.8 Summary

3. Part B: Critique

3.1 Critical Challenges to Bayesian Brain and Predictive Processing Frameworks

3.2 Computational Intractability and the Tractability Problem

3.3 Neural Implementation Mysteries

3.4 Empirical Challenges and Alternative Explanations

3.5 Conceptual and Theoretical Problems

3.6 Alternative Frameworks and Neglected Perspectives

3.7 Philosophical and Phenomenological Critiques

3.8 Evaluating Neural Evidence

3.9 Methodological Concerns

3.10 Integration Challenges

3.11 Constructive Paths Forward

4. Conclusion

Appendix: Computational Intractability in Bayesian Brain and Predictive Processing Frameworks

REFERENCES


The essay that follows will be published in three installments; this one, the first, contains sections 1-2.

But you can also download and read or share a .pdf of the complete text of this essay, including the REFERENCES, by scrolling down to the bottom of this post and clicking on the Download tab.


The Bayesian Brain and Predictive Processing: A Critique, #1

1. Introduction

The Bayesian brain hypothesis and the predictive processing framework have become central to contemporary discussions of perception, cognition, and action. In this essay, we outline these influential approaches, exploring their mathematical roots, empirical grounding, and implications for neuroscience and cognitive science, while also noting the challenges they face. The Bayesian brain idea views the mind as a statistical inference system, continually updating its understanding of the world through probabilistic reasoning. Predictive processing extends this logic, proposing that the brain functions as a prediction engine that constantly works to minimize errors between its expectations and sensory input. Taken together, the two frameworks aim to provide a unified explanation of how perception, learning, and action interact, all expressed through the mathematics of inference and information theory. Yet, as will be argued, this unification—though elegant—raises questions about computational tractability, empirical specificity, and the limits of theoretical reach.

2. Part A: Exposition

2.1 From Passive Reception to Active Inference

Earlier models of perception tended to treat the brain as a passive receiver of sensory information, building up internal representations from the raw data of the senses. This bottom-up view, dominant throughout much of the twentieth century (Marr, 2010), imagined perception as a stepwise construction of complex mental images from simple features. However, such models struggled to explain persistent puzzles—why illusions occur, how prior knowledge shapes what we see, and how the brain produces coherent experience from fragmentary or noisy data.

In response, a very different idea took shape: the Bayesian brain hypothesis. It suggests that the brain is not a passive receiver at all, but an active inference machine, constantly forming and updating beliefs about the world (Knill and Richards, 1996; Rao and Ballard, 1999). Perception, on this view, is an inferential process—an ongoing negotiation between what we expect and what we actually sense, governed by the rules of probability.

Building on this foundation, Karl Friston and others developed the predictive processing framework, which claims that the brain’s central function is prediction itself (Friston, 2005, 2010; Clark, 2013, 2016). The brain, they argue, generates internal models that anticipate incoming sensory signals and adjusts those models when the predictions fail. In short, perception and learning are not so much about recording the world as about constantly guessing and correcting.

Over the past two decades, these ideas have inspired extensive empirical and theoretical work across neuroscience and psychology. Researchers have explored their implications for perception, action, attention, consciousness, and even mental illness. The result is a body of work that aspires to unify many aspects of mind under a single computational principle.

The purpose of this essay is to examine these claims systematically. We review the mathematical and conceptual foundations of the Bayesian brain and predictive processing approaches, survey the evidence that supports them, and finally consider the problems and limits that come with such a grand theoretical vision.

2.2 Mathematical Foundations: Bayesian Inference and the Brain

Bayes’s Theorem and Probabilistic Reasoning

At the core of the Bayesian brain hypothesis lies Bayes’s theorem, a simple yet remarkably powerful rule for updating beliefs when new information arrives. In its familiar form, the theorem states:

Bayes’s Theorem

P(H|E) = [P(E|H) × P(H)] / P(E)

Here, P(H|E) is the posterior probability—our revised belief about hypothesis H after seeing the evidence E. The term P(E|H) is the likelihood, expressing how probable the evidence would be if H were true. P(H) represents the prior belief before new evidence appears, and P(E) is the overall probability of the evidence itself.

Applied to perception, this framework suggests that the brain treats the world as a set of competing hypotheses and sensory inputs as evidence that weighs for or against each. The perceptual task, then, is to estimate which state of the world is most likely, given noisy and ambiguous data. This idea, developed in foundational work by Knill and Pouget (2004), recasts perception as an inferential process governed by probability rather than by deterministic feature extraction.

Hierarchical Bayesian Models

Real-world perception is far too complex to be handled by a single level of inference. Hierarchical Bayesian models therefore propose that the brain operates through layers of representation, where higher levels encode abstract, general patterns that generate predictions for the lower levels. Each level attempts to anticipate the one beneath it, while discrepancies—prediction errors—are passed upward to refine the higher-level models (Lee and Mumford, 2003; Yuille and Kersten, 2006).

This notion fits neatly with what is known about cortical organization. The brain’s anatomy reveals an abundance of feedback pathways: higher cortical areas send far more signals back down than they receive from below. Such asymmetry implies that perception involves substantial top-down influence, not merely bottom-up construction (Mumford, 1992; Felleman and Van Essen, 1991).

In this view, perception is an ongoing conversation between levels of the brain’s hierarchy. Each level proposes a prediction; each lower level either confirms it or sends back an error signal. Over time, the entire system converges on a stable interpretation of the world.

Predictive Coding

Predictive coding translates these mathematical ideas into a potential neural mechanism (Rao and Ballard, 1999; Friston, 2005). The story goes as follows: higher cortical layers send predictions downward; lower layers compare those predictions with actual inputs; the mismatch between the two—the prediction error—is then transmitted upward to adjust the model.

This creates a continuous loop: predictions flowing down, errors flowing up. According to Bastos et al., distinct populations of neurons may even specialize in carrying these two kinds of information, with feedback connections carrying the predictions and feedforward connections carrying the error signals (Bastos et al., 2012)

Seen from this perspective, the brain’s apparent complexity begins to make sense as an adaptive network striving to minimize discrepancy between expectation and sensation. The process is dynamic and recursive, rather than linear. Each act of perception becomes a kind of hypothesis test, a negotiation between the brain’s best guess and the sensory evidence that corrects it.

2.3 Core Principles of Predictive Processing: The Brain as a Prediction Machine

Predictive processing extends Bayesian reasoning into a comprehensive account of brain function (Clark, 2013, 2016; Hohwy, 2013). The central idea is that the brain’s primary role is to minimize prediction error—the mismatch between expected and actual sensory input. By reducing this discrepancy, the brain continually refines its internal model of the world, allowing it to anticipate future events more accurately.

This minimization occurs through two complementary strategies. Perceptual inference updates internal models to better match incoming sensory information, effectively changing beliefs to fit the world. Active inference, on the other hand, involves acting to align the world with predictions, thereby changing sensory input to match expectations. Together, these mechanisms unify perception and action within a single predictive framework. Perception “explains away” sensory data by improving predictions, while action generates sensory outcomes that confirm them (Friston et al., 2009).

Precision Weighting and Attention

A key feature of predictive processing is precision weighting—the system’s assessment of how reliable its predictions are relative to incoming sensory signals (Feldman and Friston, 2010). Predictions with high precision (low uncertainty) dampen the influence of errors, while low-precision predictions (high uncertainty) allow errors to drive significant model updates.

This mechanism naturally accounts for attention. Attention can be seen as selectively increasing the precision of prediction errors in specific domains, heightening sensitivity to unexpected events while reducing sensitivity to irrelevant information (Feldman and Friston, 2010; Hohwy, 2012). In essence, attention reflects the brain’s tuning of its error signals, aligning with information-theoretic perspectives that define attention as optimizing precision.

The Free Energy Principle

Karl Friston’s free energy principle extends predictive processing into a general theory of biological systems (Friston, 2010; Friston et al., 2017). According to this principle, all living organisms act to minimize variational free energy, an information-theoretic bound on surprise, or the negative log probability of sensory data given an internal model.

To reduce free energy, a system can:

  • Refine its predictions (perceptual inference),
  • Act to shape sensory input (active inference), and
  • Update its internal model through learning.

This principle unifies perception, action, and learning as complementary strategies for maintaining coherence with the environment (Friston, 2010).

2.4 Empirical Evidence for Bayesian and Predictive Processing

Psychophysical Studies

Many psychophysical experiments support Bayesian principles in perception.

Cue Integration: When multiple sensory cues provide information about the same property—such as depth from stereo disparity and texture gradients—humans combine them in a way that approximates statistically optimal Bayesian integration, weighting each cue by its reliability (Ernst and Banks, 2002; Alais and Burr, 2004).

Prior Effects: Prior expectations influence perception. The light-from-above prior, for example, causes ambiguous shading to appear convex when illuminated from above and concave when lit from below (Mamassian and Goutcher, 2001). Motion perception also incorporates priors favoring slow, smooth movement, reflecting assumptions about natural motion in the environment (Weiss, Simoncelli, and Adelson, 2002).

Contour Integration: The brain integrates discrete edge elements into continuous contours according to the likelihood that these elements belong together, consistent with Bayesian predictions (Geisler et al., 2001).

Neurophysiological Evidence

Neural studies provide direct evidence for predictive coding mechanisms.

Prediction Error Signals: Neurons in auditory cortex respond strongly to unexpected sounds but less to predictable ones, indicating that neural activity encodes prediction errors rather than raw input (Winkler et al.,1996; Garrido et al., 2009).

Hierarchical Processing: fMRI studies show prediction errors propagating upward through cortical hierarchies while predictions flow downward. Higher cortical areas respond more to surprising events, while lower areas encode the discrepancy between predicted and actual input (den Ouden, Kok, and de Lange, 2012).

Repetition Suppression: Reduced neural activity to repeated stimuli—repetition suppression—aligns with predictive coding, as expected stimuli generate smaller prediction errors (Summerfield et al., 2008; Kok, et al.,  2012).

Neuroimaging Studies

Neuroimaging studies further support hierarchical predictive processing.

Predictive Context Effects: The fusiform face area shows reduced activation to faces predicted by context, consistent with reduced prediction error (Egner et al., 2010).

Violation Responses: Unexpected violations of learned patterns activate prefrontal and parietal regions, reflecting hierarchical prediction error signaling (Bubic et al., 2010).

Precision Manipulation: Manipulating uncertainty systematically modulates neural responses, with increased uncertainty amplifying the processing of prediction errors (Hesselmann et al., 2010).

2.5 Applications to Cognitive Phenomena

Perception and Illusions

Predictive processing explains perceptual illusions as cases where strong expectations override ambiguous sensory input. The hollow-mask illusion, in which a concave mask appears convex, demonstrates how robust priors about facial structure dominate conflicting evidence (Gregory, 1980). Similarly, bistable phenomena like the Necker cube arise from competing interpretations producing comparable prediction errors, leading to perceptual alternation (Hohwy et al., 2008).

Action and Motor Control

Through active inference, motor control is recast as prediction-driven. The brain anticipates sensory consequences of movements, particularly proprioceptive feedback, and reflexes act to minimize the resulting prediction errors (Friston et al., 2009). This framework elegantly accounts for motor learning, adaptation, and sensorimotor integration. Disruptions in prediction or precision weighting can explain disorders such as apraxia or other motor impairments (Edwards et al., 2012).

Learning and Development

Learning involves updating the generative model to reflect causal regularities in the environment. Synaptic plasticity and structural changes reduce long-term prediction error (Friston, 2010). Development represents the refinement of generative models over time. Early in life, priors are imprecise, allowing rapid learning. As experience accumulates, models stabilize, slowing learning and increasing predictive confidence (Gopnik and Wellman, 2012).

Attention and Consciousness

Attention emerges from precision weighting, selectively amplifying prediction errors in relevant domains (Feldman and Friston, 2010). Bottom-up attention reflects unexpected error signals, while top-down attention represents goal-directed precision modulation. Conscious experience may correspond to high-level predictions that best account for lower-level sensory input, with disorders of consciousness arising from failures in hierarchical prediction or precision regulation (Hohwy, 2013; Clark, 2013).

2.6 Applications to Psychopathology

Schizophrenia and Psychosis

Psychotic symptoms can be understood as disturbances in prediction error signalling and precision weighting (Fletcher and Frith, 2009; Adams et al., 2013). Hallucinations can arise when internally generated predictions are assigned excessive precision, experienced as external events. Delusions can result from attempts to explain anomalous prediction errors. Dopamine appears to encode precision, and dysregulation can lead to aberrant salience of irrelevant stimuli (Kapur, 2003; Corlett et al.,  2009).

Autism Spectrum Disorders

Autism has been conceptualized as overly precise sensory prediction errors, combined with imprecise higher-level predictions, leading to heightened sensory sensitivity and difficulty forming contextual expectations (Pellicano and Burr, 2012; Van de Cruys et al., 2014). This “hypo-priors” account helps explain detail-focused processing, social prediction difficulties, and challenges in interpreting environmental regularities.

Anxiety and Depression

Anxiety can reflect over-precise threat priors or excessive uncertainty, driving hypervigilance (Paulus and Stein, 2006). Depression may involve flattened precision for positive prediction errors, reducing responsiveness to rewarding outcomes and reinforcing low mood (Clark et al.,  2018).

2.7 Theoretical Implications and Unification

Unifying Perception and Action

Predictive processing integrates perception and action into a single framework, where both aim to minimize prediction error (Friston et al., 2009). Perception updates internal models, and action modifies sensory input to meet predictions. Motor commands can thus be viewed as predicted proprioceptive states that reflexes implement, resolving traditional questions about sensorimotor coordination.

Connecting Neuroscience and Cognitive Science

The framework bridges computational, algorithmic, and neural levels (Marr, 1982), linking Bayesian inference, predictive coding algorithms, and hierarchical cortical structures. Minimizing prediction error aligns with minimizing free energy, connecting cognitive processes to both neural activity and thermodynamic principles (Friston, 2010).

Evolutionary and Developmental Perspectives

From an evolutionary perspective, minimizing surprise ensures survival, favoring accurate generative models (Friston et al., 2017). Development mirrors this process ontogenetically: early life emphasizes exploration and high learning rates, while later stages exploit stabilized models for efficient prediction and action.

2.8 Summary

The Bayesian brain hypothesis and predictive processing framework fundamentally shift how we understand cognition. By portraying the brain as an active, anticipatory system, these theories unify perception, action, learning, attention, and psychopathology. Empirical studies across psychophysics, neurophysiology, and neuroimaging consistently support prediction error minimization, hierarchical processing, and precision weighting.

Predictive processing provides a principled bridge between computational theory, neural implementation, and observable behavior, reframing classical problems in perception, motor control, development, and consciousness. Clinical conditions such as schizophrenia, autism, anxiety, and depression can be interpreted in terms of disrupted prediction or imbalanced precision weighting.

Overall, this theory portrays the brain as a proactive system, continuously striving to reduce surprise and maintain coherence with the environment, offering a unified, mathematically grounded framework that connects theory, neural dynamics, and lived experience.



Against Professional Philosophy is a sub-project of the online mega-project Philosophy Without Borders, which is home-based on Patreon here.

Please consider becoming a patron!