29th Annual Computational Neuroscience Meeting
CNS*2020, Online

Information-Theoretic Models in Psychology and Neuroscience

An online workshop for computational neuroscientists and mathematical psychologists taking place on 21st & 22nd July 2020. 

 

About

Models of information theory describe the behavior and neural dynamics in intelligent agents. They have arisen through fruitful interactions between mathematical psychology, cognitive neuroscience and other fields. However, opportunities for such interactions seem to be few at the moment. This workshop aims to fill this gap by bringing together researchers with different backgrounds but a common goal: to understand information processing in the human and animal brain.

The workshop will discuss information sampling, encoding and decoding during sensory processing, time perception and higher cognitive functions. It will review state of the art techniques based on deep neural networks, probabilistic inference and dynamical systems. It will also provide updates about recent results using these techniques to understand the biology and behavior of intelligent information processing.

 

The workshop will be of interest to members of the CNS community who are keen on model-driven explanations of sensory perception and higher cognition.

Organisers

When

Speakers

  • Vijay Balasubramanian, University of Pennsylvania 

  • Peter Balsam, Columbia University

  • Beth Buffalo, University of Washington

  • Karl Friston, University College London

  • Randy Gallistel, Rutgers University

  • Larry Maloney, NYU

  • Earl Miller, MIT

  • Devika Narain, Erasmus MC

  • Bill Phillips, University of Stirling 

  • Jonathan Pillow, Princeton University

  • Dimitris Pinotsis, University of London ​— City & MIT

  • Tomas Ryan, Trinity College Dublin

  • Noga Zaslavsky, MIT

 

Schedule

This is the list of talks to be held at the workshop. By clicking on the arrow next to the speaker name, you can find the title and abstract of the talk along with a link to the speaker homepage.

Session 1: Tuesday 21st July 09:30 - 13:55 ET / 15:30 - 19:55 CET

Session 2: Wednesday 22nd July 09:00 - 13:55 ET / 15:00 - 19:55 CET

For any inquiries related to this workshop, please contact the organizers at pinotsis@mit.edu ​

 

SESSION 1

9:00 ET / 15:00 CET - Karl Friston (UCL)


Deep Inference and Information Gain

In the cognitive neurosciences and machine learning, we have formal ways of understanding and characterising perception and decision-making; however, the approaches appear very different: current formulations of perceptual synthesis call on theories like predictive coding and Bayesian brain hypothesis. Conversely, formulations of decision-making and choice behaviour often appeal to reinforcement learning and the Bellman optimality principle. On the one hand, the brain seems to be in the game of optimising beliefs about how its sensations are caused; while, on the other hand, our choices and decisions appear to be governed by value functions and reward. Are these formulations irreconcilable, or is there some underlying imperative that renders perceptual inference and decision-making two sides of the same coin.

Speaker homepage here.




9:30 ET / 15:30 CET - Break


We will take a short break between speakers.




9:35 ET / 15:35 CET - Bill Philips (Stirling)


The Magic of Neocortex: Pyramidal Cells that are Context-Sensitive Two-Point Processors as Seen by Three-Way Mutual Information Decomposition

Assuming life to be organised complexity the theory of coherent infomax specifies the objective of hierarchical abstraction in neocortex as maximising transmission of coherent information and minimising transmission of irrelevant information. This is shown to be possible in networks of local processors with receptive fields (RFs) that convey the information to be processed and contextual fields (CFs) that specify the context within which it is to be processed. A two-point activation function uses CF input to modulate transmission of information about integrated RF input. Learning rules for the RF and CF synapses derived analytically from that objective are a refined version of the BCM rule (Bull. Math. Biol. 2011, 73, 344-372). Many neocortical pyramidal cells can operate as two-point processors in which apical input functions as a contextual modulator. Contextual modulation can be quantified using three-way mutual information decomposition. It is distinct from all four elementary arithmetic operators, and has two key distinctive properties: asymmetry between the effects of RFs and CFs, with RFs being dominant; and increasing then decreasing amounts of synergy with increases in RF strength. Decompositions of the output of a multicompartmental model of a layer 5b pyramidal cell confirms identification of contextual modulation with apical input (Symmetry 2020, doi:10.3390/sym12050815). These findings have far-reaching implications for mental life (Neurosci. Consc. 2016, doi.org/10.1093/nc/niw015; Brain and Cognition 2017, 112, 39–53).

Joint work with Jim W. Kay (Glasgow), Jan M. Schulz (Basel) and Matthew E. Larkum (Berlin).

Speaker homepage here.




10:05 ET / 16:05 CET - Break


We will take a short break between speakers.




10:10 ET / 16:10 CET - Peter Balsam (Columbia)


Information, Anticipation and Dopamine

Adaptive behavior requires performing the right response in the right place at the right time. Stimuli in the environment provide information about when important events (e.g.rewards) are available as well as information about when they are not. These stimuli are informative to the extent that they signal a change in the rate at which events will occur. Information theoretic approaches show that a stimulus will generate anticipatory responding to the extent that it reduces uncertainty about when rewards will occur. Inhibition of anticipation at other times is also regulated by this information. Inhibition and excitation are thus two sides of the same coin and the currency is temporal information. We show that this information modulates behavior in both Pavlovian conditioning and operant discrimination learning. Furthermore, this learning is extremely rapid and dopamine activity tracks this temporal information about reward availability.

Speaker homepage here.




10:40 ET / 16:40 CET - Break


We will take a short break between speakers.




10:45 ET / 16:45 CET - Randy Gallistel (Rutgers)


Using Cumulative Coding Cost to Analyze and Understand Acquisition and Extinction

The Kullback-Leibler divergence, DKL(P||Q) gives the average cost of encoding a datum from P using a code optimized for data from Q. When P ≡ Q, CCC is distributed gam(.5,1). In Pavlovian acquisition, P is the distribution of reinforcements given the CS, while Q is the distribution given the context, C, and P~≡ Q. In extinction, P is the current distribution and Q the pre-extinction distribution, and again P~≡ Q. The distribution of the subject's inter-response intervals (iri) during the CS separates out from the distribution in its absence in the course of acquisition. The iri distribution separates out from the pre-extinction distribution in the course of extinction. The CCCs, computed reinforcement by reinforcement and response by response, with Bayesian estimates of P and Q parameters, enable us to compare the growing strength of the evidence for a CS-C reinforcement-rate difference during acquisition with the growing strength of the evidence for a behavioural reaction to this difference—and likewise in extinction. The CCC comparisons suggest that subjects may themselves rely on the CCC in adjusting their behaviour to novel or changed circumstances.

Speaker homepage here.




11:15 ET / 17:15 CET - One Hour Break


We will take a one hour break before the next speaker.




12:15 ET / 18:15 CET - Noga Zaslavsky (MIT)


Efficient Compression and Human Semantic Systems

How do languages assign meanings to words? In this talk, I will argue that efficient data compression is a fundamental principle underlying human semantic systems. Specifically, I will argue that languages compress meanings into words by optimizing the Information Bottleneck (IB) tradeoff between the complexity and accuracy of the lexicon, which can be derived from Shannon’s Rate–Distortion theory. This proposal has gained substantial empirical support in a series of recent studies using cross-linguistic data from several semantic domains, such as terms for colors and containers. I will show that (1) semantic systems across languages lie near the IB theoretical limit; (2) the optimal systems explain much of the cross-language variation, and provide a theoretical explanation for why empirically observed patterns of inconsistent naming and soft category boundaries are efficient for communication; (3) languages may evolve through a sequence of structural phase transitions along the IB theoretical limit; and (4) this framework can be used to generate efficient naming systems from artificial neural networks trained for vision, providing a platform for testing the interaction between neural perceptual representations and high-level semantic representations. These findings suggest that efficient compression may be a major force shaping the structure and evolution of human semantic systems, and may help to inform AI systems with human-like semantics.

Speaker homepage here.




12:45 ET / 18:45 CET - Break


We will take a short break between speakers.




12:50 ET / 18:50 CET - Larry Maloney (NYU)


Probability Distortion Maximizes Mutual Information

In decision-making under risk (DMR) participants' choices are based on probability values systematically different from those that are objectively correct. Similar systematic distortions are found in tasks involving relative frequency judgments (JRF). These distortions limit performance in a wide variety of tasks and an evident question is, why do we systematically fail in our use of probability and relative frequency information? We propose a Bounded Log-Odds Model (BLO) of probability and relative frequency distortion based on three assumptions: (1) log-odds: probability and relative frequency are mapped to an internal log-odds scale, (2) boundedness: the range of representations of probability and relative frequency are bounded and the bounds change dynamically with task, and 3) variance compensation: the mapping compensates in part for uncertainty in probability and relative frequency values. We compared human performance in both DMR and JRF tasks to the predictions of the BLO model as well as eleven alternative models each missing one or more of the underlying BLO assumptions (factorial model comparison). The BLO model and its assumptions proved to be superior to any of the alternatives. In a separate analysis, we found that BLO accounts for individual participants’ data better than any previous model in the DMR literature. We also found that, subject to the boundedness limitation, participants’ choice of distortion approximately maximized the mutual information between objective task-relevant values and internal values, a form of bounded rationality.

Speaker homepage here.




13:20 ET / 19:20 CET - Break


We will take a short break between speakers.




13:25 ET / 19:25 CET - Tomas Ryan (TC Dublin)


Memory and Instincts As a Continuum of Information Storage

Information must be encoded through plasticity of latent biological states in the brain. These learning-induced changes can be referred to as memory ‘engrams’. Recent studies that employ novel molecular methodologies have implicated sparse populations of neurons as engram cells that contribute to the storage of distributed memory engrams. Memory engram technology has provided an unprecedented tool for the labelling and experimental manipulation of specific memory representations in the mouse. Memory engram technology integrates immediate early gene (IEG) labelling techniques with optogenetics to facilitate the activity-dependent tagging and reversible manipulation of components of specific memory engrams. Applying this methodology, experimental studies suggest that while short-term memories may be encoded as transient changes in neuronal excitability, long-term memories are formed by plasticity of microanatomical connectivity between engram cells. But memory is not the only form of information that guides adaptive behaviour, the other form is instinct. While memory and instinct are encoded by very different processes and plasticity mechanisms, the resultant form of instincts and long-term memories may be the same: embedded hard-wired connectivity patterns that store stable informational representation. Both memories and instincts enable the organism to make predictions about its environment, that then change with experience. We propose that such ensembles can encode evolvable affordances of how animals interpret their environments.

Speaker homepage here.





 

SESSION 2

9:00 ET / 15:00 CET - Karl Friston (UCL)


Deep Inference and Information Gain

In the cognitive neurosciences and machine learning, we have formal ways of understanding and characterising perception and decision-making; however, the approaches appear very different: current formulations of perceptual synthesis call on theories like predictive coding and Bayesian brain hypothesis. Conversely, formulations of decision-making and choice behaviour often appeal to reinforcement learning and the Bellman optimality principle. On the one hand, the brain seems to be in the game of optimising beliefs about how its sensations are caused; while, on the other hand, our choices and decisions appear to be governed by value functions and reward. Are these formulations irreconcilable, or is there some underlying imperative that renders perceptual inference and decision-making two sides of the same coin.

Speaker homepage here.




9:30 ET / 15:30 CET - Break


We will take a short break between speakers.




9:35 ET / 15:35 CET - Bill Philips (Stirling)


The Magic of Neocortex: Pyramidal Cells that are Context-Sensitive Two-Point Processors as Seen by Three-Way Mutual Information Decomposition

Assuming life to be organised complexity the theory of coherent infomax specifies the objective of hierarchical abstraction in neocortex as maximising transmission of coherent information and minimising transmission of irrelevant information. This is shown to be possible in networks of local processors with receptive fields (RFs) that convey the information to be processed and contextual fields (CFs) that specify the context within which it is to be processed. A two-point activation function uses CF input to modulate transmission of information about integrated RF input. Learning rules for the RF and CF synapses derived analytically from that objective are a refined version of the BCM rule (Bull. Math. Biol. 2011, 73, 344-372). Many neocortical pyramidal cells can operate as two-point processors in which apical input functions as a contextual modulator. Contextual modulation can be quantified using three-way mutual information decomposition. It is distinct from all four elementary arithmetic operators, and has two key distinctive properties: asymmetry between the effects of RFs and CFs, with RFs being dominant; and increasing then decreasing amounts of synergy with increases in RF strength. Decompositions of the output of a multicompartmental model of a layer 5b pyramidal cell confirms identification of contextual modulation with apical input (Symmetry 2020, doi:10.3390/sym12050815). These findings have far-reaching implications for mental life (Neurosci. Consc. 2016, doi.org/10.1093/nc/niw015; Brain and Cognition 2017, 112, 39–53).

Joint work with Jim W. Kay (Glasgow), Jan M. Schulz (Basel) and Matthew E. Larkum (Berlin).

Speaker homepage here.




10:05 ET / 16:05 CET - Break


We will take a short break between speakers.




10:10 ET / 16:10 CET - Peter Balsam (Columbia)


Information, Anticipation and Dopamine

Adaptive behavior requires performing the right response in the right place at the right time. Stimuli in the environment provide information about when important events (e.g.rewards) are available as well as information about when they are not. These stimuli are informative to the extent that they signal a change in the rate at which events will occur. Information theoretic approaches show that a stimulus will generate anticipatory responding to the extent that it reduces uncertainty about when rewards will occur. Inhibition of anticipation at other times is also regulated by this information. Inhibition and excitation are thus two sides of the same coin and the currency is temporal information. We show that this information modulates behavior in both Pavlovian conditioning and operant discrimination learning. Furthermore, this learning is extremely rapid and dopamine activity tracks this temporal information about reward availability.

Speaker homepage here.




10:40 ET / 16:40 CET - Break


We will take a short break between speakers.




10:45 ET / 16:45 CET - Randy Gallistel (Rutgers)


Using Cumulative Coding Cost to Analyze and Understand Acquisition and Extinction

The Kullback-Leibler divergence, DKL(P||Q) gives the average cost of encoding a datum from P using a code optimized for data from Q. When P ≡ Q, CCC is distributed gam(.5,1). In Pavlovian acquisition, P is the distribution of reinforcements given the CS, while Q is the distribution given the context, C, and P~≡ Q. In extinction, P is the current distribution and Q the pre-extinction distribution, and again P~≡ Q. The distribution of the subject's inter-response intervals (iri) during the CS separates out from the distribution in its absence in the course of acquisition. The iri distribution separates out from the pre-extinction distribution in the course of extinction. The CCCs, computed reinforcement by reinforcement and response by response, with Bayesian estimates of P and Q parameters, enable us to compare the growing strength of the evidence for a CS-C reinforcement-rate difference during acquisition with the growing strength of the evidence for a behavioural reaction to this difference—and likewise in extinction. The CCC comparisons suggest that subjects may themselves rely on the CCC in adjusting their behaviour to novel or changed circumstances.

Speaker homepage here.




11:15 ET / 17:15 CET - One Hour Break


We will take a one hour break before the next speaker.




12:15 ET / 18:15 CET - Noga Zaslavsky (MIT)


Efficient Compression and Human Semantic Systems

How do languages assign meanings to words? In this talk, I will argue that efficient data compression is a fundamental principle underlying human semantic systems. Specifically, I will argue that languages compress meanings into words by optimizing the Information Bottleneck (IB) tradeoff between the complexity and accuracy of the lexicon, which can be derived from Shannon’s Rate–Distortion theory. This proposal has gained substantial empirical support in a series of recent studies using cross-linguistic data from several semantic domains, such as terms for colors and containers. I will show that (1) semantic systems across languages lie near the IB theoretical limit; (2) the optimal systems explain much of the cross-language variation, and provide a theoretical explanation for why empirically observed patterns of inconsistent naming and soft category boundaries are efficient for communication; (3) languages may evolve through a sequence of structural phase transitions along the IB theoretical limit; and (4) this framework can be used to generate efficient naming systems from artificial neural networks trained for vision, providing a platform for testing the interaction between neural perceptual representations and high-level semantic representations. These findings suggest that efficient compression may be a major force shaping the structure and evolution of human semantic systems, and may help to inform AI systems with human-like semantics.

Speaker homepage here.




12:45 ET / 18:45 CET - Break


We will take a short break between speakers.




12:50 ET / 18:50 CET - Larry Maloney (NYU)


Probability Distortion Maximizes Mutual Information

In decision-making under risk (DMR) participants' choices are based on probability values systematically different from those that are objectively correct. Similar systematic distortions are found in tasks involving relative frequency judgments (JRF). These distortions limit performance in a wide variety of tasks and an evident question is, why do we systematically fail in our use of probability and relative frequency information? We propose a Bounded Log-Odds Model (BLO) of probability and relative frequency distortion based on three assumptions: (1) log-odds: probability and relative frequency are mapped to an internal log-odds scale, (2) boundedness: the range of representations of probability and relative frequency are bounded and the bounds change dynamically with task, and 3) variance compensation: the mapping compensates in part for uncertainty in probability and relative frequency values. We compared human performance in both DMR and JRF tasks to the predictions of the BLO model as well as eleven alternative models each missing one or more of the underlying BLO assumptions (factorial model comparison). The BLO model and its assumptions proved to be superior to any of the alternatives. In a separate analysis, we found that BLO accounts for individual participants’ data better than any previous model in the DMR literature. We also found that, subject to the boundedness limitation, participants’ choice of distortion approximately maximized the mutual information between objective task-relevant values and internal values, a form of bounded rationality.

Speaker homepage here.




13:20 ET / 19:20 CET - Break


We will take a short break between speakers.




13:25 ET / 19:25 CET - Tomas Ryan (TC Dublin)


Memory and Instincts As a Continuum of Information Storage

Information must be encoded through plasticity of latent biological states in the brain. These learning-induced changes can be referred to as memory ‘engrams’. Recent studies that employ novel molecular methodologies have implicated sparse populations of neurons as engram cells that contribute to the storage of distributed memory engrams. Memory engram technology has provided an unprecedented tool for the labelling and experimental manipulation of specific memory representations in the mouse. Memory engram technology integrates immediate early gene (IEG) labelling techniques with optogenetics to facilitate the activity-dependent tagging and reversible manipulation of components of specific memory engrams. Applying this methodology, experimental studies suggest that while short-term memories may be encoded as transient changes in neuronal excitability, long-term memories are formed by plasticity of microanatomical connectivity between engram cells. But memory is not the only form of information that guides adaptive behaviour, the other form is instinct. While memory and instinct are encoded by very different processes and plasticity mechanisms, the resultant form of instincts and long-term memories may be the same: embedded hard-wired connectivity patterns that store stable informational representation. Both memories and instincts enable the organism to make predictions about its environment, that then change with experience. We propose that such ensembles can encode evolvable affordances of how animals interpret their environments.

Speaker homepage here.