What is information?

As the concept of information takes a more fundamental role in the theory, it becomes more of a starting point, and hence an undefined term for which some intuitive meaning has to be assumed. One could say that a bit is the answer to one yes/no question, but this is effectively circular: one now has to say what “one” and “question” mean. “Uncertainty” is similarly undefined - certainty and uncertainty are, at bottom, feelings that we experience, not concepts that we can derive from something else.

What is time?

Good question. Hopefully the next two sections will shed some light on this. We can discuss it some more in July.

How would one measure telepathic signals?

Do you mean, how would you measure the effect of such a (hypothesized) signal and hence get evidence for its existence, or how would you measure the signal itself, using something other than a human as a detector?

In what ways/models is the Markov Blanket physical and/or informational?
It is the title of the course, how are “Informational” and “Physical” similar and different?

Markov blankets (MBs) are defined for classical, causal processes that can be represented as directed, acyclic graphs (DAGs), so diagrams in which states are connected by unidirectional arrows and there are no cycles, meaning no circular (or backwards in time) causation. In this setting, the MB around some state (or set of states) X is the set of states that send arrows to X (”parents of X”), plus the set of states that receive arrows from X (”children of X”), plus any states that are parents of X’s children. Hence all causal inputs to and outputs from X traverse the MB states. Since the MB is a set of states, it is physical. Since it effectively encodes all information that X receives from and sends to the outside world, it is informational. We will see in the June session how a holographic screen functions as an MB, even though it is defined differently. Hopefully your second question will be answered by the end of this course.

What is a black hole composed of? What gives it mass, e.g. photons, elements from the periodic table or what? How does it form from a star (composed of e.g. iron and other elements)?

Black holes (BHs) are a prediction of classical general relativity (GR). In GR, mass is curvature, so effectively tension, in spacetime. A BH is a region of spacetime in which the curvature is so large that light cannot escape. If spacetime was two-dimensional, it would look like this picture, which I got from the math stack exchange. The x and y axes are spacetime, the z axis is energy, in the form of gravitational potential energy. Anything that falls into a BH increases the BH’s mass and hence its curvature. The standard astrophysical model is that BH are formed when very massive stars run out of fuel, stop radiating energy, and collapse due to their own gravity. There is also the idea that BH form from anisotropies in the very early universe. Both of these ideas are classical. Observationally, BH appear to exist in the centers of galaxies, including our own. These BH are very large, e.g. the size of the whole solar system, so have orders of magnitude more mass than the sun. There is a huge literature on these objects - start with the wikipedia page and follow the links.

At around 12 minutes it is said that the logarithm is there because the number of states is enormous. I just wanted to comment that the logarithm is there due to the combinatoric allowing one to combine the entropy of different systems in the appropriate way, which, without the fact that log(ab)=log(a)+log(b) wouldn’t work.

Yes - logs make entropy additive and so much easier to think about.

So, to maintain the integrity of its markov blanket over time, a system must be able to constrain the set of its possible future states... gravitate toward the "good" states where it stays together, avoid the "bad" states where it falls apart. This essentially means it must limit its own entropy, or "resist" entropy somehow. What allows a system to do this? Darwinian fitness? Can that be physically measured?

Schrodinger introduced this idea of “resisting entropy” in his book “What is Life” (1944). “Staying together” as an identifiable “thing” takes energy. Relatively simple, solid things like crystals can resist falling apart just with chemical binding energy. Relatively complex, soft things like cells or organisms have to use metabolic energy from controlled chemical reactions. The energy that isn’t stored in chemical bonds is radiated back into the environment as heat, or via the definition dQ = dS/T, as entropy. Hence they are called “dissipative systems” that is, systems that dissipate entropy. This idea is the foundation of theories of self-organization. One can view all of biology as the study of self-organizing systems. Darwinian fitness is, technically, the probability of one’s genotype or phenotype (depending on the formulation) appearing in the next generation, so it depends on successful dissipation of entropy, at least for long enough to reproduce.

Was trying to figure out how adding one quantum of action to a system changes its number of possible states. Given Entropy=(Boltz const)*ln(Ω) where Ω is the number of states, we can get Ω=e^(S/k). Where can we go from here? Should we use the Wick rotation to try and replace Boltzmann's constant with Planck’s constant? How would that look?

Adding action = finite energy in finite time doesn’t increase the number of possible states (i.e. doesn’t change the state space), it just changes the probability distribution of state occupancy. When we add energy, more energetic states have a higher probability of being occupied. Adding or subtracting possible states (changing the state space) typically involves adding or subtracting matter and hence particle degrees of freedom. This is where black holes are a limiting case - adding matter, or even energy in the form of photons, is adding curvature, which makes the horizon beyond which light can’t escape larger. This adds states (i.e. entropy) to the horizon, which is effectively the boundary between the BH and the outside world.

To make an observation on a system we must add energy to lower its entropy. But it seems like adding energy can also increase entropy. What makes the difference? Is it a matter of perspective?

Yes. This will be the main theme of the July session. Whether a behavior is “informative” or “noise” depends on who is looking and how they look. Think about colliding a proton with an antiproton at the LHC. If I look with a complicated detector that measures outgoing particle momenta, charge, and spins, I get a lot of information to test the Standard Model. But if I just measure with a calorimeter, I just get heat, so from my perspective I’ve just increased the system’s entropy. This gets back to Boltzmann’s insight, which is that entropy is a measure of the number of states that I can’t distinguish with whatever measurements I am deploying.

Does the Wick rotation tell us that matter and energy somehow exist “90 degrees” perpendicular to our 3 dimensions of space? In “imaginary” or “complex” space?

In a sense, yes. Time is perpendicular to our 3 dimensions of space, and matter and energy (and we ourselves) exist in time. Wick is pointing to the intimate relationship between how we measure time and how we determine (or judge) that something maintains its identity through time, i.e. stays “the same thing” through time. To tell time by a clock, I have to be sure I’m looking at the same clock, set the same way, etc. We’ll get into this in July and August.

Dr. Fields, when you described Boltzmann’s theory of entropy as a measure of our uncertainty, did this not imply the presence of a subject who is uncertain? In other words, this definition became perspectival from the get-go (which would be hint of the quantum theory’s observer)? Or, perhaps, there was some kind of universal uncertainty hanging in the air? If is the former, can we really talk about “the entropy of a black hole C” and not the entropy of “subject Z evaluating black hole C”?
In Shannon’s theory it was originally the receiver who was uncertain about the sender’s next message. In FEP, the entropy of the sensory states of System A can be minimized, but the entropy of the System A’s beliefs can be maximized – this is all from the perspective of System A.
Who or what is uncertain about the entropy of a black hole or even a gas in a chamber and how does the state of such subject affect the measurement of entropy? Can a photon be uncertain while moving at the speed of light? The Universe? A photo camera? What are the necessary requirements of an observer then?

Yes. Entropy was thought of as objective = observer independent in classical physics because essentially everything was thought of as objective - there was only Galilean and then Einstein’s notions of relativity to take account of. But it did depend from the start of what measurements were made and what the observer knew (in Bayesian terms, what the observer’s priors were) beforehand. A measurement always employs a communication channel, as Shannon recognized. We can think of the channel states as a Markov blanket or a boundary, as the FEP does - the observer is always on one side of this channel and the observed system (black hole, ideal gas, whatever) is on the other side. If the observer is to acquire information and hence change her uncertainty, she has to have a memory she can access. This is what photons don’t have - a photon can encode a memory for me, but it doesn’t encode a memory for itself.

When entropy is referred to as a type of uncertainty, (1) correct me if I’m wrong but it seems to be talking about something like a threshold of the maximum information you could get out of a system before you start to change its thermodynamic properties. Surely there are sources of uncertainty like a smudge on a lens that have no relevance to the thermodynamics of the observed system. (2) What types of observation are on this threshold where they require altering the thermodynamics properties of the system being observed? For instance, when Brownian motion was discovered with small bits of pollen jiggling in a liquid; (3) do these motions of the pollen, which seem to convey some amount of information about the micro-fluctuations of the fluid, have to balance this information gain with some sort of increase in entropy? It seems that there is no obvious limit to how long the pollen could dance this way, constantly displaying information, and no clear free energy for the pollen to be utilizing. My best guess would be then that this does not qualify as information, despite its ruling out certain microstates that would not have pushed the pollen in the observed direction at the moment of observation and thus reducing uncertainty; (4) is this because the information has no predictive power? What am I missing here?

The pollen keeps dancing because it, and the liquid, are embedded in an environment at finite temperature and open to thermodynamic exchange with that environment. It would stop if the whole business was frozen to absolute zero. We can see the dancing because the system is being bathed in light (i.e. energy), some of which is reflected back to us. This is an example of how observation requires thermodynamic exchange, and of why the classical idea of “passive observation” doesn’t actually make sense. To get information about the pollen, we have to make observations over time (e.g. with a video camera) and spend energy writing the results to memory (e.g. the memory chip in the video camera, which requires DC power). This shows how we, as observers, also have to be open to thermodynamic exchange, as do all of our bits of laboratory apparatus. In the presentation, I simplified this by just considering a system and its environment (which in this case includes us, our apparatus, etc). In August or September we will get to an explicit picture of systems that have multiple components (e.g. us, apparatus, the rest of the environment) that exchange classical information. At either level of detail, though, getting and recording information requires thermodynamic exchange.

What does Wheeler mean when he says there are no laws?

Philosophers and historians of physics debate this question; a recent example is . My own view is that he is pointing out that the “classical reality” that “laws of physics” are supposed to describe is observer-relative. It’s well worth reading Wheeler’s paper, available at Is the interface equal to the channel in Shannon and Weaver model? Is it the Markov Blanket? Is the big problem that we don´t know who works the channel?

The interface/MB is indeed the channel. I’m not sure what you mean by “who works the channel” - Alice and Bob write on and read from it, just as we are doing with this coda interface and the net as a channel.

I noticed this relationship and wanted to get your take on it. Basically that it/hbar = 1/(Kb*T), notice that temperature can be written as dE/dS and dS can be rewritten as dS = Kbln(Ω) making the RHS 1/Kb*(dE/d(kbLn(Ω)) we can factor out the constants and cancel them and we are left with 1/dE/dLn(Ω) we the can rewrite this as dLn(Ω)/dE = it/hbar. The association of the time factor it in quantum mechanics with the variation of the logarithm of the number of accessible states with respect to energy (dLn(Ω)/dE) suggests that the evolution of quantum systems (which is often considered timeless or reversible in isolation) could also be understood in terms of a changing number of accessible states, hinting at a possible deep connection between quantum mechanics and the thermodynamic arrow of time. Is my reading correct, I just saw watched the first lecture and I couldn’t help but notice that. I would love to further discuss things via email or some form.

Nice observation. The relationship between QT and time is an active area - Rovelli’s papers arxiv:1812.03578 and 2010.05734 are good entry points. A key issue in this relationship is the observer-relativity of entropy or information. From this perspective, dS/dE can be read as the change in information flow to/from the observer (the boundary is informationally symmetric) as the energy expended to get information increases/decreases. Seeing the observer’s internal clock (time reference frame) as a bit counter couples this observer-relative information flow to observer-relative time.

What is meant by local free choice?
Does it just mean that we can not always predict perfectly what will happen next, that are actions are based on uncertainty?

"Local” here means “at some boundary.” QT is a globally deterministic theory, i.e. any isolated system evolves unitarily. Isolated systems are unobservable by definition; they are the abstractions used to get the theory off the ground. If we think of some observer (Alice) and “everything else” (not-Alice), then the joint system comprising these two is isolated by definition. Alice doesn’t observe the joint system; she observes not-Alice (usually called Bob). Think of the boundary between Alice and Bob as an array of qubits as in slide 16 of the talk. Alice and Bob interact by preparing and measuring these qubits. We can think of the qubits as spins, and the preparation and measurement as using the z-spin operator s_z. The z axis is the “up v/s down” direction. Local free choice is the freedom of Alice and Bob to each choose their own z-axis completely independently of each other. If they don’t have this freedom, they are entangled. We’ll discuss this more in the July session.

Dr. Fields. When we say “uncertainty” with respect to entropy, what version of probability do we actually mean (In Sean Carroll’s terms.) One version of 50% of heads vs tails means – you throw a coin a 1000 times and get approximately 500 heads - empirical. Another one is the confidence in a belief, such as a prognosis of 50% chance of rain tomorrow in Rome – nobody will actually do a 1000 experiments in this case. Essentially – the second version of probability is inferential.

So when we say entropy is a measure of the observer’s uncertainty. You showed this wonderful example of proton and anti-proton colliding in LHC and measured with a colorimeter vs. a complicated detector – it is about the HOW we measure.

But if we are talking about inferential probability – measure of certainty in a belief or future measurement/answer/message, then are we not talking about the quality of the agent’s generative model? Untrained Chat-GPT has high entropy when asked a medical question. Chat-GPT trained on full access to all papers on would have a lower entropy with respect to medical questions. Also, in order for the Chat-GPT to be flexible and adaptable to the new data published in pubmed, does it not need to have a high entropy of beliefs (broad distributions) – so that it is able to update the model when the new data (surprise) is coming in with relatively high confidence that contradicts the priors? Could you please comment on this inferential aspect of “uncertainty”?

Good question, one that points to an active area with a long history, going back at least to Laplace. The “empirical” or “frequentist” idea of probability is generally taken to also be “objective” in the sense of observer-independent. The “inferential” or “Bayesian” (see ) idea of probability is “subjective” or observer-specific. The latter perspective works well with observer-relative conceptions of quantum states and operations; see Chris Fuchs, arxiv:1003.5209 or David Mermin arxiv:1809.01639 for good introductions to this debate (both on the Bayesian side). I always these observer-specific readings of entropy, information, measurement, or action. They depend on how the agent interacts with the world, i.e. what reference frames they have available and how they use them. You are right to call this the “quality” of the agent’s generative model. Chat-GPT is a nice example. The July session will focus on these questions of the reference frames used to act and make measurements, and how they also determine what data can be stored in a memory and accessed later. Dr. Fields, If we consider Planck’s equation with energy being proportional to frequency and then consider energy’s relationship with entropy, is it fair to say that an increase in frequency (all other things being equal) will result in higher entropy?
If we apply this to a human brain – EEG measurements. Deep sleep is low frequency, while wakefulness is high frequency – gamma. Those who actually empirically measure EEG entropy (ApEn, SpEn, etc) observe that entropy increases when we wake up and even when we just open our eyes while being already awake.
Is it just frequency, or the variation in frequencies as well? Is it fair to say that with higher frequency/energy there is generally more noise and variability, leading to more uncertainty?

This is a nice connection to entropy measures in EEG - thanks for bringing it up. As discussed in the context of earlier questions, entropy and hence uncertainty are observer-relative concepts. The question “Whose uncertainty?” always arises. If Alice is measuring EEG, then a variable signal occupies more frequency states than a constant signal, and hence has higher entropy for Alice. This is independent of what the frequency of the signal is telling Alice about the energy and hence the entropy (dE = TdS) of the system being measured. Whether Alice views variation in the signal as “noise” or as informative depends on Alice’s measurement capabilities and priors - if she expects a constant signal but measures a variable one, with some nice, e.g. Gaussian structure in the variation, she may consider the variation “noise” around her expected constant value. But if she expects variation - e.g. she is looking for a phase-coded message in the signal - the variation is no longer noise, but what she’s after.

How do you relate the scattering matrix (quantum) to the transition matrix (classical) in Markov Blanket?

Good question that brings up a difference between the theories and how they are used. An MB is defined in a causal (directed) network. One can draw a closed boundary anywhere in such a network and define an MB as the set of states with in or out arrows crossing the boundary plus any states not yet counted that have arrows going to the states with incoming boundary-crossing arrows. So if you pick a node, you can define a sequence of larger and larger MBs around that node, each containing all the states of any smaller MBs around that node. In the limit, you get to the boundary of the whole network, outside of which there are no more nodes, so the process stops. The S-matrix takes essentially the opposite approach, starting with idealized “free particle” states at + and - infinity in both space and time from some “scattering center” where an interaction happens. The S-matrix is then the operator S: |in> → |out>. This formulation effectively assumes that “no one is looking” during the actual interaction, which happens “fast” compared to the processes (that are not represented by the S-matrix) of preparing the initial state and measuring the final state. The explicit causal network of the classical picture is replaced by “everything not explicitly forbidden is mandatory” (M. Gell-Mann from T. H. White) with the standard representation being the hierarchy of increasingly complicated Feynman diagrams. The other key difference is unitarity and hence spacetime reversibility: all the Feynman diagrams also work with the arrows reversed.

Does a Markov boundary have symmetrical input/output bandwidth?

If “bandwidth” is defined by counting incoming and outgoing arrows, the answer is no - nodes can have high fan-in and low fan-out or the reverse. More subtle metrics, e.g. transfer entropy () can also be defined. The most general way of asking the question would be in terms of conservation of energy, in which case the answer is yes provided there are no sources on sinks inside the MB. A source or sink is effectively a singularity - a node where the energy changes with a step function - so one could require causal networks with no singularities. This is a place where the quantum theory is much more straightforward - holographic boundaries are informationally symmetric by definition. Whether the agents on the two sides of the boundary can use the information in any interesting way depends on their computational capabilities and so is not in general symmetric. Does holographic principle make it easier to compute any known physics problems?

The HP is the statement that interactions between finite systems have not just finite eigenvalues, but finitely encodable eigenvalues. Quantitatively, it sets an upper limit on the entropy associated with a boundary, S <= A/4, A the area in Planck units, with equality only (by definition) for black holes. So the HP only gives you a number for BHs, e.g. lets you associate an entropy with a mass via R = 2M. The HP is also the basis for holographic dualities like AdS/CFT, which effectively lets you change the basis in which a calculation is done. There is a huge literature of applications of AdS/CFT.

Quote from 1st lecture: “The FEP tells us that all systems are agents doing science all the time”. This reminds me of a discussion I had with John Campbell, he said: “Science is simply a rediscovery of a method that nature has always used to generate and maintain existence by gathering knowledge” (i.e., Active Inference). From this, 2 questions:
1- Is science a method for generating and maintaining existence in the form of scientific theories and technologies?
2- Could we consider science as a specific socio-cultural practice aimed at securing our collective synchronization with the external environment?
2- If yes, and if we compare it with other collective synchronization strategies around the world, could we say that it is not the most effective synchronization strategy, given the environmental costs of the scientific enterprise?

In what you quote, I am using the word “science” very broadly, essentially as a synonym for active inference: science (in this usage) is the “practice” of probing the world in order to get new information from it. I suspect that John C. as you quote him was using “science” in the narrower meaning that distinguishes it from other strategies for interacting with the world, as you are in your question #3. In my broad usage, all synchronization strategies are variants or implementations of active inference so they are all “science.” The question of effectiveness can also be asked from multiple perspectives. From an active inference perspective, an effective procedure is one that uses minimal energy to construct a GM with good predictive power. Part of this strategy could involve molding the environment to make it more predictable. But this strategy can clearly get into local minima in which the environment looks more predictable in the short term, but is in fact less predictable in the long term.

Dr. Fields, may I ask a question on how you derive time from communications?

This citation from your paper with colleagues: “As discussed in §3.3 above, the idea of sequential measurement, and hence the idea of recordable time, is only physically meaningful for observers able to write data irreversibly to a classical memory. The action of writing to a memory sector Y defines an A-specfic, local time QRF tA as illustrated in Fig. 5. The most natural unit of tA is the minimal time to write one bit, […].”

This is a description of an interval of time. However, is the concept of a point - synchrony (co-incidence) not necessary for this definition? If so, then this, seemingly new definition of time, already depends on some sense of the classical time – the most basic concept of time measurement - the point of synchrony.

Say, the time to write one bit would actually mean a process that starts exactly when the encoder begins to write 1 bit and ends exactly when it stops. And how and who/what would know exactly when such co-incidence is? What is the definition of synchrony or co-incidence in your framework? Is is subject-independent or it is purely subjective? Is there a need for a meta-observer, judging when exactly the synchrony is established – starting and stopping the local stopwatch?

For a human subject in infancy, an act of touch by a mother can possibly be defined as a subjective analog of being exactly in the same time and space as mom - on the scale of the whole humans (A. Fotopoulou’s thought).

What about a virus, a rock, a chimp, a star? Until synchrony is formalized independently, can we really measure the exact interval it takes to write one bit or all data?

I think what you are pointing to here is the fundamentally stipulative nature of QRF definitions. I can only tell an external clock is ticking because I have an internal time QRF, so my use of an external time QRF, even an attosecond resolution light clock, depends on my internal QRF. There is nothing I can measure my internal ticks with (EEG or neural recording for example) that doesn’t (logically and physically) depend on the very internal clock I’m trying to measure. There is a broader issue of which this is an example: no system can reverse-engineer itself. A can be separable from B only if dim(A), dim(B) >> dim(H_AB), which is the (Hilbert-space) dimension of the A-B boundary. But dim(H_AB) is the upper limit of the information A can write to memory. So A doesn’t have the capacity to write anything but a coarse-grained model of itself to memory. Even this model is going to be inferred from interactions with B (since H_AB is the source of all incoming data) and hence will be, like my model of my own internal processes, an inference from observations of systems that A regards as similar to it.

I am not sure I understand the reasoning that underlies the calculation of the rhodopsin’s action, or why Planck’s constant would be the minimal action for any process

Planck’s constant is (in current theory) the minimal action by definition. It has units of energy x time, which are complementary so cannot be measured simultaneously. But if you separately measure energy and time for some process, in many replicates so you can calculate means, you can compute a mean action. This is what I did for rhodopsin, assuming a minimal (ln2k_B T) energy instead of measuring it. In fact the energy of visible light that rhodopsin responds to is close to this number. The response time is from measurements. Hence I can calculate the action and compare it to the value of hbar. The value of hbar in fundamental (Planck) units is one; the numerical values of 1 sec, 1 Joule, 1 meter, etc are derived from these stipulated values - see . Dr. Fields, if we go back to Alice asking Bob a question in Language N, then Alice’s local time seems to be a function of the language chosen to communicate with Bob? Say, Alice is human. Saying out loud “up or down” can take 2 seconds, but a tactile contact can take a second, while a non-verbal communication via facial expression can take thirty milliseconds. All of these will be perceived by Bob, who comprehends all three languages. Within the body there are electrical, chemical, and other kinds of communication. How is this Babylon problem addressed with respect to time? Does your model assume that there is one and only one language Alice can use to talk to Bob (encode memory on the holographic screen?) If not, then Alice’s time is a function of Alice’s chosen language. Would you agree that the time in your model not the same thing as Newton’s universal time, or Einstein’s relative time of system N moving with velocity V?

Yes. There are different time scales for different components and different component scales - e.g. even different molecules have different characteristic times for functionally-relevant configuration changes. It is indeed a complex business! Complex organisms and even single cells have many different interaction modalities with different time scales and different implementation compartments. We can analyze it down to compartments that implement single QRFs and hence just one “language.” We’ll discuss this in the generic case in July, after which you may want to pose an extension of this question.

Dr. Fields, a related question – can humans reduce all information quantitatively to bits or a qualitative differentiation is absolutely necessary? Say, Alice is hungry and thirsty– in her own body, these two messages may have similar intensity, but they are qualitatively different (M. Solms), Viscera communicating these messages to the hypothalamus must have the differentiation and 10 times X is absolutely not the same thing as 10 times y. They must be categorically different variables, as food will not satisfy thirst. The midbrain decision triangle will then prioritize thirst over hunger if it is of the same intensity. Is it a viable model to say that all communication can be reduced to bits without any categories?

You are right, a model that ignores the semantics is not viable. The system has to keep track of what QRF outcomes are produced by. This requires (effectively) a labeling scheme imposed by some “meta” component looking at sets of outcomes using its own reference frames. Hence interesting systems have hierarchical processing, with each “layer” doing active inference on its own inputs. We will scratch the surface of this in July.

Dr. Fields, with respect to the receiver Bob, when Alice wrote a question to Bob, and Bob has dyslexia, it would take him longer to retrieve it than it is for Alice to encode it, so time is not symmetrical for Alice and Bob even when they use the same language. Let’s assume further that Alice continues to write on a limited-capacity holographic screen, while Bob is behind is reading from the screen. At some point then, Alice will fill up the entire screen with information and would not be able to write any more. Would that mean then that the screen can also communicate to Alice that it is full, so the screen itself is another Bob?

Yes, A and B can have different time scales, and each will interact with the screen at their own scale. So Alice in your example will not stop “writing” but will overwrite her own last message. Letting t_A = t_B just makes the model easier to understand.

What happens when the agent becomes identical to the environment? When the model has been improved so much that it does not differ from the environment. Is this state even possible? Would it be thermodynamical equilibrium? Maximum entanglement?

Karl Friston has pointed out that the limit of the classical FEP, the limit of perfect prediction, corresponds to “generalized synchrony” between system and environment - each sends messages that the other can perfectly predict. This notion depends on the system and its environment being separated and hence distinguished somehow. In classical formulations, they are separated in ordinary 3d space - they occupy different locations. The quantum formulation is “background free” meaning it assumes no spacetime embedding - system and environment do not have “different locations” even though they have different sets of degrees of freedom in the overall Hilbert (i.e. state) space. Here the limit of the FEP is maximal entanglement. This does not mean that the system and its environment are “identical” (they aren’t, since they have different degrees of freedom). It means that their joint state |SE> is not separable, for factorable, into a system state |S> and an environment state |E>. Their states are no longer conditionally independent. This is distinct from thermal equilibrium, which means that they have the same temperature while remaining separable, and that their interaction can be characterized as the exchange of thermal fluctuations, i.e. noise.

Would you say there is an optimal degree of QRF overlap between agents for effective information transfer? In the diagram, a) shows no meaningful interaction and d) shows perfect overlap (in which case no new information can be transferred) so it seems for there to be an interesting interaction there must be both some overlapping region and some non-overlapping region (two agents who are either 1) looking at some of the same things and some different things, or 2) who see the same region in two different but related ways - or perhaps you would say 1 and 2 are in some sense equivalent when talking about QRFs?) so is there a "best case" difference in reference frames if your goal is to maximize your own new and useful information? Reference frames that are too different might limit the ability for different agents to communicate in the first place (they would lack a shared language of sorts), while reference frames that are too similar might veer towards redundancy (exchanged information can be assimilated without much adjustment or modification of one's QRF at all, and so is not really new).

I suspect that the answer to all general forms of this question is that it is undecidable. Some specific instances where undecidability can be proved will be discussed in the August session. This is clear in some cases; for example, “What is the most efficient way to do science?” is an example of your question. Undecidability leaves us with trying to find good heuristics, which are fiercely debated in the case of how to do science. A spin-off of your question is “How can a system (at least approximately) measure its own VFE?” How good are humans, for example, at assigning subjective probabilities? What is the “sense of certainty” and how is it implemented? These are questions for the October session.

After Shannon : “Alice can’t tell what the input means to the system.” But the modles we build have semantic content. so presumably we infer what the input ‘means’ to the system?

Our models indeed incorporate hypotheses about meaning; we use these every day in conversation. In the theory as presented, an “agent” is just a physical system that is in a separable joint state with its environment. Interesting systems have some level of accessible event memory, which elementary particles lack. Hence “agency” is a distribution, not a bright line.

Why does the area of the markov blanket outside the QRF bring in VFE. I thought that prediction errors from misaligned QRF`s created VFE.

VFE can be defined across the whole boundary/MB - see, for example, Karl Friston’s arxiv:1906.10184. The QRF-associated VFE is the VFE that the system can do something about by improving its information processing.

Dr. Fields mentioned that “A may be seeing what B is doing as informative, noise for B may be signal for A depending on how A’s QRF is structured. A is seeing more information than B is trying to send (the reverse of the hidden variable case). There are variables in the other system that aren’t telling you anything about what the other system is doing." to which Dr. Friedman made the comparison of reading too much or too little between the lines.
Am I correct in understanding this implies there can be an asymmetric overlap between QRFs? I am wondering if you can say more about what this would actually look like in terms of human communication or information sharing - specifically if there are conditions such as communication disorders (I am thinking of autism or schizophrenia) in which an agent is sending “useful” or readable signals unintentionally. I understand information in quantum theory is not synonymous with Shannon information, or with information as described in sender-receiver signaling games, but presumably it is somehow related to both. There are signaling games in which the information being shared is described as having either a cost or a benefit (most often a cost to the sender and a benefit to the receiver) depending on what that information is and how it is shared - in other words, the payoff matrix for sender and receiver may be asymmetrical or even at odds even when involving joint coordination. Is it possible and/or worthwhile to frame that type of analysis in quantum theoretic terms? (Apologies for the long and involved question, feel free to give a more cursory answer if necessary.)

Slide 19 b) and c) show asymmetric overlap, which is probably very common as your discussion of signalling games suggests. Biological agents go even farther, using various kinds of intentional deception, coercion, etc. in communications. The scalar cost/benefit analysis is an approximation, since sending and receiving have, for example, opportunity costs (you’re not attending to other potentially more important signals) as well as thermodynamic costs, and from an architectural perspective, memory as well as processing costs, etc. These cost tradeoffs need to be understood to map out control flow and see how well the system is optimized for allocating attention to where it is most needed. We will get to some of this in October.

in lecture 2 in quantum info theory, you said you make no assumptions about ‘mechanics’, but you also make use of the concepts of Energy and Time.

Energy and time (as complementary variables) are unavoidable in any model of information flow. I am avoiding any assumptions about the motion of objects through space. The goal is to see, in September, how “objects” and “motion through space” emerge as descriptions of the flow through time of bit patterns on the boundary/MB. “Mechanics” in this case becomes a model used to predict what is going to happen next on the boundary.

I’m trying to make sure I understand the noiseless single qubit communication channel between A and B. I think I appreciate how H_a and H_b are able to encode/decode a bit onto the qubit via a self mapping represented by their unitary propagators, dually represented by the energetic cost of the cycle.
What I am having trouble appreciating is how A and B infer that they are communicating with one another through the channel. Specifically, given H_a’s act of preparation and measurement are equivalent, what communicates to A that B has acted on the channel between H_a’s measure/prepare cycle? I’m imagining a situation where A is preparing/measuring ‘up’ over time and halfway between these cycles B is preparing/measuring ‘down’. Given the duality of preparing/measuring, both A and B are constantly noticing the qubit in up or down state respectively.
Is the answer something to do with the energetic cycle of H_a assuming within its self-mapping that it is acting on the qubit in isolation, and if H_b was acting on the qubit simultaneously as described above that this would be appreciated by a change in entropy? And/or a change in energetic cost to perform the H_a cycle?
Conceptually, I would like to be able to appreciate the ‘Operator M’ slide in lecture 2 given the different situations of A acting on the channel in isolation, and A and B acting on the channel in both a non-entangled and entangled state. Hopefully my question will help expose what I am misunderstanding, thanks!

Very good question - what philosophers call the “problem of the external world” or the “problem of solipsism.” Inferring that there is another system on the other side of the boundary is a sophisticated bit of abstract reasoning. This could happen across a one-bit channel - a system that was the whole universe minus one bit might infer that the light it saw flashing on and off was a extra bit out there somewhere. The one-bit system wouldn’t be able to infer anything. It is interesting to ask where in biology (evolution or development, e.g. from infancy in humans) it becomes useful to “see” the external system as a system; this is distinct from the question of when it becomes useful to identify and track objects, which probably happens much earlier. Your question about entanglement is different but also interesting. If A and B are entangled, then neither has a conditionally-independent state, so to say that either “infers” anything sounds like a category mistake. This, however, is from a perspective that can measure/infer their entanglement. We’ll see in August that a system can’t measure entanglement across its own boundary, so A can’t know that it is entangled with B. The question of whether A can infer anything now looks a lot like what philosophers call the “problem of free will.”

Do you think the city that I live could potentially be an agent with qualitative experience. So a markov blanket with classical information and some memory functions similar to the carlo rovelli time papers you referenced. Or is this type of thinking a bit magical

The theory challenges us to think about these kinds of questions. How do we draw the boundary around a city, and how different are the answers for different ways of drawing the boundary? What is the interaction at that boundary? How does a city store information? What do we mean by “qualitative experience,” in the case of a city and even in our own case? I don’t see this as magical thinking but rather as very difficult thinking that we may not be cognitively well equipped to do.

Hey Chris! Keep up the amazing work! I feel very grateful!
I still have questions about free energy principle for generic quantum systems. I am attempting to merge it with quantum darwinism applied to subatomic particles. This is my current understanding and I might have made mistakes in learning, feel free to correct me:
Entaglement is statistical nonseparability of one system into many. https://arxiv.org/abs/2112.15242 https://www.youtube.com/watch?v=aeh9bSuSNrA
In quantum darwinism we have pointer states, the quantum equivalents of the classical states of the system after decoherence has occurred through interaction with the environment that survived in darwinian natural selection, leading to loss of quantum entaglement, transforming quantum information into classical information. Spacetime and gravity are part of the classical state. https://en.wikipedia.org/wiki/Quantum_Darwinism
Let's suppose quantum reference frame (QRF) A corresponds to organism A and QRF B to organism B. Let's suppose those QRFs' cone-cocone diagrams are implemented by the brain dynamics, that are fundamentally modelled by quantum darwinism describing potential entangling of subatomic particles (or arbitrary quantum systems in general that can be classical pointer states), and approximated by various quantum field theory, neural field theory, or any other classical or quantum models used in neuroscience, neurophysiology, neurobiology, biophysics, quantum biology and overall consciousness studies. https://en.wikipedia.org/wiki/Quantum_mind#Quantum_brain_dynamics https://qri.org/blog/eigenbasis-of-the-mind
The leading theories of consciousness think that brain corresponds to a quantum system that mostly works classicaly, they suppose that brain doesn't really do mainly pure quantum computing (on the level of particles), but instead mostly classical computing, because assuming it (particles) was entangled, it would quickly locally decohere because of too much environmental interaction (even tho globally there's one universal wavefunction where nothing ever happens https://arxiv.org/abs/1210.8447 and according to quantum cognition there's entaglement between mental processes and between organisms, which is formalizing contextuality while communicating https://en.wikipedia.org/wiki/Quantum_cognition ), and this being the reason why our current built quantum computers must be in really close to absolute zero to maintain stable entaglement. https://en.wikipedia.org/wiki/Quantum_mind But quantum computing on the level of particles might play just a small role in the function with some empirically measured entanglement. (nuclear proton spins of 'brain water' being entangled) https://iopscience.iop.org/article/10.1088/2399-6528/ac94be Therefore, according to quantum darwinism, its a physical quantum state with subatomic particles with a lot of internal separable classical pointer states.
My main question is:
When communication between those two systems (organisms) through entaglement happens, which is the basis for perception, do we put in a quantum superposition a system that is separable (QRF A) with another separable system (QRF B), making the resulting system nonseparable, resulting in quantum entaglement? Do all the before separable states in the QRFs become unseparable, because the two QRFs become one quantum superposition? Will that cause the entangled system to have the properties of quantum computing (time symmetry), instead of classical computing? How do these properties manifest in concrete situations? Contextuality of our language? Is there time symmetry in our language? What are the best empirical examples? Nonseparability of symbols in language or bayesian beliefs present among cultures?
When it comes to sight, what does the classical content of our visual field (holographic screen) correspond to, what system/s do we entangle with? Photons? Or emitters of those photons (sun)? Or with the system they bounce off from (chair)? All of it, because without this photon bouncing process, or without the object, or without the whole univserse, there wouldn't be us, which is fundamental contextuality of everything? But its good enough pragmatically to make classical approximations to predict quantum states that behave classicaly, where time symmetry doesn't seem to be really utilized locally, while being utilized globally everywhere? How does sound work? How do other senses work?
Is the communication between the nested holographic quantum reference frames in our neuroanatomy also realized via the same kind of entaglement? Contextuality of mental processes? https://psyarxiv.com/6afs3
How do thoughts fit in? Classical information? Feelings? Ontologies? Fundamental bayesian priors? Is the sense of isness that Heidegger or Buddhism talks about the most fundamental bayesian prior? Are boundaries, disctintions or any concrete structure in experience just symmetry breakings in the thermodynamic flow in hiearchical bayesian networks? Are all qualia just different classical information corresponding to some entaglement of our organism with some other external physical system, encoding relational correlations with the environment? Can all qualia be explained in quantum FEP? Does our phenomenal spacetime (error correcting code glued by entanglement? https://www.youtube.com/watch?v=8-ct1IlGUOw https://noetic.org/wp-content/uploads/2023/06/Conscious-Agents-Full-Proposal.pdf ) correlate to other quantum systems' relativistic relational phenomenal spacetimes that are part of their classical pointer states that we entagle with and therefore percieve together shared spacetime? What do phenomenal states, where sense of spacetime or existence stops existing (on disorders of consciousness, deep sleep, deep meditation, psychedelics) correspond to the loss of entaglement? Is all qualia classical? If all perception is relation entaglement with other quantum systems, would that mean that our holographic screen could "measure" the existence/correlates with quantum states that don't behave classicaly? Are there "quantum qualia" corresponding to quantum information? Does that make sense mathematically?
Are elementary particles the best fundamental building blocks to use in quantum darwinism, since quantum field theory models the universe as system of interacting quantized fields, where particles are just excitations of the fields? What other fundamental building blocks can we use, how to properly define them? Excitations? Waves? Strings? Loops? (from loop quantum gravity https://en.wikipedia.org/wiki/Loop_quantum_gravity ) Arbitrary for quantum reference frames "graspable" quantum or classical information in a quantum system that can behave classicaly? Undefined? And since quantum field theory standard model assumes background spacetime and we want spacetime as emergent classical information, when it comes to the emergence of classical spacetime from quantum, what all models can be used to understand this relationship? What other proposed QM+GR unifying fundamental physics theories or frameworks could be used to make spacetime emergent from a deeper principle, finding pointer states with classical spacetime in quantum mechanics? Do you have a favorite one? For example: The more entangled two quantum states are, the closer to eachother they are geometrically and total entaglement increases as time passes and where is there is energy (therefore also mass) there is less entaglement https://www.youtube.com/watch?v=8-ct1IlGUOw , entaglement being glue of spacetime https://www.youtube.com/watch?v=bxY1PK4wW1I , spacetime being quantum error correcting code https://www.youtube.com/watch?v=SW2rlQVfnK0 or amplituhedron (encoding momentum of particles in twistors aka spacetimefree 4D complex plane?) https://www.youtube.com/watch?v=GL77oOnrPzY or string theory? Could the mathematics of models that instead try to quantize classical spacetime be helpful here, like loop quantum gravity? https://en.wikipedia.org/wiki/Loop_quantum_gravity
How to think about bayesian networks minimizing free energy without appealing to some update function that assumes time, because space and time is classical information embedded inside the network? Because it is encoded as entaglement error correcting glue, we just describe slice in time by the total amount of global entaglement and interaction is defined as all the possible paths in the free energy landscape, that is mathematically equivalent to quantum theory's by default universal wavefunction, one fully entangled globally spacetimeless system where spacetime just makes sense locally? Is sense of and measurment of locality relational? What's the nature of entropic arrow of time towards greater entanglement in the universe?
What's the solution to the binding problem and the hard problem of consciousness? Is unified physical information processing in a quantum reference frame implemented as markov blanket with various implementations in different physics and consciousness theories an individualized observer implemented by neural synchrony? https://en.wikipedia.org/wiki/Holonomic_brain_theory#Deep_and_surface_structure_of_memory by quantum coherence (neuronal superpositions)? That would make the most sense in quantum cognition formalism as superposition of neural processes! https://en.wikipedia.org/wiki/Quantum_cognition https://en.wikipedia.org/wiki/Quantum_mind#David_Pearce https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7597170/ by some kind of alignment of selforganizing harmonic modes? https://twitter.com/AFornito/status/1577743997149511680 by topological segmentation in the electromagnetic field? https://qri.org/blog/digital-sentience https://www.youtube.com/watch?v=g0YID6XV-PQ or possibly by a deeper structure beyond quantum field theory? https://qri.org/blog/digital-sentience https://www.youtube.com/watch?v=g0YID6XV-PQ Are there overlaps? Do the neural superpositions create topological segments?
I also like the model of phenomenal spacetime being error correcting code https://noetic.org/wp-content/uploads/2023/06/Conscious-Agents-Full-Proposal.pdf or consequence of measurements of distances and causal structures https://qri.org/blog/pseudo-time-arrow https://qri.org/blog/hyperbolic-geometry-dmt
https://upload.wikimedia.org/wikipedia/commons/7/7c/Disciplines_mind_map.jpg
So, if we wanna describe all natural scientific fields in one model using the quantum formulation of the free energy principle or overall quantum information theory and fundamental physics, we can use it to formalize the fundamental relationality of information in any other scientific discipline that studies arbitrary quantum systems that may or may not have classical correspondences in various models, where all models are useful approximations and evolutionary niches. All the mathematical models or differential equations used to predict dynamics in chemistry, biology, sociology, ecosystem science, astrophysics and so on are different special cases or approximations of those underlying quantum dynamics in quantum reference frames with holographic screens for communication that cause higher order causal information processing with agentic free energy minimizing modelling in nested hieachical bayesian graph and acting across scales. Can this be formalized as multiple cocone diagrams forming higher order cocone diagrams by informorphisms encoding higher order semantic information, a cocone diagram where nodes are themselves cocone diagrams, not just classifiers and semantics. Are the fundamental classifiers nonreducible information processing in physics, corresponding to the simpliest information processing of subatomic particles or excitations or loops or strings or something else, that together form collective information processing with shared context in a cocone diagram and that forms again higher order cocone diagrams up to theoretical infinity? Is this the nature of emergence of collective intelligences from collective intelligences? From (inert) simple interacting subatomic particles or other fundamental entity (that through entaglement form the quantum error correcting glue of spacetime) to atoms to molecules to cells to brain regions to brains to organisms to communities to cultures to nation states to planetary systems to galaxies and so on, from lower order quantum information theoretic goals that bind and form a unit together getting higher order goals by this nested fractal cocone collective bayesian free energy minimizing information processing structure, where classical or quantum differential equations describing them tend to get more and more complex with increasing structural information processing complexity, but in some cases we can locate simple laws acting as good enough approximation for its context that can also sometimes be transposed across many contexts, is that convergent evolution of nature balancing between simplicity and complexity on all scales?

There are far too many questions to answer here in one go, so I will focus of quantum Darwinism (qDar), as this will help with some of the others. The central idea of qDar is the “environment as witness,” i.e. that the environment E of some system S “witnesses” the state of S by a process called “einselection” (this is W. Zurek’s terminology). Einselection is just interaction at a boundary: what is encoded on the boundary are the eigenvalues of the interaction Hamiltonian, in this case H_{SE}. Zurek points out that known physical interactions are distance dependent, so fixing the energy (the eigenvalue of the Hamiltonian) fixes the distance. This means: the environment E encodes a fixed position for the system S, not a superposition of positions. This is just an alternative way of thinking about a standard (see e.g. M. Schlosshauer’s textbook) collisional model of environmental decoherence. Now, let’s add the observer, A. Zurek points out that in practice, we observers always interact with the ambient environment, e.g. the ambient photon field. So there is another boundary of interest, the boundary between A and E, and another interaction H_{AE}. See the left side of the attached image, taken from arxiv:2303.16461. In the environment as witness model, A “reads” the encoding of S’s position (and other pointer states) from E via H_{AE}. So from A’s perspective, E is a channel from S to A. Now consider two observers, A1 and A2. Each interacts with a separate “sector” of E, call them E1 and E2 (see right part of the attached image). These interactions have to be mutually nondisturbing for qDAR to work. If they are, then we have a channel A1 - E1 - S - E2 - A2 from A1 to A2. This is a quantum channel. Now the question is: how do A1 and A2 know to look for an encoding of |S> in E1 and E2, respectively? Think of a cluttered laboratory. How do A1 and A2 know to look for the encodings of the state of the Fluke voltmeter in E1 and E2, not look at the encodings of all the other positions and states that the ambient photon field encodes? Zurek assumes that they just know this “without prior agreement” but obviously that won’t work - they have to both be looking at the Fluke voltmeter, not at or for something else. So they have to have a prior agreement, or some prior knowledge, of what to both look for. This is established by classical communication, either now or at some time in the past. So there is also a classical channel A1 - E* - A2, where E* is some part of the shared environment other than E1 and E2. Now we have A1 and A2 sharing a quantum channel E1 - S - E2 and a classical channel E*. That’s a LOCC protocol. For the protocol to work, we have to assume that A1 and A2 are separable, i.e. have conditionally independent states.

Do you agree or disagree with Deutsch and Wallace’s decision-theoretic approach to quantum probability ()? How it could be interpreted from the perspective of agent and the environment who choose different reference frames (up directions) and try to align them? I agree with the general thrust of treating all probabilities as agent-specific uncertainties about outcomes of future observations. The QBists make a much cleaner statement of this, at least for me; see C. Fuchs, arxiv:1003.5209 or D. Mermin, arxiv:1809.01639.
Despite Deutsch’s eloquence about utility for understanding quantum computing, the whole “branching” business seems like a red herring to me. At the very least, many worlds should make the past branch as much as the future - by treating branching as one-way, they are reifying classical memory in just the way that they claim to not be reifying classical states.

This may be a bit of a step back moment, but can we have another look at the “Z” spin operator’s role in a non-geometric, non space-time condition? I appreciate pivoting and axis and points where two vectors meet. But these conceptual “places” are just that - they seem to explain something directly linked to a space-time way of locating, not always topologically.
And, in this presentation, the focus is on “multiple” channels. Multiple - based on categories - all using this “Z” spin operator that is NOT as an axis (i.e. around which the spinning to up or down transpires, but not located using measures, and, exists so that measures CAN take place). So if NOT multiple AXIS’ (plural) because we are NOT talking space-time, but still affording change (spin to position U or D), then something that affords, but is NOT tangible? That would make perfect sense if “I want to get from here, to there, and have nothing get in my Informational way.” And, then, No-thing must also be accepted as A-thing even if nothing is intangible. Nothing is then both tangible and intangible concurrently. If this is true, great. It means I can “pass through” zero, as, at least informationally, the screen does not always act as a barrier. If not true, can you explain why not, given the part that “Z” plays? Thanks!

The array of qubits on the boundary is initially just a set. The only stipulation is that they do interact, but other than that, there is no “location” for any qubit or even an “order” in which the qubits are arranged. The idea of measuring along a “z-axis” is a convention. Each qubit has two classically-distinguishable states which we can think of, heuristically, as “up” and “down” (or “left” and “right” or any other pair of opposites). Max Tegmark is fond of using a smiley and a frowny face to indicate the classically-distinguishable states. It is all an intangible abstraction.

How do quantum information tasks with a shared figure of merit or state, as a shared quantum reference frame, change under fuzzy superselection rules? For example, how does entanglement distillation with fluctuating spin and quantum computation with noisy time work with the fuzzy superselection rules?

See Bartlett et al.’s review of quantum reference frames in Rev. Mod. Phys. 2007 (arxiv:0610030) where they show how superselection rules can always be seen as stand-ins for the absence of a QRF, i.e. for an inability to measure some state component. Classical notions of fluctuations, noise, or fuzziness rely on the idea of a “random” or uninformative environment and hence on some measurement capability that yields a “noisy” statistical distribution. This requires building a model observer considerably more complicated than we’ve discussed. I’ve not attempted doing so explicitly. It might be an interesting exercise.

If Markov blankets are really holographic screens, might this imply a non-unitary, thermodynamically open universe? The concept of information transfer across such a screen is key in your lecture. This implies the universe must be transferring information ‘from’ something. This appears to comport with dark energy, at least internally. Additionally, this sets up a radical question, to what extent are the properties of living systems attributable to the universe itself? If the universe is an open, information-processing system, perhaps the drive toward greater internal order in biological systems is not an exception but rather resonates with the informational dynamics of the universe. This hints at a radical possibility - that life and mind are not flukes within a purely mechanical universe, but expressions of more fundamental, potentially consciousness-like, properties of reality.

What is a “thermodynamically open universe”? This term suggests to me a “universe” that is in thermodynamic interaction with something else, in which case it is not a “universe” as I am using this term. A universe is my usage of the term is isolated by definition: one could, for example, say that there is a unique isolated system - comprising “everything” - that is what “universe” refers to. This system does not obtain information from or give information to anything else.

In the 2nd lecture, you talk about the how the FEP is the classical limit of unitarity. However, thermodynamically open systems, such as living systems, are typically considered non-unitary. Could you explain the relationship of the FEP to non-unitary systems?

The FEP is a classical limit of unitarity in the sense that the FEP drives all jointly-isolated pairs of open systems to become perfect predictors of each other’s behavior, i.e. toward what Karl Friston refers to as “generalized synchrony.” In the limit, such systems are distinguished from each other only by their spacetime embedding, so they become indistinguishable when an “objective” spacetime embedding imposed from outside the joint system is dropped as an assumption. When we consider just one of any such pair of jointly-isolated systems, the FEP drives that system to become a perfect prediction of the behavior of its interaction partner, i.e. its total physical environment.

Dr. Fields, could you kindly comment on the issues between the Unitarity Principle and General Relativity? E.g. as described here? Do you see this issue as real and fundamental or is it a misunderstanding? Thank you.

If spacetime and objects (e.g. standard model particles) are “objective” or fundamental, then QT and GR are inconsistent. This is one of the main drivers of models in which spacetime and hence GR are emergent from QT.

Dr. Fields, in your 5th lecture you described the origins of metric sense (distance). I believe that developmentally, the direction - up/down/felt/right is primary to how far. How would this correspond to your theory? Thank you

Yes. Body centered coordinates are roughly polar (i.e. 2 angles plus a radial distance) and also roughly projective (). Dr. Fields, you described human development in your 5th lecture, including object permanence. Katerina Fotopalu describes something that happens prior to that - synchrony between a mother and a baby and this is touch - there is no way to touch when two humans are not synchronous both in time and in space. Skin-to-skin right after birth results in actual synchronization in EEG between a mother and a baby. Could you kindly comment on this concept of Alice and Bob realizing they are together/communicating (meet at the holographic screen?) prior to a sense of time or space?

The object permanence work I referred to is heavily skewed toward vision due to use of the looking-time paradigm for measuring differential attention. Object permanence may well develop much earlier for touch, audition, and olfaction/taste. What an infant “realizes” in a metacognitive sense may be very limited. I’m not up on the latest on development of the self model - do you know the new literature on this?

“Interoperability is due to the fact that all information media have in common properties—the two counterfactuals I mentioned above*—that transcend most of their specific details (i.e., whether they are photons, transistors, the spins of an electron, neurons, or switches in a lamp). In all these cases, when interested in the information-processing abilities of these systems, we can abstract away their irrelevant details and simply talk about them as information media, considering their ‘information-carrying attributes’ only (e.g., up/ down for an arrow, on/ off for a lamp, and so on). Now, armed with these counterfactuals, you can understand why information looks like an abstraction and yet is grounded in (counterfactual) physical properties. When talking of a bit, we do not need to mention what physical system embodies it. What matters is that a bit is an information medium—entirely defined by its counterfactual properties, which hold irrespective of its physical details. What is the connection with physics, then? The key is that which physical systems are information media and which are not is established precisely by the physical laws that rule our universe. And the interoperability of information media is a counterfactual property of physical systems: it is a property of the physical world, just like the colour of the summer sky, the shape of rainbows, or the attractive interaction that holds between charges of opposite signs.”
(* not every system is an information medium. A good example is a memory in a computer that is full but cannot be erased: it is possible to read information out, but not write new information in (because no more space is available, and reset is not possible). It was an information medium once, but no longer. You could also have a case where information can be copied in, but not out. Have you ever tried to write something on the foam on top of a cappuccino or a beer? At first it looks possible. But the letters rapidly fade away, to the point that they can no longer be read. Neither of these two types of systems would be capable of carrying information, because they do not have enough counterfactual properties. They are not information media.)
— The Science of Can and Can’t: A Physicist’s Journey through the Land of Counterfactuals by Chiara Marletto
I would just like a Chris clarification based on Chiara’s comment…Information carrying as being fundamentally different than information organizing (orienting). The former is quantum information, the later is the space-time construct?

This passage from Marletto does seem entirely classical. The question of counterfactuals and “counterfactual definiteness” has a long history in QT and (especially) its associated philosophy. The core question is whether quantum systems have well-defined states when not being measured. In the formalism employed here, there is no such thing as an open system that is not interacting with something, and hence being measured by that thing. The deeper issue here, I think, is what level of cognitive complexity is required for a system to manipulate counterfactuals, which is a requirement for planning. I don’t know of any good answers to this question.

Where does a possible quantum gravity enter into the quantum information theoretic? For example, if by the free energy principle two agents align and become entangled across the holographic screen (e.g., they fall in love metaphorically speaking), would the two of them feel a non-fungible attraction (or revulsion, if it’s not love ) to the other’s image on the screen? Would this emotional gravitation follow post entanglement? And could this hypothetical quantum gravity be the actor that maintains the hemostatic balance that Michael Levin has hinted of?

One option is that a quantum gravitational theory (i.e. Einstein’s equations) follows from a quantum information theory. How this might relate to the “feeling” of gravity (weight, acceleration) requires a more of a psychological theory than I’m willing to speculate about at this point.

Dear Professor Fields, thank you for the awesome lectures! My question: Why does unitary dynamics (conservation of information, the “-1 law of thermodymics according to Prof. Susskind) drives systems to maximal entaglement (page 7 of your 2021 paper FEP for generic quantum systems)? The FEP allows to maintain the Markov Blanket, to stay separable from the environement as long as possible (homeostasis, survive). Isn’t this a contradiction to the statement: the FEP drives systems to max. entanglement? As t→infinity the biological systems will not use anylonger the FEP anyhow (dead, the Markov Blanket disolves → entanglement).

Yes. The FEP drives systems to be better predictors, with perfect prediction as a limit, as discussed in a response above. It is predictive ability, and prediction-driven adaptive behavior, that preserves the integrity of the boundary, and hence staves off death. But the limit of perfect predictions (on the part of both interacting parties) is entanglement and hence the boundary becoming effectively meaningless. I increasingly think of the FEP as a theory of fluctuations away from a default state of entanglement.

Dr. Fields, I was hoping to clarify the following. Here
https://www.youtube.com/watch?v=mitw2XcXY98&t=1s
You said: “Entanglement is everywhere.[… and later] There are no boundaries (in physics, only in semantics - user interfaces).” 24th minute here, 26th minutes.
It seems that you are not saying every object is entangled with every other object, did I get that right?
When people say – the is just one wave function for the entire universe, that would be equivalent to saying – everything in the universe is entangled. – I think that then the concept of entanglement would have no specificity, and then it would become meaningless.
Clearly, in the lab, there can be two particles that are entangled and two that are not. Therefore, systems that are not entangled exist. Is that right? Can we be clear on that?
Empirically, we see boundaries everywhere and not just semantic – cell membrane, skin,
Blood-brain barrier. The destruction of these boundaries leads to death. How are they just semantic if skin is an organ and skin cells differ from other cells? It is very much physical and real. If there are no boundaries then there are no holographic screens, as there is no difference between Alice and Bob. Could you please clarify?

It is commonplace even in the physics literature to talk about entanglement as ontological and observer-independent. This however is not correct, as first pointed out (I think) by Paolo Zanardi in a Phys Rev Lett paper in 2000. Whether a state looks entangled depends on how it is measured. Think of a Bell state (|10> - |01>)/sqr(2). This is entangled when measured in the basis (|0>, |1>), i.e. the ordinary lab basis that looks at one “particle” at a time. But if you could measure in the basis comprising two orthogonal Bell states, which is a perfectly good basis for the Hilbert space, the Bell state would not look entangled - it would just have one basis component. What’s happening here? A measurement in the Bell basis corresponds, in the lab, to a measurement made by observers that are themselves entangled in the (|0>, |1>) basis. So in this “ordinary” basis, we can move the entanglement around, from the “system” to the “observers” and back. Here the existence of entanglement looks like a fact, while whether a particular state is entangled is observer-relative. Again, what’s happening? Picking a basis requires drawing a boundary. If we drew a boundary around the whole lab, we couldn’t say where the entanglement was - this is the “Wigner’s friend” scenario. But drawing a boundary is something that we do, intentionally or not. Your examples are of boundaries that we are “wired up” to draw, but we can (with some effort or formalism) ignore them. Take the scale of observation down to elementary particles or even atoms and these boundaries disappear. It is in this sense that boundaries or holographic screens are “semantics” - they are structure added to the physics so that we can make sense of it. Since the added structure is not just useful but essential for getting on in the world, we regard it as real. But it’s not part of the “underlying physics”, it’s part of the process of observation. Sorry for the long answer - this interface won’t let me have multiple paragraphs!

Dr. Fields, what interpretation of Quantum Mechanics do you prefer? What do you think of Many Worlds?
Sean Carroll makes a point that only in Many Worlds the Schrodinger equation stays unchanged at all times, in all other ones, parts of it “collapse.” He also makes a point that, as an example, due to a cat being entangled with its immediate environment (e.g. air around it,) every macro object inevitably becomes entangled. He does say that there is just one wave function of an entire Universe, however, the act of entanglement of A and B leads to the split and there are multiple universes at that point, vectors of which are orthogonal in Hilbert space - they would never exist simultaneously.
It would be great to hear your take on this. Thank you

Many Worlds is the idea that every time I open my eyes, the entire universe splits in two, and you and everyone else are suddenly duplicated. This also happens every time an E. coli flicks its flagellum. This seems excessive to me, but it also seems inevitable if one is to regard classical information (or classical objects) as ontological. Hence the natural alternative is to drop this assumption, in which case one gets the picture employed in this course: classical information is only encoded on boundaries and boundaries are all ancillary to the physics. Or stated differently: classical information is all observer-relative, including the classical information that designates something as an “observer.” This view is close to “QBism” as developed by Chris Fuchs et al. and is a form of what’s now called “participatory realism” after John Wheeler’s idea of physical systems being “observer-participants” in the universe.

Dear Chris Fields and Active Inference peers:
I wanted to express my gratitude for the chance to take part in this incredible course.
The physics as information processing paradigm reflects insights I've encountered in philosophy, the arts, and the social and biological sciences. The possibility of a meta-language that encompasses all of them is truly exciting.
At this stage, I'd like to confirm if the summary I've written about the course so far is accurate.
Based on lecture 2, it appears that the question 'up or down' doesn't assume Alice and Bob have specific goals. Their primary objective is to find and effectively share answers over time across boundaries, which leads to complex systems arising from innumerable topologically-connected information channels. From these two points, I gather that information processing is the driving force behind systems, rather than the goals we attribute to them, such as Darwinian fitness. If these ideas are correct, then Darwin's observations of "endless forms most beautiful," the phylogenetic trees we’ve drawn since his time, what we now know about the origins of human sociocultural systems, and all non-biological systems, are instantiations of physics as information processing.
Thank you in advance for your feedback.
Ana Magdalena Hurtado, PhD
Professor of Human Evolutionary Ecology and Global Health, Arizona State University

Thanks, Ana, I’m glad you’ve found it valuable. I agree with your summary. If you replace “information processing” with “uncertainty minimization” as the “driving force,” you have a fair statement of the FEP. There is a growing literature on applications of this kind of thinking in evo-devo-eco biology, with the aim of building a theory structure for “development” that is scale-free. There is also experimental work (much of it computational), e.g. Mike Levin’s work showing that gene regulatory networks are capable of learning and storing memories (!). My recent papers in this area are all on my website chrisfieldsresearch.com, and have pointers in the references to lots of work by others. Happy to discuss any of this further by email.

Dr. Fields, In the discussion group to Lecture 5 you mentioned that one needs memory to have a clock/time. What is memory exactly would seem essential to define? In humans, we have objective approach - neurobiology of memory engrams and various kinds of memory - procedural, episodic, etc, and we have subjective perception; however, the memory itself - the concept - is an abstraction, a model. I am not sure one can define memory if time does not exist just yet. How would that look like?
Then, if one needs time to define memory and one derives time from having memory, we may have circular definitions.
The bottom line is that if we state that we derive time and space, should we not make sure that we are not using concepts that rely on time and space, such as memory? Thank you.

Yes, I would agree that time and memory cannot be defined separately. I suspect that space and object identity also cannot be defined separately. I also suspect that this inter-dependence of physical and psychological notions tells us something important, at least that our educational system needs some revision.

Dr. Fields, you talk about decoherence here in the discussion to Lecture 5, minute 22

In which sense do you use the term? Some describe it as a loss of quantum coherence and object becoming classical.

Sean Carroll states that what is required for decoherence is (a) the belief that there is only one wave function for the whole universe and (b) the rest of the universe is continuously bumping into us, interacting with us, so then if one is a big enough macroscopic object, one will become entangled with the rest of the universe, like it or not – and that entanglement is what we call decoherence.

These two descriptions do not seem to be saying the same thing. How is the classical object entangled with the rest of the universe? Thank you.

The terminology in this area is confusing, both because it gets tied up with interpretations of QT and because of sloppy usage. The idea is based on a thought experiment: suppose you had an isolated pure state |S>, and you suddenly exposed it to a large environment E, the state of which you haven’t or can’t measure and so don’t know. Your previously-isolated S now interacts with E, so you have some joint state |SE> or density \rho_SE. You can’t measure this since E is big, so you measure the S component, which you can only access as a density \rho_S = Tr(E)\rho_SE, Tr the trace operator that essentially averages over states of E. This is a “mixed state” i.e. a classical probability distribution over quantum states. Unless this distribution is a delta function, S and E are entangled (in your basis - see response to the question about entanglement above). Technically, decoherence is this transition from a pure to a mixed state of S. But, this thought experiment doesn’t really make sense, or only makes sense (approximately) for a beam of particles hitting a target or something like that. It’s a heuristic idea. The underlying question, as always, is whether decoherence is observer-specific or “objective” in some sense. Carroll is making it objective - your state is effectively classical because it is not a pure state, but a mixed state due to your interaction with the rest of the universe.

Just a small correction to my question 46. I misspelled homeostatic when I typed “hemostatic”. If you corrected this small error of mine, it would make my question clearer to all readers that may happen upon it. Thank you, and cheers. It is a great online course for us lurkers!

Thanks. I can’t fix this, but hopefully read your question correctly.

Do boundaries arise when entities, connected by a shared intention at a particular moment (Alice), need to establish channels for information exchange with another group of entities in a different state at the same moment (Bob)?

Assuming an “entity” already involves assuming a boundary around that entity. Since the boundary itself defines the overall channel, this exists whenever the entity does. What can change is the “systems” that the entity identifies on its boundary and treats as system-specific channels. How multiple entities agree to recognize a “shared system” remains an open question. The question of how humans agree on shared meanings of words is a special case of this.

What happens when the shared intention within a system remains constant or changes over time? Does the composition of entities within systems change while boundaries remain the same, or do the entities stay the same while boundaries change? Can these states occur simultaneously while cycling between their combinations? Are these the types of semantics and scale-free-state concepts that entropy captures?

"Intention” suggests an agent with a sophisticated self-representation that interprets the agent’s actions in terms of apparent goals. This would presumably be context-dependent, so I’m not sure anything can be said in general. The boundary defines the entity as a physical system, so they change together. This differs from ordinary usage, where we in some cases treat time series of very different physical systems - e.g. human bodies - as “the same thing.” Understanding how we do this is an open cognitive science question. Entropy is just a number, so it really only captures interaction bandwidth.

Thank you for recommending during the discussion on the lecture about spacetime that we look into Karl Friston’s cognitive sciences research on the free energy principle. I came across an article that has addressed some of the questions I’ve had since beginning your course - Variational ecology and the physics of sentient systems. The authors state: “According to the FEP, the physics of sentient systems follows from the statistical mechanics of life.” I am curious to know how one would evaluate the validity of this approach from a physics-as-information systems standpoint. The current framework is completely consistent with that of the FEP, up to some subtleties about how boundaries are defined and to the reliance of the classical version on a spacetime embedding to keep system and environment separate. See “A free energy principle for generic quantum systems” written with Karl, on my website chrisfieldsresearch.com.

What is your perspective on this conclusion published in Melvin Vopson’s paper in AIP Advances last week - “We showed that the second law of infodynamics is universally applicable to any system containing information states, including biological systems and digital data. Remarkably, this indicates that the evolution of biological life tends in such a way that genetic mutations are not just random events as per the current Darwinian consensus, but instead undergo genetic mutations according to the second law of infodynamics, minimizing their information entropy.” (The second law of infodynamics and its implications for the simulated universe hypothesis

AIP Advances 13, 105308 (2023)

I don’t understand this sentence “the effect of creating a number of N information states is to form N additional information microstates superimposed onto the existing physical microstates” (para 2 of that paper) because I don’t know what it could mean to say that we “create new microstates”. How? Any system has the states it has, so this makes no sense to me. I agree that treating mutations as “random” has no explanatory power and is usually incorrect; there is now a huge literature on this. But I doubt that I would agree with this paper’s reasons for that claim, given the above.

If two compartments of Alice, A1 and A2, communicate with each other via a classical channel, but one or both of the compartments communicates with Bob via a quantum channel, does this mean that, for instance, A1 and Bob can momentarily share more information with each other than A1 shares with A2? That Bob and A1 can share a memory that A2 does not share?

Yes on both counts.

Given the classical requirement of the agent of the human genus to have engendered an Eco-Evo-Devo energy-matter solution that necessarily nests within a natural language scaffold, both leveraging and suppressing the bounds of human cognition, as the very Markov blanket/boundary or holographic screen that then must, by definition, extend to the species as an aggregate cognitive lens, to what extent do you regard the fact that current computational-neural information models all appear predicated on information processing - classical or quantum - that invariably presumes ever-leveraged gains at cost as an accentuated probability curve (evermore successful over/within space-time), when empirical econiche performance in all developmental milestones and cognitive-neuromuscular tasks conducted by human agents, is rote-dependent, of stochastic success rates and subject not to inherited DNA traits, yet rather agent numerosity-endowed, and observed as leveraged technology: the very vibrational quantum fabric of human language itself?

I unfortunately do not understand this question. However, it is clear that viewing physical interaction in quantum terms raises the question of what classical communication is very starkly. The question of what language is and how it works is a special case of this larger question about classical communication. Alexei Grinbaum concluded a paper on quantum communication with the suggestion that physics is actually “about language.” I have considerable sympathy with this view.

Hello, Prof Fields - I'm Ana Hurtado. During the live Discussion 5 session, I asked: "What is the goal?" Later on, I sent you a question through this system suggesting that biological fitness might be a distraction. From what I learned in your lectures, it appeared that the ‘goal seeking’ aspect of the topos quantum-bit information might be the driving force behind all systems in the universe. So, today I stumbled upon this interesting publication (Wong, M.L. et al. (2023) ‘On the roles of function and selection in evolving systems’, Proceedings of the National Academy of Sciences, 120(43). doi:10.1073/pnas.2310223120) about the law of functional information. It addresses what had me scratching my head (goal seeking) and sheds light on how it could actually work (something like - selecting 'adaptive' information variants). What are your thoughts on the last paragraph of the article? - “Given the ubiquity of evolving systems in the natural world, it seems odd that one or more laws describing their behaviors have not been more quickly forthcoming [though note the important contribution of Price (140)]. Perhaps the dominance of Darwinian thinking—the false equating of biological natural selection to “evolution” writ large—played some role. Yet that cannot be the whole story. A more deeply rooted factor in the absence of a law of evolution may be the reluctance of scientists to consider “function” and “context” in their formulations. A metric of information that is based on functionality suggests that considerations of the context of a system alters the outcome of a calculation, and that this context results in a preference for configurations with greater degrees of function. An asymmetric trajectory based upon functionality may seem antithetical to scientific analysis. Nevertheless, we conjecture that selection based on static persistence, dynamic persistence, and novelty generation is a universal process that results in systems with increased functional information.” This framework has the potential to provide a better explanation for many observations I've come across in hunter-gatherer societies, human evolutionary environments, and present-day societies. It particularly sheds light on the evolution of human life history and population health. Thank you for offering such an amazing course!

Thanks for this reference - you’re the second person to refer me to this paper in this past week! I have yet to review it thoroughly, but it seems very consonant with thinking driven by the FEP. Consider a system S and its environment E interacting at their mutual boundary. The FEP says that S behaves so as to decrease its measured VFE at the boundary. But the FEP also applies to E, whose environment is S. It says that E also acts to decrease VFE measured at the boundary. How? By building a better predictive model of S’s behavior, and by acting on S to get S to conform to its (E’s) model of S’s behavior. The environment of any active-inference agent, in other words, is also an active-inference agent. Modern Synthesis NeoDarwinism basically treats the environment as a noise source. But this is crazy - the environment of any organism is mostly other organisms, all of which are trying to do what the organism of interest is trying to do - get on in the world. The interaction between an organism and its environment is social, economic, political all the way down, at least to the level of cells, even in microbial mats. This is much closer to Margulis and Lovelock than to Dawkins.