These notes are written for my PhD student, who is a Mathematician
who needs to know a little Statistical Mechanics for his thesis introduction. YMMV.
He has asked:
What is Statistical Mechanics?
Who is interested in it?
What is the relationship between the partition function and Observables?
...let's start with those questions.
%{{{ 1
\subsection{What? Why? (Click Me)}
Statistical Mechanics is part of Physics.
Physics might be characterised, in the large, as the
scientific exercise
(as opposed to involuntary reflex) of modeling of the
observable physical world.
That is,
the representation of the physical world by something `simpler',
which nonetheless captures some of
the physical world's humanistically essential features.
There are various phases to this exercise, such as:
(i) deciding which toy is the model;
(ii) working out what the model itself does;
and
(iii) interpreting this behaviour
as a prediction for the physical world.
The simple toys at our disposal (such as real toys, and
systems of equations in mathematics)
either exist themselves in the physical world, or are abstractions
formulated by creatures living in the physical world.
In particular
Scientists have had notable success summarizing large amounts of
observational data from the physical world with certain relatively simple
mathematical models.
A very successful such model is, reasonably, regarded as close to nature
itself; and hence fundamental.
Key to this is the expectation that such a model, pushed into an as yet
unobserved
(but suitably nearby) regime, will correctly predict the result of observations
subsequently made there.
There has been notable success too in this predictive aspect of Physics,
and great technological benefits have accrued.
...So where does Statistical Mechanics fit in?
"The fundamental laws necessary for the mathematical treatment of a large part of physics and the whole of chemistry are thus completely known, and the difficulty lies only in the fact that application of these laws leads to equations that are too complex to be solved."
Paul Dirac
In this quote Dirac points out that the problems of Physics do not end,
by any means, with the determination of fundamental principles.
They include such fundamental problems; and also problems of computation.
(Indeed for the subject we are going to describe here, its original
historical development was assumed to be on the fundamental side.
Only a better understanding of its setting later showed otherwise.)
An example of the laws that Dirac is refering to would be Newton's laws,
which do a good job of determining the classical dynamics of a single particle
moving through a given force-field.
Two-body systems are also manageable but after that, even though it may
well still be Newtonian (or some other well-understood) laws that apply in
principle, exact dynamics will simply not be computationally accessible.
At least some understanding of the modelling of many-body systems is needed in order to work with a number of important materials (magnets, magnetic recording materials, LCDs, non-perturbative QFT etc). In each such case, the key dynamical components of the system are numerous, and interact with each other. Thus the force fields affecting the movement of one, are caused by the others; and when it moves, its own field changes, moving the others.
The solution:
The equilibrium Statistical Mechanical approach to such problems is to try to model only certain special types of observation that could be made on the system. One then models these observations by weighted averages over all possible instantaneous states of the system. In other words dynamics is not modelled directly (questions about dynamics are not asked directly). As far as is appropriate, dynamics is encoded in the weightings -- the probabilities asigned to states.
It is most convenient to pass to an example. We shall choose a bar magnet. We shall assume that the metal crystal lattice is essentially fixed (the formation of the lattice is itself a significant problem, but we will have enough on our plate). The set of states of the system that we shall allow shall be the possible orientations of the atomic magnetic dipoles (not their positions, which shall be fixed on the lattice sites). What next? %}}} %{{{ 2
\subsection{Classical reminders (Click Me)}
A good rule of thumb when analysing a physical system is: "follow the energy". (This begs many questions, all of which we ignore.) The kinetic energy of a system of $N$ point particles with masses $m_i$ and velocities $v_i$ is \[ E_{kin} = \sum_{i=1}^{N} \frac{1}{2} m_i v_i^2 \] What can affect a particle's subsequent velocity, and hence change its kinetic energy? That is, what causes $\frac{dv}{dt}$ to be non-zero? A force: \[ F = m \frac{dv}{dt} \] Thus we also need to understand the forces acting on the particles. For example: If they are really pointlike then they interact pairwise via the Coulomb force \[ F_1 = \frac{q_1 q_2}{4 \pi \epsilon_0} \frac{\underline{r}_{12}}{r_{12}^3} = -F_2 \] Here $q_1,q_2$ are the charges (perhaps in coulombs); $\epsilon_0$ is a constant (depending on that unit choice); and $\underline{r}_{12} = \underline{r}_1 -\underline{r}_2$. For a moment we can think of this as a force field created by the second particle, acting on any charged first particle. This is a conservative force field; meaning that there is a function $\phi(\underline{r})$ such that \[ F = - \nabla \phi \] The function $\phi(\underline{r})$ is part of the potential energy of the first particle. In other words its `total energy' is \[ E = \frac{1}{2} m v^2 + \phi \] In practice, since $\phi$ is only defined up to an additive constant, $E$ itself is not so significant as changes in $E$. %
\subsection{Stats/Gibbs canonical distribution (Click Me)}
Notice that system energy $E$ depends on the velocities and positions of all the atoms in the system. There are $10^{23}$ or so atoms in a handful of Earthbound matter, so we are not going to be able to keep track of them all (nor do we really want to). We would rather know about the bulk, averaged behaviour of the matter. Let us call the inaccessible complete microscopic specification of all positions and velocities in the system a `microstate'. Then for each microstate $\sigma$ we know, in principle, the total energy $E(\sigma)$. We could ask: What is the probability $P$ of finding the system, at any given instant, in a specific microstate? Then we could compute an expected value for some bulk observation ${\mathcal O}$ by a weighted average over the microstates: \begin{equation} \label{initexpect} \langle {\mathcal O} \rangle = \sum_{\sigma} {\mathcal O}(\sigma) P(\sigma) \end{equation} In principle the probability $P$ could depend on every aspect of $\sigma$. This would make computation very hard. At the other extreme, $P$ could be independent of $\sigma$. But this turns out to be a problematic assumption for a number of Mathematical and Physical reasons. Another working assumption would be that two microstates are equally likely if they have the same energy; i.e. that $P$ depends on $\sigma$ just through $E$. That is, that $P$ depends only on the total energy of the system. Let us try this. The next question is: How does $P$ depend on $E$? What is the function $P(E)$? If we have a large system, then we could consider describing it in two parts (left and right side, say), separated by some notional boundary, with the total microstate $\sigma$ being made up of $\sigma_L$ and $\sigma_R$. These halves are in contact, of course, along the boundary. But if the system is also in contact with other systems (so that energy is not required to be locally conserved), then it is plausible to assume that the states of the two halves are independent variables. In this case \[ P(\sigma) = P(\sigma_L) P(\sigma_R) \] as for such probabilities in general. Similarly, the total energy \[ E = E_L + E_R + E_{int} \] (where $ E_{int} $ is the interaction energy between the halves) is reasonably approximated by \[ E \sim E_L + E_R \] (Why is this reasonable?!... Clearly the kinetic energy is localised in each of the two halves. The potential energy is made up of contributions from all pairs, including pairs with one in each half. But we assume that the pair potential is greater for pairs that are closer together; and that the boundary is a structure of lower dimension that the system overall. In this sense $E_{int}$ is localised in the boundary (pairs that are close together but in separate halves are necessarily close to the boundary); while being part of the overall potential energy, which is spread with essentially constant density over the whole system. Thus $E_{int}$ is a vanishing proportion of the whole energy for a large system. (We shall return to these core Physical assumptions of Statistical Mechanics later. They imply an intrinsic restriction in Statistical Mechanics to treating interactions that are, in a suitable sense, short-range. Fortunately this seems Physically justifiable.)) The $L$ and $R$ subsystems will each have their own `energy-only' probability function. Thus we have something like \begin{equation} \label{ppp1} P(E_L+E_R) = P_L(E_L) P_R(E_R) \end{equation} In this expression $E_L$ and $E_R$ are independent variables, so \[ \frac{\partial P(E_L+E_R)}{\partial E_L} = \frac{\partial P(E_L+E_R)}{\partial E_R} \] so $ P_L'(E_L) P_R(E_R) = P_L(E_L) P_R'(E_R) $, so \[ \frac{P_L'(E_L)}{P_L(E_L)} = \frac{P_R'(E_R)}{P_R(E_R)} \] This separates. We write $-\beta$ for the constant of separation. We have $P'_L(E_L) = -\beta P_L(E_L)$ (and similarly for $R$). This is solved by a function of form \[ P(E) = C \exp(-\beta E) \] where $C$ is any constant. In our case $C$ is determined by \[ \sum_{\sigma} P(E(\sigma)) =1 \] The separation constant $\beta$ is interesting, since it is the only thing (other than the form of the function itself) that connects the subsystems. We will see later that this connection corresponds (inversely) to a notion of temperature. [click to minimize subsection]
\subsection{Partition Function (Click Me)}
The normalisation function for our system \[ Z(\beta) =\sum_{\sigma} \exp(-\beta E(\sigma)) \] ($Z$ for zustatensummen, or some such name due to Boltzmann) is called the partition function. That is, for given $\beta$, \[ P(E) = \frac{\exp(-\beta E)}{Z} \] Recall that, by our derivation, $\beta$ represents the effect of thermal (energetic) contact with the universe of other systems. Our usual notion of the bulk contribution of neighbouring systems on the energetics of a given system, at least where long-time-stable (equilibrium) properties are concerned, is the notion of temperature. Thus $\beta$ encodes temperature. How specifically does it do this? See later. First we want to consider the pay-off for the analysis we have made so far. The idea was that we would be able to compute time-averaged bulk properties of the system. To produce a concrete example, we are going to need to make a concrete choice for $E$. If $S$ is the set of all possible instantaneous states of the system, then \[ E : S \rightarrow \Re \] associates a real energy value to each state. We now formulate choices for $S$ and $E$ via a long series of simplifying, but not trivialising, assumptions.
|
Figure: square lattice array of spins |
\subsection{The Potts Model (Click Me)}
To summarize, we have the $Q$-state Potts model partition function \[ Z = Z(G) = \sum_{s \in S} \exp\left(\beta \sum_{(i,j) \in G} \delta_{s(i),s(j)} \right) \] Note that, fixing $Q$ ($Q=2$, say), this is simply a polymial in \[ x= \exp(\beta) \] for each choice of graph $G$. Let's start with an almost trivial example. If $Q=2$ and the graph is $K_2$ ($K_n$ is the complete graph on $N$ vertices) we have \[ Z(K_2) =2x+2 \] Notice that with this choice of energy function it does not matter exactly what form the $Q$ distinct spin 'states' take, since it only depends on whether they are equal or not. For definiteness let us take the set \[ \{ 1,2,...,Q \} \] Our case $Q=2$ coincides almost exactly with the famous Ising model. The only difference is that there it is conventional to take the set \[ \{ +1, -1 \} \] and then to define $f(s,s')=ss'$, giving \[ Z_{Ising}(K_2) = 2x + 2x^{-1} \] This differs from the Potts answer by an overall factor, and by $\beta \rightarrow \beta/2$ (both here and for arbitrary $G$). We will see later that these changes are essentially trivial, and we will not trouble even to remark on them again thereafter.
\subsection{Physics in the partition function (Click Me)}
The partition function $Z$ is 'just' a normalising factor, but \[ \frac{d \ln Z}{d \beta} = - \frac{1}{Z} \sum_{s\in S} E(s) \exp(-\beta E(s)) \] which satisfies our definition of an observable. Indeed it is an important observable called the 'internal energy' (scale this by $1/N$, where $N$ is the number of spins in the system, for the 'energy density' $U$). In light of this, we see that the analysis of $Z$ does contain Physics! Suppose that our energy function is quantised, as in \[ E:S \rightarrow \pm {\mathbb N} \] (as it is in the Potts case). Then $Z$ is polynomial in $x = \exp(\beta)$. Accordingly its only analytic structure is zeros. How can zeros of a polynomial reveal physics?...
|
Figure: cubic lattice Ising model partition function |