Unlabeling combinatorial species (cycle index series)

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	Qingyu	155
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

Hi everyone!

In my previous blog, I wrote about how generating functions can be used to enumerated labeled species. In this blog, I want to continue the topic by writing about how one can account for different kinds of symmetries when counting different combinatorial structures. Ultimately, we will end up deriving and hopefully understanding the analogue of Pólya enumeration theorem in species.

Difficulty: ★★★★☆

Prerequisites:

Familiarity with combinatorial species (see my prev. blog), OR
Very good intuition with enumerative combinatorics, genfuncs and recap below.

I will try to use plain language rather than formulas as much as possible, as it seems to be the preferred format for readers.

Recap

Below is a very brief recap of the most important things from the previous article. I tried to keep it as informal as possible.

Previously on combinatorial species

Labeled species

The species $$$F$$$ is a rule that produces:

For each finite set $$$A$$$, a finite set $$$F(A)$$$ consisting of all possible structures defined by the species $$$F$$$ that can be built using the elements of $$$A$$$ as atoms;
For each bijection $$$\sigma : A \to B$$$, a bijection $$$F_\sigma : F(A) \to F(B)$$$ between the structures $$$F(A)$$$ and $$$F(B)$$$, that describes how structures change if we change one set of atoms for another.

When we use the term "atoms", we mean the units used to enumerate the species, such as the vertices of a graph, the pairs of brackets in a bracket sequence, etc. For example, you may see $$$F(A_2)$$$, $$$F(A_3)$$$ and $$$F(A_4)$$$ for the species of labeled trees below:

Labeled trees on sets $$$A_2 = \{{\color{red} \bullet}, {\color{blue} \bullet}\}$$$, $$$A_3 = \{{\color{red} \bullet}, {\color{blue} \bullet}, {\color{green} \bullet}\}$$$ and $$$A_4 = \{{\color{red} \bullet}, {\color{blue} \bullet}, {\color{green} \bullet}, {\color{goldenrod} \bullet}\}$$$.
The image by Júlio Reis is distributed under CC BY-SA 3.0 license.

Unlabeled structures

The theory of combinatorial species described in the previous blog post is based on an assumption that all atoms are pairwise distinguishable ("labeled"). I also very briefly mentioned the concept of unlabeled structures in the end of the previous blog post:

On the left and on the right are equivalent trees. In the middle is the representation of their unlabeled structure.

We say that two structures $$$\alpha, \beta \in F(A)$$$ are equivalent if one can be obtained from the other by re-ordering the atoms in $$$A$$$. For example, on the picture above all the trees on the bottom row are equivalent to each other, as well as all the trees on 3 rows above it.

All unlabeled ordered trees on $$$n \in \{1,2,3,4\}$$$ vertices.

An unlabeled structure is what we get by essentially erasing the "labels" from the elements of $$$F(A)$$$ and treating atoms as indistinguishable from each other. Formally, it is an equivalence class in $$$F(A)$$$ with regard to the equivalence relation defined above.

Species composition

Perhaps a single most powerful aspect of combinatorial species is the notion of species composition.

A $$$G$$$-structure is formed on each set of the atom partition, and an $$$F$$$-structure is formed on top of the partition itself.
The image by Brent Yorgey is distributed under CC BY-SA 3.0 license.

Informally, a composition of species $$$F$$$ and $$$G$$$ is a species $$$F \circ G$$$ whose combinatorial structure can be described as an $$$F$$$-structure whose atoms are substituted by instances of $$$G$$$-structures.

Function species is a set of cycles of rooted trees
An illustration from Combinatorial Species article by François Bergeron

Species composition allows us to think about one species as a combination of simpler species. Typical example of which is a species of permutations which can be interpreted as a species of sets of cycles due to cyclic presentation of permutations.

If $$$A(x)$$$ and $$$B(x)$$$ are the generating functions for species $$$A$$$ and $$$B$$$, then $$$A(B(x))$$$ is the generating function for their composition.

It is a fundamental result for labeled species. Unfortunately, there is no simple analogue with unlabeled structures, and deriving the composition formula in some reasonable formulation for them is the ultimate goal of this article.

How unlabeled structures work

Let's take a closer look on the equivalence classes that are formed when we enumerate unlabeled structures. What is common for the elements of an equivalence class? How to compute the size of an equivalence class? How to enumerate the elements of a given equivalence class, if a single representative is known?

Let $$$|A|=n$$$. Consider an object $$$o \in F(A)$$$ of the species $$$F$$$. If we use a permutation $$$\sigma : A \to A$$$ on the underlying set of atoms, it will produce a new object $$$F_\sigma(o)$$$. There is a total of $$$n!$$$ possible permutations $$$\sigma$$$ of $$$n$$$ atoms, and when we apply $$$F_\sigma$$$ to a specific object $$$o$$$ for all permutations $$$\sigma$$$, we obtain all the elements of its equivalence class defined above.

Claim 1. For every object $$$o'$$$ that may be produced as a result of applying $$$F_\sigma$$$ to $$$o$$$, the number of permutations $$$\sigma$$$ that produce $$$o'$$$ is the same as the number of permutations that produce $$$o$$$ itself.

Explanation. Let $$$\tau$$$ be a permutation such that $$$F_\tau(o) = o'$$$. Then for any permutation $$$\sigma$$$ such that $$$F_\sigma(o)=o$$$, there is a permutation $$$\tau \circ \sigma$$$ such that $$$F_{\tau \circ \sigma}(o) = F_\tau(F_\sigma(o)) = o'$$$. Same is true in the other direction.

The permutations $$$\sigma$$$ such that $$$F_\sigma(o)=o$$$ are called the automorphisms of the object $$$o$$$. So among $$$n!$$$ values of $$$F_\sigma(o)$$$, if we consider them as a multiset, each unique one is counted $$$|\operatorname{Aut}(o)|$$$ times, where $$$\operatorname{Aut}(o) = \{\sigma : F_\sigma(o) = o\}$$$ is the set of automorphisms of $$$o$$$.

This allows us to say that the size of the equivalence class of $$$o$$$ is exactly $$$\frac{n!}{|\operatorname{Aut}(o)|}$$$.

Burnside lemma

The result above is very useful, as now instead of counting the number of equivalence classes we instead can "distribute" the counting among all the elements, that is we may represent the total number of equivalence classes as

$$$ \sum\limits_{o \in F(A)} \frac{|\operatorname{Aut}(o)|}{n!}. $$$

In this way, the elements from the same equivalence class, when summed up together, will produce $$$1$$$.

On the other hand, the sum of $$$|\operatorname{Aut}(o)|$$$ can be reformulated as follows:

$$$ \sum\limits_{o \in F(A)} |\operatorname{Aut}(o)| = |\{(o, \sigma) : F_\sigma(o) = o\}| = \sum\limits_{\sigma \in S_n} |\operatorname{Fix}(\sigma)|, $$$

where $$$\operatorname{Fix}(\sigma)$$$ is the number of fixed points of $$$F_\sigma$$$, that is $$$\operatorname{Fix}(\sigma) = \{o \in F(A) : F_\sigma(o) = o\}$$$.

This gives the following alternative formula for the number of the equivalence classes:

$$$ \frac{1}{n!}\sum\limits_{\sigma \in S_n} |\operatorname{Fix}(\sigma)|, $$$

also known as the Burnside lemma. That being said, the number of unlabeled structure on $$$n$$$ atoms of the species equates to the average number of fixed points over all permutations of length $$$n$$$. This key observation allows us to apply key properties of expected values to compute the number of equivalence classes, in particular use the linearity of expected value.

Automorphisms and fixed points for compositions

To construct an $$$F \circ G$$$ structure, we first build an $$$F$$$-structure on a set $$$B = \{1, 2, \dots, k\}$$$, and then substitute the $$$i$$$-th of its atoms with a $$$G$$$-structure built on the set $$$A_i$$$. Then, we treat the disjoint union $$$A = A_1 \sqcup A_2 \sqcup \dots \sqcup A_k$$$ as the full set of atoms. So, an object $$$o$$$ of species $$$F \circ G$$$ can be described by an object $$$f \in F(B)$$$ and a family of objects $$$g_1, \dots, g_k$$$, where $$$g_i \in G(A_i)$$$.

Automorphisms of a given object

Consider a permutation $$$\sigma : A \to A$$$. What should hold for it to be an automorphism of $$$o$$$?

First of all, it should preserve the structure of $$$f$$$. The object $$$f$$$ is built on top of the sets $$$A_1, \dots, A_k$$$, and to preserve the structure of $$$f$$$, the permutation $$$\sigma$$$ should only map these sets to each other. In other words, it should map a set $$$A_i$$$ to a set $$$A_j$$$ such that $$$|A_i| = |A_j|$$$, and the function $$$\rho(i) = j$$$ defined from such mapping should be a permutations, which is, in turn, an automorphism of $$$f$$$.

Then, if we look on a transition from $$$A_i$$$ to $$$A_{\rho(i)}$$$, the permutation $$$\sigma$$$ should be consistent with the mapping $$$\rho$$$. In other words, if we restrict the domain of $$$\sigma$$$ from the whole $$$A$$$ to its subset $$$A_i$$$, the resulting mapping $$$\sigma_i$$$ should map the object $$$g_i$$$ into the object $$$g_{\rho(i)}$$$.

That being said, each permutation of interest $$$\sigma : A \to A$$$ can be described by a permutation $$$\rho : B \to B$$$, which is an automorphism of $$$f$$$, and a family of bijections $$$\sigma_i : A_i \to A_{\rho(i)}$$$, such that $$$\sigma_i$$$ maps $$$g_i$$$ to $$$g_{\rho(i)}$$$.

Fixed points of a given permutation

Such setting allows us to think about what should hold for an object $$$o \in (F \circ G)(A)$$$ to be a fixed point of $$$\sigma$$$?

The object $$$f$$$ must be a fixed point of $$$\rho$$$;
For each $$$\sigma_i : A_i \to A_{j}$$$ it must hold that $$$F_{\sigma_i}(g_i) = g_{j}$$$.

The first criteria is understandable and is in line with what we have studied so far. What about the second, though? When is it true?

Essentially, it means that in a cycle decomposition of the permutation $$$\rho$$$, for each cycle $$$i_1 \to i_2 \to \dots \to i_m \to i_1$$$ we're allowed to choose a single distinct element $$$g_{i_1}$$$ in such way that it is a fixed point of a permutation $$$\tau = \sigma_{i_m} \circ \dots \circ \sigma_{i_2} \circ \sigma_{i_1}$$$, and all the other ones will be uniquely defined by the mappings $$$\sigma_{i_1}, \sigma_{i_2}, \dots, \sigma_{i_{m-1}}$$$.

Objects $$$g_{i_1}, \dots, g_{i_m}$$$ on a cycle $$$i_1 \to i_2 \to \dots \to i_m \to i_1$$$

Let $$$|A_{i_1}|= \dots = |A_{i_m}|=t$$$. What do we get if we average the amount of fixed points $$$g_{i_1}$$$ of $$$\tau$$$ over all permutations $$$\tau$$$? It follows from the Burnside lemma that we get the number of distinct unlabeled structures of species $$$G$$$ on $$$t$$$ atoms. The neat part here is that for each cycle in $$$\rho$$$ it could be done independently, multiplying the amounts to get the average amount for a fixed permutation $$$\rho$$$.

Averaging over all permutations

In other words, to count the average amount of fixed points of $$$\sigma : A \to A$$$, we can do the following:

Consider a permutation $$$\rho : B \to B$$$;
Pick a fixed point of $$$F_\rho$$$ as an $$$F$$$-structure;
For each cycle of $$$\rho$$$, pick an unlabeled structure $$$g$$$ of $$$G$$$ and duplicate it on the whole cycle $$$g_{i_1}, \dots, g_{i_m}$$$;
Average the counts over all permutations $$$\rho$$$ by dividing with $$$k!$$$.

These considerations boil down to the following essential formula:

$$$ \boxed{(F \circ G)(x) = \sum\limits_{k=0}^{\infty} \frac{1}{k!} \sum\limits_{\rho \in S_k} |\operatorname{Fix}_F(\rho)| \prod\limits_{j=1}^k G(x^{j})^{\rho_j},} $$$

where $$$\rho_j$$$ is the number of cycles of length $$$j$$$ in $$$\rho$$$, and the ordinary generating functions $$$G(x)$$$ and $$$(F \circ G)(x)$$$ are defined as

$$$\begin{align} G(x) =& \sum\limits_{n=0}^\infty \frac{x^n}{n!} \sum\limits_{\sigma \in S_n} |\operatorname{Fix}_G(\sigma)|, \\ (F \circ G)(x) =& \sum\limits_{n=0}^\infty \frac{x^n}{n!} \sum\limits_{\sigma \in S_n} |\operatorname{Fix}_{F \circ G}(\sigma)|. \end{align}$$$

In other words, the functions are defined in such way that $$$[x^n]G(x)$$$ and $$$[x^n](F \circ G)(x)$$$ are the numbers of unlabeled $$$G$$$-structures and $$$(F \circ G)$$$-structures correspondingly on $$$n$$$ atoms, corresponding to OGFs defined for unlabeled structures in the previous article.

Intuitively, $$$G(x^j)$$$ is a generating functions for sets of $$$j$$$ equal unlabeled structures of species $$$G$$$, so we pick $$$|c|$$$ same $$$G$$$-structures for each cycle $$$c$$$ of length $$$|c|$$$ and then multiply the generating functions over all cycles, as it is done independently.

If you're not convinced yet, you can find a bit more technical and rigorous (but less intuitive) computation below to get the same result.

Derivation

Cycle index series

looking on the formula above, we may define a multivariate formal power series

$$$ Z_F(t_1, t_2, t_3, \dots) = \sum\limits_{k=0}^\infty \frac{1}{k!} \sum\limits_{\rho \in S_k} |\operatorname{Fix}_F(\rho)| t_1^{\rho_1} \dots t_k^{\rho_k}, $$$

which allows to rewrite $$$(F \circ G)(x)$$$ as the formal power series composition:

$$$ (F \circ G)(x) = Z_F(G(x), G(x^2), G(x^3), \dots). $$$

The multivariate power series $$$Z_F(t_1, t_2, t_3, \dots)$$$ is called the cycle index series of the species $$$F$$$ and it holds a tremendous amount of information about the structure of the species. In particular, we may note that the exponential generating function of the species is simply

$$$ F(x) = Z_F(x, 0, 0, \dots), $$$

As to count labeled structures it is sufficient to count the fixed points of identity permutations (for which all cycles have length $$$1$$$).

and the ordinary generating function of unlabeled structures of the species is nothing but

$$$ F(x) = Z_F(x, x^2, x^3, \dots), $$$

as we may perceive the species as a composition of itself with an "atom" species $$$x$$$.

This is the composition formula for the unlabeled structures that we sought for.

Why is it called a cycle index series?

The definition of a cycle index series above may be rewritten in terms of automorphisms:

$$$ Z_F(t_1, t_2, t_3, \dots) = \sum\limits_{k=0}^\infty \sum\limits_{o \in F(B_k)} \frac{1}{k!} \sum\limits_{\rho \in \operatorname{Aut}(o)} t_1^{\rho_1} \dots t_k^{\rho_k}, $$$

where $$$B_k = \{1, 2, \dots, k\}$$$.

Substituting $$$\frac{1}{k!} = \frac{|\operatorname{Aut}(o)|}{k!} \frac{1}{|\operatorname{Aut}(o)|}$$$, and recalling that the equivalence class of $$$o$$$ has $$$\frac{k!}{|\operatorname{Aut}(o)|}$$$ objects, we may rewrite the sum above as

$$$ Z_F(t_1, t_2, t_3, \dots) = \sum\limits_{k=0}^\infty \sum\limits_{o \in \widetilde{F}(B_k)} \frac{1}{|\operatorname{Aut}(o)|} \sum\limits_{\rho \in \operatorname{Aut}(o)} t_1^{\rho_1} \dots t_k^{\rho_k}, $$$

where $$$\widetilde{F}(B_k)$$$ is the set of representatives of unlabeled equivalence classes of $$$F(B_k)$$$.

For a given set of permutations that is closed under composition (so-called permutation group) $$$G$$$, the polynomial

$$$ Z(G)(t_1, t_2, \dots, t_k) = \frac{1}{|G|} \sum\limits_{\rho \in G} t_1^{\rho_1} \dots t_k^{\rho_k} $$$

is known more widely as the cycle index of $$$G$$$. That being said, the cycle index series $$$Z_F(t_1, t_2, \dots)$$$ is the formal sum of cycle indices over all unlabeled structures of the species $$$F$$$ of all possible sizes.

Atom colorings

Note that for in all discussions above we assumed that either all atoms are distinct from each other (labeled) or indistinguishable (unlabeled). In a bit more generic setting we may assume that each atom is colored in one of $$$s$$$ distinct colors.

In this case, the number of indistinguishable structures built on such set of atoms is $$$Z(G)(s, s, \dots, s)$$$, as we enumerated fixed points of $$$\rho$$$ by coloring all atoms on each cycle in the same color.

Correspondingly, if we want to also keep track of the total number of atoms in cycle index series, we should use

$$$ f(s) = Z_F(sx, sx^2, sx^3, \dots). $$$

Then, the coefficient $$$[x^n] f(s)$$$ will denote the number of $$$F$$$-structure on $$$n$$$ atoms, such that each atom is colored in one of $$$s$$$ indistinguishable colors.

Examples

There are two famous species $$$F$$$ with known composition formulas. Specifically, the cycles and sets.

Let's derive explicit form for their cycle index series, and from that obtain the operators $$$\operatorname{CYC}$$$ and $$$\operatorname{MSET}$$$:

Cycles

Consider a cycle of length $$$n$$$. What do its automorphisms look like? Its automorphisms are:

A permutation with this cycle as its cycle presentation;
Powers of this permutation.

This is a total of $$$n$$$ elements. When we raise the cycle to the power $$$k$$$, we obtain a set of $$$\gcd(k, n)$$$ of length $$$n/\gcd(n, k)$$$ each.

For example, raising single cyclic shift $$$[2,3,4,5,6,1]$$$ with cycle presentation $$$(1~2~3~4~5~6)$$$ to the power $$$3$$$, we get a permutation $$$[4,5,6,1,2,3]$$$ with a cycle presentation $$$(1~4)(2~5)(3~6)$$$. Note that all cycles of length $$$n$$$ are isomorphic to each other, meaning that there is a unique unlabeled cycle of length $$$n$$$ with the cycle index

$$$ Z(C_n)(t_1, t_2, \dots, t_n) = \frac{1}{n} \sum\limits_{k=1}^n t_{n/\gcd(k, n)}^{\gcd(k, n)} = \frac{1}{n} \sum\limits_{d | n} \varphi(\frac{n}{d})t_{n/d}^{d}. $$$

The last formula is due to the fact that there are $$$\varphi(\frac{n}{d})$$$ numbers $$$k$$$ such that $$$\gcd(k, n) = d$$$. If we sum it up over all $$$n$$$, we get

$$$ Z_C(t_1, t_2, \dots) = \sum\limits_{n=1}^\infty \frac{1}{n} \sum\limits_{d|n} \varphi(\frac{n}{d}) t_{n/d}^d, $$$

then substituting $$$k=\frac{n}{d}$$$, we get

$$$ Z_C(t_1, t_2, \dots) = \sum\limits_{k=1}^\infty \frac{\varphi(k)}{k} \sum\limits_{d=1}^\infty \frac{t_{k}^d}{d}, $$$

making it into

$$$ \boxed{Z_C(t_1, t_2, \dots) = \sum\limits_{k=1}^\infty \frac{\varphi(k)}{k} \log \frac{1}{1-t_k}} $$$

From this, the unlabeled composition formula of $$$\operatorname{CYC}$$$ species of cycles and arbitrary species $$$f$$$ with OGF $$$f(x)$$$ is:

$$$ \boxed{ \operatorname{CYC} f(x) = \sum\limits_{k=1}^\infty \frac{\varphi(k)}{k} \log\frac{1}{1-f(x^k)} } $$$

Sets

Now, in case of sets, we would get the same set no matter how we rearrange its elements. Again, there is a unique unlabeled set on $$$n$$$ vertices, and its automorphisms are all possible permutations, hence we have

$$$ Z(S_n)(t_1, t_2, \dots, t_n) = \frac{1}{n!} \sum\limits_{\rho \in S_n} \prod\limits_{c \in \rho}t_{|c|}. $$$

Assuming that there are $$$j_k$$$ cycles of length $$$k$$$, the formula above rewrites as

$$$ Z(S_n)(t_1, t_2, \dots) = \sum\limits_{j_1+2 j_2 + 3 j_3 + \cdots + n j_n = n} \prod\limits_{k=1}^n \frac{t_k^{j_k}}{k^{j_k} j_k!}. $$$

Indeed, the number of ways to make a permutation with given $$$j_1, \dots, j_n$$$ is

$$$ \frac{n!}{1^{j_1} j_1! 2^{j_2} j_2!\dots n^{j_n} j_n!}, $$$

as for $$$j_k$$$ cycles of length $$$k$$$, you divide by $$$k^{j_k}$$$ to not double-count cyclic shifts and by $$$j_k!$$$ to not double-count rearrangements of cycles.

On the other hand, if we sum it up over all $$$n$$$, we get

$$$ Z_S(t_1, t_2, \dots) = \sum\limits_{n=0}^\infty \sum\limits_{j_1+2 j_2 + 3 j_3 + \cdots + n j_n = n} \prod\limits_{k=1}^n \frac{t_k^{j_k}}{k^{j_k} j_k!}. $$$

Adding a new variable $$$y$$$ to track the sum of $$$j_1 + 2j_2 + \dots + n j_n$$$, we can rewrite it as product of sums:

$$$ Z_S(y, t_1, t_2, \dots) = \prod\limits_{k=1}^\infty \sum\limits_{j_k=0}^\infty \frac{(t_k y^k)^{j_k}}{k^{j_k} j_k!} = \prod\limits_{k=1}^\infty \exp\left(\frac{t_k y^k}{k}\right). $$$

Finally, substituting $$$y=1$$$, we get the sum over all possible $$$n$$$ as

$$$ \boxed{Z_S(t_1, t_2, \dots) = \exp\left(\sum\limits_{k=1}^\infty \frac{t_k}{k}\right)} $$$

and the composition formula with set species

$$$ \boxed{ \operatorname{MSET} f(x) = \exp\left(\sum\limits_{k=1}^\infty \frac{f(x^k)}{k}\right) } $$$

Cycle index composition

In a very generic setting, one may prove the general formula:

$$$ Z_{F \circ G}(t_1, t_2, \dots) = Z_F(Z_G(t_1, t_2, \dots), Z_G(t_2, t_4, \dots), Z_G(t_3, t_6, \dots), \dots). $$$

Derivation, explanation and proof of this formula is left to the reader as an exercise.

Rev.	By	When	Δ	Comment
en9	adamant	2023-01-27 23:12:13	867
en8	adamant	2023-01-27 14:23:05	173
en7	adamant	2023-01-27 13:27:52	13
en6	adamant	2023-01-27 13:25:29	348
en5	adamant	2023-01-27 13:21:39	0	(published)
en4	adamant	2023-01-27 13:21:32	115	(saved to drafts)
en3	adamant	2023-01-27 13:16:33	115
en2	adamant	2023-01-27 13:15:10	458	about colorings
en1	adamant	2023-01-27 05:33:52	22802	Initial revision (published)