Free Access

The pseudovariety $C o m * D$ viewed as a power pseudovariety

Jiří Kad’ourek

Department of Mathematics and Statistics, Masaryk University, Kotlářská 2, 61137 Brno, Czech Republic

https://doi.org/10.1142/S0218196725500043Cited by:0 (Source: Crossref)

Abstract

The objective of this paper is to document that the pseudovariety equation $P X = C o m * D$ admits a solution $U$ . This solution $U$ is a pseudovariety raised up by Almeida in his book on finite semigroups and universal algebra as a prospective candidate for such a solution. This result achieved in the present paper resolves an open problem posed by Almeida in his book a long time ago.

Communicated: J. Almeida

Keywords:

AMSC: 20M07, 68Q70

1. Introduction

The semidirect product on the level of pseudovarieties of semigroups has been widely studied in the past. See [2, Chap. 10] for a survey of results obtained until mid-nineties in this direction. Likewise the power operator on the level of pseudovarieties of semigroups has been the subject of intensive study in the past. See [2, Chap. 11] for an overview of results obtained by that time in this area. From time to time these two routes of research have met up with each other. One particular instance of such a fruitful encounter is represented by the study of locally commutative power pseudovarieties, whose results are summarized in [2, Sec. 11.9]. In this context, the pseudovariety $C o m * D$ has arisen, which has been previously the object of concern of researchers in the domain of semidirect products of pseudovarieties of semigroups. See [2, Sec. 10.7] in this respect, for example.

Recall yet that $C o m$ is the pseudovariety of all commutative semigroups, $D$ is the pseudovariety of all definite semigroups, that is, the pseudovariety of all semigroups whose idempotents act as right zeros and $C o m * D$ is the pseudovariety of semigroups generated by the class of all semidirect products of commutative semigroups by definite semigroups. An effective characterization of the pseudovariety $C o m * D$ has been provided by Thérien and Weiss in [10]. Their result actually says that $C o m * D$ is the pseudovariety of semigroups determined by the pseudoidentity $e x f y e z f ≏ e z f y e x f$ . See also [2, Sec. 10.7] once again in this regard.

Let $L C o m$ be the pseudovariety of all locally commutative semigroups. That is, $L C o m$ consists of all finite semigroups S having the property that, for every idempotent e in S, the submonoid $e S e$ of S is a commutative monoid. Yet otherwise stated, $L C o m$ is the pseudovariety of semigroups determined by the pseudoudentity $e x e y e ≏ e y e x e$ . In this context, Almeida has proposed in [1] and in [2, Sec. 11.9] to consider the subpseudovariety $U$ of the pseudovariety $L C o m$ determined within $L C o m$ by the additional pseudoidentity ${(e f)}^{ω} e x f ≏ e x f$ . It is easy to check that, within $L C o m$ , the pseudoidentity ${(e f)}^{ω} e x f ≏ e x f$ is equivalent to the pseudoidentity $e x f {(e f)}^{ω} ≏ e x f$ , which shows that the pseudovariety $U$ is self-dual. In fact, it is straightforward to infer hence that the pseudovariety $U$ consists of all finite semigroups S having the property that, for every idempotents $e, f$ in S, the subsemigroup $e S f$ of S is a commutative monoid which admits the element ${(e f)}^{ω}$ for its identity.

Furthermore, Almeida considered in [1] and in [2, Sec. 11.9] the power pseudovariety $P U$ . Recall that $P U$ is the pseudovariety generated by the class of all power semigroups $𝒫 (S)$ , where S are arbitrary semigroups from the pseudovariety $U$ . Having this pseudovariety in view, Almeida has shown in [1] and in [2, Sec. 11.9] that the pseudovariety $C o m * D$ provides an upper bound for the power pseudovariety $P U$ . That is, he has established the inclusion $P U \subseteq C o m * D$ , and he has done so by verifying directly that the power semigroups $𝒫 (S)$ , for all semigroups S in the pseudovariety $U$ , do satisfy the above-mentioned pseudoidentity $e x f y e z f ≏ e z f y e x f$ . He has also shown, in addition, that, for any pseudovariety $V$ of semigroups, the condition $P V \subseteq C o m * D$ is equivalent to the condition $V \subseteq U$ . Thus $U$ is the largest pseudovariety for which the inclusion $P U \subseteq C o m * D$ holds. However, the question whether the equality $P U = C o m * D$ actually holds here has been left open in [2, Sec. 11.9].

The purpose of this paper is to establish the missing inclusion $C o m * D \subseteq P U$ . This is the inclusion that is opposite relative to the one discussed above. In this way, it eventually emerges that, in reality, the equality $P U = C o m * D$ does hold. Therefore the pseudovariety $C o m * D$ is indeed a power pseudovariety. It is equal to the pseudovariety $P U$ , where $U$ is the pseudovariety introduced above.

In order to achieve the appointed goal, we shall take advantage of the tools furnished by formal language theory. It is a celebrated result in the theory of varieties of recognizable languages attributed to Eilenberg that there are mutually inverse order preserving one-to-one correspondences between the lattice of all pseudovarieties of semigroups and the lattice of all varieties of languages. See [3, Sec. VII.3] for Eilenberg’s original treatment of this subject. Having this in view, we shall let $𝒱$ be the variety of languages corresponding to the pseudovariety $C o m * D$ , and we shall let $𝒲$ be the variety of languages corresponding to the pseudovariety $P U$ . Then, in our efforts to establish the inclusion $C o m * D \subseteq P U$ , we shall be done if we can show that we have $𝒱 \subseteq 𝒲$ . This is exactly what we shall strive for in this paper.

The other tools from formal language theory that we shall make use of include Straubing’s wreath product principle, expounded in detail by Pin and Weil in [7], which makes it possible to underpin the languages appertaining to the variety $𝒱$ corresponding to the pseudovariety $C o m * D$ , and Straubing’s characterization of languages recognized by power semigroups, obtained in [9], which allows us to handle the languages belonging to the variety $𝒲$ corresponding to the power pseudovariety $P U$ .

2. Preliminaries

All semigroups appearing in this paper, with the exception of free semigroups, will be finite. For every semigroup S, we shall denote by $E (S)$ the set of all idempotents in S. We shall denote by $S^{1}$ the monoid which is equal to S provided that S already is a monoid, and which is obtained from S by adjoining an identity to it otherwise.

By a pseudovariety of semigroups we shall mean any class of finite semigroups which is closed under the formation of direct products of finite families of semigroups and under the formation of subsemigroups and homomorphic images of semigroups. Such pseudovarieties can be determined by collections of pseudoidentities, as it has been the case on several instances in the previous section. We shall not need the concept of pseudoidentity in its full generality, and so we shall not embark on this subject here. We only mention that, in such a pseudoidentity $u ≏ v$ , apart from letters like $x, y, z, \dots$ which are reserved for variables that may be substituted by arbitrary elements from a semigroup S in which the pseudoidentity $u ≏ v$ is evaluated, also letters like $e, f, g, \dots$ may appear which are reserved for constituents that may be substituted only by idempotents from $E (S)$ .

We have met already in the previous section the pseudovariety $D$ of all definite semigroups. This pseudovariety consists of all finite semigroups S having the property that every idempotent in $E (S)$ acts as a right zero in S. Thus $D$ is the pseudovariety determined by the pseudoidentity $x e ≏ e$ . For every positive integer m, let $D_{m}$ be the pseudovariety determined by the pseudoidentity $x y_{1} y_{2} \dots y_{m} ≏ y_{1} y_{2} \dots y_{m}$ . Then it turns out that $D = ⋃_{m = 1}^{\infty} D_{m}$ . Later on, we shall also need the pseudovariety $K$ which is dual with respect to the pseudovariety $D$ . That is, $K$ consists of all finite semigroups S having the property that every idempotent in $E (S)$ acts as a left zero in S. Thus $K$ is the pseudovariety determined by the pseudoidentity $e x ≏ e$ . For every positive integer m, let $K_{m}$ be the pseudovariety determined by the pseudoidentity $x_{1} x_{2} \dots x_{m} y ≏ x_{1} x_{2} \dots x_{m}$ . Then it turns out that $K = ⋃_{m = 1}^{\infty} K_{m}$ .

We shall have to deal in this paper with semidirect products of pseudovarieties of semigroups. To start with, we shall have to work with semidirect products of individual semigroups of the form $S * T$ , where S and T are arbitrary semigroups. We shall observe here the custom introduced by Almeida at the beginning of [2, Sec. 10.1], according to which, even if T is viewed as a semigroup, if T happens coincidentally to be a monoid, then the underlying left action of T on S giving rise to the semidirect product $S * T$ should be left unitary. With that, if $V$ and $W$ are any pseudovarieties of semigroups, then the semidirect product $V * W$ of these two pseudovarieties is defined to be the pseudovariety of semigroups generated by the class of all semidirect products of the form $S * T$ , where S is a semigroup from $V$ and T is a semigroup from $W$ , observing the custom regarding the underlying left action of T on S mentioned above. It is easy to verify that this semidirect product $V * W$ consists, in fact, of all divisors of the semidirect products $S * T$ just described.

Aside from ordinary semigroups, we shall have to deal also with transformation semigroups. To begin with, let P be a finite set. We shall denote by $𝒯_{P}$ the monoid of all transformations of the set P. Note that we shall let the transformations from $𝒯_{P}$ act on the elements of the set P on the right. That is, given an element $p \in P$ and a transformation $τ$ of the set P, we shall denote by $p τ$ the result of the application of $τ$ on p. Therefore the composition of transformations from $𝒯_{P}$ should be interpreted accordingly, that is, it has to be read from the left to the right. Having this at hand, by a transformation semigroup we shall mean any pair $X = (P, S)$ where P is a finite set and S is a subsemigroup of the monoid $𝒯_{P}$ of all transformations of the set P. The set P will be called the underlying set of X and the semigroup S will be called the action semigroup of X.

In accordance with [3, Sec. I.10] or [2, Sec. 10.1], we introduce the operation of wreath product of transformation semigroups in the following manner. Let $X = (P, S)$ and $Y = (Q, T)$ be transformation semigroups. The wreath product $X \circ Y$ of these transformation semigroups is the transformation semigroup having for its underlying set the cartesian product $P \times Q$ and having for its action semigroup the semidirect product $S^{Q} * T$ of the semigroup $S^{Q}$ of all functions of Q into S by the semigroup T determined by the left action of T on $S^{Q}$ given by the following formula. For every transformation $τ \in T$ and for every function $f : Q \to S$ , the function $^{τ} f : Q \to S$ is defined by the rule $q (^{τ} f) = (q τ) f$ , for every $q \in Q$ , where functions are written on the right of their arguments. In order to keep the promise that $X \circ Y$ so determined should indeed be a transformation semigroup, it remains to observe that the action semigroup $S^{Q} * T$ may be viewed as a subsemigroup of the monoid $𝒯_{P \times Q}$ in the following way. For every function $f : Q \to S$ and for every transformation $τ \in T$ , the element $(f, τ)$ of $S^{Q} * T$ determines a transformation $ϱ_{(f, τ)}$ of the set $P \times Q$ sending every pair $(p, q)$ from $P \times Q$ to the pair $(p (q f), q τ)$ . Then the assignment $(f, τ) \mapsto ϱ_{(f, τ)}$ is an embedding of $S^{Q} * T$ into $𝒯_{P \times Q}$ . Therefore the semigroup $S^{Q} * T$ may indeed be treated as a subsemigroup of the monoid $𝒯_{P \times Q}$ . In this way, $X \circ Y$ becomes again a transformation semigroup.

However, the operation of wreath product $X \circ Y$ of transformation semigroups $X = (P, S)$ and $Y = (Q, T)$ so introduced has a certain defect. It appears when the semigroup T is a monoid, but the transformation representing the identity of T is not equal to the identity on the set Q. In such a case, the left action of T on $S^{Q}$ determined above need not be left unitary. And this is in conflict with our previous requirements regarding the underlying left action of the semigroup T on $S^{Q}$ , if the semidirect product $S^{Q} * T$ has to be properly defined. Having this circumstance in view, we have introduced in [4] the concept of proper transformation semigroups. A transformation semigroup $X = (P, S)$ is said to be proper if either S is not a monoid, or S is a monoid and the identity of S acts as the identity on the set P. If we restrict the scope of the above definition of wreath products only to proper transformation semigroups, then the difficulty that we have encountered above disappears. And it is not too hard to verify that the wreath product $X \circ Y$ of two proper transformation semigroups X and Y is a proper transformation semigroup again.

Most frequent examples of proper transformation semigroups are the right regular representations of ordinary semigroups. For any semigroup S and for every element $s \in S$ , we may consider the transformation $ρ_{s} : S^{1} \to S^{1}$ which maps every element $t \in S^{1}$ to the element $t s$ . Then the mapping $S \to 𝒯_{S^{1}}$ given by the formula $s \mapsto ρ_{s}$ is an embedding of the semigroup S into $𝒯_{S^{1}}$ . In this way, a proper transformation semigroup $(S^{1}, S)$ arises, which is called the right regular representation of S. Thus, for any semigroups S and T, we may consider the wreath product $(S^{1}, S) \circ (T^{1}, T)$ of the right regular representations of these semigroups. This wreath product is again a proper transformation semigroup and its action semigroup $S^{T^{1}} * T$ is just the standard wreath product $S \circ T$ of the semigroups S and T. It is a familiar fact that, for any semigroups S and T and for every left action of T on S, where the custom regarding the left actions that has been introduced above is observed, the semidirect product $S * T$ determined by this left action can be embedded into the wreath product $S \circ T$ .

As a consequence, we see that, for any two pseudovarieties of semigroups $V$ and $W$ , the semidirect product $V * W$ of these pseudovarieties is equal to the pseudovariety generated by the class of all wreath products of the form $S \circ T$ , where S is a semigroup in $V$ and T is a semigroup in $W$ . In fact, this semidirect product $V * W$ can be obtained as the class of all divisors of the wreath products $S \circ T$ just described. Finally note that, on the other hand, the semidirect product $V * W$ contains all action semigroups $S^{Q} * T$ of the wreath products $X \circ Y$ of arbitrary proper transformation semigroups $X = (P, S)$ and $Y = (Q, T)$ , where the semigroup S belongs to $V$ and the semigroup T belongs to $W$ .

We close this section with a few remarks on power semigroups and power pseudovarieties. For any semigroup S, we denote by $𝒫 (S)$ the set of all subsets of S equipped with the usual multiplication. In this way, $𝒫 (S)$ becomes a semigroup, called the power semigroup of S. For any pseudovariety $V$ of semigroups, we let $P V$ be the pseudovariety generated by all semigroups of the form $𝒫 (S)$ , where S stands for arbitrary semigroups from $V$ . Note that, if $V$ contains a non-trivial monoid, then $P V$ can be actually obtained as the class of all semigroups T that divide some semigroup of the form $𝒫 (S)$ , where S is any semigroup in $V$ . The pseudovariety $P V$ arising in this way from $V$ is called the power pseudovariety of $V$ .

3. Formal Language Theory

Let A be a finite set. Within the context of the theory of formal languages, we shall say that A is an alphabet. Elements of this set A will be called letters. By $A^{*}$ we shall denote the free monoid on the set A, and by $A^{+}$ we shall denote the free semigroup on the set A. Elements of $A^{*}$ will be called words in the alphabet A. By 1 we shall denote the empty word, that is, the identity of the monoid $A^{*}$ . By a language over the alphabet A we shall mean any set of words in the alphabet A, that is, any subset L of the free monoid $A^{*}$ . Later on, however, we shall have to deal with languages L that will be subsets of the free semigroup $A^{+}$ , that is, such languages L will not contain the empty word 1.

By an automaton we shall mean any triplet $𝒜 = (Q, A, \cdot)$ where Q is a finite set of states, A is an alphabet and ⋅ is a function $Q \times A \to Q$ . One can extend this function ⋅ to a function $Q \times A^{*} \to Q$ , also denoted by ⋅, by putting successively $q \cdot 1 = q$ and $q \cdot (w a) = (q \cdot w) \cdot a$ , for all $q \in Q$ , $w \in A^{*}$ and $a \in A$ .

Let $L \subseteq A^{*}$ be a language and let $𝒜 = (Q, A, \cdot)$ be an automaton. We shall say that the language L is recognized by the automaton $𝒜$ if there exist a state $q_{0} \in Q$ , called the initial state and a subset $F \subseteq Q$ of states, called the set of terminal states, such that $L = {w \in A^{*} : q_{0} \cdot w \in F}$ . Languages that are recognized by some automaton are called recognizable languages.

If we have to deal only with languages $L \subseteq A^{+}$ , then a slightly different notion of recognizability of such languages by means of automata suggests itself. Let again $𝒜 = (Q, A, \cdot)$ be an automaton. Then we may say that the language L is recognized by the automaton $𝒜$ if there exist an initial state $q_{0} \in Q$ and a subset $F \subseteq Q$ of terminal states such that $L = {w \in A^{+} : q_{0} \cdot w \in F}$ . But if a language $L \subseteq A^{+}$ is recognized by an automaton $𝒜$ in this latter sense, then it can also be recognized by an automaton $\bar{𝒜}$ in the former sense, where the automaton $\bar{𝒜}$ is only slightly adapted from $𝒜$ . For languages $L \subseteq A^{+}$ , we shall usually have in view just this latter altered notion of recognizability.

One can consider quite a lot of operations on the languages over the given alphabet A. In the first place these are the standard Boolean operations, namely the union $K \cup L$ and the intersection $K \cap L$ of any two languages $K, L \subseteq A^{*}$ , and the complement $A^{*} \ L$ of a language $L \subseteq A^{*}$ . Further operations that one may consider are the product $K L$ of two languages $K, L \subseteq A^{*}$ and the star $L^{*}$ of a language $L \subseteq A^{*}$ . The product $K L$ is defined to be the language $K L = {u w \in A^{*} : u \in K, w \in L}$ . The star $L^{*}$ is defined to be the submonoid of $A^{*}$ generated by the language L.

We proceed to introduce another family of languages, namely the so-called rational languages. The class of all rational languages over the alphabet A is defined to be the smallest subclass of the class of all languages over the alphabet A which contains all finite languages and is closed under the formation of unions of pairs of languages, of products of pairs of languages and of stars of languages. Now the classical Kleene’s theorem asserts that a language over an alphabet A is recognizable if and only if it is rational. See [5, Sec. 6.3], for example.

Given an automaton $𝒜 = (Q, A, \cdot)$ , every word $w \in A^{*}$ determines a transformation $ζ_{w}$ of the set Q given by the rule $q \mapsto q \cdot w$ , for every $q \in Q$ . The transformations $ζ_{w}$ of the set Q, for all words $w \in A^{*}$ , form a submonoid of the monoid $𝒯_{Q}$ of all transformations of the set Q. This submonoid, which we shall denote by $M (𝒜)$ , is called the transition monoid of the automaton $𝒜$ . Note that this monoid $M (𝒜)$ is generated by the set of all transformations of the set Q of the form $ζ_{a}$ , for arbitrary elements $a \in A$ . If we take in the above considerations only the transformations $ζ_{w}$ of the set Q such that $w \in A^{+}$ , then these transformations will form a subsemigroup of the monoid $𝒯_{Q}$ . This subsemigroup, which we shall denote by $S (𝒜)$ , is called the transition semigroup of the automaton $𝒜$ . Note also that then the pair $(Q, S (𝒜))$ forms a transformation semigroup. Later on, by the transition semigroup of the automaton $𝒜$ we shall often mean just this transformation semigroup.

From now on, we shall confine ourselves to languages $L \subseteq A^{+}$ . So let $L \subseteq A^{+}$ be such a language, let S be a finite semigroup, and let $φ : A^{+} \to S$ be a homomorphism of semigroups. We shall say that the language L is recognized by the homomorphism $φ$ if there exists a subset $P \subseteq S$ such that $L = φ^{- 1} (P)$ . If this is the case, then we shall also say that the language L is recognized by the semigroup S. It turns out that a language $L \subseteq A^{+}$ is recognized by a semigroup in this way if and only if L is recognized by an automaton, as described above. Indeed, if the language $L \subseteq A^{+}$ is recognized by a semigroup S and if $φ : A^{+} \to S$ is the underlying homomorphism of semigroups, then L is recognized by the automaton $𝒜_{S} = (S^{1}, A, \cdot)$ , where the function ⋅ is determined, for any $s \in S^{1}$ and any $a \in A$ , by the formula $s \cdot a = s φ (a)$ . On the other hand, if the language $L \subseteq A^{+}$ is recognized by an automaton $𝒜 = (Q, A, \cdot)$ , then L is recognized by the transition semigroup $S (𝒜)$ of this automaton.

The possibility to treat in this manner the notion of recognizability of languages over an alphabet A in terms of homomorphisms of the free semigroup $A^{+}$ into finite semigroups makes it possible to verify easily the following statements. If $L \subseteq A^{+}$ is a recognizable language, then its complement $A^{+} \ L$ is also a recognizable language. If $K, L \subseteq A^{+}$ are recognizable languages, then their union $K \cup L$ and their intersection $K \cap L$ are also recognizable languages.

It is also worth noticing that if $L \subseteq A^{+}$ is a language that can be recognized by a semigroup S and if T is a semigroup such that S divides T, then L can also be recognized by T.

Let again $L \subseteq A^{+}$ be a language. We shall say that this language L is recognized by a transformation semigroup $X = (Q, S)$ if there exist a homomorphism of semigroups $φ : A^{+} \to S$ , a state $q_{0} \in Q$ , called the initial state, and a subset $F \subseteq Q$ of states, called the set of terminal states, such that $L = {w \in A^{+} : q_{0} φ (w) \in F}$ . As before, it turns out that a language $L \subseteq A^{+}$ is recognized by a transformation semigroup in the way just specified if and only if L is recognized by an automaton in the way described above. Namely, if the language $L \subseteq A^{+}$ is recognized by a transformation semigroup $X = (Q, S)$ and if $φ : A^{+} \to S$ is the underlying homomorphism of semigroups, then L is recognized by the automaton $𝒜_{X} = (Q, A, \cdot)$ , where the function ⋅ is determined, for any $q \in Q$ and any $a \in A$ , by the formula $q \cdot a = q φ (a)$ . On the other hand, if the language $L \subseteq A^{+}$ is recognized by an automaton $𝒜 = (Q, A, \cdot)$ , then L is recognized by the transition semigroup $(Q, S (𝒜))$ of this automaton.

Also the following notes on the various notions of recognizability will come in handy. If a language $L \subseteq A^{+}$ is recognized by a transformation semigroup $X = (Q, S)$ , then L is recognized by the transformation semigroup $(S^{1}, S)$ . The latter semigroup, which is the right regular representation of the semigroup S, is itself a proper transformation semigroup. It is also easy to see that a language $L \subseteq A^{+}$ is recognized by a semigroup S if and only if L is recognized by its right regular representation $(S^{1}, S)$ .

If $X = (Q, S)$ is a transformation semigroup and if $L \subseteq A^{+}$ is a language that can be recognized by the semigroup S, then L is a Boolean combination of languages over the alphabet A that can be recognized by X. Indeed, if $φ : A^{+} \to S$ is a homomorphism and if $P \subseteq S$ is a subset such that $L = φ^{- 1} (P)$ , then it turns out that $L = ⋃_{τ \in P} ⋂_{q \in Q} {w \in A^{+} : q φ (w) = q τ}$ . The latter languages are clearly recognized by X.

There are plenty of other operations on the languages over various alphabets A. For example, given a language $L \subseteq A^{+}$ and a word $u \in A^{+}$ , one can define the left quotient $u^{- 1} L$ and the right quotient $L u^{- 1}$ of the language L relative to the word u as the languages $u^{- 1} L = {w \in A^{+} : u w \in L}$ and $L u^{- 1} = {w \in A^{+} : w u \in L}$ . As previously, if $L \subseteq A^{+}$ is a recognizable language and $u \in A^{+}$ is any word, then the left quotient $u^{- 1} L$ and the right quotient $L u^{- 1}$ are recognizable languages, as well.

Another operation on languages over distinct alphabets A and B is that of taking inverse images of languages over B with respect to various homomorphisms of free semigroups $A^{+} \to B^{+}$ . Thus let $L \subseteq B^{+}$ be a language and let $ψ : A^{+} \to B^{+}$ be a homomorphism of free semigroups. Then one may consider the inverse image $ψ^{- 1} (L)$ of the language L with respect to $ψ$ . Then, of course, $ψ^{- 1} (L) \subseteq A^{+}$ . And once again, if L is a recognizable language over B, then $ψ^{- 1} (L)$ turns out to be a recognizable language over A.

Let $L \subseteq A^{+}$ be any language. We shall say that a congruence $\approx$ on the free semigroup $A^{+}$ saturates the language L if L is a union of certain classes of the partition $A^{+} / \approx$ . There exists the coarsest congruence $\sim_{L}$ on $A^{+}$ which saturates the language L. We will call this congruence $\sim_{L}$ the syntactic congruence of the language L. It is given, for all words $u, v \in A^{+}$ , by the formula

u \sim L v if and only if for all p, q \in A * : p u q \in L \Leftrightarrow p v q \in L . <math display="block" altimg="eq-00324.gif"><mrow><mi>u</mi><msub><mrow><mo>\sim</mo></mrow><mrow><mi>L</mi></mrow></msub><mi>v</mi><mspace width="1em" class="quad"></mspace><mstyle><mtext mathvariant="normal">if and only if</mtext></mstyle><mspace width="1em" class="quad"></mspace><mstyle><mtext mathvariant="normal">for all</mtext></mstyle><mspace width="0.5em" class="nbsp"></mspace><mi>p</mi><mo>,</mo><mi>q</mi><mo>\in</mo><msup><mrow><mi>A</mi></mrow><mrow><mo>*</mo></mrow></msup><mo>:</mo><mi>p</mi><mi>u</mi><mi>q</mi><mo>\in</mo><mi>L</mi><mo>\Leftrightarrow</mo><mi>p</mi><mi>v</mi><mi>q</mi><mo>\in</mo><mi>L</mi><mo>.</mo></mrow></math>

In this way, the quotient semigroup

$A^{+} / \sim_{L}$ arises. This quotient semigroup

$A^{+} / \sim_{L}$ will be called the syntactic semigroup of the language L. In general, this semigroup

$A^{+} / \sim_{L}$ need not be finite. However, it turns out that the language

$L \subseteq A^{+}$ is recognizable if and only if its syntactic semigroup

$A^{+} / \sim_{L}$ is finite. If this is the case, then the language L is recognized by its syntactic semigroup

$A^{+} / \sim_{L}$ . Moreover, whenever S is any semigroup such that L is recognized by S, then the syntactic semigroup

$A^{+} / \sim_{L}$ divides S.

Now we are ready to evoke the celebrated Eilenberg’s theorem relating to each other pseudovarieties of finite semigroups and varieties of recognizable languages. Recall once again that we confine ourselves here to recognizable languages $L \subseteq A^{+}$ , for various alphabets A. Within this framework, this theorem has been presented already by Eilenberg in [3, Sec. VII.3]. Another exposition of this result has been provided by Pin in [6, Sec. 2.2]. The other version of this theorem involving pseudovarieties of finite monoids and recognizable languages $L \subseteq A^{*}$ has been furnished, aside from [3] and [6], also by Lallement in [5, Sec. 6.5].

We begin with yet another definition. By a class of recognizable languages we shall mean any mapping $𝒞$ assigning to every alphabet A a set $A^{+} 𝒞$ of recognizable languages over A. (All of these languages are assumed to be subsets of $A^{+}$ .) If $𝒞$ and $𝒟$ are two classes of recognizable languages, then we shall write $𝒞 \subseteq 𝒟$ if, for every alphabet A, we have the inclusion $A^{+} 𝒞 \subseteq A^{+} 𝒟$ . In this way, a partial order on the family of all classes of recognizable languages is introduced.

By a variety of languages we shall mean any class $𝒱$ of recognizable languages such that the following conditions are satisfied:

(1)	For every alphabet A, the set $A^{+} 𝒱$ is a Boolean algebra, that is, $A^{+} 𝒱$ is closed with respect to the operation of set theoretical complement of languages (relative to the base set $A^{+}$ ) and with respect to the operations of set theoretical union and intersection of pairs of languages.
(2)	For every alphabet A, for every language L in $A^{+} 𝒱$ and for every element $a \in A$ , also the languages $a^{- 1} L$ and $L a^{- 1}$ belong to $A^{+} 𝒱$ .
(3)	Whenever A and B are alphabets and $ψ : A^{+} \to B^{+}$ is a homomorphism of free semigroups, then the preimage $ψ^{- 1} (L)$ of any language L from $B^{+} 𝒱$ belongs to $A^{+} 𝒱$ .

Let now $V$ be any pseudovariety of semigroups. We assign to $V$ a class of recognizable languages $𝒱$ in the following way. For every alphabet A, we let $A^{+} 𝒱$ be the set of all languages $L \subseteq A^{+}$ having the property that the syntactic semigroup $A^{+} / \sim_{L}$ belongs to $V$ . Then $𝒱$ is a variety of languages.

We may also proceed the other way round. Let $𝒱$ be any variety of languages. We assign to $𝒱$ a pseudovariety $V$ of semigroups as follows. We let $V$ be the pseudovariety generated by all syntactic semigroups of the form $A^{+} / \sim_{L}$ , where A is any alphabet and L is any language from $A^{+} 𝒱$ .

Then we may formulate the announced Eilenberg’s theorem.

Theorem 3.1. The assignments $V \mapsto 𝒱$ and $𝒱 \mapsto V$ described above determine mutually inverse order preserving one-to-one correspondences between the lattice of all pseudovarieties of semigroups and the partially ordered family of all varieties of languages.

Of course, it hence ensues immediately that also the partially ordered family of all varieties of languages forms a lattice.

In the continuation of this section, we proceed to explain the aforementioned Straubing’s wreath product principle making it possible to handle languages recognized by semigroups that belong to the semidirect products of couples of pseudovarieties of semigroups. As the original source of these ideas, the paper [8] of Straubing is usually being quoted. But it is not straightforward to identify this principle in that paper. Our exposition in this paper will follow closely that of Pin and Weil in [7, Sec. 3], where they have drawn it up for ordered semigroups. The unordered version of these results hence follows as a special case. Besides, in the course of our exposition that we offer here, we shall try to fix some shortcomings that have appeared in [7].

By a sequential transducer we shall mean a machine which can be formally described as any sextuplet $𝒯 = (Q, A, B, q_{0}, \cdot, *)$ , where Q is a finite set of states, A is an alphabet called the input alphabet, B is another alphabet called the output alphabet, $q_{0} \in Q$ is the initial state, ⋅ is a function $Q \times A \to Q$ called the transition function, and ∗ is a function $Q \times A \to B^{*}$ called the output function. Thus the triplet $𝒜 = (Q, A, \cdot)$ is an automaton, and hence the transition function ⋅ can be extended to a function $Q \times A^{*} \to Q$ , also denoted by ⋅, in the manner described previously. In addition, the output function ∗ can be extended to a function $Q \times A^{*} \to B^{*}$ , also denoted by ∗, by putting successively $q * 1 = 1$ and $q * (w a) = (q * w) ((q \cdot w) * a)$ , for all $q \in Q$ , $w \in A^{*}$ and $a \in A$ . The function realized by the given sequential transducer $𝒯$ is the function $ς : A^{*} \to B^{*}$ defined by the formula $ς (w) = q_{0} * w$ , for all words $w \in A^{*}$ . Finally, given two alphabets A and B, by a sequential function we shall mean any function $σ : A^{*} \to B^{*}$ which can be realized by a sequential transducer of the form $𝒯 = (Q, A, B, q_{0}, \cdot, *)$ . Having all that at hand, and arguing in quite the same way as in [7, Sec. 3.1], we arrive at the following result.

Theorem 3.2. Let A and B be any alphabets and let $σ : A^{*} \to B^{*}$ be a sequential function realized by a sequential transducer $𝒯 = (Q, A, B, q_{0}, \cdot, *)$ . Let $Y = (Q, T)$ be the transition semigroup of the underlying automaton $𝒜 = (Q, A, \cdot)$ . Let $Z = (Q, U)$ be any proper transformation semigroup such that $T \subseteq U$ . If $L \subseteq B^{+}$ is a language recognized by a proper transformation semigroup $X = (P, S),$ then the language $σ^{- 1} (L) \subseteq A^{+}$ is recognized by the wreath product $X \circ Z$ of the transformation semigroups X and Z.

Let A be an alphabet, let T be a finite semigroup, and let $φ : A^{+} \to T$ be a homomorphism of semigroups. We shall associate with this homomorphism $φ$ a sequential transducer $𝒯_{φ}$ which we shall construct in the following way. We set $B_{T} = T^{1} \times A$ . Then we put $𝒯_{φ} = (T^{1}, A, B_{T}, 1, \cdot, *)$ , where the transition function $\cdot : T^{1} \times A \to T^{1}$ is given, for each $t \in T^{1}$ and $a \in A$ , by the formula $t \cdot a = t φ (a)$ , and the output function $* : T^{1} \times A \to B_{T}^{*}$ is given, again for each $t \in T^{1}$ and $a \in A$ , by the formula $t * a = (t, a)$ . The sequential function $σ_{φ} : A^{*} \to B_{T}^{*}$ realized by this sequential transducer $𝒯_{φ}$ will be called the sequential function associated with the homomorphism $φ$ . Then it is easy to verify that, for any positive integer h and for arbitrary elements $a_{1}, a_{2}, \dots, a_{h} \in A$ , the equality

σ φ (a 1 a 2 \dots a h) = (1, a 1) (φ (a 1), a 2) (φ (a 1 a 2), a 3) \dots (φ (a 1 a 2 \dots a h - 1), a h) <math display="block" altimg="eq-00415.gif"><mrow><msub><mrow><mi>σ</mi></mrow><mrow><mi>φ</mi></mrow></msub><mo stretchy="false">(</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>1</mn></mrow></msub><msub><mrow><mi>a</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>\dots</mo><msub><mrow><mi>a</mi></mrow><mrow><mi>h</mi></mrow></msub><mo stretchy="false">)</mo><mo>=</mo><mo stretchy="false">(</mo><mn>1</mn><mo>,</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>1</mn></mrow></msub><mo stretchy="false">)</mo><mo stretchy="false">(</mo><mi>φ</mi><mo stretchy="false">(</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>1</mn></mrow></msub><mo stretchy="false">)</mo><mo>,</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>2</mn></mrow></msub><mo stretchy="false">)</mo><mo stretchy="false">(</mo><mi>φ</mi><mo stretchy="false">(</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>1</mn></mrow></msub><msub><mrow><mi>a</mi></mrow><mrow><mn>2</mn></mrow></msub><mo stretchy="false">)</mo><mo>,</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>3</mn></mrow></msub><mo stretchy="false">)</mo><mo>\dots</mo><mo stretchy="false">(</mo><mi>φ</mi><mo stretchy="false">(</mo><msub><mrow><mi>a</mi></mrow><mrow><mn>1</mn></mrow></msub><msub><mrow><mi>a</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>\dots</mo><msub><mrow><mi>a</mi></mrow><mrow><mi>h</mi><mo>-</mo><mn>1</mn></mrow></msub><mo stretchy="false">)</mo><mo>,</mo><msub><mrow><mi>a</mi></mrow><mrow><mi>h</mi></mrow></msub><mo stretchy="false">)</mo></mrow></math>

holds.

Note that, when constructing the sequential transducer $𝒯_{φ}$ , in comparison with the opening passage of [7, Sec. 3.2], we do not assume here the homomorphism $φ : A^{+} \to T$ to be surjective. Therefore also the transition semigroup of the underlying automaton $𝒜_{φ} = (T^{1}, A, \cdot)$ need not be equal to the entire transformation semigroup $(T^{1}, T)$ . But these properties are not exploited in the subsequent reasonings in [7, Sec. 3.2]. Thus we can manage without these properties.

Let now $X = (P, S)$ and $Y = (Q, T)$ be two proper transformation semigroups, let $X \circ Y = (P \times Q, S^{Q} * T)$ be the wreath product of these semigroups, and let $L \subseteq A^{+}$ be a language recognized by the transformation semigroup $X \circ Y$ . This means that there exist a state $(p_{0}, q_{0}) \in P \times Q$ , a subset $F \subseteq P \times Q$ and a homomorphism $η : A^{+} \to S^{Q} * T$ such that $L = {w \in A^{+} : (p_{0}, q_{0}) η (w) \in F}$ . Let $π$ denote the natural projection from $S^{Q} * T$ onto T, which is given by $π (f, t) = t$ , for all $f : Q \to S$ and all $t \in T$ . Then put $φ = π \circ η$ . Thus $φ$ is a homomorphism of $A^{+}$ into T. Now, as in the last paragraph but one, we may associate with $φ$ the sequential transducer $𝒯_{φ} = (T^{1}, A, B_{T}, 1, \cdot, *)$ and the sequential function $σ_{φ} : A^{*} \to B_{T}^{*}$ realized by this sequential transducer. In this situation, the arguments assembled in [7, Sec. 3.2] can be applied. It is, however, important to note in this connection that the identity of the monoid $T^{1}$ can be required to act as the identity on the set Q. The fact that this is a feasible requirement follows from our assumption that $Y = (Q, T)$ is a proper transformation semigroup. With this amendment, proceeding otherwise as in [7, Sec. 3.2], for the above language L we obtain the following statement.

Theorem 3.3. The language $L \subseteq A^{+}$ is a finite union of languages of the form $M \cap σ_{φ}^{- 1} (N),$ where $M \subseteq A^{+}$ is a language recognized by the homomorphism $φ$ and $N \subseteq B_{T}^{+}$ is a language recognized by the transformation semigroup X.

From Theorems 3.2 and 3.3 it is easy to deduce the following consequence. This result, which will be of crucial importance in the following section, has also its counterpart in [7, Sec. 3.2]. However, since the respective arguments in [7] must be viewed as virtually flawed, we provide a full proof below.

Corollary 3.4. Let $V$ and $W$ be two pseudovarieties of semigroups, let $𝒱$ and $𝒲$ be the varieties of languages associated with the pseudovarieties $V$ and $W,$ respectively, and let $𝒵$ be the variety of languages associated with the semidirect product $V * W$ of the pseudovarieties $V$ and $W$ . Then, for every alphabet A, the collection of languages $A^{+} 𝒵$ is the smallest Boolean algebra containing all languages from $A^{+} 𝒲$ and all languages of the form $σ_{φ}^{- 1} (N),$ where $φ : A^{+} \to T$ is a homomorphism of $A^{+}$ into a semigroup $T \in W,$ $σ_{φ}$ is the sequential function associated with $φ,$ and N is any language from $B_{T}^{+} 𝒱$ .

Proof. Clearly, the collection $A^{+} 𝒵$ is a Boolean algebra and it contains all languages from $A^{+} 𝒲$ . We shall show that it also contains all languages of the above form $σ_{φ}^{- 1} (N)$ . Let $𝒯_{φ} = (T^{1}, A, B_{T}, 1, \cdot, *)$ be the sequential transducer associated with $φ$ . Then $σ_{φ} : A^{*} \to B_{T}^{*}$ is the sequential function realized by this sequential transducer. Let $Y = (T^{1}, R)$ be the transition semigroup of the underlying automaton $𝒜_{φ} = (T^{1}, A, \cdot)$ . Then R is a subsemigroup of T, and the right regular representation $Z = (T^{1}, T)$ of the semigroup T is a proper transformation semigroup. Recall also that $T \in W$ . The language N is recognized by a proper transformation semigroup $X = (P, S)$ such that $S \in V$ . Now, applying Theorem 3.2, we come to the conclusion that the language $σ_{φ}^{- 1} (N)$ is recognized by the wreath product $X \circ Z$ . Consequently, this language $σ_{φ}^{- 1} (N)$ is recognized by the action semigroup of the wreath product $X \circ Z$ . But this action semigroup is equal to the semidirect product $S^{T^{1}} * T$ , and so it belongs to the pseudovariety $V * W$ . Therefore the language $σ_{φ}^{- 1} (N)$ indeed belongs to $A^{+} 𝒵$ .

On the other hand, we have seen in the previous section (see the last paragraph but one in that section) that the semidirect product $V * W$ of the given pseudovarieties is obtained as the class of all divisors of the action semigroups $S^{Q} * T$ of the wreath products $X \circ Y$ of arbitrary proper transformation semigroups $X = (P, S)$ and $Y = (Q, T)$ such that $S \in V$ and $T \in W$ . Furthermore, according to what we have seen previously in this section, every language over the alphabet A which is recognized by any divisor of such an action semigroup $S^{Q} * T$ must be recognized already by this action semigroup itself, and every language over the alphabet A which is recognized by such an action semigroup $S^{Q} * T$ is a Boolean combination of languages over the alphabet A that can be recognized by the initial wreath product $X \circ Y$ . Thus it remains to show that every language over the alphabet A which can be recognized by any wreath product $X \circ Y$ of the mentioned form belongs to the Boolean algebra generated by the languages from $A^{+} 𝒲$ and by the languages of the above form $σ_{φ}^{- 1} (N)$ . But this follows immediately from Theorem 3.3. □

Now we approach the characterization of languages which can be recognized by the semigroups in the power pseudovariety $P V$ , for any given pseudovariety of semigroups $V$ . The result that we are going to present here has been obtained by Straubing in [9].

Let A and B be two alphabets and let $𝜗 : A^{+} \to B^{+}$ be a homomorphism of semigroups. We say that this homomorphism $𝜗$ is length preserving if the inclusion $𝜗 (A) \subseteq B$ holds. Then, for every word $u \in A^{+}$ , the length of the word $𝜗 (u)$ is the same as that of u. Then the following statement holds.

Theorem 3.5. Let $V$ be any pseudovariety of semigroups, let $𝒱$ be the variety of languages associated with $V,$ and let $𝒲$ be the variety of languages associated with the power pseudovariety $P V$ . Then, for every alphabet B, the collection of languages $B^{+} 𝒲$ is the smallest Boolean algebra containing all languages of the form $𝜗 (M),$ where M is any language from $A^{+} 𝒱$ for some alphabet A, and $𝜗 : A^{+} \to B^{+}$ is a length preserving homomorphism.

We conclude this section with a characterization of languages that are recognized by commutative semigroups. That is, we shall characterize the languages belonging to the variety of languages $𝒞 o m$ that is associated to the pseudovariety $C o m$ of all commutative semigroups. This result comes from [6, Sec. 2.3].

Let A be an alphabet, let $w \in A^{+}$ be a word and let $a \in A$ be a letter. We denote by $| w |_{a}$ the number of occurrences of the letter a in the word w. For every positive integer n and for every $k \in {0, 1, 2, \dots, n - 1}$ , consider the language

K (a, k, n) = {w \in A + : | w | a \equiv k mod <math display="block" altimg="eq-00521.gif"><mrow><mi>K</mi><mo stretchy="false">(</mo><mi>a</mi><mo>,</mo><mi>k</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo><mo>=</mo><mo stretchy="false">{</mo><mi>w</mi><mo>\in</mo><msup><mrow><mi>A</mi></mrow><mrow><mo>+</mo></mrow></msup><mo>:</mo><mo stretchy="false">|</mo><mi>w</mi><msub><mrow><mo stretchy="false">|</mo></mrow><mrow><mi>a</mi></mrow></msub><mo>\equiv</mo><mi>k</mi><mspace width="-.17em" class="negativethinspace"></mspace><mspace width="1em"></mspace><mo>mod</mo><mspace width="0.3em"></mspace><mi>n</mi><mo stretchy="false">}</mo><mo>.</mo></mrow></math>

Then

$K (a, k, n)$ is a recognizable language. Indeed, let

$ℤ_{n}$ be the additive group of integers modulo n and let

$φ_{a} : A^{+} \to ℤ_{n}$ be the homomorphism assigning to every word

$w \in A^{+}$ the number

$| w |_{a}$ taken modulo n. Then we have

$K (a, k, n) = φ_{a}^{- 1} (k)$ . Since obviously

$ℤ_{n} \in C o m$ , we hence see that

$K (a, k, n) \in A^{+} 𝒞 o m$ .

Furthermore, for every non-negative integer r, consider the language

L (a, r) = {w \in A^{+} : | w |_{a} = r} .

Then

$L (a, r)$ is also a recognizable language. Indeed, let n be any positive integer greater than r and let

$ℤ_{1, n}$ be the cyclic monoid of order

$n + 1$ and period 1. This means that if g is the generator, then the elements of

$ℤ_{1, n}$ are

$1, g, g^{2}, \dots, g^{n}$ with the convention that

$g^{n + 1} = g^{n}$ . Then let

$ψ_{a} : A^{+} \to ℤ_{1, n}$ be the homomorphism assigning to every word

$w \in A^{+}$ the element

$g^{| w |_{a}}$ . Here

$g^{0}$ is to be understood as 1. Then we have

$L (a, r) = ψ_{a}^{- 1} (g^{r})$ . Since clearly

$ℤ_{1, n} \in C o m$ , we see that

$L (a, r) \in A^{+} 𝒞 o m$ .

From these notes and from the last corollary in [6, Sec. 2.3] we infer the following description of the recognizable languages forming the variety of languages $𝒞 o m$ that is associated to the pseudovariety $C o m$ of all commutative semigroups.

Proposition 3.6. For every alphabet A, the collection of languages $A^{+} 𝒞 o m$ is the smallest Boolean algebra containing all languages over A of the form K(a,k,n), where $a \in A,$ n is a positive integer, and $k \in {0, 1, 2, \dots, n - 1},$ and all languages over A of the form L(a,r), where $a \in A,$ and r is a non-negative integer.

4. The Equality of Pseudovarieties $P U = C o m * D$

As we have already explained in the introduction, the purpose of this paper is to establish the equality of pseudovarieties given in the heading of this section. The inclusion $P U \subseteq C o m * D$ has been verified already by Almeida in [1] and in [2, Sec. 11.9] using purely algebraic arguments. Thus it remains to establish also the opposite inclusion $C o m * D \subseteq P U$ . This once, we shall do so by means of language theoretical methods. We have recalled in Theorem 3.1 the celebrated result of Eilenberg, according to which there are mutually inverse order preserving one-to-one correspondences between the lattice of all pseudovarieties of semigroups and the lattice of all varieties of languages. As in the introduction, we shall let $𝒱$ be the variety of languages corresponding to the pseudovariety $C o m * D$ , and we shall let $𝒲$ be the variety of languages corresponding to the pseudovariety $P U$ . Then, in order to establish the inclusion $C o m * D \subseteq P U$ , we shall struggle to verify that we have $𝒱 \subseteq 𝒲$ . That is, we shall strive to show that, for every alphabet A, we have the inclusion $A^{+} 𝒱 \subseteq A^{+} 𝒲$ .

Once again, let $𝒞 o m$ be the variety of languages corresponding to the pseudovariety $C o m$ . Let further $𝒟$ be the variety of languages corresponding to the pseudovariety $D$ , and let $𝒰$ be the variety of languages corresponding to the pseudovariety $U$ . Since clearly $U \subseteq P U$ , we have $𝒰 \subseteq 𝒲$ , which means that, for every alphabet A, we have the inclusion $A^{+} 𝒰 \subseteq A^{+} 𝒲$ . Next we turn to the pseudovariety $C o m * D$ and to its corresponding variety of languages $𝒱$ .

According to Straubing’s wreath product principle captured in Corollary 3.4, for every alphabet A, the collection of languages $A^{+} 𝒱$ is the smallest Boolean algebra containing all languages from $A^{+} 𝒟$ and all languages of the form $σ_{φ}^{- 1} (J)$ , where $φ : A^{+} \to T$ is a homomorphism of $A^{+}$ into a semigroup $T \in D$ , $σ_{φ}$ is the sequential function associated with $φ$ , and J is any language from $B_{T}^{+} 𝒞 o m$ . Since clearly $D \subseteq U$ , we have $𝒟 \subseteq 𝒰$ , which means that, for every alphabet A, we have the inclusion $A^{+} 𝒟 \subseteq A^{+} 𝒰$ . We have seen in the previous paragraph that we also have the inclusion $A^{+} 𝒰 \subseteq A^{+} 𝒲$ . Therefore, for every alphabet A, we get the inclusion $A^{+} 𝒟 \subseteq A^{+} 𝒲$ . Thus it remains to look after the languages of the form $σ_{φ}^{- 1} (J)$ specified above. We need to show that also these languages belong to the collection $A^{+} 𝒲$ .

Thus let $φ : A^{+} \to T$ be a homomorphism of $A^{+}$ into a semigroup $T \in D$ , let $σ_{φ}$ be the sequential function associated with $φ$ , and let J be any language from $B_{T}^{+} 𝒞 o m$ . Since $T \in D$ and $D = ⋃_{m = 1}^{\infty} D_{m}$ , there exists a positive integer m such that $T \in D_{m}$ . Therefore the semigroup T satisfies the pseudoidentity $x y_{1} y_{2} \dots y_{m} ≏ y_{1} y_{2} \dots y_{m}$ . Recall that the sequential function $σ_{φ}$ is given, for every positive integer h and for arbitrary elements $a_{1}, a_{2}, \dots, a_{h} \in A$ , by the formula

σ_{φ} (a_{1} a_{2} \dots a_{h}) = (1, a_{1}) (φ (a_{1}), a_{2}) (φ (a_{1} a_{2}), a_{3}) \dots (φ (a_{1} a_{2} \dots a_{h - 1}), a_{h}) .

Consider any index

$i \in {1, 2, \dots, h}$ such that

$m < i$ . Then we have

\begin{matrix} (φ (a_{1} a_{2} \dots a_{i - 1}), a_{i}) & = & (φ (a_{1}) φ (a_{2}) \dots φ (a_{i - 1}), a_{i}) \\ = & (φ (a_{i - m}) φ (a_{i - m + 1}) \dots φ (a_{i - 1}), a_{i}) \\ = & (φ (a_{i - m} a_{i - m + 1} \dots a_{i - 1}), a_{i}) . \end{matrix}

This finding shows that, when computing the word

$σ_{φ} (a_{1} a_{2} \dots a_{h})$ , apart from the initial m segments

$a_{1}$ ,

$a_{1} a_{2}, \dots, a_{1} a_{2} \dots a_{m}$ of the word

$a_{1} a_{2} \dots a_{h}$ , they are only the segments of the word

$a_{1} a_{2} \dots a_{h}$ of length

$m + 1$ which do matter. Thus, in the word

$σ_{φ} (a_{1} a_{2} \dots a_{h})$ , only the initial letters

$(1, a_{1})$ ,

$(φ (a_{1}), a_{2})$ ,

$(φ (a_{1} a_{2}), a_{3})$ ,

…,

$(φ (a_{1} a_{2} \dots a_{m - 1}), a_{m})$ of the alphabet

$B_{T}$ will appear first, and then only the letters of the form

$(φ (a_{j} a_{j + 1} \dots a_{j + m - 1}), a_{j + m})$ , for all

$j \in {1, 2, \dots, h - m}$ , of the alphabet

$B_{T}$ will appear subsequently.

According to Proposition 3.6, the language J from $B_{T}^{+} 𝒞 o m$ is a Boolean combination of languages over the alphabet $B_{T} = T^{1} \times A$ of the form $K ((t, a), k, n)$ , where $(t, a)$ is a letter in $B_{T}$ , so that $t \in T^{1}$ and $a \in A$ , n is a positive integer and $k \in {0, 1, 2, \dots, n - 1}$ , and of languages of the form $L ((t, a), r)$ , where $(t, a)$ is again a letter in $B_{T}$ , so that $t \in T^{1}$ and $a \in A$ and r is a non-negative integer. We have to show that then the language $σ_{φ}^{- 1} (J)$ belongs to the collection of languages $A^{+} 𝒲$ . Since the operator $σ_{φ}^{- 1}$ commutes with the Boolean operations, we see that it will be enough if we show that the languages of the forms $σ_{φ}^{- 1} (K ((t, a), k, n))$ and $σ_{φ}^{- 1} (L ((t, a), r))$ belong to $A^{+} 𝒲$ . We already know from Theorem 3.2 that these languages are recognizable languages over the alphabet A.

We start with the languages of the form $σ_{φ}^{- 1} (K ((t, a), k, n))$ , where $t \in T^{1}$ , $a \in A$ , n is a positive integer, and $k \in {0, 1, 2, \dots, n - 1}$ . We shall see that, in this case, these languages actually belong already to the subset $A^{+} 𝒰$ of $A^{+} 𝒲$ . For this purpose, we shall show that the syntactic semigroups of these languages belong to the pseudovariety $U$ . We shall document this statement by verifying that these syntactic semigroups satisfy the pseudoidentities $e x e y e ≏ e y e x e$ and ${(e f)}^{ω} e x f ≏ e x f$ .

Lemma 4.1. The syntactic semigroup of the language $σ_{φ}^{- 1} (K ((t, a), k, n))$ satisfies the pseudoidentity $e x e y e ≏ e y e x e$ .

Proof. Let us denote briefly by $\sim_{K}$ the syntactic congruence of the language $σ_{φ}^{- 1} (K ((t, a), k, n))$ . Thus we wish to show that the syntactic semigroup $A^{+} / \sim_{K}$ satisfies the pseudoidentity $e x e y e ≏ e y e x e$ . So let $u, v \in A^{+}$ and $𝜀 \in A^{+}$ be arbitrary words such that the class $[𝜀] \sim_{K}$ is an idempotent in $A^{+} / \sim_{K}$ . We need to show that then we have $𝜀 u 𝜀 v 𝜀 \sim_{K} 𝜀 v 𝜀 u 𝜀$ . By the definition of the syntactic congruence $\sim_{K}$ , this means to check that, for arbitrary words $p, q \in A^{*}$ , we have

p 𝜀 u 𝜀 v 𝜀 q \in σ_{φ}^{- 1} (K ((t, a), k, n)) \Leftrightarrow p 𝜀 v 𝜀 u 𝜀 q \in σ_{φ}^{- 1} (K ((t, a), k, n)) .

Equivalently, this means to verify that, for any words

$p, q \in A^{*}$ , we have

σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q) \in K ((t, a), k, n) \Leftrightarrow σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q) \in K ((t, a), k, n) .

Since the class

$[𝜀] \sim_{K}$ is an idempotent in

$A^{+} / \sim_{K}$ , we have

$𝜀 \sim_{K} 𝜀^{2}$ , which says that, for arbitrary words

$c, d \in A^{*}$ , we have

$c 𝜀 d \in σ_{φ}^{- 1} (K ((t, a), k, n)) \Leftrightarrow c 𝜀^{2} d \in σ_{φ}^{- 1} (K ((t, a), k, n))$ , or, which is the same,

$σ_{φ} (c 𝜀 d) \in K ((t, a), k, n) \Leftrightarrow σ_{φ} (c 𝜀^{2} d) \in K ((t, a), k, n)$ . But this entails that, in the formulae displayed above, we could replace every occurrence of the word

$𝜀$ with an arbitrarily large positive power of

$𝜀$ . Therefore we may assume, in addition, that the length

$| 𝜀 |$ of the word

$𝜀$ is greater than the value m drawn above.

Now we have to return to our previous considerations regarding the computation of the word $σ_{φ} (a_{1} a_{2} \dots a_{h})$ , for arbitrary elements $a_{1}, a_{2}, \dots, a_{h} \in A$ , and to apply them to the above words $σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q)$ and $σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q)$ . Since $| 𝜀 | > m$ , the first m letters of the alphabet $B_{T}$ in the words $σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q)$ and $σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q)$ occur within the initial segment $σ_{φ} (p 𝜀)$ of these words, and therefore they are the same. As far as the other letters of $B_{T}$ in the words $σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q)$ and $σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q)$ are concerned, we have seen that these letters arise from segments of the words $p 𝜀 u 𝜀 v 𝜀 q$ and $p 𝜀 v 𝜀 u 𝜀 q$ of length $m + 1$ . Once again, since $| 𝜀 | > m$ , such segments must appear as segments in one of the words $p 𝜀$ , $𝜀 u 𝜀$ , $𝜀 v 𝜀$ and $𝜀 q$ . In particular, if such a segment hits the word u in $p 𝜀 u 𝜀 v 𝜀 q$ , then it must be a segment of the word $𝜀 u 𝜀$ . But then it is also a segment of the same word $𝜀 u 𝜀$ in $p 𝜀 v 𝜀 u 𝜀 q$ . The same holds true of the segments of length $m + 1$ of the words $p 𝜀 u 𝜀 v 𝜀 q$ and $p 𝜀 v 𝜀 u 𝜀 q$ which hit the word v. This consideration reveals that the number of occurrences of each individual letter of $B_{T}$ in both words $σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q)$ and $σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q)$ is the same. Hence, by the definition of the language $K ((t, a), k, n)$ , we certainly have $σ_{φ} (p 𝜀 u 𝜀 v 𝜀 q) \in K ((t, a), k, n) \Leftrightarrow σ_{φ} (p 𝜀 v 𝜀 u 𝜀 q) \in K ((t, a), k, n)$ , for arbitrary words $p, q \in A^{*}$ , which yields that $𝜀 u 𝜀 v 𝜀 \sim_{K} 𝜀 v 𝜀 u 𝜀$ , as required. □

Lemma 4.2. The syntactic semigroup of the language $σ_{φ}^{- 1} (K ((t, a), k, n))$ satisfies the pseudoidentity ${(e f)}^{ω} e x f ≏ e x f$ .

Proof. As before, let us denote briefly by $\sim_{K}$ the syntactic congruence of the language $σ_{φ}^{- 1} (K ((t, a), k, n))$ . Thus we wish to show that the syntactic semigroup $A^{+} / \sim_{K}$ satisfies the pseudoidentity ${(e f)}^{ω} e x f ≏ e x f$ . So let $u \in A^{+}$ and $𝜀, ϰ \in A^{+}$ be arbitrary words such that the classes $[𝜀] \sim_{K}$ and $[ϰ] \sim_{K}$ are idempotents in $A^{+} / \sim_{K}$ . Let g be any positive integer such that the class $[{(𝜀 ϰ)}^{g}] \sim_{K}$ is an idempotent in $A^{+} / \sim_{K}$ . We need to show that then we have ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{K} 𝜀 u ϰ$ . By the definition of the syntactic congruence $\sim_{K}$ , this means to check that, for arbitrary words $p, q \in A^{*}$ , we have

p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q \in σ_{φ}^{- 1} (K ((t, a), k, n)) \Leftrightarrow p 𝜀 u ϰ q \in σ_{φ}^{- 1} (K ((t, a), k, n)) .

Equivalently, this means to verify that, for any words

$p, q \in A^{*}$ , we have

σ_{φ} (p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q) \in K ((t, a), k, n) \Leftrightarrow σ_{φ} (p 𝜀 u ϰ q) \in K ((t, a), k, n) .

The same arguments as previously show that, in the above formulae, we could replace every occurrence of any of the words

$𝜀$ and ϰ with an arbitrarily large positive power of this word. Therefore we may assume, in addition, that the lengths

$| 𝜀 |$ and

$| ϰ |$ of the words

$𝜀$ and ϰ are both greater than the value m drawn above. Furthermore, virtually the same reasoning also reveals that, in the above formulae, we could replace the word

${(𝜀 ϰ)}^{g}$ with an arbitrary positive power of this word. Therefore we may assume, in addition, that the exponent g is divisible by the positive integer n.

Once again, we have to return to our previous considerations regarding the computation of the word $σ_{φ} (a_{1} a_{2} \dots a_{h})$ , for arbitrary elements $a_{1}, a_{2}, \dots, a_{h}$ $\in$ A, and to apply them to the above words $σ_{φ} (p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q)$ and $σ_{φ} (p 𝜀 u ϰ q)$ . Since $| 𝜀 | > m$ , the first m letters of the alphabet $B_{T}$ appearing in these two words occur within the initial segment $σ_{φ} (p 𝜀)$ of these words, and therefore they are the same. Thus we have to look after the other appearances of the letters of the alphabet $B_{T}$ in these two words. We have seen that these letters arise from segments of the words $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q$ and $p 𝜀 u ϰ q$ of length $m + 1$ . Since $| 𝜀 | > m$ and $| ϰ | > m$ , the segments of length $m + 1$ which may occur anywhere in the word $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q$ , beyond those occurrences of such segments which appear already in the word $p 𝜀 u ϰ q$ , must be segments of the word ${(𝜀 ϰ)}^{g} 𝜀$ , not counting the occurrences of these segments in the last factor $𝜀$ of this word. Moreover, every such segment in the word ${(𝜀 ϰ)}^{g} 𝜀$ must be a segment of either of the words $𝜀 ϰ$ or $ϰ 𝜀$ . Hence, since the exponent g is divisible by n, the number of these occurrences of every such segment in the word ${(𝜀 ϰ)}^{g} 𝜀$ must be divisible by n. These considerations reveal that the numbers of occurrences of each individual letter of $B_{T}$ in the words $σ_{φ} (p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q)$ and $σ_{φ} (p 𝜀 u ϰ q)$ are congruent modulo n. Therefore, by the definition of the language $K ((t, a), k, n)$ , we have $σ_{φ} (p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q) \in K ((t, a), k, n) \Leftrightarrow σ_{φ} (p 𝜀 u ϰ q) \in K ((t, a), k, n)$ , for arbitrary words $p, q \in A^{*}$ , which yields that ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{K} 𝜀 u ϰ$ , as required. □

We continue by verifying that the languages of the form $σ_{φ}^{- 1} (L ((t, a), r))$ , where $t \in T^{1}$ , $a \in A$ and r is a non-negative integer, belong to $A^{+} 𝒲$ . We begin with a few preliminary considerations. So let $w \in σ_{φ}^{- 1} (L ((t, a), r))$ be an arbitrary word. This means that $σ_{φ} (w) \in L ((t, a), r)$ . However, by the definition of the language $L ((t, a), r)$ , this is equivalent to the condition that the letter $(t, a)$ of the alphabet $B_{T}$ has exactly r appearances in the word $σ_{φ} (w)$ . Assume, for the time being, that the length $| w |$ of the word w is greater than m. Let $r_{0}$ be the number of appearances of the mentioned letter $(t, a)$ in the initial segment of the word $σ_{φ} (w)$ of length m. Then we can write $r = r_{0} + \bar{r}$ for some non-negative integer $\bar{r}$ . From our earlier considerations we know that the appearances of the letter $(t, a)$ in the word $σ_{φ} (w)$ which do not occur in the initial segment of $σ_{φ} (w)$ of length m arise from some segments of the word w of length $m + 1$ . There may be several such pairwise distinct segments in the word w. So let $ξ$ be any of these segments of the word w of length $m + 1$ which give rise to the letter $(t, a)$ . Let $ξ$ have s occurrences in the word w. Then $s \leq \bar{r}$ . (We permit also the possibility that $s = 0$ .) Consider now all s occurrences of this segment $ξ$ in the word w. Some of these occurrences of $ξ$ may overlap in w or they may touch each other. Let $ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ}$ be all maximal segments of the word w which are covered completely by the mentioned occurrences of the segment $ξ$ in w. Then $ℓ \leq s$ and $w = υ_{0} ϖ_{1} υ_{1} ϖ_{2} υ_{2} \dots υ_{ℓ - 1} ϖ_{ℓ} υ_{ℓ}$ for some words $υ_{0}, υ_{ℓ} \in A^{*}$ and $υ_{1}, υ_{2}, \dots, υ_{ℓ - 1} \in A^{+}$ . Let $\bar{A}$ be an alphabet disjoint with A whose letters are in a one-to-one correspondence with the letters of A. Consider next the language ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ . (If $s = 0$ , and so $ℓ = 0$ , then take the language ${\bar{A}}^{+}$ instead.) This language is a rational language over the alphabet $A \cup \bar{A}$ , and therefore, by Kleene’s theorem, it is a recognizable language.

We shall have to deal with all possible languages over the alphabet $A \cup \bar{A}$ of the form just mentioned. Every such language is completely determined by the given $ℓ$ -tuple of words $(ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ})$ . Recall that such $ℓ$ -tuples are characterized by the following properties. First of all, there is a word $ξ$ over the alphabet A of length $m + 1$ which gives rise to the letter $(t, a)$ of the alphabet $B_{T}$ . According to what we have already specified above, this means that, if $ξ = c_{0} c_{1} c_{2} \dots c_{m}$ where $c_{0}, c_{1}, c_{2}, \dots, c_{m} \in A$ , then $φ (c_{0} c_{1} \dots c_{m - 1}) = t$ and $c_{m} = a$ . Thereafter, the mentioned $ℓ$ -tuple $(ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ})$ consists of non-empty words over the alphabet A such that the total number of occurrences of the segment $ξ$ in any of the words $ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ}$ is equal to s and, at the same time, every letter of any of the words $ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ}$ is covered by some occurrence of the segment $ξ$ in this word. Then certainly $ℓ \leq s$ and there are evidently only finitely many possibilities how such $ℓ$ -tuples $(ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ})$ may look like. Therefore there are only finitely many languages of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ which may arise in this way. (If $s = 0$ , and so $ℓ = 0$ , then there is only the language ${\bar{A}}^{+}$ which may so arise.) Let now $L_{ξ}^{(s)}$ be the finite union of all of these languages so determined. Then $L_{ξ}^{(s)}$ is again a recognizable language over the alphabet $A \cup \bar{A}$ .

We are going to show that the recognizable language $L_{ξ}^{(s)}$ just designated belongs to the collection of languages which the variety of languages $𝒰$ assigns to the alphabet $A \cup \bar{A}$ . We intend to attain this goal by checking that the syntactic semigroup of this language $L_{ξ}^{(s)}$ belongs to the pseudovariety $U$ . We shall achieve this aim by verifying that this syntactic semigroup satisfies the pseudoidentities $e x e y e ≏ e y e x e$ and ${(e f)}^{ω} e x f ≏ e x f$ .

For the sake of notational ease, let us denote briefly by $\sim_{L}$ the syntactic congruence of the language $L_{ξ}^{(s)}$ . Then the syntactic semigroup of the language $L_{ξ}^{(s)}$ is of the form ${(A \cup \bar{A})}^{+} / \sim_{L}$ . Recall that the language $L_{ξ}^{(s)}$ is a finite union of recognizable languages of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ . Let $L_{1}, L_{2}, \dots, L_{z}$ be those finitely many languages of the form just mentioned whose union constitutes the language $L_{ξ}^{(s)}$ . Let $\sim_{L_{1}}, \sim_{L_{2}}, \dots, \sim_{L_{z}}$ be the syntactic congruences of these languages. Then the syntactic semigroups of these languages are of the forms ${(A \cup \bar{A})}^{+} / \sim_{L_{1}}$ , ${(A \cup \bar{A})}^{+} / \sim_{L_{2}}, \dots, {(A \cup \bar{A})}^{+} / \sim_{L_{z}}$ . Consider the subdirect product of these syntactic semigroups consisting of all z-tuples of the form $([w] \sim_{L_{1}}, [w] \sim_{L_{2}}, \dots, [w] \sim_{L_{z}})$ , where $w \in {(A \cup \bar{A})}^{+}$ . Then this subdirect product recognizes the language $L_{ξ}^{(s)}$ , and so the syntactic semigroup ${(A \cup \bar{A})}^{+} / \sim_{L}$ of this language $L_{ξ}^{(s)}$ is a canonical homomorphic image of the mentioned subdirect product of syntactic semigroups of the languages $L_{1}, L_{2}, \dots, L_{z}$ . This canonical homomorphism maps every z-tuple of the form $([w] \sim_{L_{1}}, [w] \sim_{L_{2}}, \dots, [w] \sim_{L_{z}})$ to the element $[w] \sim_{L}$ .

Lemma 4.3. The syntactic semigroup ${(A \cup \bar{A})}^{+} / \sim_{L}$ of the language $L_{ξ}^{(s)}$ satisfies the pseudoidentity $e x e y e ≏ e y e x e$ .

Proof. So let $𝜀 \in {(A \cup \bar{A})}^{+}$ and $u, v \in {(A \cup \bar{A})}^{+}$ be any words such that the class $[𝜀] \sim_{L}$ is an idempotent in the semigroup ${(A \cup \bar{A})}^{+} / \sim_{L}$ . We need to show that then we have $𝜀 u 𝜀 v 𝜀 \sim_{L} 𝜀 v 𝜀 u 𝜀$ . The element $[𝜀] \sim_{L}$ is the image under the above canonical homomorphism of the z-tuple $([𝜀] \sim_{L_{1}}, [𝜀] \sim_{L_{2}}, \dots, [𝜀] \sim_{L_{z}})$ . Since our syntactic semigroups are finite semigroups, there exists a positive integer $η$ such that the z-tuple $([𝜀^{η}] \sim_{L_{1}}, [𝜀^{η}] \sim_{L_{2}}, \dots, [𝜀^{η}] \sim_{L_{z}})$ is an idempotent. Thus, for every $i \in {1, 2, \dots, z}$ , the element $[𝜀^{η}] \sim_{L_{i}}$ is an idempotent, and so we have $𝜀^{η} \sim_{L_{i}} 𝜀^{2 η}$ . By the definition of the syntactic congruence $\sim_{L_{i}}$ , for every words $c, d \in {(A \cup \bar{A})}^{*}$ , we get $c 𝜀^{η} d \in L_{i} \Leftrightarrow c 𝜀^{2 η} d \in L_{i}$ . However, having in view that the language $L_{i}$ is of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , we come to the following conclusion. If the word $𝜀$ contained some elements from the alphabet A and if, for some words $c, d \in {(A \cup \bar{A})}^{*}$ , we had $c 𝜀^{η} d \in L_{i}$ , then we would certainly have $c 𝜀^{2 η} d \notin L_{i}$ . However, this means that the above condition $c 𝜀^{η} d \in L_{i} \Leftrightarrow c 𝜀^{2 η} d \in L_{i}$ would be violated. Thus, whenever $𝜀 \notin {\bar{A}}^{+}$ , then, for any words $c, d \in {(A \cup \bar{A})}^{*}$ , we must have $c 𝜀^{η} d \notin L_{i}$ . Therefore, in particular, in such a case, for any words $p, q \in {(A \cup \bar{A})}^{*}$ , we must have $p 𝜀^{η} u 𝜀^{η} v 𝜀^{η} q \notin L_{i}$ and $p 𝜀^{η} v 𝜀^{η} u 𝜀^{η} q \notin L_{i}$ . However, by the definition of the syntactic congruence $\sim_{L_{i}}$ , this means that $𝜀^{η} u 𝜀^{η} v 𝜀^{η} \sim_{L_{i}} 𝜀^{η} v 𝜀^{η} u 𝜀^{η}$ . This relation must hold for all $i \in {1, 2, \dots, z}$ . By the reasonings preceding this lemma, it hence eventually ensues that $𝜀^{η} u 𝜀^{η} v 𝜀^{η} \sim_{L} 𝜀^{η} v 𝜀^{η} u 𝜀^{η}$ . Since here the element $[𝜀] \sim_{L}$ is an idempotent, and so it is equal to the element $[𝜀^{η}] \sim_{L}$ , we hence lastly infer that $𝜀 u 𝜀 v 𝜀 \sim_{L} 𝜀 v 𝜀 u 𝜀$ , as desired. Thus it remains to treat the case when $𝜀 \in {\bar{A}}^{+}$ .

We have to verify that, in this case, we also have $𝜀 u 𝜀 v 𝜀 \sim_{L} 𝜀 v 𝜀 u 𝜀$ . By the definition of the syntactic congruence $\sim_{L}$ , this means to check that, for any words $p, q \in {(A \cup \bar{A})}^{*}$ , we have $p 𝜀 u 𝜀 v 𝜀 q \in L_{ξ}^{(s)} \Leftrightarrow p 𝜀 v 𝜀 u 𝜀 q \in L_{ξ}^{(s)}$ . Thus assume that $p 𝜀 u 𝜀 v 𝜀 q \in L_{ξ}^{(s)}$ . Since $L_{ξ}^{(s)}$ is a finite union of languages of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , it hence ensues that we must have $p 𝜀 u 𝜀 v 𝜀 q \in {\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ for some $ℓ$ -tuple $(ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ})$ . Since we now have $𝜀 \in {\bar{A}}^{+}$ , the three appearances of the segment $𝜀$ in the word $p 𝜀 u 𝜀 v 𝜀 q$ must fall into some of the factors of the form ${\bar{A}}^{*}$ or ${\bar{A}}^{+}$ in the language ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ . If $u \in {\bar{A}}^{+}$ , then the whole segment $𝜀 u 𝜀$ must fall into the same factor of the form ${\bar{A}}^{*}$ or ${\bar{A}}^{+}$ . But then, clearly, we get that $p 𝜀 v 𝜀 u 𝜀 q \in {\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , and so $p 𝜀 v 𝜀 u 𝜀 q \in L_{ξ}^{(s)}$ . The same conclusion follows if $v \in {\bar{A}}^{+}$ . Thus we may further assume that $u \notin {\bar{A}}^{+}$ and $v \notin {\bar{A}}^{+}$ . Then there exist indices $i, j, k \in {1, 2, \dots, ℓ}$ satisfying $i \leq j < k$ such that the segment $𝜀 u 𝜀$ comes from the language ${\bar{A}}^{+} ϖ_{i} {\bar{A}}^{+} ϖ_{i + 1} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{j} {\bar{A}}^{+}$ and the segment $𝜀 v 𝜀$ comes from the language ${\bar{A}}^{+} ϖ_{j + 1} {\bar{A}}^{+} ϖ_{j + 2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{k} {\bar{A}}^{+}$ , and they are the leftmost and the rightmost positions of the factors of the form ${\bar{A}}^{+}$ in these two languages that the appearances of the segment $𝜀$ in the words $𝜀 u 𝜀$ and $𝜀 v 𝜀$ come from. But then, clearly, the word $p 𝜀 v 𝜀 u 𝜀 q$ is an element of the language

\begin{matrix} {\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{i - 1} {\bar{A}}^{+} ϖ_{j + 1} {\bar{A}}^{+} ϖ_{j + 2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{k} {\bar{A}}^{+} \\ ϖ_{i} {\bar{A}}^{+} ϖ_{i + 1} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{j} {\bar{A}}^{+} ϖ_{k + 1} {\bar{A}}^{+} ϖ_{k + 2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*} . \end{matrix}

However, this language is also one of the constituents of the language

$L_{ξ}^{(s)}$ . Therefore we see that we again have

$p 𝜀 v 𝜀 u 𝜀 q \in L_{ξ}^{(s)}$ . Similarly we can show that if

$p 𝜀 v 𝜀 u 𝜀 q \in L_{ξ}^{(s)}$ , then also

$p 𝜀 u 𝜀 v 𝜀 q \in L_{ξ}^{(s)}$ . This finding verifies that we indeed have

$𝜀 u 𝜀 v 𝜀 \sim_{L} 𝜀 v 𝜀 u 𝜀$ , again as desired. □

Lemma 4.4. The syntactic semigroup ${(A \cup \bar{A})}^{+} / \sim_{L}$ of the language $L_{ξ}^{(s)}$ satisfies the pseudoidentity ${(e f)}^{ω} e x f ≏ e x f$ .

Proof. To start with, let us to recall the considerations performed in the paragraph preceding Lemma 4.3 and the notations introduced in that paragraph. Thus let now $𝜀, ϰ \in {(A \cup \bar{A})}^{+}$ and $u \in {(A \cup \bar{A})}^{+}$ be any words such that the classes $[𝜀] \sim_{L}$ and $[ϰ] \sim_{L}$ are idempotents in ${(A \cup \bar{A})}^{+} / \sim_{L}$ . Let g be any positive integer such that the class $[{(𝜀 ϰ)}^{g}] \sim_{L}$ is an idempotent in ${(A \cup \bar{A})}^{+} / \sim_{L}$ . We need to show that then we have ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{L} 𝜀 u ϰ$ . The element $[𝜀] \sim_{L}$ is the image of the z-tuple $([𝜀] \sim_{L_{1}}, [𝜀] \sim_{L_{2}}, \dots, [𝜀] \sim_{L_{z}})$ , and the element $[ϰ] \sim_{L}$ is the image of the z-tuple $([ϰ] \sim_{L_{1}}, [ϰ] \sim_{L_{2}}, \dots, [ϰ] \sim_{L_{z}})$ . Then there exists a positive integer $η$ such that the z-tuple $([𝜀^{η}] \sim_{L_{1}}, [𝜀^{η}] \sim_{L_{2}}, \dots, [𝜀^{η}] \sim_{L_{z}})$ is an idempotent, and the z-tuple $([ϰ^{η}] \sim_{L_{1}}, [ϰ^{η}] \sim_{L_{2}}, \dots, [ϰ^{η}] \sim_{L_{z}})$ is an idempotent. Thus, for every $i \in {1, 2, \dots, z}$ , the element $[𝜀^{η}] \sim_{L_{i}}$ is an idempotent, which means that we have $𝜀^{η} \sim_{L_{i}} 𝜀^{2 η}$ . By the definition of the syntactic congruence $\sim_{L_{i}}$ , for every words $c, d \in {(A \cup \bar{A})}^{*}$ , we get $c 𝜀^{η} d \in L_{i} \Leftrightarrow c 𝜀^{2 η} d \in L_{i}$ . However, since the language $L_{i}$ is of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , we can infer the following conclusion. If the word $𝜀$ contained some elements from the alphabet A and if, for some words $c, d \in {(A \cup \bar{A})}^{*}$ , we had $c 𝜀^{η} d \in L_{i}$ , then we would certainly have $c 𝜀^{2 η} d \notin L_{i}$ . But this would contradict the above condition $c 𝜀^{η} d \in L_{i} \Leftrightarrow c 𝜀^{2 η} d \in L_{i}$ . Thus, whenever $𝜀 \notin {\bar{A}}^{+}$ , then, for any words $c, d \in {(A \cup \bar{A})}^{*}$ , we must have $c 𝜀^{η} d \notin L_{i}$ . Therefore, in particular, in such a case, for any words $p, q \in {(A \cup \bar{A})}^{*}$ , we must have $p {(𝜀^{η} ϰ^{η})}^{g} 𝜀^{η} u ϰ^{η} q \notin L_{i}$ and $p 𝜀^{η} u ϰ^{η} q \notin L_{i}$ . But, by the definition of the syntactic congruence $\sim_{L_{i}}$ , this means that ${(𝜀^{η} ϰ^{η})}^{g} 𝜀^{η} u ϰ^{η} \sim_{L_{i}} 𝜀^{η} u ϰ^{η}$ . This relation must hold for all $i \in {1, 2, \dots, z}$ . By the reasonings performed in the paragraph preceding Lemma 4.3, it hence eventually ensues that ${(𝜀^{η} ϰ^{η})}^{g} 𝜀^{η} u ϰ^{η} \sim_{L} 𝜀^{η} u ϰ^{η}$ . Since here the elements $[𝜀] \sim_{L}$ and $[ϰ] \sim_{L}$ are idempotents, and so they are equal to the elements $[𝜀^{η}] \sim_{L}$ and $[ϰ^{η}] \sim_{L}$ , respectively, we hence deduce, in the end, that ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{L} 𝜀 u ϰ$ , as desired. We have arrived at this conclusion under the assumption that $𝜀 \notin {\bar{A}}^{+}$ . Likewise, we could come to the same conclusion if we went from the assumption that $ϰ \notin {\bar{A}}^{+}$ . Thus it remains to deal with the case when $𝜀 \in {\bar{A}}^{+}$ and $ϰ \in {\bar{A}}^{+}$ .

We have to verify that, in this case, we also have ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{L} 𝜀 u ϰ$ . By the definition of the syntactic congruence $\sim_{L}$ , this means to check that, for any words $p, q \in {(A \cup \bar{A})}^{*}$ , we have $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q \in L_{ξ}^{(s)} \Leftrightarrow p 𝜀 u ϰ q \in L_{ξ}^{(s)}$ . Thus assume that $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q \in L_{ξ}^{(s)}$ . Since $L_{ξ}^{(s)}$ is a finite union of languages of the form ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , it hence ensues that we must have $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q \in {\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ for some $ℓ$ -tuple $(ϖ_{1}, ϖ_{2}, \dots, ϖ_{ℓ})$ . Since we now have $𝜀 \in {\bar{A}}^{+}$ and $ϰ \in {\bar{A}}^{+}$ , we see that the whole segment ${(𝜀 ϰ)}^{g} 𝜀$ in the word $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q$ must fall into a single factor of the form ${\bar{A}}^{*}$ or ${\bar{A}}^{+}$ in the language ${\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ . However, then it is evident that we also get $p 𝜀 u ϰ q \in {\bar{A}}^{*} ϖ_{1} {\bar{A}}^{+} ϖ_{2} {\bar{A}}^{+} \dots {\bar{A}}^{+} ϖ_{ℓ} {\bar{A}}^{*}$ , and so we obtain $p 𝜀 u ϰ q \in L_{ξ}^{(s)}$ . In a similar manner, it is possible to show that if $p 𝜀 u ϰ q \in L_{ξ}^{(s)}$ , then also $p {(𝜀 ϰ)}^{g} 𝜀 u ϰ q \in L_{ξ}^{(s)}$ . These findings assure that we indeed have ${(𝜀 ϰ)}^{g} 𝜀 u ϰ \sim_{L} 𝜀 u ϰ$ , again as desired. □

Let further ${\hat{L}}_{ξ}^{(s)}$ be the language over the alphabet A consisting of all words which one may obtain from the words of the language $L_{ξ}^{(s)}$ by replacing in them every occurrence of any letter from $\bar{A}$ by its respective letter form A. It is clear that then the language ${\hat{L}}_{ξ}^{(s)}$ comprises exactly those words over the alphabet A which contain at least s occurrences of the segment $ξ$ . Now we wish to sort out of these words those ones which contain exactly s occurrences of the segment $ξ$ . That is, we intend to remove from ${\hat{L}}_{ξ}^{(s)}$ those words which contain more than s occurrences of $ξ$ . For this purpose, we must be aware that, up to now, we have performed all our considerations with an arbitrary non-negative integer s. Thus we may replace everywhere in the foregoing reasonings this integer s with the integer $s + 1$ . In this way, we come up with the languages $L_{ξ}^{(s + 1)}$ and ${\hat{L}}_{ξ}^{(s + 1)}$ . Then, of course, the language ${\hat{L}}_{ξ}^{(s + 1)}$ comprises exactly those words over the alphabet A which contain at least $s + 1$ occurrences of the segment $ξ$ . And then the language ${\hat{L}}_{ξ}^{(s)} \ {\hat{L}}_{ξ}^{(s + 1)}$ consists precisely of those words over the alphabet A which contain exactly s occurrences of the segment $ξ$ .

We have shown in Lemmas 4.3 and 4.4 that the syntactic semigroups of the languages $L_{ξ}^{(s)}$ over the alphabet $A \cup \bar{A}$ , for whichever non-negative integers s, belong to the pseudovariety $U$ . We have considered subsequently the languages ${\hat{L}}_{ξ}^{(s)}$ , again for any non-negative integers s, over the alphabet A, whose creation from the languages $L_{ξ}^{(s)}$ can also be expressed in the following terms. Let us consider the homomorphism $λ : {(A \cup \bar{A})}^{+} \to A^{+}$ of free semigroups determined by the requirements that $λ$ should map every element of A to itself and $λ$ should map every element of the disjoint copy $\bar{A}$ of A to its respective counterpart in A. Then $λ$ is a length preserving homomorphism. Then we have ${\hat{L}}_{ξ}^{(s)} = λ (L_{ξ}^{(s)})$ . By Theorem 3.5, we know that then ${\hat{L}}_{ξ}^{(s)}$ are again recognizable languages which are recognized by suitable semigroups from the power pseudovariety $P U$ . Therefore these languages belong to the collection of languages $A^{+} 𝒲$ . Of course, then also the language ${\hat{L}}_{ξ}^{(s)} \ {\hat{L}}_{ξ}^{(s + 1)}$ belongs to this collection $A^{+} 𝒲$ . We have next seen above that the language ${\hat{L}}_{ξ}^{(s)} \ {\hat{L}}_{ξ}^{(s + 1)}$ consists precisely of those words over the alphabet A which contain exactly s occurrences of the segment $ξ$ . Recall that $ξ$ has been a word in the alphabet A of length $m + 1$ giving rise to the letter $(t, a)$ of the alphabet $B_{T}$ . There may be several such distinct words $ξ$ which give rise to the same letter $(t, a)$ in this way. So let $ξ_{1}, ξ_{2}, \dots, ξ_{μ}$ be all words enjoying this property. Let further $s_{1}, s_{2}, \dots, s_{μ}$ be any non-negative integers. Then for every $i \in {1, 2, \dots, μ}$ , we may consider the language ${\hat{L}}_{ξ_{i}}^{(s_{i})} \ {\hat{L}}_{ξ_{i}}^{(s_{i} + 1)}$ . Then the language $⋂_{i = 1}^{μ} ({\hat{L}}_{ξ_{i}}^{(s_{i})} \ {\hat{L}}_{ξ_{i}}^{(s_{i} + 1)})$ consists of those words which, for every $i \in {1, 2, \dots, μ}$ , contain exactly $s_{i}$ occurrences of the segment $ξ_{i}$ . Denote this last language by ${\bar{L}}_{ξ_{1}, \dots, ξ_{μ}}^{s_{1}, \dots, s_{μ}}$ . This language then again belongs to the collection $A^{+} 𝒲$ . Let finally $\bar{r}$ be any non-negative integer. Consider all $μ$ -tuples $(s_{1}, s_{2}, \dots, s_{μ})$ of non-negative integers such that $s_{1} + s_{2} + \dots + s_{μ} = \bar{r}$ . Then we may form the language $⋃_{s_{1} + \dots + s_{μ} = \bar{r}} {\bar{L}}_{ξ_{1}, \dots, ξ_{μ}}^{s_{1}, \dots, s_{μ}}$ . This language, which we shall denote briefly by ${\bar{L}}^{(\bar{r})}$ , then also belongs to the collection $A^{+} 𝒲$ .

It should now be clear that the language ${\bar{L}}^{(\bar{r})}$ consists precisely of those words $w \in A^{+}$ which enjoy the following property. We focus first on the words $w \in A^{+}$ whose length $| w |$ is greater than m. Then such a word w belongs to the language ${\bar{L}}^{(\bar{r})}$ if and only if the number of those appearances of the letter $(t, a)$ in the word $σ_{φ} (w)$ which do not occur in the initial segment of $σ_{φ} (w)$ of length m is equal to $\bar{r}$ . As far as the words $w \in A^{+}$ whose length $| w |$ does not exceed m are concerned, the following holds true. If $\bar{r} = 0$ , then all such words w belong to the language ${\bar{L}}^{(\bar{r})}$ . If, on the contrary, $\bar{r} > 0$ , then the language ${\bar{L}}^{(\bar{r})}$ contains no words w of this kind.

Let $r_{0}$ be a non-negative integer such that $r_{0} \leq m$ . There are only finitely many words u in the alphabet A of length $| u |$ equal to m. Let $M^{(r_{0})}$ be the set of all words u of this kind having the property that the number of appearances of the letter $(t, a)$ in the word $σ_{φ} (u)$ is equal to $r_{0}$ . Then $M^{(r_{0})}$ is a finite language over the alphabet A. Consider next the language ${\bar{M}}^{(r_{0})} = M^{(r_{0})} A^{+}$ . Then ${\bar{M}}^{(r_{0})}$ is a recognizable language. It is a familiar fact— see [6, Sec. 2.3]— that this language ${\bar{M}}^{(r_{0})}$ can be recognized by a semigroup from the pseudovariety $K$ . Since clearly $K \subseteq U$ , and as $U \subseteq P U$ , we see that this language ${\bar{M}}^{(r_{0})}$ can be recognized by a semigroup from the pseudovariety $P U$ , and hence it belongs to the collection of languages $A^{+} 𝒲$ . Note that this language contains no words w whose length $| w |$ does not exceed m. Take next any non-negative integer $\bar{r}$ and consider the language ${\bar{M}}^{(r_{0})} \cap {\bar{L}}^{(\bar{r})}$ . This language then also belongs to the collection $A^{+} 𝒲$ and it consists of all those words w in the alphabet A of length $| w |$ greater that m that have the property that, in the word $σ_{φ} (w)$ , there are $r_{0}$ appearances of the letter $(t, a)$ in the initial segment of this word of length m and there are $\bar{r}$ appearances of the letter $(t, a)$ in the rest of this word. Finally, for every non-negative integer r, consider all pairs $(r_{0}, \bar{r})$ of non-negative integers such that $r_{0} + \bar{r} = r$ . Then we may consider the language $⋃_{r_{0} + \bar{r} = r} ({\bar{M}}^{(r_{0})} \cap {\bar{L}}^{(\bar{r})})$ . This last language then again belongs to the collection $A^{+} 𝒲$ and it consists of those words w in the alphabet A whose length $| w |$ is greater than m and which possess the property that, in the word $σ_{φ} (w)$ , there are exactly r appearances of the letter $(t, a)$ .

At last, given a non-negative integer r, we may consider the set ${\bar{N}}^{(r)}$ of all words v in the alphabet A of length $| v |$ not exceeding m and having the property that, in the word $σ_{φ} (v)$ , there are exactly r appearances of the letter $(t, a)$ . Then ${\bar{N}}^{(r)}$ is a finite language over the alphabet A, which can therefore be recognized by a nilpotent semigroup, and so it can be certainly recognized by a semigroup from the pseudovariety $P U$ . Hence this language belongs to the collection of languages $A^{+} 𝒲$ . Finally we may consider the language ${\bar{N}}^{(r)} \cup ⋃_{r_{0} + \bar{r} = r} ({\bar{M}}^{(r_{0})} \cap {\bar{L}}^{(\bar{r})})$ . This last language then again belongs to the collection $A^{+} 𝒲$ and it consists of all those words w in the alphabet A which possess the property that, in the word $σ_{φ} (w)$ , there are exactly r appearances of the letter $(t, a)$ . But this property means that $σ_{φ} (w) \in L ((t, a), r)$ . Consequently, the last-mentioned language is precisely our language $σ_{φ}^{- 1} (L ((t, a), r))$ . Therefore this language belongs to the collection $A^{+} 𝒲$ , as desired.

References

[1] J. Almeida , Locally commutative power semigroups and counting factors of words, Theoret. Comput. Sci. 108 (1993) 3–16. Crossref, Web of Science, Google Scholar
[2] J. Almeida , Finite Semigroups and Universal Algebra (World Scientific, Singapore, 1994). Google Scholar
[3] S. Eilenberg , Automata, Languages and Machines, Vol. B (Academic Press, New York, 1976). Google Scholar
[4] J. Kad’ourek , About the power pseudovariety $P C S$ , Rocky Mountain J. Math. 51 (2021) 2045–2102. Google Scholar
[5] G. Lallement , Semigroups and Combinatorial Applications (John Wiley & Sons, New York, 1979). Google Scholar
[6] J.-E. Pin , Variétés de Langages Formels (Masson, Paris, 1984). Google Scholar
[7] J.-E. Pin and P. Weil , The wreath product principle for ordered semigroups, Comm. Algebra 30 (2002) 5677–5713. Crossref, Web of Science, Google Scholar
[8] H. Straubing , Families of recognizable sets corresponding to certain varieties of finite monoids, J. Pure Appl. Algebra 15 (1979) 305–318. Crossref, Google Scholar
[9] H. Straubing , Recognizable sets and power sets of finite semigroups, Semigroup Forum 18 (1979) 331–340. Crossref, Web of Science, Google Scholar
[10] D. Thérien and A. Weiss , Graph congruences and wreath products, J. Pure Appl. Algebra 36 (1985) 205–215. Crossref, Web of Science, Google Scholar