Operations On Languages

Introduction: Given languages L,L1, and L2, new languages may be defined using various operations. First, consider the set operations of union, intersection, set difference, and symmetric difference. These familiar set operations help define new languages as defined below:

Union:L1 ∪ L2 = {x | x ∈ L1 ∨ x ∈ L2}

Intersection: L1 ∩ L2 = {x | x ∈ L1 ∧ x ∈ L2}

Set difference:L1 \ L2 = {x | x ∈ L1 ∧x /∈ L2}

Symmetric difference: (A \ B) ∪ (B \ A)

Concatenation and exponentiation

The concatenation operation on two languages L1 and L2 performs the concatenation (juxtaposition) of strings from these languages. Exponentiation is n-ary concatenation. These ideas are basically quite simple. Suppose we have a finite-state machine M1 that accepts strings containing either an odd number of a’s or an even number of b’s. Suppose we have another finite-state machine M2 that accepts strings containing either an even number of a’s or an odd number of b’s. We can now build a new machine M whose runs are of the form xy such that xis accepted by M1 and y by M2. We can also build a new machine M∗1whose runs are of the form xx . . .x, where x is repeated some k ≥ 0 times, and x is a run of M1. The set of runs of these new machines can be captured using concatenation and exponentiation.

We now formally define the concatenation and exponentiation operations on languages:

Concatenation: L1 ◦ L2 = {xy | x ∈ L1 ∧ y ∈ L2}.

Note: Often, we will omit the ‘◦’ operator and write L1L2 instead of L1 ◦ L2.

Exponentiation: First of all, for any language L, define L0 = {ε}.

That is, the zero-th exponent of any language L is {ε}. Note that this means that even for the empty language ∅, ∅∗ = {ε}. (The reason for this convention will be clear momentarily).

Exponentiation may now be defined in one of two equivalent ways:

Definition-1 of Exponentiation: For all n ≥ 1,

Ln = {x1 . . .xn | xi ∈ L}.

Definition-2 of Exponentiation:

For n > 0, Ln = {xy | x ∈ L ∧ y ∈ Ln−1}.

In the second definition, note that we should define the basis case for the recursion, which is L0. We must put into L0 anything that serves as the identity element for string concatenation, which is ε. Hence, we define L0 = {ε} for any language, L.

Consider a simple example:

{a, aba} ◦ {ca, da} = {aca, ada, abaca, abada},

while

{ca, da}3 = {ca, da} ◦ {ca, da} ◦ {ca, da},

which is the set

{cacaca, cacada, cadaca, cadada, dacaca, dacada, dadaca, dadada}.

Consider another example where L1 (which models the runs of M1) is the set

L1 = {x | x is a sequence of odd number of a_s or even number of b_ s} and L2 (which models the runs of M2) is the set L2 = {x | x is a sequence of even number of a_s or odd number of b_ s}.Then, L1 ◦ L2 is a language in which each string consists of

• an odd number of a’s,

• an odd number of b’s,

• an odd number of a’s followed by an odd number of b’s, or

• an even number of b’s followed by an even number of a’s.

In addition, Lk1 is a language in which each string can be obtained asa k-fold repetition of a string from L1.

SKEDSOFT