THEORY OF COMPUTING AND MACHINE ARCHITECTURE
4. Formal Language Theory
Formal Language Theory
Formal language theory studies the syntactic and structural properties of languages used in computation. A formal language is a set of strings (sequences of symbols) formed from a finite alphabet. Formal languages are classified based on their complexity and the type of automaton that can recognize them:
v Regular Languages: Can be recognized by finite automata (both DFA and NFA). They are described by regular expressions and can be used to define patterns in text, like search queries.
v Context-Free Languages (CFL): Can be recognized by pushdown automata (PDA). These are more powerful than regular languages and can describe many programming languages' syntax. They are typically described by context-free grammars (CFG).
v Context-Sensitive Languages (CSL): More powerful than context-free languages. They can be recognized by linear-bounded automata (LBA) and are useful in describing some complex language structures.
v Recursively Enumerable Languages (REL): These are the most general class of languages that can be recognized by a Turing machine. They include languages that are decidable (where a machine can always give an answer) and undecidable languages (where no algorithm can always give an answer).
Key Concepts:
- Language Hierarchy:
- Regular Languages: Recognized by finite automata.
- Context-Free Languages: Recognized by pushdown automata.
- Context-Sensitive Languages: Recognized by linear-bounded automata.
- Recursively Enumerable Languages: Recognized by Turing machines.
- Grammars:
- Regular Grammar: Generates regular languages.
- Context-Free Grammar: Generates context-free languages.
- Context-Sensitive Grammar: Generates context-sensitive languages.
- Applications:
- Compiler design (lexical analysis, parsing).