编译原理课程设计实验5:语义分析和中间代码生成器
该实验实现了递归下降翻译器,对测试代码进行词法、语法和语义分析,并生成四元式中间代码。系统包含三个模块:词法分析识别各类符号并输出token序列;语法分析采用递归下降法,记录产生式推导过程;语义分析通过语法制导翻译生成四元式,处理控制流回填和数组访问。关键数据结构包括四元式表、属性结构体和临时变量管理,支持布尔表达式跳转、赋值语句和数组元素访问的翻译。测试案例展示了完整的分析流程,输出结果包含三阶
目录
一、实验要求
根据前面实验内容的Python代码(可以参考我的《编译原理课程设计》专栏,里面有实现词法分析、语法分析和语义分析和中间代码生成器-赋值表达式的源代码),添加功能实现递归下降翻译器。
注意
数据结构:
四元式:结构体
四元式序列:结构体数组
跳转语句的四元式的第 4 个域需回填。
翻译模式与步骤
1.测试:
输入要测试的代码到"test5.txt"文件中:
while(a<b)
if(c)
while(d>e)
x1=y1;
else
x2=y2;
x3[k]=y3[i,j]+a1*a2;
输出内容到控制台、"词法分析.txt"文件、"语法分析.txt"文件与"语义分析.txt"文件:
词法分析:
<20, ->
<81, ->
<111, a>
<49, ->
<111, b>
<82, ->
<17, ->
<81, ->
<111, c>
<82, ->
<20, ->
<81, ->
<111, d>
<47, ->
<111, e>
<82, ->
<111, x1>
<46, ->
<111, y1>
<84, ->
<15, ->
<111, x2>
<46, ->
<111, y2>
<84, ->
<111, x3>
<88, ->
<111, k>
<89, ->
<46, ->
<111, y3>
<88, ->
<111, i>
<83, ->
<111, j>
<89, ->
<41, ->
<111, a1>
<43, ->
<111, a2>
<84, ->
语法分析:
1) 按使用产生式过程
(1)stmts ⟶ stmt rest0
(2)stmt ⟶ while(m1 bool) m2 stmt1
(3)bool ⟶ equality
(4)equality ⟶ rel rest4
(5)rel ⟶ expr rop_expr
(6)expr ⟶ term rest5
(7)term ⟶ unary rest6
(8)unary ⟶ factor
(9)factor ⟶ loc
(10)loc ⟶ id resta
(11)resta ⟶ ε
(12)rest6 ⟶ ε
(13)rest5 ⟶ ε
(14)rop_expr ⟶ < expr
(15)expr ⟶ term rest5
(16)term ⟶ unary rest6
(17)unary ⟶ factor
(18)factor ⟶ loc
(19)loc ⟶ id resta
(20)resta ⟶ ε
(21)rest6 ⟶ ε
(22)rest5 ⟶ ε
(23)rest4 ⟶ ε
(24)stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2
(25)bool ⟶ equality
(26)equality ⟶ rel rest4
(27)rel ⟶ expr rop_expr
(28)expr ⟶ term rest5
(29)term ⟶ unary rest6
(30)unary ⟶ factor
(31)factor ⟶ loc
(32)loc ⟶ id resta
(33)resta ⟶ ε
(34)rest6 ⟶ ε
(35)rest5 ⟶ ε
(36)rop_expr ⟶ ε
(37)rest4 ⟶ ε
(38)stmt ⟶ while(m1 bool) m2 stmt1
(39)bool ⟶ equality
(40)equality ⟶ rel rest4
(41)rel ⟶ expr rop_expr
(42)expr ⟶ term rest5
(43)term ⟶ unary rest6
(44)unary ⟶ factor
(45)factor ⟶ loc
(46)loc ⟶ id resta
(47)resta ⟶ ε
(48)rest6 ⟶ ε
(49)rest5 ⟶ ε
(50)rop_expr ⟶ > expr
(51)expr ⟶ term rest5
(52)term ⟶ unary rest6
(53)unary ⟶ factor
(54)factor ⟶ loc
(55)loc ⟶ id resta
(56)resta ⟶ ε
(57)rest6 ⟶ ε
(58)rest5 ⟶ ε
(59)rest4 ⟶ ε
(60)stmt ⟶ loc = expr ;
(61)loc ⟶ id resta
(62)resta ⟶ ε
(63)expr ⟶ term rest5
(64)term ⟶ unary rest6
(65)unary ⟶ factor
(66)factor ⟶ loc
(67)loc ⟶ id resta
(68)resta ⟶ ε
(69)rest6 ⟶ ε
(70)rest5 ⟶ ε
(71)stmt ⟶ loc = expr ;
(72)loc ⟶ id resta
(73)resta ⟶ ε
(74)expr ⟶ term rest5
(75)term ⟶ unary rest6
(76)unary ⟶ factor
(77)factor ⟶ loc
(78)loc ⟶ id resta
(79)resta ⟶ ε
(80)rest6 ⟶ ε
(81)rest5 ⟶ ε
(82)rest0 ⟶ m stmt rest01
(83)stmt ⟶ loc = expr ;
(84)loc ⟶ id resta
(85)resta ⟶ [ elist ]
(86)elist ⟶ expr rest1
(87)expr ⟶ term rest5
(88)term ⟶ unary rest6
(89)unary ⟶ factor
(90)factor ⟶ loc
(91)loc ⟶ id resta
(92)resta ⟶ ε
(93)rest6 ⟶ ε
(94)rest5 ⟶ ε
(95)rest1 ⟶ ε
(96)expr ⟶ term rest5
(97)term ⟶ unary rest6
(98)unary ⟶ factor
(99)factor ⟶ loc
(100)loc ⟶ id resta
(101)resta ⟶ [ elist ]
(102)elist ⟶ expr rest1
(103)expr ⟶ term rest5
(104)term ⟶ unary rest6
(105)unary ⟶ factor
(106)factor ⟶ loc
(107)loc ⟶ id resta
(108)resta ⟶ ε
(109)rest6 ⟶ ε
(110)rest5 ⟶ ε
(111)rest1 ⟶ , expr rest1
(112)expr ⟶ term rest5
(113)term ⟶ unary rest6
(114)unary ⟶ factor
(115)factor ⟶ loc
(116)loc ⟶ id resta
(117)resta ⟶ ε
(118)rest6 ⟶ ε
(119)rest5 ⟶ ε
(120)rest1 ⟶ ε
(121)rest6 ⟶ ε
(122)rest5 ⟶ + term rest5
(123)term ⟶ unary rest6
(124)unary ⟶ factor
(125)factor ⟶ loc
(126)loc ⟶ id resta
(127)resta ⟶ ε
(128)rest6 ⟶ * unary rest6
(129)unary ⟶ factor
(130)factor ⟶ loc
(131)loc ⟶ id resta
(132)resta ⟶ ε
(133)rest6 ⟶ ε
(134)rest5 ⟶ ε
(135)rest0 ⟶ ℇ
2) 按推导过程
(1) stmts
(2) stmt rest0
(3) while(m1 bool) m2 stmt1 rest0
(4) while(m1 bool) m2 stmt1 m stmt rest01
(5) while(m1 bool) m2 stmt1 m loc = expr ; rest01
(6) while(m1 bool) m2 stmt1 m id resta = expr ; rest01
(7) while(m1 bool) m2 stmt1 m id [ elist ] = expr ; rest01
(8) while(m1 bool) m2 stmt1 m id [ expr rest1 ] = expr ; rest01
(9) while(m1 bool) m2 stmt1 m id [ term rest5 rest1 ] = expr ; rest01
(10) while(m1 bool) m2 stmt1 m id [ unary rest6 rest5 rest1 ] = expr ; rest01
(11) while(m1 bool) m2 stmt1 m id [ factor rest6 rest5 rest1 ] = expr ; rest01
(12) while(m1 bool) m2 stmt1 m id [ loc rest6 rest5 rest1 ] = expr ; rest01
(13) while(m1 bool) m2 stmt1 m id [ id resta rest6 rest5 rest1 ] = expr ; rest01
(14) while(m1 bool) m2 stmt1 m id [ id ] = term rest5 ; rest01
(15) while(m1 bool) m2 stmt1 m id [ id ] = unary rest6 rest5 ; rest01
(16) while(m1 bool) m2 stmt1 m id [ id ] = factor rest6 rest5 ; rest01
(17) while(m1 bool) m2 stmt1 m id [ id ] = loc rest6 rest5 ; rest01
(18) while(m1 bool) m2 stmt1 m id [ id ] = id resta rest6 rest5 ; rest01
(19) while(m1 bool) m2 stmt1 m id [ id ] = id [ elist ] rest6 rest5 ; rest01
(20) while(m1 bool) m2 stmt1 m id [ id ] = id [ expr rest1 ] rest6 rest5 ; rest01
(21) while(m1 bool) m2 stmt1 m id [ id ] = id [ term rest5 rest1 ] rest6 rest5 ; rest01
(22) while(m1 bool) m2 stmt1 m id [ id ] = id [ unary rest6 rest5 rest1 ] rest6 rest5 ; rest01
(23) while(m1 bool) m2 stmt1 m id [ id ] = id [ factor rest6 rest5 rest1 ] rest6 rest5 ; rest01
(24) while(m1 bool) m2 stmt1 m id [ id ] = id [ loc rest6 rest5 rest1 ] rest6 rest5 ; rest01
(25) while(m1 bool) m2 stmt1 m id [ id ] = id [ id resta rest6 rest5 rest1 ] rest6 rest5 ; rest01
(26) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , expr rest1 ] rest6 rest5 ; rest01
(27) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , term rest5 rest1 ] rest6 rest5 ; rest01
(28) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , unary rest6 rest5 rest1 ] rest6 rest5 ; rest01
(29) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , factor rest6 rest5 rest1 ] rest6 rest5 ; rest01
(30) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , loc rest6 rest5 rest1 ] rest6 rest5 ; rest01
(31) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id resta rest6 rest5 rest1 ] rest6 rest5 ; rest01
(32) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + term rest5 ; rest01
(33) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + unary rest6 rest5 ; rest01
(34) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + factor rest6 rest5 ; rest01
(35) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + loc rest6 rest5 ; rest01
(36) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id resta rest6 rest5 ; rest01
(37) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * unary rest6 rest5 ; rest01
(38) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * factor rest6 rest5 ; rest01
(39) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * loc rest6 rest5 ; rest01
(40) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * id resta rest6 rest5 ; rest01
语法分析完成,结果已保存到"语法分析.txt"文件中。
语义分析:
生成的四元式:
0: j<, a, b, 2
1: j, -, -, 11
2: jnz, c, -, 4
3: j, -, -, 9
4: j>, d, e, 6
5: j, -, -, 0
6: =, y1, -, x1
7: j, -, -, 4
8: j, -, -, 0
9: =, y2, -, x2
10: j, -, -, 0
11: -, x3, C, t1
12: *, k, w, t2
13: *, i, n2, t3
14: +, t3, j, t3
15: -, y3, C, t4
16: *, t3, w, t5
17: =[], t4[t5], -, t6
18: *, a1, a2, t7
19: +, t6, t7, t8
20: []=, t8, -, t1[t2]
"词法分析.txt"部分截图:
"语法分析.txt"部分截图:
"语义分析.txt"截图:
文法
语义动作,格式:{语义分析}
stmts⟶stmt
rest0
{rest0.inNextlist=stmt.nextlist}
{stmts.nextlist=rest0.nextlist}
rest0 ⟶m stmt
rest01
{backpatch(rest0.inNextlist, m.quad);
rest01.inNextlist=stmt.nextlist}
{rest0.nextlist=rest01.nextlist}
rest0 ⟶ℇ
{rest0.nextlist=rest0.inNextlist}
stmt⟶loc=expr ;
{if(loc.offset=null)
emit( ‘=,’ expr.place ‘, - ,’ loc.place);
else
emit(‘[]=,’ expr.place ‘, - ,’ loc.place ‘[’ loc.offset ‘]’ );
stmt.nextlist=makelist()}
stmt⟶if(bool) m1 stmt1 n else m2 stmt2 {backpatch(bool.truelist, m1.quad);
backpatch(bool.falselist, m2.quad);
stmt.nextlist=
merge(stmt1.nextlist, n.nextlist, m2.nextlist)}
stmt⟶ while(m1 bool) m2 stmt1
{backpatch(stmt1.nextlist, m1.quad);
backpatch(bool.truelist, m2.quad);
stmt.nextlist=bool.falselist;
emit( ‘j, -, -, ’ m1.quad)}
m⟶ℇ
{m.quad=nextquad}
n⟶ℇ
{n.nextlist=makelist(nextquad);
emit( ‘j, -, -, 0’)}
loc⟶id
resta
{resta.inArray=id.place}
{loc.place=resta.place;
loc.offset=resta.offset}
resta⟶[
elist
]
{elist.inArray=resta.inArray}
{resta.place=newtemp();
emit(‘-,’ elist.arry ‘,’ C ‘,’ resta.place);
resta.offset=newtemp();
emit(‘*, ’ w ‘,’ elist.offset ‘,’ resta.offset);
}
resta⟶ℇ
{resta.place=resta.inArray;
resta.offset=null}
elist ⟶expr
rest1
{rest1.inArray=elist.inArray;
rest1.inNdim=1;
rest1.inPlace=expr.place}
{elist.array=rest1.array;
elist.offset=rest1.offset}
rest1⟶ ,
expr
rest11
{t=newtemp();
m=rest1.inNdim+1;
emit(‘*,’ rest1.inPlace ‘,’ limit(rest1.inarray,m) ‘,’ t);
emit(‘+,’ t ‘,’ expr.place ‘,’ t);
rest11.inArray=rest1.inArray;
rest11.inNdim=m;
rest11.inNplace=t}
{rest1.array=rest11.array;
rest1.offset=rest11.offset}
rest1⟶ℇ
{rest1.array=rest1.inArray;
rest1.offset=rest1.inPlace}
bool ⟶ equality
{bool.truelist=equality.truelist
bool.falselist=equality.falselist }
equality ⟶ rel
rest4
{rest4.inTruelist=rel.truelist
rest4.inFalselist=rel.falselist}
{equality.truelist=rest4.truelist
equality.falselist=rest4.falselist}
rest4 ⟶ == rel rest41
rest4 ⟶ != rel rest41
rest4 ⟶ ℇ
{rest4.truelist=rest4.inTruelist
rest4.falselist=rest4.inFalselist}
rel ⟶ expr
rop_expr
{rop_expr.inPlace=expr.place}
{rel.truelist=rop_expr.truelist
rel.falselist=rop_expr.falselist}
rop_expr ⟶ <expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j<,’ rop_expr.inPlace ‘,’ expr.place ‘, -’);
emit(‘j, -, -, -’)}
rop_expr ⟶ <=expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j<=,’ rop_expr.inPlace ‘,’ expr.place ‘, -’);
emit(‘j, -, -, -’)}
rop_expr ⟶ >expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j>,’ rop_expr.inPlace ‘,’ expr.place ‘, -’);
emit(‘j, -, -, -’)}
rop_expr ⟶ >=expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j>=,’ rop_expr.inPlace ‘,’ expr.place ‘, -’);
emit(‘j, -, -, -’)}
rop_expr ⟶ ℇ
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘jnz,’ rop_expr.inPlace ‘, -, -’);
emit(‘j, -, -, -’)}
expr ⟶ term
rest5
{rest5.in=term.place}
{expr.place=rest5.place}
rest5⟶ +term
rest51
{rest51.in=newtemp();
emit(‘+,’ rest5.in ‘,’ term.place ‘,’ rest51.in)}
{rest5.place =rest51 .place}
rest5⟶ -term
rest51
{rest51.in=newtemp();
emit(‘-,’ rest5.in ‘,’ term.place ‘,’ rest51.in)}
{rest5.place =rest51 .place}
rest5⟶ ℇ
{rest5.place = rest5.in}
term⟶ unary
rest6
{rest6.in = unary.place}
{term.place = rest6.place}
rest6⟶ *unary
rest61
{rest61.in=newtemp();
emit(‘*,’ rest6.in ‘,’ unary.place ‘,’ rest61.in)}
{rest6.place = rest61 .place}
rest6⟶ /unary
rest61
{rest61.in=newtemp();
emit(‘/,’ rest6.in ‘,’ unary.place ‘,’ rest61.in)}
{rest6.place = rest61 .place}
rest6⟶ ℇ
{rest6.place = rest6.in}
unary⟶factor
{unary.place = factor.place}
factor⟶ (expr)
{factor.place = expr.place}
factor⟶loc
{if(loc.offset=null)
factor.place = loc.place
else {factor.place=newtemp();
emit(‘=[],’ loc.place ‘[’ loc.offset ‘]’ ‘, -,’ factor.place )}}
factor⟶num
{factor.place = num.value}
二、Python代码实现语义分析和中间代码生成器
# 种别码映射表
token_map = {
# 运算符
'+': 41, '-': 42, '*': 43, '/': 44, '%': 45, '=': 46,
'>': 47, '>=': 48, '<': 49, '<=': 50, '==': 51, '!=': 52,
'&&': 53, '||': 54, '!': 55, '++': 56, '--': 57,
# 输出和输入符号
'<<': 90, '>>': 91,
# 关键字
'int': 5, 'else': 15, 'if': 17, 'while': 20, 'double': 21,
'string': 22, 'char': 23, 'include': 24, 'using': 25,
'namespace': 26, 'std': 27, 'main': 28, 'return': 29,
'void': 30, 'iostream': 31, 'cin': 32, 'cout': 33, 'endl': 34
}
def Recognizestr(ch):
word = ""
while ch.isalnum() or ch == '_':
word += ch
ch = file.read(1)
# 关键字处理
code = token_map.get(word, None)
if code:
return ch, f'<{code}, ->'
return ch, f'<111, {word}>'
def RecognizeDigit(ch):
data = ""
while ch in {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.'}:
data += ch
ch = file.read(1)
return ch, f'<100, {data}>'
def Recognizeop(ch):
op = ch
peek_ch = file.read(1)
# 处理双字符运算符
double_ops = {
'+': '+', '-': '-', '=': '=', '>': '=',
'<': '=', '!': '=', '&': '&', '|': '|',
}
# 特殊处理 << 和 >>
if op == '<' and peek_ch == '<':
op += peek_ch
ch = file.read(1)
elif op == '>' and peek_ch == '>':
op += peek_ch
ch = file.read(1)
# 处理其他双字符运算符
elif op in double_ops and peek_ch == double_ops[op]:
op += peek_ch
ch = file.read(1)
else:
ch = peek_ch
# 获取种别码
code = token_map.get(op, None)
if code:
return ch, f'<{code}, ->'
return ch, f'<{op}, {op}>'
def Recognizeoth(ch):
# 处理括号和单字符符号
if ch in {'(', ')', '[', ']', '{', '}', ';', ',', '#', '<', '>'}:
code_mapping = {
'(': 81, ')': 82, '[': 88, ']': 89,
'{': 86, '}': 87, ';': 84, ',': 83,
'#': 85, '<': 49, '>': 47, '<<': 90,
'>>': 91
}
code = code_mapping.get(ch, 0)
res = f'<{code}, ->' if code else f'<{ch}, {ch}>'
return file.read(1), res
# 处理逻辑运算符
if ch in {'&', '|'}:
next_ch = file.read(1)
if next_ch == ch:
code = 53 if ch == '&' else 54
return file.read(1), f'<{code}, ->'
return next_ch, '<error>'
return file.read(1), f'<{ch}, {ch}>'
import re
import os
from collections import namedtuple
# 定义Token类
class Token:
def __init__(self, type, value):
self.type = type
self.value = value
def __str__(self):
return f"{self.type}({self.value})"
# 四元式结构
class Quadruple:
def __init__(self, op, arg1, arg2, result):
self.op = op
self.arg1 = arg1
self.arg2 = arg2
self.result = result
def __str__(self):
return f"{self.op}, {self.arg1}, {self.arg2}, {self.result}"
# 属性信息结构
class Attribute:
def __init__(self, place=None, offset=None, array=None):
self.place = place # 存放值的变量名
self.offset = offset # 数组偏移量
self.array = array # 数组基地址
# 用于布尔表达式
self.truelist = [] # 真出口跳转链表
self.falselist = [] # 假出口跳转链表
# 用于控制流
self.nextlist = [] # 下一条语句跳转链表
# 临时变量,用于传递属性
self.inArray = None
self.inOffset = None
self.inPlace = None
self.inNextlist = None
self.inTruelist = None
self.inFalselist = None
self.inNdim = 0
self.quad = 0 # 用于记录当前四元式的位置
# 用于多维数组
self.dims = [] # 维度列表
# 标记是否包含多个索引
self.has_multiple_indices = False
# 语义分析器 - 实现语法制导翻译
class SemanticAnalyzer:
def __init__(self):
# 初始化四元式表
self.quad_table = []
# 初始化临时变量计数器
self.temp_count = 0
# 初始化数组信息 (假设所有数组都是二维的,第二维大小为n2)
self.array_info = {
'A': {'dimensions': 2, 'size': [None, None]},
'n2': 10, # 假设第二维大小为10
'w': 4 # 假设每个元素大小为4字节
}
def emit(self, op, arg1, arg2, result):
"""生成四元式并添加到四元式表"""
quad = Quadruple(op, arg1, arg2, result)
self.quad_table.append(quad)
return len(self.quad_table) - 1 # 返回四元式索引
def newtemp(self):
"""生成新的临时变量名"""
self.temp_count += 1
return f"t{self.temp_count}"
def nextquad(self):
"""返回下一个四元式的索引"""
return len(self.quad_table)
def makelist(self, i):
"""创建只包含i的链表"""
return [i] if i is not None else []
def merge(self, list1, list2):
"""合并两个链表"""
if list1 is None:
list1 = []
if list2 is None:
list2 = []
return list1 + list2
def backpatch(self, address_list, quad_index):
"""回填操作 - 将address_list中的所有地址回填为quad_index"""
for i in address_list:
if i < len(self.quad_table):
self.quad_table[i].result = str(quad_index)
def print_quadruples(self, output_file=None):
"""打印四元式表"""
output = []
for i, quad in enumerate(self.quad_table):
quad_str = f"{i}: {quad.op},\t{quad.arg1},\t{quad.arg2},\t{quad.result}"
output.append(quad_str)
print(quad_str)
if output_file:
try:
with open(output_file, 'w', encoding='utf-8') as f:
for line in output:
f.write(line + '\n')
except Exception as e:
print(f"保存四元式到文件时出错: {e}")
# 实际的语法和语义分析器
class CompleteSyntaxSemanticAnalyzer:
def __init__(self):
# 用于记录产生式步骤和推导步骤
self.production_steps = []
self.derivation_steps = []
self.step_count = 0
# 当前处理的tokens
self.tokens = []
self.current_token = None
self.token_index = 0
# 当前推导的句型
self.current_sentential = ["stmts"]
# 语义分析器
self.semantic = SemanticAnalyzer()
def add_production(self, production):
"""添加一条使用的产生式"""
self.step_count += 1
self.production_steps.append(production)
def add_derivation(self, sentential_form):
"""添加一个推导步骤"""
self.derivation_steps.append(" ".join(sentential_form))
def tokenize(self, text):
"""词法分析,将输入文本转换为token序列"""
tokens = []
# 定义token模式
patterns = [
(r'[ \t\n\r]+', None), # 忽略所有空白字符
(r'//.*', None), # 忽略注释
(r'while', 'WHILE'),
(r'if', 'IF'),
(r'else', 'ELSE'),
(r'[a-zA-Z_][a-zA-Z0-9_]*', 'ID'),
(r'[0-9]+', 'NUM'),
(r'\(', 'LPAREN'),
(r'\)', 'RPAREN'),
(r'\[', 'LBRACK'),
(r'\]', 'RBRACK'),
(r';', 'SEMICOLON'),
(r',', 'COMMA'),
(r'=', 'ASSIGN'),
(r'\+', 'PLUS'),
(r'-', 'MINUS'),
(r'\*', 'MUL'),
(r'/', 'DIV'),
(r'<', 'LT'),
(r'<=', 'LE'),
(r'>', 'GT'),
(r'>=', 'GE'),
(r'==', 'EQ'),
(r'!=', 'NE'),
(r'&&', 'AND'),
(r'\|\|', 'OR')
]
# 手动匹配每种模式
pos = 0
line = 1
column = 1
while pos < len(text):
match = None
# 尝试匹配每种模式
for pattern, token_type in patterns:
regex = re.compile(pattern)
m = regex.match(text, pos)
if m:
match = m
if token_type: # 如果不是要跳过的模式
value = m.group(0)
tokens.append(Token(token_type, value))
# 更新位置信息
matched_text = m.group(0)
newlines = matched_text.count('\n')
if newlines > 0:
line += newlines
column = len(matched_text) - matched_text.rindex('\n')
else:
column += len(matched_text)
pos = m.end()
break
if not match:
# 如果没有匹配到任何模式,报告错误并跳过当前字符
print(f"无法识别的字符: '{text[pos]}' 在行 {line} 列 {column}")
pos += 1
column += 1
return tokens
def get_token(self):
"""获取当前token"""
if self.token_index < len(self.tokens):
self.current_token = self.tokens[self.token_index]
self.token_index += 1
return self.current_token
else:
self.current_token = None
return None
def peek_token(self):
"""查看下一个token但不消耗它"""
if self.token_index < len(self.tokens):
return self.tokens[self.token_index]
return None
def match(self, expected_type):
"""匹配当前token类型"""
if self.current_token and self.current_token.type == expected_type:
token = self.current_token
self.get_token()
return token
else:
expected = expected_type
found = self.current_token.type if self.current_token else "EOF"
raise SyntaxError(f"语法错误: 期望 {expected},但得到 {found}")
def parse(self, text):
"""解析输入文本"""
try:
self.tokens = self.tokenize(text)
if not self.tokens:
print("警告: 没有识别到任何token,输入可能为空或仅包含空白字符")
return False
self.token_index = 0
self.get_token() # 初始化第一个token
self.production_steps = []
self.derivation_steps = []
self.step_count = 0
self.semantic = SemanticAnalyzer()
# 初始推导步骤
self.current_sentential = ["stmts"]
self.add_derivation(self.current_sentential)
stmts_attr = self.parse_stmts()
return True
except SyntaxError as e:
print(f"语法分析错误: {e}")
return False
except Exception as e:
print(f"解析过程中出现未知错误: {e}")
import traceback
traceback.print_exc()
return False
# 以下是语义分析器的解析函数,每个函数返回对应非终结符的属性
def parse_stmts(self):
"""解析stmts"""
# stmts ⟶ stmt rest0
self.add_production("stmts ⟶ stmt rest0")
# 更新推导
self.update_derivation("stmts", ["stmt", "rest0"])
# 语义动作
stmt_attr = self.parse_stmt()
rest0_attr = Attribute()
rest0_attr.inNextlist = stmt_attr.nextlist
rest0_attr = self.parse_rest0(rest0_attr)
# 为stmts创建属性
stmts_attr = Attribute()
stmts_attr.nextlist = rest0_attr.nextlist
return stmts_attr
def parse_rest0(self, inherited_attr):
"""解析rest0"""
rest0_attr = Attribute()
if self.current_token and self.current_token.type in ['ID', 'IF', 'WHILE']:
# rest0 ⟶ m stmt rest01
self.add_production("rest0 ⟶ m stmt rest01")
# 更新推导
self.update_derivation("rest0", ["m", "stmt", "rest01"])
# 语义动作
m_attr = self.parse_m()
self.semantic.backpatch(inherited_attr.inNextlist, m_attr.quad)
stmt_attr = self.parse_stmt()
rest01_attr = Attribute()
rest01_attr.inNextlist = stmt_attr.nextlist
rest01_attr = self.parse_rest0(rest01_attr)
# 为rest0创建属性
rest0_attr.nextlist = rest01_attr.nextlist
else:
# rest0 ⟶ ε
self.add_production("rest0 ⟶ ℇ")
# 更新推导
self.update_derivation("rest0", [])
# 语义动作
rest0_attr.nextlist = inherited_attr.inNextlist
return rest0_attr
def parse_m(self):
"""解析m - 标记当前位置用于回填"""
# m ⟶ ε
# 语义动作
m_attr = Attribute()
m_attr.quad = self.semantic.nextquad()
return m_attr
def parse_n(self):
"""解析n - 生成无条件跳转四元式"""
# n ⟶ ε
# 语义动作
n_attr = Attribute()
n_attr.nextlist = self.semantic.makelist(self.semantic.nextquad())
self.semantic.emit("j", "-", "-", "0") # 0会在回填时被替换
return n_attr
def parse_stmt(self):
"""解析stmt"""
stmt_attr = Attribute()
if self.current_token is None:
raise SyntaxError("语法错误: 意外的文件结束,期望一个语句")
if self.current_token.type == 'WHILE':
# stmt ⟶ while(m1 bool) m2 stmt1
self.add_production("stmt ⟶ while(m1 bool) m2 stmt1")
# 更新推导
self.update_derivation("stmt", ["while(m1", "bool)", "m2", "stmt1"])
# 语义动作
self.match('WHILE')
self.match('LPAREN')
m1_attr = self.parse_m()
bool_attr = self.parse_bool()
self.match('RPAREN')
m2_attr = self.parse_m()
stmt1_attr = self.parse_stmt()
# 语义动作
self.semantic.backpatch(stmt1_attr.nextlist, m1_attr.quad)
self.semantic.backpatch(bool_attr.truelist, m2_attr.quad)
stmt_attr.nextlist = bool_attr.falselist
self.semantic.emit("j", "-", "-", str(m1_attr.quad))
elif self.current_token.type == 'IF':
# stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2
self.add_production("stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2")
# 更新推导
self.update_derivation("stmt", ["if(bool)", "m1", "stmt1", "n", "else", "m2", "stmt2"])
# 语义动作
self.match('IF')
self.match('LPAREN')
bool_attr = self.parse_bool()
self.match('RPAREN')
m1_attr = self.parse_m()
stmt1_attr = self.parse_stmt()
n_attr = self.parse_n()
self.match('ELSE')
m2_attr = self.parse_m()
stmt2_attr = self.parse_stmt()
# 语义动作
self.semantic.backpatch(bool_attr.truelist, m1_attr.quad)
self.semantic.backpatch(bool_attr.falselist, m2_attr.quad)
stmt_attr.nextlist = self.semantic.merge(
self.semantic.merge(stmt1_attr.nextlist, n_attr.nextlist),
stmt2_attr.nextlist
)
elif self.current_token.type == 'ID':
# stmt ⟶ loc = expr ;
self.add_production("stmt ⟶ loc = expr ;")
# 更新推导
self.update_derivation("stmt", ["loc", "=", "expr", ";"])
# 语义动作
loc_attr = self.parse_loc()
self.match('ASSIGN')
expr_attr = self.parse_expr()
self.match('SEMICOLON')
# 语义动作 - 生成赋值四元式
if loc_attr.offset is None:
self.semantic.emit("=", expr_attr.place, "-", loc_attr.place)
else:
self.semantic.emit("[]=", expr_attr.place, "-", f"{loc_attr.place}[{loc_attr.offset}]")
stmt_attr.nextlist = self.semantic.makelist(None) # 空链表
else:
raise SyntaxError(f"语法错误: 无效的语句开始: {self.current_token.type}")
return stmt_attr
def parse_loc(self):
"""解析loc"""
# loc ⟶ id resta
self.add_production("loc ⟶ id resta")
# 更新推导
self.update_derivation("loc", ["id", "resta"])
# 语义动作
id_token = self.match('ID')
resta_attr = Attribute()
resta_attr.inArray = id_token.value
resta_attr = self.parse_resta(resta_attr)
# 为loc创建属性
loc_attr = Attribute()
loc_attr.place = resta_attr.place
loc_attr.offset = resta_attr.offset
return loc_attr
def parse_resta(self, inherited_attr):
"""解析resta"""
resta_attr = Attribute()
if self.current_token and self.current_token.type == 'LBRACK':
# resta ⟶ [elist]
self.add_production("resta ⟶ [ elist ]")
# 更新推导
self.update_derivation("resta", ["[", "elist", "]"])
# 语义动作
self.match('LBRACK')
elist_attr = Attribute()
elist_attr.inArray = inherited_attr.inArray
elist_attr = self.parse_elist(elist_attr)
self.match('RBRACK')
# 为数组访问生成四元式
# 这里处理多维数组访问
if elist_attr.has_multiple_indices:
# 二维数组处理
i_index = elist_attr.dims[0]
j_index = elist_attr.dims[1]
# 计算行偏移 (i * n2)
t1 = self.semantic.newtemp()
self.semantic.emit("*", i_index, "n2", t1)
# 计算总偏移 (i * n2 + j)
self.semantic.emit("+", t1, j_index, t1)
# 计算数组基地址
t2 = self.semantic.newtemp()
self.semantic.emit("-", elist_attr.inArray, "C", t2)
# 计算元素大小偏移
t3 = self.semantic.newtemp()
self.semantic.emit("*", t1, "w", t3)
# 设置结果
resta_attr.place = t2
resta_attr.offset = t3
else:
# 单维数组或简化处理
resta_attr.place = self.semantic.newtemp()
self.semantic.emit("-", elist_attr.array, "C", resta_attr.place)
resta_attr.offset = self.semantic.newtemp()
self.semantic.emit("*", elist_attr.offset, "w", resta_attr.offset)
else:
# resta ⟶ ε
self.add_production("resta ⟶ ε")
# 更新推导
self.update_derivation("resta", [])
# 语义动作
resta_attr.place = inherited_attr.inArray
resta_attr.offset = None
return resta_attr
def parse_elist(self, inherited_attr):
"""解析elist"""
# elist ⟶ expr rest1
self.add_production("elist ⟶ expr rest1")
# 更新推导
self.update_derivation("elist", ["expr", "rest1"])
# 语义动作
expr_attr = self.parse_expr()
rest1_attr = Attribute()
rest1_attr.inArray = inherited_attr.inArray
rest1_attr.inNdim = 1
rest1_attr.inPlace = expr_attr.place
rest1_attr.dims = [expr_attr.place] # 记录第一个维度的表达式
rest1_attr = self.parse_rest1(rest1_attr)
# 为elist创建属性
elist_attr = Attribute()
elist_attr.array = rest1_attr.array
elist_attr.offset = rest1_attr.offset
elist_attr.dims = rest1_attr.dims
elist_attr.has_multiple_indices = len(rest1_attr.dims) > 1
elist_attr.inArray = inherited_attr.inArray
return elist_attr
def parse_rest1(self, inherited_attr):
"""解析rest1"""
rest1_attr = Attribute()
rest1_attr.dims = inherited_attr.dims.copy() if hasattr(inherited_attr, 'dims') else []
if self.current_token and self.current_token.type == 'COMMA':
# rest1 ⟶ , expr rest1
self.add_production("rest1 ⟶ , expr rest1")
# 更新推导
self.update_derivation("rest1", [",", "expr", "rest1"])
# 语义动作 - 处理多维数组索引
self.match('COMMA')
expr_attr = self.parse_expr()
# 记录这个维度的表达式
rest1_attr.dims.append(expr_attr.place)
rest1_attr.inArray = inherited_attr.inArray
rest1_attr.array = inherited_attr.inArray
rest1_attr.has_multiple_indices = True
# 继续处理后续的维度
rest1_next = self.parse_rest1(rest1_attr)
# 合并结果
rest1_attr.array = rest1_next.array
rest1_attr.offset = rest1_next.offset
rest1_attr.dims = rest1_next.dims
else:
# rest1 ⟶ ε
self.add_production("rest1 ⟶ ε")
# 更新推导
self.update_derivation("rest1", [])
# 语义动作
rest1_attr.array = inherited_attr.inArray
rest1_attr.offset = inherited_attr.inPlace
return rest1_attr
def parse_bool(self):
"""解析bool"""
# bool ⟶ equality
self.add_production("bool ⟶ equality")
# 更新推导
self.update_derivation("bool", ["equality"])
# 语义动作
equality_attr = self.parse_equality()
# 为bool创建属性
bool_attr = Attribute()
bool_attr.truelist = equality_attr.truelist
bool_attr.falselist = equality_attr.falselist
return bool_attr
def parse_equality(self):
"""解析equality"""
# equality ⟶ rel rest4
self.add_production("equality ⟶ rel rest4")
# 更新推导
self.update_derivation("equality", ["rel", "rest4"])
# 语义动作
rel_attr = self.parse_rel()
rest4_attr = Attribute()
rest4_attr.inTruelist = rel_attr.truelist
rest4_attr.inFalselist = rel_attr.falselist
rest4_attr = self.parse_rest4(rest4_attr)
# 为equality创建属性
equality_attr = Attribute()
equality_attr.truelist = rest4_attr.truelist
equality_attr.falselist = rest4_attr.falselist
return equality_attr
def parse_rest4(self, inherited_attr):
"""解析rest4"""
rest4_attr = Attribute()
if self.current_token and self.current_token.type == 'EQ':
# rest4 ⟶ == rel rest41
self.add_production("rest4 ⟶ == rel rest41")
# 更新推导
self.update_derivation("rest4", ["==", "rel", "rest41"])
# 语义动作 - 这里简化处理
self.match('EQ')
self.parse_rel()
# 处理rest41...
elif self.current_token and self.current_token.type == 'NE':
# rest4 ⟶ != rel rest41
self.add_production("rest4 ⟶ != rel rest41")
# 更新推导
self.update_derivation("rest4", ["!=", "rel", "rest41"])
# 语义动作 - 这里简化处理
self.match('NE')
self.parse_rel()
# 处理rest41...
else:
# rest4 ⟶ ε
self.add_production("rest4 ⟶ ε")
# 更新推导
self.update_derivation("rest4", [])
# 语义动作
rest4_attr.truelist = inherited_attr.inTruelist
rest4_attr.falselist = inherited_attr.inFalselist
return rest4_attr
def parse_rel(self):
"""解析rel"""
# rel ⟶ expr rop_expr
self.add_production("rel ⟶ expr rop_expr")
# 更新推导
self.update_derivation("rel", ["expr", "rop_expr"])
# 语义动作
expr_attr = self.parse_expr()
rop_expr_attr = Attribute()
rop_expr_attr.inPlace = expr_attr.place
rop_expr_attr = self.parse_rop_expr(rop_expr_attr)
# 为rel创建属性
rel_attr = Attribute()
rel_attr.truelist = rop_expr_attr.truelist
rel_attr.falselist = rop_expr_attr.falselist
return rel_attr
def parse_rop_expr(self, inherited_attr):
"""解析rop_expr"""
rop_expr_attr = Attribute()
if self.current_token and self.current_token.type == 'LT':
# rop_expr ⟶ < expr
self.add_production("rop_expr ⟶ < expr")
# 更新推导
self.update_derivation("rop_expr", ["<", "expr"])
# 语义动作
self.match('LT')
expr_attr = self.parse_expr()
# 生成条件跳转四元式
rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
self.semantic.emit("j<", inherited_attr.inPlace, expr_attr.place, "-")
self.semantic.emit("j", "-", "-", "-")
elif self.current_token and self.current_token.type == 'GT':
# rop_expr ⟶ > expr
self.add_production("rop_expr ⟶ > expr")
# 更新推导
self.update_derivation("rop_expr", [">", "expr"])
# 语义动作
self.match('GT')
expr_attr = self.parse_expr()
# 生成条件跳转四元式
rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
self.semantic.emit("j>", inherited_attr.inPlace, expr_attr.place, "-")
self.semantic.emit("j", "-", "-", "-")
elif self.current_token and self.current_token.type == 'LE':
# rop_expr ⟶ <= expr
self.add_production("rop_expr ⟶ <= expr")
# 更新推导
self.update_derivation("rop_expr", ["<=", "expr"])
# 语义动作
self.match('LE')
expr_attr = self.parse_expr()
# 生成条件跳转四元式
rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
self.semantic.emit("j<=", inherited_attr.inPlace, expr_attr.place, "-")
self.semantic.emit("j", "-", "-", "-")
elif self.current_token and self.current_token.type == 'GE':
# rop_expr ⟶ >= expr
self.add_production("rop_expr ⟶ >= expr")
# 更新推导
self.update_derivation("rop_expr", [">=", "expr"])
# 语义动作
self.match('GE')
expr_attr = self.parse_expr()
# 生成条件跳转四元式
rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
self.semantic.emit("j>=", inherited_attr.inPlace, expr_attr.place, "-")
self.semantic.emit("j", "-", "-", "-")
else:
# rop_expr ⟶ ε
self.add_production("rop_expr ⟶ ε")
# 更新推导
self.update_derivation("rop_expr", [])
# 语义动作 - 在这里创建一个默认的条件测试
rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
# 默认条件:非0为真
self.semantic.emit("jnz", inherited_attr.inPlace, "-", "-")
self.semantic.emit("j", "-", "-", "-")
return rop_expr_attr
def parse_expr(self):
"""解析expr"""
# expr ⟶ term rest5
self.add_production("expr ⟶ term rest5")
# 更新推导
self.update_derivation("expr", ["term", "rest5"])
# 语义动作
term_attr = self.parse_term()
rest5_attr = Attribute()
rest5_attr.inPlace = term_attr.place
rest5_attr = self.parse_rest5(rest5_attr)
# 为expr创建属性
expr_attr = Attribute()
expr_attr.place = rest5_attr.place
return expr_attr
def parse_rest5(self, inherited_attr):
"""解析rest5"""
rest5_attr = Attribute()
if self.current_token and self.current_token.type == 'PLUS':
# rest5 ⟶ + term rest51
self.add_production("rest5 ⟶ + term rest5")
# 更新推导
self.update_derivation("rest5", ["+", "term", "rest5"])
# 语义动作
self.match('PLUS')
term_attr = self.parse_term()
# 生成加法四元式
rest51_attr = Attribute()
rest51_attr.inPlace = self.semantic.newtemp()
self.semantic.emit("+", inherited_attr.inPlace, term_attr.place, rest51_attr.inPlace)
rest51_attr = self.parse_rest5(rest51_attr)
# 为rest5创建属性
rest5_attr.place = rest51_attr.place
elif self.current_token and self.current_token.type == 'MINUS':
# rest5 ⟶ - term rest51
self.add_production("rest5 ⟶ - term rest5")
# 更新推导
self.update_derivation("rest5", ["-", "term", "rest5"])
# 语义动作
self.match('MINUS')
term_attr = self.parse_term()
# 生成减法四元式
rest51_attr = Attribute()
rest51_attr.inPlace = self.semantic.newtemp()
self.semantic.emit("-", inherited_attr.inPlace, term_attr.place, rest51_attr.inPlace)
rest51_attr = self.parse_rest5(rest51_attr)
# 为rest5创建属性
rest5_attr.place = rest51_attr.place
else:
# rest5 ⟶ ε
self.add_production("rest5 ⟶ ε")
# 更新推导
self.update_derivation("rest5", [])
# 语义动作
rest5_attr.place = inherited_attr.inPlace
return rest5_attr
def parse_term(self):
"""解析term"""
# term ⟶ unary rest6
self.add_production("term ⟶ unary rest6")
# 更新推导
self.update_derivation("term", ["unary", "rest6"])
# 语义动作
unary_attr = self.parse_unary()
rest6_attr = Attribute()
rest6_attr.inPlace = unary_attr.place
rest6_attr = self.parse_rest6(rest6_attr)
# 为term创建属性
term_attr = Attribute()
term_attr.place = rest6_attr.place
return term_attr
def parse_rest6(self, inherited_attr):
"""解析rest6"""
rest6_attr = Attribute()
if self.current_token and self.current_token.type == 'MUL':
# rest6 ⟶ * unary rest61
self.add_production("rest6 ⟶ * unary rest6")
# 更新推导
self.update_derivation("rest6", ["*", "unary", "rest6"])
# 语义动作
self.match('MUL')
unary_attr = self.parse_unary()
# 生成乘法四元式
rest61_attr = Attribute()
rest61_attr.inPlace = self.semantic.newtemp()
self.semantic.emit("*", inherited_attr.inPlace, unary_attr.place, rest61_attr.inPlace)
rest61_attr = self.parse_rest6(rest61_attr)
# 为rest6创建属性
rest6_attr.place = rest61_attr.place
elif self.current_token and self.current_token.type == 'DIV':
# rest6 ⟶ / unary rest61
self.add_production("rest6 ⟶ / unary rest6")
# 更新推导
self.update_derivation("rest6", ["/", "unary", "rest6"])
# 语义动作
self.match('DIV')
unary_attr = self.parse_unary()
# 生成除法四元式
rest61_attr = Attribute()
rest61_attr.inPlace = self.semantic.newtemp()
self.semantic.emit("/", inherited_attr.inPlace, unary_attr.place, rest61_attr.inPlace)
rest61_attr = self.parse_rest6(rest61_attr)
# 为rest6创建属性
rest6_attr.place = rest61_attr.place
else:
# rest6 ⟶ ε
self.add_production("rest6 ⟶ ε")
# 更新推导
self.update_derivation("rest6", [])
# 语义动作
rest6_attr.place = inherited_attr.inPlace
return rest6_attr
def parse_unary(self):
"""解析unary"""
# unary ⟶ factor
self.add_production("unary ⟶ factor")
# 更新推导
self.update_derivation("unary", ["factor"])
# 语义动作
factor_attr = self.parse_factor()
# 为unary创建属性
unary_attr = Attribute()
unary_attr.place = factor_attr.place
return unary_attr
def parse_factor(self):
"""解析factor"""
factor_attr = Attribute()
if not self.current_token:
raise SyntaxError("语法错误: 意外的文件结束,期望一个因子")
if self.current_token.type == 'NUM':
# factor ⟶ num
self.add_production("factor ⟶ num")
# 更新推导
self.update_derivation("factor", ["num"])
# 语义动作
num_token = self.match('NUM')
factor_attr.place = num_token.value
elif self.current_token.type == 'LPAREN':
# factor ⟶ ( expr )
self.add_production("factor ⟶ ( expr )")
# 更新推导
self.update_derivation("factor", ["(", "expr", ")"])
# 语义动作
self.match('LPAREN')
expr_attr = self.parse_expr()
self.match('RPAREN')
factor_attr.place = expr_attr.place
elif self.current_token.type == 'ID':
# factor ⟶ loc
self.add_production("factor ⟶ loc")
# 更新推导
self.update_derivation("factor", ["loc"])
# 语义动作
loc_attr = self.parse_loc()
if loc_attr.offset is None:
factor_attr.place = loc_attr.place
else:
factor_attr.place = self.semantic.newtemp()
# 生成数组元素访问四元式
self.semantic.emit("=[]", f"{loc_attr.place}[{loc_attr.offset}]", "-", factor_attr.place)
else:
raise SyntaxError(f"语法错误: 无效的因子: {self.current_token.type}")
return factor_attr
def update_derivation(self, non_terminal, replacement):
"""更新当前推导句型,用replacement替换第一个出现的non_terminal"""
# 找到第一个出现的非终结符并替换
for i, symbol in enumerate(self.current_sentential):
if symbol == non_terminal:
# 替换non_terminal为replacement
new_sentential = self.current_sentential[:i] + replacement + self.current_sentential[i + 1:]
self.current_sentential = new_sentential
# 添加到推导步骤
if replacement: # 如果不是空生成式
self.add_derivation(self.current_sentential)
return True
return False
def print_result(self):
"""打印分析结果"""
print("\n1) 按使用产生式过程")
for i, step in enumerate(self.production_steps, 1):
print(f"({i}){step}")
print("\n2) 按推导过程")
for i, step in enumerate(self.derivation_steps, 1):
print(f"({i}) {step}")
def save_result(self, file_name):
"""保存分析结果到文件"""
try:
with open(file_name, 'w', encoding='utf-8') as f:
f.write("1) 按使用产生式过程\n")
for i, step in enumerate(self.production_steps, 1):
f.write(f"({i}){step}\n")
f.write("\n2) 按推导过程\n")
for i, step in enumerate(self.derivation_steps, 1):
f.write(f"({i}) {step}\n")
return True
except Exception as e:
print(f"保存文件时出错: {e}")
return False
# 读取测试文件
try:
with open("test5.txt", "r", encoding="utf-8") as file, open('词法分析.txt', 'w') as out_f:
print('词法分析:')
content = file.read()
file.seek(0) # 重置文件指针到开始位置
ch = file.read(1)
while ch:
if ch.isspace():
ch = file.read(1)
continue
res = ""
if ch.isalpha() or ch == '_':
ch, res = Recognizestr(ch)
elif ch.isdigit():
ch, res = RecognizeDigit(ch)
elif ch in {'+', '-', '*', '/', '%', '=', '>', '<', '!'}:
ch, res = Recognizeop(ch)
else:
ch, res = Recognizeoth(ch)
# 统一输出
print(res)
out_f.write(res + '\n')
print('语法分析:')
# 重置读取文件到开始的位置
file.seek(0)
code = file.read()
if not code.strip():
print("警告: test5.txt文件为空或只包含空白字符")
except Exception as e:
print(f"读取文件时出错: {e}")
code = "x=A[i,j];" # 默认测试用例
# 创建分析器并解析代码
analyzer = CompleteSyntaxSemanticAnalyzer()
success = analyzer.parse(code)
if success:
# 输出分析结果
analyzer.print_result()
if analyzer.save_result("语法分析.txt"):
print('\n语法分析完成,结果已保存到"语法分析.txt"文件中。')
else:
print("\n语法分析完成,但保存文件时出错。")
else:
print("\n语法分析失败。")
print('语义分析:')
print("\n生成的四元式:")
analyzer.semantic.print_quadruples("语义分析.txt")
三、总结
该实验实现了递归下降翻译器,对测试代码进行词法、语法和语义分析,并生成四元式中间代码。系统包含三个模块:词法分析识别各类符号并输出token序列;语法分析采用递归下降法,记录产生式推导过程;语义分析通过语法制导翻译生成四元式,处理控制流回填和数组访问。关键数据结构包括四元式表、属性结构体和临时变量管理,支持布尔表达式跳转、赋值语句和数组元素访问的翻译。测试案例展示了完整的分析流程,输出结果包含三阶段分析报告。该系统实现了从源代码到中间代码的完整翻译过程。

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。
更多推荐
所有评论(0)