目录

一、实验要求

二、Python代码实现语义分析和中间代码生成器

三、总结


一、实验要求


       根据前面实验内容的Python代码(可以参考我的《编译原理课程设计》专栏,里面有实现词法分析、语法分析和语义分析和中间代码生成器-赋值表达式的源代码),添加功能实现递归下降翻译器。

注意
数据结构:
四元式:结构体
四元式序列:结构体数组
跳转语句的四元式的第 4 个域需回填。
翻译模式与步骤
1.测试:
输入要测试的代码到"test5.txt"文件中:

while(a<b)
 if(c)
 while(d>e)
 x1=y1;
 else
 x2=y2;
x3[k]=y3[i,j]+a1*a2;


输出内容到控制台、"词法分析.txt"文件、"语法分析.txt"文件与"语义分析.txt"文件:

词法分析:
<20, ->
<81, ->
<111, a>
<49, ->
<111, b>
<82, ->
<17, ->
<81, ->
<111, c>
<82, ->
<20, ->
<81, ->
<111, d>
<47, ->
<111, e>
<82, ->
<111, x1>
<46, ->
<111, y1>
<84, ->
<15, ->
<111, x2>
<46, ->
<111, y2>
<84, ->
<111, x3>
<88, ->
<111, k>
<89, ->
<46, ->
<111, y3>
<88, ->
<111, i>
<83, ->
<111, j>
<89, ->
<41, ->
<111, a1>
<43, ->
<111, a2>
<84, ->
语法分析:

1) 按使用产生式过程
(1)stmts ⟶ stmt rest0
(2)stmt ⟶ while(m1 bool) m2 stmt1
(3)bool ⟶ equality
(4)equality ⟶ rel rest4
(5)rel ⟶ expr rop_expr
(6)expr ⟶ term rest5
(7)term ⟶ unary rest6
(8)unary ⟶ factor
(9)factor ⟶ loc
(10)loc ⟶ id resta
(11)resta ⟶ ε
(12)rest6 ⟶ ε
(13)rest5 ⟶ ε
(14)rop_expr ⟶ < expr
(15)expr ⟶ term rest5
(16)term ⟶ unary rest6
(17)unary ⟶ factor
(18)factor ⟶ loc
(19)loc ⟶ id resta
(20)resta ⟶ ε
(21)rest6 ⟶ ε
(22)rest5 ⟶ ε
(23)rest4 ⟶ ε
(24)stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2
(25)bool ⟶ equality
(26)equality ⟶ rel rest4
(27)rel ⟶ expr rop_expr
(28)expr ⟶ term rest5
(29)term ⟶ unary rest6
(30)unary ⟶ factor
(31)factor ⟶ loc
(32)loc ⟶ id resta
(33)resta ⟶ ε
(34)rest6 ⟶ ε
(35)rest5 ⟶ ε
(36)rop_expr ⟶ ε
(37)rest4 ⟶ ε
(38)stmt ⟶ while(m1 bool) m2 stmt1
(39)bool ⟶ equality
(40)equality ⟶ rel rest4
(41)rel ⟶ expr rop_expr
(42)expr ⟶ term rest5
(43)term ⟶ unary rest6
(44)unary ⟶ factor
(45)factor ⟶ loc
(46)loc ⟶ id resta
(47)resta ⟶ ε
(48)rest6 ⟶ ε
(49)rest5 ⟶ ε
(50)rop_expr ⟶ > expr
(51)expr ⟶ term rest5
(52)term ⟶ unary rest6
(53)unary ⟶ factor
(54)factor ⟶ loc
(55)loc ⟶ id resta
(56)resta ⟶ ε
(57)rest6 ⟶ ε
(58)rest5 ⟶ ε
(59)rest4 ⟶ ε
(60)stmt ⟶ loc = expr ;
(61)loc ⟶ id resta
(62)resta ⟶ ε
(63)expr ⟶ term rest5
(64)term ⟶ unary rest6
(65)unary ⟶ factor
(66)factor ⟶ loc
(67)loc ⟶ id resta
(68)resta ⟶ ε
(69)rest6 ⟶ ε
(70)rest5 ⟶ ε
(71)stmt ⟶ loc = expr ;
(72)loc ⟶ id resta
(73)resta ⟶ ε
(74)expr ⟶ term rest5
(75)term ⟶ unary rest6
(76)unary ⟶ factor
(77)factor ⟶ loc
(78)loc ⟶ id resta
(79)resta ⟶ ε
(80)rest6 ⟶ ε
(81)rest5 ⟶ ε
(82)rest0 ⟶ m stmt rest01
(83)stmt ⟶ loc = expr ;
(84)loc ⟶ id resta
(85)resta ⟶ [ elist ]
(86)elist ⟶ expr rest1
(87)expr ⟶ term rest5
(88)term ⟶ unary rest6
(89)unary ⟶ factor
(90)factor ⟶ loc
(91)loc ⟶ id resta
(92)resta ⟶ ε
(93)rest6 ⟶ ε
(94)rest5 ⟶ ε
(95)rest1 ⟶ ε
(96)expr ⟶ term rest5
(97)term ⟶ unary rest6
(98)unary ⟶ factor
(99)factor ⟶ loc
(100)loc ⟶ id resta
(101)resta ⟶ [ elist ]
(102)elist ⟶ expr rest1
(103)expr ⟶ term rest5
(104)term ⟶ unary rest6
(105)unary ⟶ factor
(106)factor ⟶ loc
(107)loc ⟶ id resta
(108)resta ⟶ ε
(109)rest6 ⟶ ε
(110)rest5 ⟶ ε
(111)rest1 ⟶ , expr rest1
(112)expr ⟶ term rest5
(113)term ⟶ unary rest6
(114)unary ⟶ factor
(115)factor ⟶ loc
(116)loc ⟶ id resta
(117)resta ⟶ ε
(118)rest6 ⟶ ε
(119)rest5 ⟶ ε
(120)rest1 ⟶ ε
(121)rest6 ⟶ ε
(122)rest5 ⟶ + term rest5
(123)term ⟶ unary rest6
(124)unary ⟶ factor
(125)factor ⟶ loc
(126)loc ⟶ id resta
(127)resta ⟶ ε
(128)rest6 ⟶ * unary rest6
(129)unary ⟶ factor
(130)factor ⟶ loc
(131)loc ⟶ id resta
(132)resta ⟶ ε
(133)rest6 ⟶ ε
(134)rest5 ⟶ ε
(135)rest0 ⟶ ℇ

2) 按推导过程
(1) stmts
(2) stmt rest0
(3) while(m1 bool) m2 stmt1 rest0
(4) while(m1 bool) m2 stmt1 m stmt rest01
(5) while(m1 bool) m2 stmt1 m loc = expr ; rest01
(6) while(m1 bool) m2 stmt1 m id resta = expr ; rest01
(7) while(m1 bool) m2 stmt1 m id [ elist ] = expr ; rest01
(8) while(m1 bool) m2 stmt1 m id [ expr rest1 ] = expr ; rest01
(9) while(m1 bool) m2 stmt1 m id [ term rest5 rest1 ] = expr ; rest01
(10) while(m1 bool) m2 stmt1 m id [ unary rest6 rest5 rest1 ] = expr ; rest01
(11) while(m1 bool) m2 stmt1 m id [ factor rest6 rest5 rest1 ] = expr ; rest01
(12) while(m1 bool) m2 stmt1 m id [ loc rest6 rest5 rest1 ] = expr ; rest01
(13) while(m1 bool) m2 stmt1 m id [ id resta rest6 rest5 rest1 ] = expr ; rest01
(14) while(m1 bool) m2 stmt1 m id [ id ] = term rest5 ; rest01
(15) while(m1 bool) m2 stmt1 m id [ id ] = unary rest6 rest5 ; rest01
(16) while(m1 bool) m2 stmt1 m id [ id ] = factor rest6 rest5 ; rest01
(17) while(m1 bool) m2 stmt1 m id [ id ] = loc rest6 rest5 ; rest01
(18) while(m1 bool) m2 stmt1 m id [ id ] = id resta rest6 rest5 ; rest01
(19) while(m1 bool) m2 stmt1 m id [ id ] = id [ elist ] rest6 rest5 ; rest01
(20) while(m1 bool) m2 stmt1 m id [ id ] = id [ expr rest1 ] rest6 rest5 ; rest01
(21) while(m1 bool) m2 stmt1 m id [ id ] = id [ term rest5 rest1 ] rest6 rest5 ; rest01
(22) while(m1 bool) m2 stmt1 m id [ id ] = id [ unary rest6 rest5 rest1 ] rest6 rest5 ; rest01
(23) while(m1 bool) m2 stmt1 m id [ id ] = id [ factor rest6 rest5 rest1 ] rest6 rest5 ; rest01
(24) while(m1 bool) m2 stmt1 m id [ id ] = id [ loc rest6 rest5 rest1 ] rest6 rest5 ; rest01
(25) while(m1 bool) m2 stmt1 m id [ id ] = id [ id resta rest6 rest5 rest1 ] rest6 rest5 ; rest01
(26) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , expr rest1 ] rest6 rest5 ; rest01
(27) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , term rest5 rest1 ] rest6 rest5 ; rest01
(28) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , unary rest6 rest5 rest1 ] rest6 rest5 ; rest01
(29) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , factor rest6 rest5 rest1 ] rest6 rest5 ; rest01
(30) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , loc rest6 rest5 rest1 ] rest6 rest5 ; rest01
(31) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id resta rest6 rest5 rest1 ] rest6 rest5 ; rest01
(32) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + term rest5 ; rest01
(33) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + unary rest6 rest5 ; rest01
(34) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + factor rest6 rest5 ; rest01
(35) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + loc rest6 rest5 ; rest01
(36) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id resta rest6 rest5 ; rest01
(37) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * unary rest6 rest5 ; rest01
(38) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * factor rest6 rest5 ; rest01
(39) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * loc rest6 rest5 ; rest01
(40) while(m1 bool) m2 stmt1 m id [ id ] = id [ id , id ] + id * id resta rest6 rest5 ; rest01

语法分析完成,结果已保存到"语法分析.txt"文件中。
语义分析:

生成的四元式:
0: j<,    a,    b,    2
1: j,    -,    -,    11
2: jnz,    c,    -,    4
3: j,    -,    -,    9
4: j>,    d,    e,    6
5: j,    -,    -,    0
6: =,    y1,    -,    x1
7: j,    -,    -,    4
8: j,    -,    -,    0
9: =,    y2,    -,    x2
10: j,    -,    -,    0
11: -,    x3,    C,    t1
12: *,    k,    w,    t2
13: *,    i,    n2,    t3
14: +,    t3,    j,    t3
15: -,    y3,    C,    t4
16: *,    t3,    w,    t5
17: =[],    t4[t5],    -,    t6
18: *,    a1,    a2,    t7
19: +,    t6,    t7,    t8
20: []=,    t8,    -,    t1[t2]
 

"词法分析.txt"部分截图:

"语法分析.txt"部分截图:

"语义分析.txt"截图:

文法
语义动作,格式:{语义分析}


stmts⟶stmt
rest0
{rest0.inNextlist=stmt.nextlist}
{stmts.nextlist=rest0.nextlist}


rest0 ⟶m stmt
rest01
{backpatch(rest0.inNextlist,    m.quad);
rest01.inNextlist=stmt.nextlist}
{rest0.nextlist=rest01.nextlist}

rest0 ⟶ℇ
{rest0.nextlist=rest0.inNextlist}


stmt⟶loc=expr ;
{if(loc.offset=null)
emit( ‘=,’ expr.place ‘, - ,’ loc.place);
else
emit(‘[]=,’ expr.place ‘, - ,’ loc.place ‘[’ loc.offset ‘]’ );
stmt.nextlist=makelist()}


stmt⟶if(bool) m1 stmt1 n else m2 stmt2 {backpatch(bool.truelist,    m1.quad);
backpatch(bool.falselist,    m2.quad);
stmt.nextlist=
merge(stmt1.nextlist,    n.nextlist,    m2.nextlist)}


stmt⟶ while(m1 bool) m2 stmt1
{backpatch(stmt1.nextlist,    m1.quad);
backpatch(bool.truelist,    m2.quad);
stmt.nextlist=bool.falselist;
emit( ‘j, -, -, ’ m1.quad)}


m⟶ℇ
{m.quad=nextquad}


n⟶ℇ
{n.nextlist=makelist(nextquad);
emit( ‘j, -, -,    0’)}


loc⟶id
resta
{resta.inArray=id.place}
{loc.place=resta.place;
loc.offset=resta.offset}


resta⟶[
elist
]
{elist.inArray=resta.inArray}
{resta.place=newtemp();
emit(‘-,’ elist.arry ‘,’ C ‘,’ resta.place);
resta.offset=newtemp();
emit(‘*, ’ w ‘,’ elist.offset ‘,’ resta.offset);
}


resta⟶ℇ
{resta.place=resta.inArray;
resta.offset=null}


elist ⟶expr
rest1
{rest1.inArray=elist.inArray;
rest1.inNdim=1;
rest1.inPlace=expr.place}
{elist.array=rest1.array;
elist.offset=rest1.offset}

rest1⟶ ,
expr
rest11
{t=newtemp();
m=rest1.inNdim+1;
emit(‘*,’ rest1.inPlace ‘,’ limit(rest1.inarray,m) ‘,’ t);
emit(‘+,’ t ‘,’ expr.place ‘,’ t);
rest11.inArray=rest1.inArray;
rest11.inNdim=m;
rest11.inNplace=t}
{rest1.array=rest11.array;
rest1.offset=rest11.offset}


rest1⟶ℇ
{rest1.array=rest1.inArray;
rest1.offset=rest1.inPlace}


bool ⟶ equality
{bool.truelist=equality.truelist
bool.falselist=equality.falselist    }


equality ⟶ rel
rest4
{rest4.inTruelist=rel.truelist
rest4.inFalselist=rel.falselist}
{equality.truelist=rest4.truelist
equality.falselist=rest4.falselist}


rest4 ⟶ == rel    rest41
rest4 ⟶ != rel    rest41
rest4 ⟶ ℇ
{rest4.truelist=rest4.inTruelist
rest4.falselist=rest4.inFalselist}


rel ⟶ expr
rop_expr
{rop_expr.inPlace=expr.place}
{rel.truelist=rop_expr.truelist
rel.falselist=rop_expr.falselist}


rop_expr ⟶ <expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j<,’    rop_expr.inPlace    ‘,’ expr.place ‘, -’);
emit(‘j, -, -, -’)}


rop_expr ⟶ <=expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j<=,’    rop_expr.inPlace    ‘,’    expr.place    ‘, -’);
emit(‘j, -, -, -’)}


rop_expr ⟶ >expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j>,’    rop_expr.inPlace    ‘,’    expr.place    ‘, -’);
emit(‘j, -, -, -’)}


rop_expr ⟶ >=expr
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘j>=,’    rop_expr.inPlace    ‘,’    expr.place    ‘, -’);
emit(‘j, -, -, -’)}


rop_expr ⟶ ℇ
{rop_expr.truelist=makelist(nextquad);
rop_expr.falselist=makelist(nextquad+1);
emit(‘jnz,’ rop_expr.inPlace ‘, -, -’);
emit(‘j, -, -, -’)}

expr ⟶ term
rest5
{rest5.in=term.place}
{expr.place=rest5.place}


rest5⟶ +term
rest51
{rest51.in=newtemp();
emit(‘+,’ rest5.in ‘,’ term.place ‘,’ rest51.in)}
{rest5.place    =rest51 .place}


rest5⟶ -term
rest51
{rest51.in=newtemp();
emit(‘-,’ rest5.in ‘,’ term.place ‘,’ rest51.in)}
{rest5.place    =rest51 .place}


rest5⟶ ℇ
{rest5.place    =    rest5.in}


term⟶ unary
rest6
{rest6.in    =    unary.place}
{term.place    =    rest6.place}


rest6⟶ *unary
rest61
{rest61.in=newtemp();
emit(‘*,’ rest6.in ‘,’ unary.place ‘,’ rest61.in)}
{rest6.place    =    rest61 .place}


rest6⟶ /unary
rest61
{rest61.in=newtemp();
emit(‘/,’    rest6.in    ‘,’    unary.place    ‘,’    rest61.in)}
{rest6.place    =    rest61 .place}


rest6⟶ ℇ
{rest6.place    =    rest6.in}


unary⟶factor
{unary.place    = factor.place}


factor⟶ (expr)
{factor.place    = expr.place}


factor⟶loc
{if(loc.offset=null)
factor.place    =    loc.place
else    {factor.place=newtemp();
emit(‘=[],’ loc.place ‘[’ loc.offset ‘]’ ‘, -,’ factor.place    )}}


factor⟶num
{factor.place    =    num.value}

二、Python代码实现语义分析和中间代码生成器

# 种别码映射表
token_map = {
    # 运算符
    '+': 41, '-': 42, '*': 43, '/': 44, '%': 45, '=': 46,
    '>': 47, '>=': 48, '<': 49, '<=': 50, '==': 51, '!=': 52,
    '&&': 53, '||': 54, '!': 55, '++': 56, '--': 57,
    # 输出和输入符号
    '<<': 90, '>>': 91,
    # 关键字
    'int': 5, 'else': 15, 'if': 17, 'while': 20, 'double': 21,
    'string': 22, 'char': 23, 'include': 24, 'using': 25,
    'namespace': 26, 'std': 27, 'main': 28, 'return': 29,
    'void': 30, 'iostream': 31, 'cin': 32, 'cout': 33, 'endl': 34
}


def Recognizestr(ch):
    word = ""
    while ch.isalnum() or ch == '_':
        word += ch
        ch = file.read(1)

    # 关键字处理
    code = token_map.get(word, None)
    if code:
        return ch, f'<{code}, ->'
    return ch, f'<111, {word}>'


def RecognizeDigit(ch):
    data = ""
    while ch in {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.'}:
        data += ch
        ch = file.read(1)
    return ch, f'<100, {data}>'


def Recognizeop(ch):
    op = ch
    peek_ch = file.read(1)

    # 处理双字符运算符
    double_ops = {
        '+': '+', '-': '-', '=': '=', '>': '=',
        '<': '=', '!': '=', '&': '&', '|': '|',
    }

    # 特殊处理 << 和 >>
    if op == '<' and peek_ch == '<':
        op += peek_ch
        ch = file.read(1)
    elif op == '>' and peek_ch == '>':
        op += peek_ch
        ch = file.read(1)
    # 处理其他双字符运算符
    elif op in double_ops and peek_ch == double_ops[op]:
        op += peek_ch
        ch = file.read(1)
    else:
        ch = peek_ch

    # 获取种别码
    code = token_map.get(op, None)
    if code:
        return ch, f'<{code}, ->'
    return ch, f'<{op}, {op}>'


def Recognizeoth(ch):
    # 处理括号和单字符符号
    if ch in {'(', ')', '[', ']', '{', '}', ';', ',', '#', '<', '>'}:
        code_mapping = {
            '(': 81, ')': 82, '[': 88, ']': 89,
            '{': 86, '}': 87, ';': 84, ',': 83,
            '#': 85, '<': 49, '>': 47, '<<': 90,
            '>>': 91
        }
        code = code_mapping.get(ch, 0)
        res = f'<{code}, ->' if code else f'<{ch}, {ch}>'
        return file.read(1), res

    # 处理逻辑运算符
    if ch in {'&', '|'}:
        next_ch = file.read(1)
        if next_ch == ch:
            code = 53 if ch == '&' else 54
            return file.read(1), f'<{code}, ->'
        return next_ch, '<error>'

    return file.read(1), f'<{ch}, {ch}>'


import re
import os
from collections import namedtuple


# 定义Token类
class Token:
    def __init__(self, type, value):
        self.type = type
        self.value = value

    def __str__(self):
        return f"{self.type}({self.value})"


# 四元式结构
class Quadruple:
    def __init__(self, op, arg1, arg2, result):
        self.op = op
        self.arg1 = arg1
        self.arg2 = arg2
        self.result = result

    def __str__(self):
        return f"{self.op}, {self.arg1}, {self.arg2}, {self.result}"


# 属性信息结构
class Attribute:
    def __init__(self, place=None, offset=None, array=None):
        self.place = place  # 存放值的变量名
        self.offset = offset  # 数组偏移量
        self.array = array  # 数组基地址
        # 用于布尔表达式
        self.truelist = []  # 真出口跳转链表
        self.falselist = []  # 假出口跳转链表
        # 用于控制流
        self.nextlist = []  # 下一条语句跳转链表
        # 临时变量,用于传递属性
        self.inArray = None
        self.inOffset = None
        self.inPlace = None
        self.inNextlist = None
        self.inTruelist = None
        self.inFalselist = None
        self.inNdim = 0
        self.quad = 0  # 用于记录当前四元式的位置
        # 用于多维数组
        self.dims = []  # 维度列表
        # 标记是否包含多个索引
        self.has_multiple_indices = False


# 语义分析器 - 实现语法制导翻译
class SemanticAnalyzer:
    def __init__(self):
        # 初始化四元式表
        self.quad_table = []
        # 初始化临时变量计数器
        self.temp_count = 0
        # 初始化数组信息 (假设所有数组都是二维的,第二维大小为n2)
        self.array_info = {
            'A': {'dimensions': 2, 'size': [None, None]},
            'n2': 10,  # 假设第二维大小为10
            'w': 4  # 假设每个元素大小为4字节
        }

    def emit(self, op, arg1, arg2, result):
        """生成四元式并添加到四元式表"""
        quad = Quadruple(op, arg1, arg2, result)
        self.quad_table.append(quad)
        return len(self.quad_table) - 1  # 返回四元式索引

    def newtemp(self):
        """生成新的临时变量名"""
        self.temp_count += 1
        return f"t{self.temp_count}"

    def nextquad(self):
        """返回下一个四元式的索引"""
        return len(self.quad_table)

    def makelist(self, i):
        """创建只包含i的链表"""
        return [i] if i is not None else []

    def merge(self, list1, list2):
        """合并两个链表"""
        if list1 is None:
            list1 = []
        if list2 is None:
            list2 = []
        return list1 + list2

    def backpatch(self, address_list, quad_index):
        """回填操作 - 将address_list中的所有地址回填为quad_index"""
        for i in address_list:
            if i < len(self.quad_table):
                self.quad_table[i].result = str(quad_index)

    def print_quadruples(self, output_file=None):
        """打印四元式表"""
        output = []
        for i, quad in enumerate(self.quad_table):
            quad_str = f"{i}: {quad.op},\t{quad.arg1},\t{quad.arg2},\t{quad.result}"
            output.append(quad_str)
            print(quad_str)

        if output_file:
            try:
                with open(output_file, 'w', encoding='utf-8') as f:
                    for line in output:
                        f.write(line + '\n')
            except Exception as e:
                print(f"保存四元式到文件时出错: {e}")


# 实际的语法和语义分析器
class CompleteSyntaxSemanticAnalyzer:
    def __init__(self):
        # 用于记录产生式步骤和推导步骤
        self.production_steps = []
        self.derivation_steps = []
        self.step_count = 0

        # 当前处理的tokens
        self.tokens = []
        self.current_token = None
        self.token_index = 0

        # 当前推导的句型
        self.current_sentential = ["stmts"]

        # 语义分析器
        self.semantic = SemanticAnalyzer()

    def add_production(self, production):
        """添加一条使用的产生式"""
        self.step_count += 1
        self.production_steps.append(production)

    def add_derivation(self, sentential_form):
        """添加一个推导步骤"""
        self.derivation_steps.append(" ".join(sentential_form))

    def tokenize(self, text):
        """词法分析,将输入文本转换为token序列"""
        tokens = []

        # 定义token模式
        patterns = [
            (r'[ \t\n\r]+', None),  # 忽略所有空白字符
            (r'//.*', None),  # 忽略注释
            (r'while', 'WHILE'),
            (r'if', 'IF'),
            (r'else', 'ELSE'),
            (r'[a-zA-Z_][a-zA-Z0-9_]*', 'ID'),
            (r'[0-9]+', 'NUM'),
            (r'\(', 'LPAREN'),
            (r'\)', 'RPAREN'),
            (r'\[', 'LBRACK'),
            (r'\]', 'RBRACK'),
            (r';', 'SEMICOLON'),
            (r',', 'COMMA'),
            (r'=', 'ASSIGN'),
            (r'\+', 'PLUS'),
            (r'-', 'MINUS'),
            (r'\*', 'MUL'),
            (r'/', 'DIV'),
            (r'<', 'LT'),
            (r'<=', 'LE'),
            (r'>', 'GT'),
            (r'>=', 'GE'),
            (r'==', 'EQ'),
            (r'!=', 'NE'),
            (r'&&', 'AND'),
            (r'\|\|', 'OR')
        ]

        # 手动匹配每种模式
        pos = 0
        line = 1
        column = 1

        while pos < len(text):
            match = None

            # 尝试匹配每种模式
            for pattern, token_type in patterns:
                regex = re.compile(pattern)
                m = regex.match(text, pos)
                if m:
                    match = m
                    if token_type:  # 如果不是要跳过的模式
                        value = m.group(0)
                        tokens.append(Token(token_type, value))

                    # 更新位置信息
                    matched_text = m.group(0)
                    newlines = matched_text.count('\n')
                    if newlines > 0:
                        line += newlines
                        column = len(matched_text) - matched_text.rindex('\n')
                    else:
                        column += len(matched_text)

                    pos = m.end()
                    break

            if not match:
                # 如果没有匹配到任何模式,报告错误并跳过当前字符
                print(f"无法识别的字符: '{text[pos]}' 在行 {line} 列 {column}")
                pos += 1
                column += 1

        return tokens

    def get_token(self):
        """获取当前token"""
        if self.token_index < len(self.tokens):
            self.current_token = self.tokens[self.token_index]
            self.token_index += 1
            return self.current_token
        else:
            self.current_token = None
            return None

    def peek_token(self):
        """查看下一个token但不消耗它"""
        if self.token_index < len(self.tokens):
            return self.tokens[self.token_index]
        return None

    def match(self, expected_type):
        """匹配当前token类型"""
        if self.current_token and self.current_token.type == expected_type:
            token = self.current_token
            self.get_token()
            return token
        else:
            expected = expected_type
            found = self.current_token.type if self.current_token else "EOF"
            raise SyntaxError(f"语法错误: 期望 {expected},但得到 {found}")

    def parse(self, text):
        """解析输入文本"""
        try:
            self.tokens = self.tokenize(text)
            if not self.tokens:
                print("警告: 没有识别到任何token,输入可能为空或仅包含空白字符")
                return False

            self.token_index = 0
            self.get_token()  # 初始化第一个token

            self.production_steps = []
            self.derivation_steps = []
            self.step_count = 0
            self.semantic = SemanticAnalyzer()

            # 初始推导步骤
            self.current_sentential = ["stmts"]
            self.add_derivation(self.current_sentential)

            stmts_attr = self.parse_stmts()
            return True
        except SyntaxError as e:
            print(f"语法分析错误: {e}")
            return False
        except Exception as e:
            print(f"解析过程中出现未知错误: {e}")
            import traceback
            traceback.print_exc()
            return False

    # 以下是语义分析器的解析函数,每个函数返回对应非终结符的属性

    def parse_stmts(self):
        """解析stmts"""
        # stmts ⟶ stmt rest0
        self.add_production("stmts ⟶ stmt rest0")

        # 更新推导
        self.update_derivation("stmts", ["stmt", "rest0"])

        # 语义动作
        stmt_attr = self.parse_stmt()
        rest0_attr = Attribute()
        rest0_attr.inNextlist = stmt_attr.nextlist
        rest0_attr = self.parse_rest0(rest0_attr)

        # 为stmts创建属性
        stmts_attr = Attribute()
        stmts_attr.nextlist = rest0_attr.nextlist

        return stmts_attr

    def parse_rest0(self, inherited_attr):
        """解析rest0"""
        rest0_attr = Attribute()

        if self.current_token and self.current_token.type in ['ID', 'IF', 'WHILE']:
            # rest0 ⟶ m stmt rest01
            self.add_production("rest0 ⟶ m stmt rest01")

            # 更新推导
            self.update_derivation("rest0", ["m", "stmt", "rest01"])

            # 语义动作
            m_attr = self.parse_m()
            self.semantic.backpatch(inherited_attr.inNextlist, m_attr.quad)
            stmt_attr = self.parse_stmt()
            rest01_attr = Attribute()
            rest01_attr.inNextlist = stmt_attr.nextlist
            rest01_attr = self.parse_rest0(rest01_attr)

            # 为rest0创建属性
            rest0_attr.nextlist = rest01_attr.nextlist
        else:
            # rest0 ⟶ ε
            self.add_production("rest0 ⟶ ℇ")

            # 更新推导
            self.update_derivation("rest0", [])

            # 语义动作
            rest0_attr.nextlist = inherited_attr.inNextlist

        return rest0_attr

    def parse_m(self):
        """解析m - 标记当前位置用于回填"""
        # m ⟶ ε
        # 语义动作
        m_attr = Attribute()
        m_attr.quad = self.semantic.nextquad()
        return m_attr

    def parse_n(self):
        """解析n - 生成无条件跳转四元式"""
        # n ⟶ ε
        # 语义动作
        n_attr = Attribute()
        n_attr.nextlist = self.semantic.makelist(self.semantic.nextquad())
        self.semantic.emit("j", "-", "-", "0")  # 0会在回填时被替换
        return n_attr

    def parse_stmt(self):
        """解析stmt"""
        stmt_attr = Attribute()

        if self.current_token is None:
            raise SyntaxError("语法错误: 意外的文件结束,期望一个语句")

        if self.current_token.type == 'WHILE':
            # stmt ⟶ while(m1 bool) m2 stmt1
            self.add_production("stmt ⟶ while(m1 bool) m2 stmt1")

            # 更新推导
            self.update_derivation("stmt", ["while(m1", "bool)", "m2", "stmt1"])

            # 语义动作
            self.match('WHILE')
            self.match('LPAREN')
            m1_attr = self.parse_m()
            bool_attr = self.parse_bool()
            self.match('RPAREN')
            m2_attr = self.parse_m()
            stmt1_attr = self.parse_stmt()

            # 语义动作
            self.semantic.backpatch(stmt1_attr.nextlist, m1_attr.quad)
            self.semantic.backpatch(bool_attr.truelist, m2_attr.quad)
            stmt_attr.nextlist = bool_attr.falselist
            self.semantic.emit("j", "-", "-", str(m1_attr.quad))

        elif self.current_token.type == 'IF':
            # stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2
            self.add_production("stmt ⟶ if(bool) m1 stmt1 n else m2 stmt2")

            # 更新推导
            self.update_derivation("stmt", ["if(bool)", "m1", "stmt1", "n", "else", "m2", "stmt2"])

            # 语义动作
            self.match('IF')
            self.match('LPAREN')
            bool_attr = self.parse_bool()
            self.match('RPAREN')
            m1_attr = self.parse_m()
            stmt1_attr = self.parse_stmt()
            n_attr = self.parse_n()
            self.match('ELSE')
            m2_attr = self.parse_m()
            stmt2_attr = self.parse_stmt()

            # 语义动作
            self.semantic.backpatch(bool_attr.truelist, m1_attr.quad)
            self.semantic.backpatch(bool_attr.falselist, m2_attr.quad)
            stmt_attr.nextlist = self.semantic.merge(
                self.semantic.merge(stmt1_attr.nextlist, n_attr.nextlist),
                stmt2_attr.nextlist
            )

        elif self.current_token.type == 'ID':
            # stmt ⟶ loc = expr ;
            self.add_production("stmt ⟶ loc = expr ;")

            # 更新推导
            self.update_derivation("stmt", ["loc", "=", "expr", ";"])

            # 语义动作
            loc_attr = self.parse_loc()
            self.match('ASSIGN')
            expr_attr = self.parse_expr()
            self.match('SEMICOLON')

            # 语义动作 - 生成赋值四元式
            if loc_attr.offset is None:
                self.semantic.emit("=", expr_attr.place, "-", loc_attr.place)
            else:
                self.semantic.emit("[]=", expr_attr.place, "-", f"{loc_attr.place}[{loc_attr.offset}]")

            stmt_attr.nextlist = self.semantic.makelist(None)  # 空链表

        else:
            raise SyntaxError(f"语法错误: 无效的语句开始: {self.current_token.type}")

        return stmt_attr

    def parse_loc(self):
        """解析loc"""
        # loc ⟶ id resta
        self.add_production("loc ⟶ id resta")

        # 更新推导
        self.update_derivation("loc", ["id", "resta"])

        # 语义动作
        id_token = self.match('ID')
        resta_attr = Attribute()
        resta_attr.inArray = id_token.value
        resta_attr = self.parse_resta(resta_attr)

        # 为loc创建属性
        loc_attr = Attribute()
        loc_attr.place = resta_attr.place
        loc_attr.offset = resta_attr.offset

        return loc_attr

    def parse_resta(self, inherited_attr):
        """解析resta"""
        resta_attr = Attribute()

        if self.current_token and self.current_token.type == 'LBRACK':
            # resta ⟶ [elist]
            self.add_production("resta ⟶ [ elist ]")

            # 更新推导
            self.update_derivation("resta", ["[", "elist", "]"])

            # 语义动作
            self.match('LBRACK')
            elist_attr = Attribute()
            elist_attr.inArray = inherited_attr.inArray
            elist_attr = self.parse_elist(elist_attr)
            self.match('RBRACK')

            # 为数组访问生成四元式
            # 这里处理多维数组访问
            if elist_attr.has_multiple_indices:
                # 二维数组处理
                i_index = elist_attr.dims[0]
                j_index = elist_attr.dims[1]

                # 计算行偏移 (i * n2)
                t1 = self.semantic.newtemp()
                self.semantic.emit("*", i_index, "n2", t1)

                # 计算总偏移 (i * n2 + j)
                self.semantic.emit("+", t1, j_index, t1)

                # 计算数组基地址
                t2 = self.semantic.newtemp()
                self.semantic.emit("-", elist_attr.inArray, "C", t2)

                # 计算元素大小偏移
                t3 = self.semantic.newtemp()
                self.semantic.emit("*", t1, "w", t3)

                # 设置结果
                resta_attr.place = t2
                resta_attr.offset = t3
            else:
                # 单维数组或简化处理
                resta_attr.place = self.semantic.newtemp()
                self.semantic.emit("-", elist_attr.array, "C", resta_attr.place)
                resta_attr.offset = self.semantic.newtemp()
                self.semantic.emit("*", elist_attr.offset, "w", resta_attr.offset)

        else:
            # resta ⟶ ε
            self.add_production("resta ⟶ ε")

            # 更新推导
            self.update_derivation("resta", [])

            # 语义动作
            resta_attr.place = inherited_attr.inArray
            resta_attr.offset = None

        return resta_attr

    def parse_elist(self, inherited_attr):
        """解析elist"""
        # elist ⟶ expr rest1
        self.add_production("elist ⟶ expr rest1")

        # 更新推导
        self.update_derivation("elist", ["expr", "rest1"])

        # 语义动作
        expr_attr = self.parse_expr()
        rest1_attr = Attribute()
        rest1_attr.inArray = inherited_attr.inArray
        rest1_attr.inNdim = 1
        rest1_attr.inPlace = expr_attr.place
        rest1_attr.dims = [expr_attr.place]  # 记录第一个维度的表达式
        rest1_attr = self.parse_rest1(rest1_attr)

        # 为elist创建属性
        elist_attr = Attribute()
        elist_attr.array = rest1_attr.array
        elist_attr.offset = rest1_attr.offset
        elist_attr.dims = rest1_attr.dims
        elist_attr.has_multiple_indices = len(rest1_attr.dims) > 1
        elist_attr.inArray = inherited_attr.inArray

        return elist_attr

    def parse_rest1(self, inherited_attr):
        """解析rest1"""
        rest1_attr = Attribute()
        rest1_attr.dims = inherited_attr.dims.copy() if hasattr(inherited_attr, 'dims') else []

        if self.current_token and self.current_token.type == 'COMMA':
            # rest1 ⟶ , expr rest1
            self.add_production("rest1 ⟶ , expr rest1")

            # 更新推导
            self.update_derivation("rest1", [",", "expr", "rest1"])

            # 语义动作 - 处理多维数组索引
            self.match('COMMA')
            expr_attr = self.parse_expr()

            # 记录这个维度的表达式
            rest1_attr.dims.append(expr_attr.place)
            rest1_attr.inArray = inherited_attr.inArray
            rest1_attr.array = inherited_attr.inArray
            rest1_attr.has_multiple_indices = True

            # 继续处理后续的维度
            rest1_next = self.parse_rest1(rest1_attr)

            # 合并结果
            rest1_attr.array = rest1_next.array
            rest1_attr.offset = rest1_next.offset
            rest1_attr.dims = rest1_next.dims

        else:
            # rest1 ⟶ ε
            self.add_production("rest1 ⟶ ε")

            # 更新推导
            self.update_derivation("rest1", [])

            # 语义动作
            rest1_attr.array = inherited_attr.inArray
            rest1_attr.offset = inherited_attr.inPlace

        return rest1_attr

    def parse_bool(self):
        """解析bool"""
        # bool ⟶ equality
        self.add_production("bool ⟶ equality")

        # 更新推导
        self.update_derivation("bool", ["equality"])

        # 语义动作
        equality_attr = self.parse_equality()

        # 为bool创建属性
        bool_attr = Attribute()
        bool_attr.truelist = equality_attr.truelist
        bool_attr.falselist = equality_attr.falselist

        return bool_attr

    def parse_equality(self):
        """解析equality"""
        # equality ⟶ rel rest4
        self.add_production("equality ⟶ rel rest4")

        # 更新推导
        self.update_derivation("equality", ["rel", "rest4"])

        # 语义动作
        rel_attr = self.parse_rel()
        rest4_attr = Attribute()
        rest4_attr.inTruelist = rel_attr.truelist
        rest4_attr.inFalselist = rel_attr.falselist
        rest4_attr = self.parse_rest4(rest4_attr)

        # 为equality创建属性
        equality_attr = Attribute()
        equality_attr.truelist = rest4_attr.truelist
        equality_attr.falselist = rest4_attr.falselist

        return equality_attr

    def parse_rest4(self, inherited_attr):
        """解析rest4"""
        rest4_attr = Attribute()

        if self.current_token and self.current_token.type == 'EQ':
            # rest4 ⟶ == rel rest41
            self.add_production("rest4 ⟶ == rel rest41")

            # 更新推导
            self.update_derivation("rest4", ["==", "rel", "rest41"])

            # 语义动作 - 这里简化处理
            self.match('EQ')
            self.parse_rel()
            # 处理rest41...

        elif self.current_token and self.current_token.type == 'NE':
            # rest4 ⟶ != rel rest41
            self.add_production("rest4 ⟶ != rel rest41")

            # 更新推导
            self.update_derivation("rest4", ["!=", "rel", "rest41"])

            # 语义动作 - 这里简化处理
            self.match('NE')
            self.parse_rel()
            # 处理rest41...

        else:
            # rest4 ⟶ ε
            self.add_production("rest4 ⟶ ε")

            # 更新推导
            self.update_derivation("rest4", [])

            # 语义动作
            rest4_attr.truelist = inherited_attr.inTruelist
            rest4_attr.falselist = inherited_attr.inFalselist

        return rest4_attr

    def parse_rel(self):
        """解析rel"""
        # rel ⟶ expr rop_expr
        self.add_production("rel ⟶ expr rop_expr")

        # 更新推导
        self.update_derivation("rel", ["expr", "rop_expr"])

        # 语义动作
        expr_attr = self.parse_expr()
        rop_expr_attr = Attribute()
        rop_expr_attr.inPlace = expr_attr.place
        rop_expr_attr = self.parse_rop_expr(rop_expr_attr)

        # 为rel创建属性
        rel_attr = Attribute()
        rel_attr.truelist = rop_expr_attr.truelist
        rel_attr.falselist = rop_expr_attr.falselist

        return rel_attr

    def parse_rop_expr(self, inherited_attr):
        """解析rop_expr"""
        rop_expr_attr = Attribute()

        if self.current_token and self.current_token.type == 'LT':
            # rop_expr ⟶ < expr
            self.add_production("rop_expr ⟶ < expr")

            # 更新推导
            self.update_derivation("rop_expr", ["<", "expr"])

            # 语义动作
            self.match('LT')
            expr_attr = self.parse_expr()

            # 生成条件跳转四元式
            rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
            rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
            self.semantic.emit("j<", inherited_attr.inPlace, expr_attr.place, "-")
            self.semantic.emit("j", "-", "-", "-")

        elif self.current_token and self.current_token.type == 'GT':
            # rop_expr ⟶ > expr
            self.add_production("rop_expr ⟶ > expr")

            # 更新推导
            self.update_derivation("rop_expr", [">", "expr"])

            # 语义动作
            self.match('GT')
            expr_attr = self.parse_expr()

            # 生成条件跳转四元式
            rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
            rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
            self.semantic.emit("j>", inherited_attr.inPlace, expr_attr.place, "-")
            self.semantic.emit("j", "-", "-", "-")

        elif self.current_token and self.current_token.type == 'LE':
            # rop_expr ⟶ <= expr
            self.add_production("rop_expr ⟶ <= expr")

            # 更新推导
            self.update_derivation("rop_expr", ["<=", "expr"])

            # 语义动作
            self.match('LE')
            expr_attr = self.parse_expr()

            # 生成条件跳转四元式
            rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
            rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
            self.semantic.emit("j<=", inherited_attr.inPlace, expr_attr.place, "-")
            self.semantic.emit("j", "-", "-", "-")

        elif self.current_token and self.current_token.type == 'GE':
            # rop_expr ⟶ >= expr
            self.add_production("rop_expr ⟶ >= expr")

            # 更新推导
            self.update_derivation("rop_expr", [">=", "expr"])

            # 语义动作
            self.match('GE')
            expr_attr = self.parse_expr()

            # 生成条件跳转四元式
            rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
            rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
            self.semantic.emit("j>=", inherited_attr.inPlace, expr_attr.place, "-")
            self.semantic.emit("j", "-", "-", "-")

        else:
            # rop_expr ⟶ ε
            self.add_production("rop_expr ⟶ ε")

            # 更新推导
            self.update_derivation("rop_expr", [])

            # 语义动作 - 在这里创建一个默认的条件测试
            rop_expr_attr.truelist = self.semantic.makelist(self.semantic.nextquad())
            rop_expr_attr.falselist = self.semantic.makelist(self.semantic.nextquad() + 1)
            # 默认条件:非0为真
            self.semantic.emit("jnz", inherited_attr.inPlace, "-", "-")
            self.semantic.emit("j", "-", "-", "-")

        return rop_expr_attr

    def parse_expr(self):
        """解析expr"""
        # expr ⟶ term rest5
        self.add_production("expr ⟶ term rest5")

        # 更新推导
        self.update_derivation("expr", ["term", "rest5"])

        # 语义动作
        term_attr = self.parse_term()
        rest5_attr = Attribute()
        rest5_attr.inPlace = term_attr.place
        rest5_attr = self.parse_rest5(rest5_attr)

        # 为expr创建属性
        expr_attr = Attribute()
        expr_attr.place = rest5_attr.place

        return expr_attr

    def parse_rest5(self, inherited_attr):
        """解析rest5"""
        rest5_attr = Attribute()

        if self.current_token and self.current_token.type == 'PLUS':
            # rest5 ⟶ + term rest51
            self.add_production("rest5 ⟶ + term rest5")

            # 更新推导
            self.update_derivation("rest5", ["+", "term", "rest5"])

            # 语义动作
            self.match('PLUS')
            term_attr = self.parse_term()

            # 生成加法四元式
            rest51_attr = Attribute()
            rest51_attr.inPlace = self.semantic.newtemp()
            self.semantic.emit("+", inherited_attr.inPlace, term_attr.place, rest51_attr.inPlace)
            rest51_attr = self.parse_rest5(rest51_attr)

            # 为rest5创建属性
            rest5_attr.place = rest51_attr.place

        elif self.current_token and self.current_token.type == 'MINUS':
            # rest5 ⟶ - term rest51
            self.add_production("rest5 ⟶ - term rest5")

            # 更新推导
            self.update_derivation("rest5", ["-", "term", "rest5"])

            # 语义动作
            self.match('MINUS')
            term_attr = self.parse_term()

            # 生成减法四元式
            rest51_attr = Attribute()
            rest51_attr.inPlace = self.semantic.newtemp()
            self.semantic.emit("-", inherited_attr.inPlace, term_attr.place, rest51_attr.inPlace)
            rest51_attr = self.parse_rest5(rest51_attr)

            # 为rest5创建属性
            rest5_attr.place = rest51_attr.place

        else:
            # rest5 ⟶ ε
            self.add_production("rest5 ⟶ ε")

            # 更新推导
            self.update_derivation("rest5", [])

            # 语义动作
            rest5_attr.place = inherited_attr.inPlace

        return rest5_attr

    def parse_term(self):
        """解析term"""
        # term ⟶ unary rest6
        self.add_production("term ⟶ unary rest6")

        # 更新推导
        self.update_derivation("term", ["unary", "rest6"])

        # 语义动作
        unary_attr = self.parse_unary()
        rest6_attr = Attribute()
        rest6_attr.inPlace = unary_attr.place
        rest6_attr = self.parse_rest6(rest6_attr)

        # 为term创建属性
        term_attr = Attribute()
        term_attr.place = rest6_attr.place

        return term_attr

    def parse_rest6(self, inherited_attr):
        """解析rest6"""
        rest6_attr = Attribute()

        if self.current_token and self.current_token.type == 'MUL':
            # rest6 ⟶ * unary rest61
            self.add_production("rest6 ⟶ * unary rest6")

            # 更新推导
            self.update_derivation("rest6", ["*", "unary", "rest6"])

            # 语义动作
            self.match('MUL')
            unary_attr = self.parse_unary()

            # 生成乘法四元式
            rest61_attr = Attribute()
            rest61_attr.inPlace = self.semantic.newtemp()
            self.semantic.emit("*", inherited_attr.inPlace, unary_attr.place, rest61_attr.inPlace)
            rest61_attr = self.parse_rest6(rest61_attr)

            # 为rest6创建属性
            rest6_attr.place = rest61_attr.place

        elif self.current_token and self.current_token.type == 'DIV':
            # rest6 ⟶ / unary rest61
            self.add_production("rest6 ⟶ / unary rest6")

            # 更新推导
            self.update_derivation("rest6", ["/", "unary", "rest6"])

            # 语义动作
            self.match('DIV')
            unary_attr = self.parse_unary()

            # 生成除法四元式
            rest61_attr = Attribute()
            rest61_attr.inPlace = self.semantic.newtemp()
            self.semantic.emit("/", inherited_attr.inPlace, unary_attr.place, rest61_attr.inPlace)
            rest61_attr = self.parse_rest6(rest61_attr)

            # 为rest6创建属性
            rest6_attr.place = rest61_attr.place

        else:
            # rest6 ⟶ ε
            self.add_production("rest6 ⟶ ε")

            # 更新推导
            self.update_derivation("rest6", [])

            # 语义动作
            rest6_attr.place = inherited_attr.inPlace

        return rest6_attr

    def parse_unary(self):
        """解析unary"""
        # unary ⟶ factor
        self.add_production("unary ⟶ factor")

        # 更新推导
        self.update_derivation("unary", ["factor"])

        # 语义动作
        factor_attr = self.parse_factor()

        # 为unary创建属性
        unary_attr = Attribute()
        unary_attr.place = factor_attr.place

        return unary_attr

    def parse_factor(self):
        """解析factor"""
        factor_attr = Attribute()

        if not self.current_token:
            raise SyntaxError("语法错误: 意外的文件结束,期望一个因子")

        if self.current_token.type == 'NUM':
            # factor ⟶ num
            self.add_production("factor ⟶ num")

            # 更新推导
            self.update_derivation("factor", ["num"])

            # 语义动作
            num_token = self.match('NUM')
            factor_attr.place = num_token.value

        elif self.current_token.type == 'LPAREN':
            # factor ⟶ ( expr )
            self.add_production("factor ⟶ ( expr )")

            # 更新推导
            self.update_derivation("factor", ["(", "expr", ")"])

            # 语义动作
            self.match('LPAREN')
            expr_attr = self.parse_expr()
            self.match('RPAREN')
            factor_attr.place = expr_attr.place

        elif self.current_token.type == 'ID':
            # factor ⟶ loc
            self.add_production("factor ⟶ loc")

            # 更新推导
            self.update_derivation("factor", ["loc"])

            # 语义动作
            loc_attr = self.parse_loc()

            if loc_attr.offset is None:
                factor_attr.place = loc_attr.place
            else:
                factor_attr.place = self.semantic.newtemp()
                # 生成数组元素访问四元式
                self.semantic.emit("=[]", f"{loc_attr.place}[{loc_attr.offset}]", "-", factor_attr.place)

        else:
            raise SyntaxError(f"语法错误: 无效的因子: {self.current_token.type}")

        return factor_attr

    def update_derivation(self, non_terminal, replacement):
        """更新当前推导句型,用replacement替换第一个出现的non_terminal"""
        # 找到第一个出现的非终结符并替换
        for i, symbol in enumerate(self.current_sentential):
            if symbol == non_terminal:
                # 替换non_terminal为replacement
                new_sentential = self.current_sentential[:i] + replacement + self.current_sentential[i + 1:]
                self.current_sentential = new_sentential

                # 添加到推导步骤
                if replacement:  # 如果不是空生成式
                    self.add_derivation(self.current_sentential)

                return True

        return False

    def print_result(self):
        """打印分析结果"""
        print("\n1) 按使用产生式过程")
        for i, step in enumerate(self.production_steps, 1):
            print(f"({i}){step}")

        print("\n2) 按推导过程")
        for i, step in enumerate(self.derivation_steps, 1):
            print(f"({i}) {step}")

    def save_result(self, file_name):
        """保存分析结果到文件"""
        try:
            with open(file_name, 'w', encoding='utf-8') as f:
                f.write("1) 按使用产生式过程\n")
                for i, step in enumerate(self.production_steps, 1):
                    f.write(f"({i}){step}\n")

                f.write("\n2) 按推导过程\n")
                for i, step in enumerate(self.derivation_steps, 1):
                    f.write(f"({i}) {step}\n")
            return True
        except Exception as e:
            print(f"保存文件时出错: {e}")
            return False


# 读取测试文件
try:
    with open("test5.txt", "r", encoding="utf-8") as file, open('词法分析.txt', 'w') as out_f:
        print('词法分析:')
        content = file.read()
        file.seek(0)  # 重置文件指针到开始位置
        ch = file.read(1)
        while ch:
            if ch.isspace():
                ch = file.read(1)
                continue

            res = ""
            if ch.isalpha() or ch == '_':
                ch, res = Recognizestr(ch)
            elif ch.isdigit():
                ch, res = RecognizeDigit(ch)
            elif ch in {'+', '-', '*', '/', '%', '=', '>', '<', '!'}:
                ch, res = Recognizeop(ch)
            else:
                ch, res = Recognizeoth(ch)
            # 统一输出
            print(res)
            out_f.write(res + '\n')

        print('语法分析:')
        # 重置读取文件到开始的位置
        file.seek(0)
        code = file.read()
        if not code.strip():
            print("警告: test5.txt文件为空或只包含空白字符")
except Exception as e:
    print(f"读取文件时出错: {e}")
    code = "x=A[i,j];"  # 默认测试用例

# 创建分析器并解析代码
analyzer = CompleteSyntaxSemanticAnalyzer()
success = analyzer.parse(code)

if success:
    # 输出分析结果
    analyzer.print_result()
    if analyzer.save_result("语法分析.txt"):
        print('\n语法分析完成,结果已保存到"语法分析.txt"文件中。')
    else:
        print("\n语法分析完成,但保存文件时出错。")
else:
    print("\n语法分析失败。")

print('语义分析:')
print("\n生成的四元式:")
analyzer.semantic.print_quadruples("语义分析.txt")

三、总结

        该实验实现了递归下降翻译器,对测试代码进行词法、语法和语义分析,并生成四元式中间代码。系统包含三个模块:词法分析识别各类符号并输出token序列;语法分析采用递归下降法,记录产生式推导过程;语义分析通过语法制导翻译生成四元式,处理控制流回填和数组访问。关键数据结构包括四元式表、属性结构体和临时变量管理,支持布尔表达式跳转、赋值语句和数组元素访问的翻译。测试案例展示了完整的分析流程,输出结果包含三阶段分析报告。该系统实现了从源代码到中间代码的完整翻译过程。

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐