oracle 00283,ORA-00283: recovery session canceled due to errors
某客户的核心数据库存储问题导致数据库重启后无法正常启动,根据客户反馈最开始在启动数据库时报错控制文件IO错误,如下:Sun Mar 15 11:59:37 2015Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:ORA-00204: error in reading (block 1, # bloc
某客户的核心数据库存储问题导致数据库重启后无法正常启动,根据客户反馈最开始在启动数据库时
报错控制文件IO错误,如下:
Sun Mar 15 11:59:37 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '/xxx/xxxx/control01.ctl'
ORA-17500: ODM err:ODM ERROR V-41-4-2-43-6 No such device or address
Sun Mar 15 11:59:37 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '/xxx/xxxx/control01.ctl'
ORA-17500: ODM err:ODM ERROR V-41-4-2-43-6 No such device or address
Sun Mar 15 11:59:37 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '/xxx/xxxx/control01.ctl'
ORA-17500: ODM err:ODM ERROR V-41-4-2-43-6 No such device or address
Sun Mar 15 11:59:37 2015
Master background archival failure: 204
Sun Mar 15 11:59:49 2015
Termination issued to instance processes. Waiting for the processes to exit
Sun Mar 15 15:40:09 2015
Starting ORACLE instance (normal)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
SunMar1511:59:372015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204:errorinreading(block1,# blocks 1) of control file
ORA-00202:controlfile:'/xxx/xxxx/control01.ctl'
ORA-17500:ODMerr:ODMERRORV-41-4-2-43-6Nosuchdeviceoraddress
SunMar1511:59:372015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204:errorinreading(block1,# blocks 1) of control file
ORA-00202:controlfile:'/xxx/xxxx/control01.ctl'
ORA-17500:ODMerr:ODMERRORV-41-4-2-43-6Nosuchdeviceoraddress
SunMar1511:59:372015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_arc1_630876.trc:
ORA-00204:errorinreading(block1,# blocks 1) of control file
ORA-00202:controlfile:'/xxx/xxxx/control01.ctl'
ORA-17500:ODMerr:ODMERRORV-41-4-2-43-6Nosuchdeviceoraddress
SunMar1511:59:372015
Masterbackgroundarchivalfailure:204
SunMar1511:59:492015
Terminationissuedtoinstanceprocesses.Waitingfortheprocessestoexit
SunMar1515:40:092015
StartingORACLEinstance(normal)
上述的问题本质上都跟控制文件有关系,替换掉损坏的控制文件就行。当替换掉控制文件之后,在open数据库时发现报如下错误:
Sun Mar 15 16:10:48 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_626734.trc:
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x70000038F8C94E0], [2], [], [], [], [], []
Abort recovery for domain 0
Sun Mar 15 16:10:49 2015
Aborting crash recovery due to error 600
Sun Mar 15 16:10:49 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_626734.trc:
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x70000038F8C94E0], [2], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...
Sun Mar 15 16:10:49 2015
Trace dumping is performing id=[cdmp_20150315161049]
Sun Mar 15 16:12:35 2015
Shutting down instance: further logons disabled
Sun Mar 15 16:12:35 2015
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SunMar1516:10:482015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_626734.trc:
ORA-00600:internalerrorcode,arguments:[kcrfr_update_nab_2],[0x70000038F8C94E0],[2],[],[],[],[],[]
Abortrecoveryfordomain0
SunMar1516:10:492015
Abortingcrashrecoveryduetoerror600
SunMar1516:10:492015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_626734.trc:
ORA-00600:internalerrorcode,arguments:[kcrfr_update_nab_2],[0x70000038F8C94E0],[2],[],[],[],[],[]
ORA-600signalledduring:ALTERDATABASEOPEN...
SunMar1516:10:492015
Tracedumpingisperformingid=[cdmp_20150315161049]
SunMar1516:12:352015
Shuttingdowninstance:furtherlogonsdisabled
SunMar1516:12:352015
该错误本质上是因为redo的问题,即有redo log损坏。通过在RMAN进行recover,发现报如下类似错误:
Sun Mar 15 16:47:59 2015
Beginning crash recovery of 2 threads
parallel recovery setup failed: using serial mode
Sun Mar 15 16:47:59 2015
Started redo scan
Sun Mar 15 16:47:59 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00313: open failed for members of log group 5 of thread 2
ORA-00312: online log 5 thread 2: '/xxx/xxxx/redo05a.log'
ORA-17503: ksfdopn:4 Failed to open file /xxx/xxxx/redo05a.log
ORA-17500: ODM err:File does not exist
Sun Mar 15 16:47:59 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00313: open failed for members of log group 4 of thread 1
ORA-00312: online log 4 thread 1: '/xxx/xxxx/redo04a.log'
ORA-17503: ksfdopn:4 Failed to open file /xxx/xxxx/redo04a.log
ORA-17500: ODM err:File does not exist
Sun Mar 15 17:03:03 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2009344 change 14160745159583 time 03/15/2015 11:56:29
ORA-00334: archived log: '/xxx/xxxx/redo04b.log'
Sun Mar 15 17:03:03 2015
Abort recovery for domain 0
Sun Mar 15 17:03:03 2015
Aborting crash recovery due to error 354
Sun Mar 15 17:03:03 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2009344 change 14160745159583 time 03/15/2015 11:56:29
ORA-00312: online log 4 thread 1: '/xxx/xxxx/redo04b.log'
ORA-354 signalled during: ALTER DATABASE OPEN...
Sun Mar 15 17:08:02 2015
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
SunMar1516:47:592015
Beginningcrashrecoveryof2threads
parallelrecoverysetupfailed:usingserialmode
SunMar1516:47:592015
Startedredoscan
SunMar1516:47:592015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00313:openfailedformembersofloggroup5ofthread2
ORA-00312:onlinelog5thread2:'/xxx/xxxx/redo05a.log'
ORA-17503:ksfdopn:4Failedtoopenfile/xxx/xxxx/redo05a.log
ORA-17500:ODMerr:Filedoesnotexist
SunMar1516:47:592015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00313:openfailedformembersofloggroup4ofthread1
ORA-00312:onlinelog4thread1:'/xxx/xxxx/redo04a.log'
ORA-17503:ksfdopn:4Failedtoopenfile/xxx/xxxx/redo04a.log
ORA-17500:ODMerr:Filedoesnotexist
SunMar1517:03:032015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00354:corruptredologblockheader
ORA-00353:logcorruptionnearblock2009344change14160745159583time03/15/201511:56:29
ORA-00334:archivedlog:'/xxx/xxxx/redo04b.log'
SunMar1517:03:032015
Abortrecoveryfordomain0
SunMar1517:03:032015
Abortingcrashrecoveryduetoerror354
SunMar1517:03:032015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_299470.trc:
ORA-00354:corruptredologblockheader
ORA-00353:logcorruptionnearblock2009344change14160745159583time03/15/201511:56:29
ORA-00312:onlinelog4thread1:'/xxx/xxxx/redo04b.log'
ORA-354signalledduring:ALTERDATABASEOPEN...
SunMar1517:08:022015
上述过程大致是客户之前的处理过程。我在18点左右介入之后,进行了相关的操作。我最开始尝试在利用RMAN 进行恢复,发现报错:
RMAN> recover database;
Starting recover at 15-MAR-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=3268 instance=xxxx2 devtype=DISK
starting media recovery
media recovery failed
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 03/15/2015 18:39:26
ORA-00283: recovery session canceled due to errors
RMAN-11003: failure during parse/execution of SQL statement: alter database recover if needed
start
ORA-00283: recovery session canceled due to errors
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1788672 change 14160744248478 time 03/15/2015 11:54:46
ORA-00312: online log 4 thread 1: '/xxx/xxxx/redo04a.log'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
RMAN>recoverdatabase;
Startingrecoverat15-MAR-15
usingtargetdatabasecontrolfileinsteadofrecoverycatalog
allocatedchannel:ORA_DISK_1
channelORA_DISK_1:sid=3268instance=xxxx2devtype=DISK
startingmediarecovery
mediarecoveryfailed
RMAN-00571:===========================================================
RMAN-00569:===============ERRORMESSAGESTACKFOLLOWS===============
RMAN-00571:===========================================================
RMAN-03002:failureofrecovercommandat03/15/201518:39:26
ORA-00283:recoverysessioncanceledduetoerrors
RMAN-11003:failureduringparse/executionofSQLstatement:alterdatabaserecoverifneeded
start
ORA-00283:recoverysessioncanceledduetoerrors
ORA-00354:corruptredologblockheader
ORA-00353:logcorruptionnearblock1788672change14160744248478time03/15/201511:54:46
ORA-00312:onlinelog4thread1:'/xxx/xxxx/redo04a.log'
从上面的错误来看,初步可以判断redo04a.log文件已经损坏,而且是block 1788672的问题。为了验证该block是否损坏,我通过类似如下的dump 命令进行dump,发现报错:
alter system dump logfile 'xxx' scn min xxxx scn max xxxx;
1
altersystemdumplogfile'xxx'scnminxxxxscnmaxxxxx;
由此判断,该block损坏无疑。 由于客户的需求是尽可能快的将数据库拉起来,因此对应redo损坏的情况之下。
通常只能进程不完全恢复并强制打开,这里我使用了如下的参数:
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
1
2
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
在open resetlogs之前,我已经将redo备份,resetlogs打开时,发现数据库报错如下:
Sun Mar 15 19:43:36 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sun Mar 15 19:43:36 2015
SMON: enabling cache recovery
Sun Mar 15 19:43:37 2015
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Sun Mar 15 19:43:37 2015
ORA-01555 caused by SQL statement below (SQL ID: 5wc2915k44m38, Query Duration=0 sec, SCN: 0x0ce1.0e2d8971):
Sun Mar 15 19:43:37 2015
select user#,type# from user$ where name=:1
Sun Mar 15 19:43:37 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_667814.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 25 with name "_SYSSMU25$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 667814
ORA-1092 signalled during: alter database open resetlogs...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SunMar1519:43:362015
MTTRadvisoryisdisabledbecauseFAST_START_MTTR_TARGETisnotset
SunMar1519:43:362015
SMON:enablingcacherecovery
SunMar1519:43:372015
Instancerecovery:lookingfordeadthreads
Instancerecovery:lockdomaininvalidbutnodeadthreads
SunMar1519:43:372015
ORA-01555causedbySQLstatementbelow(SQLID:5wc2915k44m38,QueryDuration=0sec,SCN:0x0ce1.0e2d8971):
SunMar1519:43:372015
selectuser#,type# from user$ where name=:1
SunMar1519:43:372015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_667814.trc:
ORA-00704:bootstrapprocessfailure
ORA-00604:erroroccurredatrecursiveSQLlevel1
ORA-01555:snapshottooold:rollbacksegmentnumber25withname"_SYSSMU25$"toosmall
Error704happenedduringdbopen,shuttingdowndatabase
USER:terminatinginstanceduetoerror704
InstanceterminatedbyUSER,pid=667814
ORA-1092signalledduring:alterdatabaseopenresetlogs...
从日志来看,大致判断可能是_SYSSMU25$ 回滚段的问题,因此尝试先通过如下隐含参数屏蔽回滚段:
_corrupted_rollback_segments=_SYSSMU25$
_offline_rollback_segments=_SYSSMU25$
1
2
_corrupted_rollback_segments=_SYSSMU25$
_offline_rollback_segments=_SYSSMU25$
屏蔽回滚段之后,尝试打开数据库,发现错误依旧,通过10046 trace跟踪,发现递归SQL在如下的block上执行失败:
PARSING IN CURSOR #3 len=43 dep=1 uid=0 oct=3 lid=0 tim=37951056727245 hv=1682066536 ad='8cb74a90'
select user#,type# from user$ where name=:1
END OF STMT
PARSE #3:c=0,e=372,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=37951056727243
BINDS #3:
kkscoacd
Bind#0
oacdty=01 mxl=32(03) mxlc=00 mal=00 scl=00 pre=00
oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
kxsbbbfp=1126d4b70 bln=32 avl=03 flg=05
value="XDB"
EXEC #3:c=0,e=465,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=37951056727780
WAIT #3: nam='db file sequential read' ela= 1582 file#=1 block#=282 blocks=1 obj#=44 tim=37951056729421
WAIT #3: nam='db file sequential read' ela= 6642 file#=1 block#=91 blocks=1 obj#=22 tim=37951056736126
1
2
3
4
5
6
7
8
9
10
11
12
13
14
PARSINGINCURSOR#3 len=43 dep=1 uid=0 oct=3 lid=0 tim=37951056727245 hv=1682066536 ad='8cb74a90'
selectuser#,type# from user$ where name=:1
ENDOFSTMT
PARSE#3:c=0,e=372,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=37951056727243
BINDS#3:
kkscoacd
Bind#0
oacdty=01mxl=32(03)mxlc=00mal=00scl=00pre=00
oacflg=18fl2=0001frm=01csi=852siz=32off=0
kxsbbbfp=1126d4b70bln=32avl=03flg=05
value="XDB"
EXEC#3:c=0,e=465,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,tim=37951056727780
WAIT#3: nam='db file sequential read' ela= 1582 file#=1 block#=282 blocks=1 obj#=44 tim=37951056729421
WAIT#3: nam='db file sequential read' ela= 6642 file#=1 block#=91 blocks=1 obj#=22 tim=37951056736126
通过dump file 1 block 91,发现该block上第2个ITL确认存在一个活跃事务。原本计划直接bbed提交该事务,但是当我编译好bbed之后,查看发现该block为一个cluster block.
对于cluster block的事务修改,相对复杂一些,我的博客有文章描述,大家可以参考,这里不多说。考虑到生产库使用bbed有一定的风险,我并没有使用bbed。
接着使用undo_management参数启动数据库,然后强制open数据库,发现错误变成如下:
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_778430.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238525189], [3297], [238091117], [], [], []
Sun Mar 15 20:50:52 2015
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Sun Mar 15 20:50:53 2015
Redo thread 1 internally disabled at seq 1 (CKPT)
Sun Mar 15 20:50:53 2015
ARC1: Archiving disabled thread 1 sequence 1
Sun Mar 15 20:50:54 2015
Trace dumping is performing id=[cdmp_20150315205054]
Sun Mar 15 20:50:54 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_778430.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238525189], [3297], [238091117], [], [], []
Sun Mar 15 20:50:54 2015
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 778430
ORA-1092 signalled during: alter database open resetlogs...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_778430.trc:
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238525189],[3297],[238091117],[],[],[]
SunMar1520:50:522015
Instancerecovery:lookingfordeadthreads
Instancerecovery:lockdomaininvalidbutnodeadthreads
SunMar1520:50:532015
Redothread1internallydisabledatseq1(CKPT)
SunMar1520:50:532015
ARC1:Archivingdisabledthread1sequence1
SunMar1520:50:542015
Tracedumpingisperformingid=[cdmp_20150315205054]
SunMar1520:50:542015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_778430.trc:
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238525189],[3297],[238091117],[],[],[]
SunMar1520:50:542015
Error600happenedduringdbopen,shuttingdowndatabase
USER:terminatinginstanceduetoerror600
InstanceterminatedbyUSER,pid=778430
ORA-1092signalledduring:alterdatabaseopenresetlogs...
从错误来看,我们就可以知道,这应该是SCN的问题。如果要手工推进SCN,那么level应该待遇3297*4才行,由于这里的238091117/1024/1024/1024小于1,因此推进scn时,level=3297*4+2 就差不多了。 这里我再次进行了10046 trace,发现了如下信息:
WAIT #5: nam='db file sequential read' ela= 1021 file#=1 block#=400 blocks=1 obj#=0 tim=37953716489772
EXEC #5:c=0,e=2969,p=1,cr=2,cu=3,mis=1,r=1,dep=1,og=4,tim=37953716490098
STAT #5 id=1 cnt=1 pid=0 pos=1 obj=0 op='UPDATE UNDO$ (cr=2 pr=1 pw=0 time=1542 us)'
STAT #5 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=2 pr=0 pw=0 time=11 us)'
WAIT #1: nam='row cache lock' ela= 71 cache id=3 mode=0 request=5 obj#=0 tim=37953716490369
WAIT #1: nam='db file sequential read' ela= 5783 file#=2 block#=25 blocks=1 obj#=0 tim=37953716496201
........
........
GLOBAL CACHE ELEMENT DUMP (address: 700000011f91498):
id1: 0x19 id2: 0x20000 obj: US#2 block: (2/25)
lock: SL rls: 0x0000 acq: 0x0000 latch: 3
flags: 0x41 fair: 0 recovery: 0 fpin: 'ktuwh02: ktugus'
bscn: 0x0.0 bctx: 0 write: 0 scan: 0x0 xflg: 0 xid: 0x0.0.0
lcp: 0 lnk: [NULL] lch: [70000023bf4bc20,70000023bf4bc20]
seq: 3 hist: 143:0 208 352
LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
flg: 0x00080000 state: READING mode: EXCL
pin: 'ktuwh02: ktugus'
addr: 70000023bf4bb10 obj: INVALID cls: UNDO HEAD bscn: 0xce1.e379b05 ---这里的bscn即scn值
GCS SHADOW 700000011f91508,1 sq[70000035fb339f8,70000035fb339f8] resp[70000035fb339d0,0x19.20000] pkey 4294950914
grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCAL
master 1 owner 1 sid 0 remote[0,0] hist 0x106
history 0x6.0x4.0x0.0x0.0x0.0x0. cflag 0x0 sender 0 flags 0x0 replay# 0
disk: 0x0000.00000000 write request: 0x0000.00000000
pi scn: 0x0000.00000000
msgseq 0x0 updseq 0x0 reqids[1,0,0] infop 0x0
GCS RESOURCE 70000035fb339d0 hashq [70000038cbc6658,70000038cbc6658] name[0x19.20000] pkey 4294950914
grant 700000011f91508 cvt 0 send 0,0 write 0,0@65535
flag 0x0 mdrole 0x1 mode 1 scan 0 role LOCAL
disk: 0x0000.00000000 write: 0x0000.00000000 cnt 0x0 hist 0x0
xid 0x0000.000.00000000 sid 0 pkwait 59s
pkey 4294950914
hv 0 [stat 0x0, 1->1, wm 32767, RMno 0, remxxx 0, dom 0]
kjga st 0x4, step 0.0.0, cxxx 2, rmno 0, flags 0x0
lb 0, hb 0, myb 6147, drmb 6147, apifrz 0
GCS SHADOW 700000011f91508,1 sq[70000035fb339f8,70000035fb339f8] resp[70000035fb339d0,0x19.20000] pkey 4294950914
grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCAL
master 1 owner 1 sid 0 remote[0,0] hist 0x106
history 0x6.0x4.0x0.0x0.0x0.0x0. cflag 0x0 sender 0 flags 0x0 replay# 0
disk: 0x0000.00000000 write request: 0x0000.00000000
pi scn: 0x0000.00000000
msgseq 0x0 updseq 0x0 reqids[1,0,0] infop 0x0
kjbmbassert [0x19.20000]
*** 2015-03-15 20:54:54.385
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238525189], [3297], [238091117], [], [], []
Current SQL statement for this session:
alter database open resetlogs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
WAIT#5: nam='db file sequential read' ela= 1021 file#=1 block#=400 blocks=1 obj#=0 tim=37953716489772
EXEC#5:c=0,e=2969,p=1,cr=2,cu=3,mis=1,r=1,dep=1,og=4,tim=37953716490098
STAT#5 id=1 cnt=1 pid=0 pos=1 obj=0 op='UPDATE UNDO$ (cr=2 pr=1 pw=0 time=1542 us)'
STAT#5 id=2 cnt=1 pid=1 pos=1 obj=34 op='INDEX UNIQUE SCAN I_UNDO1 (cr=2 pr=0 pw=0 time=11 us)'
WAIT#1: nam='row cache lock' ela= 71 cache id=3 mode=0 request=5 obj#=0 tim=37953716490369
WAIT#1: nam='db file sequential read' ela= 5783 file#=2 block#=25 blocks=1 obj#=0 tim=37953716496201
........
........
GLOBALCACHEELEMENTDUMP(address:700000011f91498):
id1:0x19id2:0x20000obj:US#2 block: (2/25)
lock:SLrls:0x0000acq:0x0000latch:3
flags:0x41fair:0recovery:0fpin:'ktuwh02: ktugus'
bscn:0x0.0bctx:0write:0scan:0x0xflg:0xid:0x0.0.0
lcp:0lnk:[NULL]lch:[70000023bf4bc20,70000023bf4bc20]
seq:3hist:143:0208352
LISTOFBUFFERSLINKEDTOTHISGLOBALCACHEELEMENT:
flg:0x00080000state:READINGmode:EXCL
pin:'ktuwh02: ktugus'
addr:70000023bf4bb10obj:INVALIDcls:UNDOHEADbscn:0xce1.e379b05---这里的bscn即scn值
GCSSHADOW700000011f91508,1sq[70000035fb339f8,70000035fb339f8]resp[70000035fb339d0,0x19.20000]pkey4294950914
grant1cvt0mdrole0x21st0x40GRANTQrlLOCAL
master1owner1sid0remote[0,0]hist0x106
history0x6.0x4.0x0.0x0.0x0.0x0.cflag0x0sender0flags0x0replay# 0
disk:0x0000.00000000writerequest:0x0000.00000000
piscn:0x0000.00000000
msgseq0x0updseq0x0reqids[1,0,0]infop0x0
GCSRESOURCE70000035fb339d0hashq[70000038cbc6658,70000038cbc6658]name[0x19.20000]pkey4294950914
grant700000011f91508cvt0send0,0write0,0@65535
flag0x0mdrole0x1mode1scan0roleLOCAL
disk:0x0000.00000000write:0x0000.00000000cnt0x0hist0x0
xid0x0000.000.00000000sid0pkwait59s
pkey4294950914
hv0[stat0x0,1->1,wm32767,RMno0,remxxx0,dom0]
kjgast0x4,step0.0.0,cxxx2,rmno0,flags0x0
lb0,hb0,myb6147,drmb6147,apifrz0
GCSSHADOW700000011f91508,1sq[70000035fb339f8,70000035fb339f8]resp[70000035fb339d0,0x19.20000]pkey4294950914
grant1cvt0mdrole0x21st0x40GRANTQrlLOCAL
master1owner1sid0remote[0,0]hist0x106
history0x6.0x4.0x0.0x0.0x0.0x0.cflag0x0sender0flags0x0replay# 0
disk:0x0000.00000000writerequest:0x0000.00000000
piscn:0x0000.00000000
msgseq0x0updseq0x0reqids[1,0,0]infop0x0
kjbmbassert[0x19.20000]
***2015-03-1520:54:54.385
ksedmp:internalorfatalerror
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238525189],[3297],[238091117],[],[],[]
CurrentSQLstatementforthissession:
alterdatabaseopenresetlogs
bscn: 0xce1.e379b05 将该scn进行转换,我们可以发现:0xce1 为3297,e379b05为238525189. 与上述报错信息一致。同时我发现这里使用了第2号回滚段,如下:
id1: 0x19 id2: 0x20000 obj: US#2 block: (2/25)
1
id1:0x19id2:0x20000obj:US#2 block: (2/25)
因此,尝试继续使用隐含参数屏蔽这第2号回滚段,并尝试打开数据库,但是错误依旧。看来还是需要调整SCN才行,如下:
Sun Mar 15 21:23:20 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238958669], [3297], [238091118], [], [], []
Sun Mar 15 21:23:20 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238958669], [3297], [238091118], [], [], []
Sun Mar 15 21:23:21 2015
Errors in file /oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [3297], [238958669], [3297], [238091118], [], [], []
Sun Mar 15 21:23:21 2015
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 774222
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SunMar1521:23:202015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238958669],[3297],[238091118],[],[],[]
SunMar1521:23:202015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238958669],[3297],[238091118],[],[],[]
SunMar1521:23:212015
Errorsinfile/oracle/app1/oracle/admin/xxxx/udump/xxxx2_ora_774222.trc:
ORA-00600:internalerrorcode,arguments:[kclchkblk_4],[3297],[238958669],[3297],[238091118],[],[],[]
SunMar1521:23:212015
Error600happenedduringdbopen,shuttingdowndatabase
USER:terminatinginstanceduetoerror600
InstanceterminatedbyUSER,pid=774222
首先我尝试了在会话级别设置:
alter session set events '10015 trace name adjust_scn level 13190';
1
altersessionsetevents'10015 trace name adjust_scn level 13190';
发现alter database open失败,尝试使用*._minimum_giga_scn参数,但是在启动的时候,提示说该参数不支持。从此判断,该环境可能是安装了比较新的PSU,Oracle将该参数废弃掉了,这么说前面的10015 event根本就没起作用。 无奈只能通过oradebug手工修改SCN来启动数据库了,如下:
SQL> startup mount pfile='/tmp/gb.ora';
ORACLE instance started.
Total System Global Area 1.5032E+10 bytes
Fixed Size 2110096 bytes
Variable Size 5704256880 bytes
Database Buffers 9311354880 bytes
Redo Buffers 14663680 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [7000000100122A8, 7000000100122D8) = 00000000 00000005 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 07000000 ...
SQL>
SQL> oradebug poke 0x7000000100122A8 4 3300
BEFORE: [7000000100122A8, 7000000100122AC) = 00000000
AFTER: [7000000100122A8, 7000000100122AC) = 00000CE4
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [7000000100122A8, 7000000100122D8) = 00000CE4 00000005 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 07000000 ...
SQL>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
SQL>startupmountpfile='/tmp/gb.ora';
ORACLEinstancestarted.
TotalSystemGlobalArea1.5032E+10bytes
FixedSize2110096bytes
VariableSize5704256880bytes
DatabaseBuffers9311354880bytes
RedoBuffers14663680bytes
Databasemounted.
SQL>oradebugsetmypid
Statementprocessed.
SQL>oradebugDUMPvarSGAkcsgscn_
kcslfkcsgscn_[7000000100122A8,7000000100122D8)=0000000000000005000000000000000000000000000000000000000000000000000000000000000007000000...
SQL>
SQL>oradebugpoke0x7000000100122A843300
BEFORE:[7000000100122A8,7000000100122AC)=00000000
AFTER:[7000000100122A8,7000000100122AC)=00000CE4
SQL>oradebugDUMPvarSGAkcsgscn_
kcslfkcsgscn_[7000000100122A8,7000000100122D8)=00000CE400000005000000000000000000000000000000000000000000000000000000000000000007000000...
SQL>
修改SCN之后,顺利打开了数据库,但是数据库很快就crash掉,如下是日志信息:
Sun Mar 15 21:47:31 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
......
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:33 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], []
QMNC started with pid=32, OS id=520520
Sun Mar 15 21:47:35 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], []
Sun Mar 15 21:47:35 2015
ORACLE Instance xxxx2 (pid = 22) - Error 600 encountered while recovering transaction (44, 26) on object 47098.
Sun Mar 15 21:47:35 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], []
Sun Mar 15 21:47:36 2015
LOGSTDBY: Validating controlfile with logical metadata
Sun Mar 15 21:47:36 2015
LOGSTDBY: Validation complete
Sun Mar 15 21:47:36 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:36 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:36 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sun Mar 15 21:47:37 2015
ORACLE Instance xxxx2 (pid = 22) - Error 600 encountered while recovering transaction (48, 25).
Sun Mar 15 21:47:37 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sun Mar 15 21:47:39 2015
Completed: alter database open
Sun Mar 15 21:47:39 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:39 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:39 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sun Mar 15 21:47:40 2015
ORACLE Instance xxxx2 (pid = 22) - Error 600 encountered while recovering transaction (65, 7).
Sun Mar 15 21:47:40 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sun Mar 15 21:47:40 2015
Trace dumping is performing id=[cdmp_20150315214740]
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-12012: error on auto execute of job 524
ORA-01552: cannot use system rollback segment for non-system tablespace 'xxx_ADMIN'
ORA-06512: at "SYS.xxx_LOGINHISTORY", line 3
ORA-06512: at line 1
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-12012: error on auto execute of job 524
ORA-01552: cannot use system rollback segment for non-system tablespace 'xxx_ADMIN'
ORA-06512: at "SYS.xxx_LOGINHISTORY", line 3
ORA-06512: at line 1
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
Sun Mar 15 21:47:41 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sun Mar 15 21:47:42 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j008_586068.trc:
ORA-12012: error on auto execute of job 526
ORA-01552: cannot use system rollback segment for non-system tablespace 'xxx_ADMIN'
ORA-06512: at "SYS.xxx_SEG_xxx", line 3
ORA-06512: at line 1
Sun Mar 15 21:47:42 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-12012: error on auto execute of job 524
ORA-01552: cannot use system rollback segment for non-system tablespace 'xxx_ADMIN'
ORA-06512: at "SYS.xxx_LOGINHISTORY", line 3
ORA-06512: at line 1
Sun Mar 15 21:47:42 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Sun Mar 15 21:47:43 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-12012: error on auto execute of job 524
ORA-01552: cannot use system rollback segment for non-system tablespace 'XXXX_ADMIN'
ORA-06512: at "SYS.XXXX_LOGINHISTORY", line 3
ORA-06512: at line 1
Sun Mar 15 21:47:43 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 239, file 1, block 1674 (2)
Sun Mar 15 21:47:43 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '/xxx/xxxx/redo02a.log'
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Sun Mar 15 21:47:43 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-06512: at "xxxx.PKG_XXXXX", line 126
ORA-06512: at line 3
Sun Mar 15 21:47:43 2015
Errors in file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_pmon_565700.trc:
ORA-00474: SMON process terminated with error
Sun Mar 15 21:47:43 2015
PMON: terminating instance due to error 474
Sun Mar 15 21:47:47 2015
Dump system state for local instance only
System State dumped to trace file /oracle/app1/oracle/admin/xxxx/bdump/xxxx2_diag_377122.trc
Sun Mar 15 21:47:48 2015
Instance terminated by PMON, pid = 565700
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
SunMar1521:47:312015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
......
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:332015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[6006],[1],[],[],[],[],[],[]
QMNCstartedwithpid=32,OSid=520520
SunMar1521:47:352015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-00600:internalerrorcode,arguments:[6006],[1],[],[],[],[],[],[]
SunMar1521:47:352015
ORACLEInstancexxxx2(pid=22)-Error600encounteredwhilerecoveringtransaction(44,26)onobject47098.
SunMar1521:47:352015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[6006],[1],[],[],[],[],[],[]
SunMar1521:47:362015
LOGSTDBY:Validatingcontrolfilewithlogicalmetadata
SunMar1521:47:362015
LOGSTDBY:Validationcomplete
SunMar1521:47:362015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:362015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:362015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[4137],[],[],[],[],[],[],[]
SunMar1521:47:372015
ORACLEInstancexxxx2(pid=22)-Error600encounteredwhilerecoveringtransaction(48,25).
SunMar1521:47:372015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[4137],[],[],[],[],[],[],[]
SunMar1521:47:392015
Completed:alterdatabaseopen
SunMar1521:47:392015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:392015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:392015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[4137],[],[],[],[],[],[],[]
SunMar1521:47:402015
ORACLEInstancexxxx2(pid=22)-Error600encounteredwhilerecoveringtransaction(65,7).
SunMar1521:47:402015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[4137],[],[],[],[],[],[],[]
SunMar1521:47:402015
Tracedumpingisperformingid=[cdmp_20150315214740]
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-12012:erroronautoexecuteofjob524
ORA-01552:cannotusesystemrollbacksegmentfornon-systemtablespace'xxx_ADMIN'
ORA-06512:at"SYS.xxx_LOGINHISTORY",line3
ORA-06512:atline1
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-12012:erroronautoexecuteofjob524
ORA-01552:cannotusesystemrollbacksegmentfornon-systemtablespace'xxx_ADMIN'
ORA-06512:at"SYS.xxx_LOGINHISTORY",line3
ORA-06512:atline1
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00600:internalerrorcode,arguments:[kdsgrp1],[],[],[],[],[],[],[]
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
SunMar1521:47:412015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_smon_774386.trc:
ORA-00600:internalerrorcode,arguments:[4137],[],[],[],[],[],[],[]
SunMar1521:47:422015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j008_586068.trc:
ORA-12012:erroronautoexecuteofjob526
ORA-01552:cannotusesystemrollbacksegmentfornon-systemtablespace'xxx_ADMIN'
ORA-06512:at"SYS.xxx_SEG_xxx",line3
ORA-06512:atline1
SunMar1521:47:422015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-12012:erroronautoexecuteofjob524
ORA-01552:cannotusesystemrollbacksegmentfornon-systemtablespace'xxx_ADMIN'
ORA-06512:at"SYS.xxx_LOGINHISTORY",line3
ORA-06512:atline1
SunMar1521:47:422015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-00600:internalerrorcode,arguments:[kdsgrp1],[],[],[],[],[],[],[]
SunMar1521:47:432015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-12012:erroronautoexecuteofjob524
ORA-01552:cannotusesystemrollbacksegmentfornon-systemtablespace'XXXX_ADMIN'
ORA-06512:at"SYS.XXXX_LOGINHISTORY",line3
ORA-06512:atline1
SunMar1521:47:432015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j009_790778.trc:
ORA-00604:erroroccurredatrecursiveSQLlevel1
ORA-08102:indexkeynotfound,obj# 239, file 1, block 1674 (2)
SunMar1521:47:432015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00339:archivedlogdoesnotcontainanyredo
ORA-00334:archivedlog:'/xxx/xxxx/redo02a.log'
ORA-00600:internalerrorcode,arguments:[kdsgrp1],[],[],[],[],[],[],[]
SunMar1521:47:432015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_j002_475534.trc:
ORA-00600:internalerrorcode,arguments:[],[],[],[],[],[],[],[]
ORA-06512:at"xxxx.PKG_XXXXX",line126
ORA-06512:atline3
SunMar1521:47:432015
Errorsinfile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_pmon_565700.trc:
ORA-00474:SMONprocessterminatedwitherror
SunMar1521:47:432015
PMON:terminatinginstanceduetoerror474
SunMar1521:47:472015
Dumpsystemstateforlocalinstanceonly
SystemStatedumpedtotracefile/oracle/app1/oracle/admin/xxxx/bdump/xxxx2_diag_377122.trc
SunMar1521:47:482015
InstanceterminatedbyPMON,pid=565700
从上述日志信息来看,主要出现了如下几个错误:
ORA-00600 [6006],ORA-00600 [4137],ORA-00600 [kdsgrp1]
对于前面2个错误,很明显是Oracle SMON进程在进行利用回滚段进行事务rollback时失败导致,如下:
ORACLE Instance xxxx2 (pid = 22) – Error 600 encountered while recovering transaction (44, 26) on object 47098.
ORACLE Instance xxxx2 (pid = 22) – Error 600 encountered while recovering transaction (48, 25).
因此,不难看出,数据库中还有部分的回滚段存在活跃事务。
对于ORA-00600 [kdsgrp1]错误,通常是出现在Index上,即Index数据和表的数据不一致,一般来说可以通过重建解决。
其次,对于后面的ORA-08102: index key not found, obj# 239, file 1, block 1674 (2) 错误,主要是job调用出现,因此
我们可以暂时屏蔽job的调度。
对于ORA-08102错误,我的博客几年前也写过相关的文章,本质上也是Index block中的相关键值不存在导致。
与其如此,最后我感觉将数据库的所有回滚段都屏蔽掉,并重建数据库undo 表空间,如下是获取回滚段的命令:
strings system01.dbf | grep _SYSSMU | cut -d $ -f 1 | sort -u
经过整理,发现该库存在大约2600个回滚段,我了个去,先不管这么多,重启实例后,重建undo表空间:
SQL> conn /as sysdba
Connected to an idle instance.
SQL> startup upgrade pfile='/tmp/gb2.ora';
ORACLE instance started.
Total System Global Area 1.5032E+10 bytes
Fixed Size 2110096 bytes
Variable Size 5704256880 bytes
Database Buffers 9311354880 bytes
Redo Buffers 14663680 bytes
Database mounted.
Database opened.
SQL> create undo tablespace undotbs11 datafile '/xxx/xxxx/undotbs11_01.dbf' size 100m;
Tablespace created.
SQL> create undo tablespace undotbs22 datafile '/xxx/xxxx/undotbs22_01.dbf' size 100m;
Tablespace created.
SQL> drop tablespace undotbs1 xxxluding contents and datafiles;
Tablespace dropped.
SQL> drop tablespace undotbs2 xxxluding contents and datafiles;
Tablespace dropped.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
SQL>conn/assysdba
Connectedtoanidleinstance.
SQL>startupupgradepfile='/tmp/gb2.ora';
ORACLEinstancestarted.
TotalSystemGlobalArea1.5032E+10bytes
FixedSize2110096bytes
VariableSize5704256880bytes
DatabaseBuffers9311354880bytes
RedoBuffers14663680bytes
Databasemounted.
Databaseopened.
SQL>createundotablespaceundotbs11datafile'/xxx/xxxx/undotbs11_01.dbf'size100m;
Tablespacecreated.
SQL>createundotablespaceundotbs22datafile'/xxx/xxxx/undotbs22_01.dbf'size100m;
Tablespacecreated.
SQL>droptablespaceundotbs1xxxludingcontentsanddatafiles;
Tablespacedropped.
SQL>droptablespaceundotbs2xxxludingcontentsanddatafiles;
Tablespacedropped.
最后重启数据库实例,让客户将关键核心的配置表导出,先进行业务恢复,如果需要数据,直接从库中抽取。
这里要补充一点,该库约为5TB多一点,虽然有备份,但是恢复时间太长,如果有个dataguard是多么的重要啊!
魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。
更多推荐



所有评论(0)