存储掉电强制拉库引起ORA-01555和ORA-01189/ORA-01190故障处理---惜分飞
机房存储突然掉电导致Oracle数据库访问存储异常,数据库报出大量的ORA-27072: File I/O error,Linux-x86_64 Error: 5: Input/output error,ORA-15081: failed to submit an I/O operation to a disk等错误,实例直接crash
Wed Aug 27 07:11:53 2025
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_m000_17596.trc:
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 6297632
Additional information: -1
WARNING: Read Failed. group:1 disk:0 AU:3075 offset:16384 size:16384
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-00202: control file: \'+DG/xff/controlfile/current.284.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of
file 284 in group [1.2747812198] from disk DG_0000 allocation unit 3075 reason error;
if possible, will try another mirror side
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_m000_17596.trc:
ORA-00202: control file: \'+DG/xff/controlfile/current.284.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 16384
WARNING: Write Failed. group:1 disk:0 AU:3080 offset:49152 size:16384
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 16384
WARNING: Write Failed. group:1 disk:0 AU:3075 offset:49152 size:16384
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-15080: synchronous I/O operation to a disk failed
WARNING: failed to write mirror side 1 of virtual extent 0 logical
extent 0 of file 284 in group 1 on disk 0 allocation unit 3075
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-15080: synchronous I/O operation to a disk failed
WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0
of file 283 in group 1 on disk 0 allocation unit 3080
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: \'+DG/xff/controlfile/current.283.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: \'+DG/xff/controlfile/current.284.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ckpt_6165.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: \'+DG/xff/controlfile/current.283.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: \'+DG/xff/controlfile/current.284.918834897\'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
CKPT (ospid: 6165): terminating the instance due to error 221
Wed Aug 27 07:11:53 2025
ORA-1092 : opitsk aborting process
存储恢复之后,尝试open数据库报ORA-00333错误(该错误一般是由于redo写丢失导致)
Wed Aug 27 16:36:32 2025
ALTER DATABASE OPEN
This instance was first to open
Beginning crash recovery of 2 threads
parallel recovery started with 31 processes
Started redo scan
Incomplete read from log member \'+DG/xff/onlinelog/group_2.287.918834905\'. Trying next member.
Incomplete read from log member \'+DG/xff/onlinelog/group_2.288.918834911\'. Trying next member.
Incomplete read from log member \'+DG/xff/onlinelog/group_2.287.918834905\'. Trying next member.
Abort recovery for domain 0
Aborting crash recovery due to error 333
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_10257.trc:
ORA-00333: redo log read error block 1275904 count 5721
Abort recovery for domain 0
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_10257.trc:
ORA-00333: redo log read error block 1275904 count 5721
ORA-333 signalled during: ALTER DATABASE OPEN...
1
现场人员使用隐含参数,尝试直接拉库操作报ORA-00704 ORA-01555错误,导致拉库失败
1
Wed Aug 27 16:47:11 2025
ALTER DATABASE RECOVER database until cancel
Media Recovery Start
started logmerger process
Parallel Media Recovery started with 32 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER database until cancel ...
Wed Aug 27 16:47:56 2025
ALTER DATABASE RECOVER CONTINUE DEFAULT
Media Recovery Log +DG
Wed Aug 27 16:47:56 2025
Errors with log +DG
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc:
ORA-00308: cannot open archived log \'+DG\'
ORA-17503: ksfdopn:2 Failed to open file +DG
ORA-15045: ASM file name \'+DG\' is not in reference form
ORA-308 signalled during: ALTER DATABASE RECOVER CONTINUE DEFAULT ...
ALTER DATABASE RECOVER CANCEL
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: \'+DG/xff/datafile/system.279.918834827\'
Slave exiting with ORA-1547 exception
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_pr00_24154.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: \'+DG/xff/datafile/system.279.918834827\'
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Wed Aug 27 16:48:09 2025
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 25330611827626
Resetting resetlogs activation ID 3307041102 (0xc51d714e)
Deleted Oracle managed file +DG/xff/onlinelog/group_1.285.918834899
Deleted Oracle managed file +DG/xff/onlinelog/group_1.286.918834901
Deleted Oracle managed file +DG/xff/onlinelog/group_2.287.918834905
Deleted Oracle managed file +DG/xff/onlinelog/group_2.288.918834911
Wed Aug 27 16:48:28 2025
Deleted Oracle managed file +DG/xff/onlinelog/group_3.289.918834917
Deleted Oracle managed file +DG/xff/onlinelog/group_3.290.918834923
Deleted Oracle managed file +DG/xff/onlinelog/group_4.293.918835035
Deleted Oracle managed file +DG/xff/onlinelog/group_4.294.918835037
Wed Aug 27 16:48:48 2025
Deleted Oracle managed file +DG/xff/onlinelog/group_5.295.918835041
Deleted Oracle managed file +DG/xff/onlinelog/group_5.296.918835047
Deleted Oracle managed file +DG/xff/onlinelog/group_6.297.918835055
Wed Aug 27 16:48:58 2025
Deleted Oracle managed file +DG/xff/onlinelog/group_6.298.918835061
Wed Aug 27 16:49:10 2025
Setting recovery target incarnation to 3
Wed Aug 27 16:49:10 2025
This instance was first to open
Picked broadcast on commit scheme to generate SCNs
Wed Aug 27 16:49:10 2025
Assigning activation ID 3598492411 (0xd67ca2fb)
Thread 2 opened at log sequence 1
Current log# 4 seq# 1 mem# 0: +DG/xff/onlinelog/group_4.294.1210265317
Current log# 4 seq# 1 mem# 1: +DG/xff/onlinelog/group_4.293.1210265323
Successful open of redo thread 2
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Aug 27 16:49:10 2025
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x1709.be1eb3b1):
select ctime, mtime, stime from obj$ where obj# = :1
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_23787.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 20 with name \"_SYSSMU20_1295954159$\" too small
Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_ora_23787.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 20 with name \"_SYSSMU20_1295954159$\" too small
Error 704 happened during db open, shutting down database
USER (ospid: 23787): terminating the instance due to error 704
Instance terminated by USER, pid = 23787
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (23787) as a result of ORA-1092
现场进行了一系列尝试操作,最后我接手数据库之时报错为:ORA-01190 ORA-01110,无法recover,也无法重建controlfile,陷入了死局
Completed: ALTER DATABASE MOUNT
Sat Aug 30 10:03:20 2025
ALTER DATABASE OPEN
This instance was first to open
Abort recovery for domain 0
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_6435.trc:
ORA-01190: control file or data file 1 is from before the last RESETLOGS
ORA-01110: data file 1: \'+DG/xff/datafile/system0829.dbf\'
ORA-1190 signalled during: ALTER DATABASE OPEN...
Sat Aug 30 00:56:32 2025
NOTE: Loaded library: System
SUCCESS: diskgroup DG was mounted
Errors in file /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_17302.trc:
ORA-01189: file is from a different RESETLOGS than previous files
ORA-01110: data file 2: \'+DG/xff/datafile/sysaux.280.918834827\'
ORA-1503 signalled during: create controlfile reuse database xff noarchivelog noresetlogs
对于这种情况,通过Oracle recovery check脚本可以直接发现异常(WRONG RESETLOGS)

使用Oracle Recovery Tools小工具实现快速恢复

再尝试重建ctl成功

然后修改数据库scn信息,顺利open数据库

后续建议客户逻辑迁移该库


