Главная > Brocade > [Brocade]SAN Troubleshooting(11)- Common SAN Issue Part1 — Segmented Fabrics

[Brocade]SAN Troubleshooting(11)- Common SAN Issue Part1 — Segmented Fabrics

Before getting to troubleshooting Fabric Segmentation issue, let’s first understand the fabric merge procedures in Redundant Fabrics, With redundant fabrics, you can merge a fabric by taking it offline and redirecting I/O to the other fabric. existing I/O operations are not affected; however, during the merge, the hosts operate in degraded mode without redundant data path protection. With proper planning, you can minimize downtime. After completing and verifying the fabric merge, bring the first fabric online by restoring the I/O paths. After you restore the I/O paths on the first fabric, you can repeat the merge process for the second fabric.

In the following procedure, the SAN consists of fabric A and a redundant fabric B. Each of these fabrics is merged with a SAN consisting of fabrics C and D.

1. Identify and resolve any issues that can cause fabric segmentation.
2. Verify that each fabric provides a redundant path to all attached devices.
3. Verify that paths are open to each device that must remain online during the merge.
4. Select fabrics for merging, for example, fabric A with fabric C.
5. Close all active paths on the fabric selected for merging and prepare devices for downtime.
For example, use multipathing software to redirect I/O by performing a failover to the alternate path.
6. Verify that the fabric selected for merging has no I/O activity.
7. Connect the selected fabrics (fabric A and fabric C).
8. Verify that the newly merged fabric contains all switches and that the zoning has merged correctly.
9. Restore I/O operations on the new fabric from the multipathing software console.
10. Verify that paths are open and restored for each device.
11. Ensure that all paths and I/O operations have been restored.
Repeat this procedure to merge fabric B with fabric D.


Fabric Segmentation can occur for a variety of reasons

1. Wrong Product License.
2. Zoning conflicts
3. Admin Domain conflict
4. Incompatible Switch Parameters
5. Domain ID overlap
6. Access Control Lists (ACLs)

Useful Command

Command can be used to pinpoint the root cause for the Fabric Segmentation:

— switchshow
— licenseShow
US_SAN01:FID128:admin> licenseshow

bbbQ9ydRbzcccRAP:

Obsolete license

RyyQb9SdcSzzRch:

Obsolete license

cSRzcccQRcdVS0d0:

FICON_CUP license

7YAJSMNLLaTGTCfmfSJgJ97FAFPHC3QCB73MG:

Extended Fabric license

Fabric Watch license

Performance Monitor license

Trunking license

Adaptive Networking license

Enhanced Group Management license

raMStaDPQQXDfBKf9fRtBH4RCS4tWAWZBJ7DH:

Server Application Optimization license

MTXSXLFrYM7GGrtfWtXZtK9KTTaLgAEHrY3MmFEAFrSB:

Advanced FICON Acceleration (FTR_AFA) license

Capacity 1

Consumed 1

Configured Blade Slots 9

MTXSXLFrYM7GGrtfWt9ZtK9KTTaLgAJHrY3MmFEAPH3B:

10 Gigabit Ethernet (FTR_10G) license

Capacity 2

Consumed 2

Configured Blade Slots 4,9

MTXSXLFrYM7GGrtfWttZtK9KTTaLgAJHrY3MmFEAZHSB:

Advanced Extension (FTR_AE) license

Capacity 2

Consumed 2

Configured Blade Slots 4,9

US_SAN01:FID128:admin>

CA_SAN01:FID128:admin> licenseshow

RceR9dzzQdSdfSAH:

Obsolete license

SeQS99QSRyTfRTAj:

Obsolete license

Sbc9cSyezRTedAdR:

FICON_CUP license

SYTXKGLJAXDaQEPJaXY4ZNrNXJNXRNLDBACBD:

Extended Fabric license

Fabric Watch license

Performance Monitor license

Trunking license

Adaptive Networking license

Enhanced Group Management license

Server Application Optimization license

7rXtYJ4tHTRStPCgFGXCPHJMN9ZQQCMgTrDBggFAMYJA:

Advanced FICON Acceleration (FTR_AFA) license

Capacity 1

Consumed 1

Configured Blade Slots 9

PSSH7J3gHrGZMHGJ7TYHmmHAYLSWHHgEtZXfrfGABSTA:

10 Gigabit Ethernet (FTR_10G) license

Capacity 2

Consumed 2

Configured Blade Slots 4,9

aBTPf4CTNAH47RSCJrSTSHFFXETQBJQLM3LEaRHA4NfA:

Advanced Extension (FTR_AE) license

Capacity 2

Consumed 2

Configured Blade Slots 4,9

CA_SAN01:FID128:admin>

— fabstatsshow
SAN01:FID128:admin> fabstatsshow

Description Count Port Timestamp

————————— —— —— —————-

Domain ID forcibly changed: 0

E_Port offline transitions: 11 60 Thu Apr 18 16:22:29 2013

Reconfigurations: 9 60 Thu Apr 18 16:22:29 2013

Segmentations due to:

Loopback: 0

Incompatibility: 0

Overlap: 0

Zoning: 0

E_Port Segment: 0

Licensing: 0

Disabled E_Port: 0

Platform DB: 0

Sec Incompatibility: 0

Sec Violation: 0

ECP Error: 0

Duplicate WWN: 0

Eport Isolated: 0

AD header conflict: 0

DomainID offset conflict: 0

McData SafeZone conflict: 0

VF AD conflict: 0

MSFR/RD H&T WWN conflict: 0

ETIZ Incompatibility: 0

ESC detected conflict: 0

‘<' - Denotes the type of event that occurred last. QFMTLSAN01:FID128:admin>

QFMTLSAN01:FID128:admin>

1. Wrong Product License.

Fabric license is required for merging and form a Fabrics
2. Zoning Conflicts

When merging two fabrics, zoning information from the two previously separated fabrics is merged as much as possible into the new fabric. Sometimes, zoning inconsistency can occur and zoning information cannot be merged.
One of the solutions is to make sure zoning information on both switches is consistent before bringing up the ISL.
Another solution which is the easiest that is to clear the configuration on the conflicted switch and then have that switch absorb the zone information when it becomes part of the fabric.
1. SSH into the switch you are adding, and press Enter.
2. Login, enter your userid and password, disable the switch with the switchdisable command.
3. Disable the active configuration using cfgdisable, for example, cfgdisable “CFG1 ”.
4. Issue the cfgclear command to clear all zoning information.
5. Issue the cfgsave command to save the changes.
6. Issue the switchenable command to enable the switch.

Typically, there are some conditions that will cause a zone conflict:
— Multiple Active zoning configurations. To solve it One of the zone configuration need to be disabled.
— Zone Name/Alias Conflict. This is could become very time consuming when dealing with large fabric, basically, you need to ensure no duplicated name used for the name of zone object including alias, zone and cfg.
— Zone Content Conflict
This is where the definition name and type match, but the content is different.
Here is an example, in this example Fabric segments due to mismatching zone content
Switch 1

defined: cfg1
zone1:
10:00:00:90:69:00:00:8a;
10:00:00:90:69:00:00:8b

Switch 2
defined: cfg1
zone1:
10:00:00:90:69:00:00:8c;
10:00:00:90:69:00:00:8d
enabled: irrelevant

So in this case it will require that the administrator determine which zone is correct and either update the incorrect one or delete it. Once the fabrics merge, the proper zone will be propagated to all the switches in the fabric.
In summary, When encounter zone conflicts you could either correct the conflict or by clearing the zoning information on either existing fabric or new switch. Before do anything we will recommend to save cfgShow and do a configUpload on both switch. To clear the zoning config you can use cfgClear then cfgDisable.

Other useful commands and Tools include switchShow which will help you identify the conflict.
Port Media Speed State

=========================
0 id N2 Online E-Port 10:00:00:60:69:80:04:c6 segmented,(Trunk master)
1 id N2 No_Light
2 id N2 No_Light
3. Admin Domain(AD)conflict

AD maintain continuity of service for Fabric OS features and operate in mixed-release Fabric OS environments. High availability is supported with some backward compatibility.

1. Admin Domain will succeed if the receiver has no AD database or the receiver’s AD database matches 100% with both the defined and effective configurations of the local AD database. .
2. If the AD database merge fails, the E_Port is segmented with an “AD conflict” error code.

Identify AD conflicts
1. Using switchShow and errShow command to identify the cause
2. when confirm merge failure caused by AD conflict, the switchShow output will show you the affected E_ports
3. Switch error log and fabstatsshow also log AD related conflict.
4. you can also use ad —show to display AD definition, as mentioned AD database must match 100% for AD numbers, members and zone databases(including root zone database)

Solution
as mentioned earlier, to resolve AD conflicts perform the following actions on both fabrics from AD255.
— Resolve any differences by editing AD configuration in one fabric to match the other
— Use add —add, add —remove, add —apply from AD255
— Disable and re-enable E-port

If dealing with segmented fabrics once they shared a common AD configuration:
— Disable all E_port connections between the two fabrics.
— Decide which AD database you wish to keep
— use ad —add, add —remove, add —apply from AD255
— clear the AD database on the other fabric using ad —clear, ad —apply
— Use ad —show to verify AD configuration
— Enable the E_Ports
4. Incompatible Switch Parameters including Fabric and Port parameters

Fabric Parameters Conflict

The fabric parameter system configuration settings must be the same for every switch in the fabric.The fabric will segment if there is a difference between the parameters. All of parameters must be checked before merging fabric. Use configShow to check the parameters.
# configshow | grep fabric.ops
SAN01:FID128:admin> configshow |grep fabric.ops

fabric.ops.BBCredit:16
fabric.ops.E_D_TOV:2000
fabric.ops.R_A_TOV:10000
fabric.ops.bladeFault_on_hwErrlevel:0
fabric.ops.dataFieldSize:2112
fabric.ops.max_hops:7
fabric.ops.mode.fcpProbeDisable:0
fabric.ops.mode.isolate:0
fabric.ops.mode.longDistance:0
fabric.ops.mode.noClassF:0
fabric.ops.mode.pidFormat:1
fabric.ops.mode.tachyonCompat:0
fabric.ops.mode.unicastOnly:0
fabric.ops.mode.useCsCtl:0
fabric.ops.vc.class.2:2
fabric.ops.vc.class.3:3
fabric.ops.vc.config:0xc0
fabric.ops.vc.linkCtrl:0
fabric.ops.vc.multicast:7
fabric.ops.wan_tov:0
SAN01:FID128:admin>

Port Parameters Conflict

Port parameters must be consistent on both end of the ISL. use portCfgShow command see sample output below:
SAN01:FID128:admin> portcfgshow

.., output truncated,…

Ports of Slot 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

——————+—+—+—+—+——+—+—+—+——+—+—+—+——+—+—+—
Speed AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN

Fill Word 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

AL_PA Offset 13 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Trunk Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

Long Distance .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

VC Link Init .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked L_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked G_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Disabled E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

ISL R_RDY Mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

RSCN Suppressed .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Persistent Disable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

LOS TOV enable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

NPIV capability ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

NPIV PP Limit 126 126 126 126 126 126 126 126 126 126 126 126 126 126 126 126

QOS E_Port AE AE AE AE AE AE AE AE AE AE AE AE .. .. .. ..

EX Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Mirror Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Rate Limit .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Credit Recovery ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

Fport Buffers .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Port Auto Disable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

CSCTL mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Fault Delay 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Ports of Slot 4 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

——————+—+—+—+—+——+—+—+—+——+—+—+—+——+—+—+—

Speed AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN AN

Fill Word 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

AL_PA Offset 13 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Trunk Port ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

Long Distance .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

VC Link Init .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked L_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked G_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Disabled E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Locked E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

ISL R_RDY Mode .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

RSCN Suppressed .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Persistent Disable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

LOS TOV enable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

NPIV capability ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON ON

NPIV PP Limit 126 126 126 126 126 126 126 126 126 126 126 126 126 126 126 126

QOS E_Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

EX Port .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Fport Buffers .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Port Auto Disable .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Fault Delay 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

where AE:QoSAutoEnable, AN:AutoNegotiate, ..:OFF, NA:NotApplicable, ??:INVALID,

LM:L0.5

.., output truncated,…

See below spreadsheet on the command to use to make changes to the port parameters:

Parameter

Command

Comments

Port Speed

portcfgspeed

Speed – Displays AN for auto speed negotiation mode, or a specific speed of 1, 2, 4, or 8 Gbits/sec. This value is set by the portcfgspeed command.

Reset to Defaults

portcfgdefault

Port Type (L_Port)

portcfglport

Locked L_Port – Displays ON when the port is locked to L_Port only. Displays (..) or OFF when L_Port lock mode is disabled and the port behaves as a U_Port). This value is set by the portcfglport command.

Port Type (E_Port disabled)

portcfgeport

Disabled E_Port – Displays ON when the port is not allowed to be an E_Port.

Displays (..) or OFF when the port is allowed to function as an E_Port. This value is set by the portcfgeport command.

Port Type (E or F_Port only)

portcfggport

Locked G Port Displays ON when the port is locked to G Port only Displays ( )

TS300 End Device Connectivity G_– G_only. (..) or OFF when G_Port lock mode is disabled and the port behaves as a U_Port. This value is set by the portcfggport command.

Long Distance and VC Link Init

portcfglongdistance

Long Distance – Displays (..) or OFF when long distance mode is off; otherwise,

displays long distance levels as shown below. This value is set by the

portcfglongdistance command.

• LE – The link is up to 10 km

• LD – The distance is determined dynamically

• LS – The distance is determined statically by user input

ISL R_RDY Mode

portcfgislmode

ISL R_RDY Mode – Displays ON when ISL R_RDY mode is enabled on the port.

Displays (..) or OFF when ISL R_RDY mode is disabled. This value is set by the

portcfgislmode command.

QoS

portcfgqos

QOS E_Port – Displays ON when Quality of Service (QoS) is enabled on the port.

Displays (..) or OFF when QoS is disabled. By default, QoS is enabled by best effort based on availability of buffers. This value is set by the portcfgqos command.

Credit Recovery

portcfgcreditrecovery

Credit Recovery – Displays ON when Credit Recovery is enabled on the port or (..) or OFF when disabled. This value is set by the portcfgcreditrecovery

command. The credit recovery feature is enabled by default, but only ports

configured as long distance ports can utilize this feature.

Port Parameters Conflict — FCIP

GbE port FCIP tunnel configuration must be identical on both sides. You can use SwitchShow, portshow fciptunnel to verify the port setting, port setting here including FCIP tunnel compression, fastwrite, tape pipelining, IPSec and IKE configuration. See below for an example FCIP setup:

Candian Site configuration
CA_SAN01:FID128:admin> portshow fciptunnel all

——————————————————————————-

Tunnel Circuit OpStatus Flags Uptime TxMBps RxMBps ConnCnt CommRt Met

——————————————————————————-

4/12 — Up -ft— 51d23h 0.00 0.00 7 — —

4/22 — Up -f— 51d23h 8.05 116.71 6 — —

——————————————————————————-

Flags: tunnel: c=compression f=fastwrite t=Tapepipelining F=FICON T=TPerf

circuit: s=sack

CA_SAN01:FID128:admin> portshow fciptunnel 4/22

——————————————-

Tunnel ID: 4/22

Tunnel Description: FCIP XGE0 Disk

Admin Status: Enabled

Oper Status: Up

Compression: Off

Fastwrite: On

Tape Acceleration: Off

TPerf Option: Off

IPSec: Disabled

Remote WWN: Not Configured

Local WWN: 10:00:00:05:33:19:dc:22

Peer WWN: 10:00:00:05:1e:e3:c7:12

Circuit Count: 3

Flags: 0x00000000

FICON: Off

CA_SAN01:FID128:admin>

US Site configuration

US_SAN01:FID128:admin> portshow fciptunnel all

——————————————————————————-

Tunnel Circuit OpStatus Flags Uptime TxMBps RxMBps ConnCnt CommRt Met

——————————————————————————-

4/12 — Up -ft— 51d23h 0.00 0.00 6 — —

4/22 — Up -f— 51d23h 74.79 5.49 5 — —

——————————————————————————-

Flags: tunnel: c=compression f=fastwrite t=Tapepipelining F=FICON T=TPerf

circuit: s=sack

US_SAN01:FID128:admin> portshow fciptunnel 4/22

——————————————-

Tunnel ID: 4/22

Tunnel Description: FCIP XGE0 Disk

Admin Status: Enabled

Oper Status: Up

Compression: Off

Fastwrite: On

Tape Acceleration: Off

TPerf Option: Off

IPSec: Disabled

Remote WWN: Not Configured

Local WWN: 10:00:00:05:1e:e3:c7:12

Peer WWN: 10:00:00:05:33:19:dc:22

Circuit Count: 3

Flags: 0x00000000

FICON: Off

US_SAN01:FID128:admin>

5. Domain ID overlap/Conflict

Normally, domain IDs are automatically assigned; however, once a switch is online, the domain ID cannot change, as it would change the port addressing and potentially disrupt critical I/O.The resolution for this problem involves performing a switchDisable followed by a switchEnable on the joining switch.This will enable the joining switch to obtain a new domain ID as part of the process of coming online.The fabric principal switch will allocate the next available domain ID to the new switch during this process. It’s recommended to use the insistent domain ID (IDID). When IDID mode is enabled, the current domain setting for the switch is insistent; that is, the same ID is requested during switch reboots, power cycles, CP failovers, firmware downloads, and fabric reconfiguration.

— Required to be enabled for FICON environments
— Recommended for HP-UX and AIX environments to setup Insistent Domain ID. Because hardwarepath or PID are all associated with the domain ID.
— Changing domain IDs can have an impact on port zoning entries. Be sure to check to see if any port zoning entries exist for devices on a switch before changing its domain ID
So when insistent domain ID is setup it will not allow the switch address to be automatically changed when a duplicate switch address is added to the fabric. Instead, fabrics that use insistent domain IDs require an operator’s overt action to change a switch address. The customization of fabric-binding and the setting of an insistent domain ID are normally done only at switch installation time or reinstallation time.
To change Insistend Domain ID Mode and Domain ID.
1. switchdisable
2. configure

admin> configure

Configure…
Fabric parameters (yes, y, no, n): [no] y
Domain: (1..239) [6] 10
…,… output truncted
Insistent Domain ID Mode (yes, y, no, n): [y]

3. switchenable

6. Access Control Lists (ACLs)

Verify security policy, the security policy maybe setup to limit the total number of switch in the fabric. So you will need to make sure to add the new switch to the ACLs SCC policy.

SAN01:FID128:admin> secpolicyshow

____________________________________________________
ACTIVE POLICY SET
SCC_POLICY
WWN DId swName
—————————————————
10:00:00:05:33:19:dc:22 100 CA_SAN01
10:00:00:05:1e:e3:c7:12 200 US_SAN01
____________________________________________________
DEFINED POLICY SET
SCC_POLICY
WWN DId swName
—————————————————
10:00:00:05:33:19:dc:22 100 CA_SAN01
10:00:00:05:1e:e3:c7:12 200 US_SAN01
SAN01:FID128:admin>

Categories: Brocade Tags:
  1. Пока что нет комментариев.
  1. Пока что нет уведомлений.