Unified MPLS: Advanced Scaling for Core and Edge Networks
  • Abstract:

    Service Providers (SPs) are striving towards becoming
    'Experience Providers' while offering many residential and/or commercial
    services.
    Many SPs have to build an agile Next Gen Networks (NGN) that can optimally deliver the 'Any Play' promise.
    However,
    as the Networks continue to get are getting bigger, fatter and richer,
    some of the conventional wisdom of designing IP/MPLS networks is no
    longer sufficient.

    This introduces a 'Cisco Validated Design' for
    building Next-Gen Networks' Core and Edge. It briefly discusses the
    technologies integral to such a design and focus on their implementation
    using IOS-XR platforms (CRS-1/3/X and ASR 9000). This looks at the
    scaling designs and properties of IP, MPLS, the IGP and BGP as well as
    the protection mechanisms IP/LDP FRR and MPLS-TE FRR.

    This is intended to cover:
    + Unicast routing + MPLS design
    + Fast Restoration
    + Topology Dependency
    + Test Results
    + Case Study

    Trend:
    + Networks becoming larger
     - Quad-play (Video, Voice, Data & Mobility)
     - Merger & Acquisition
     - Growth
    + Exponential bandwidth consumption
     - Business Services
     - Mobile
    + MPLS in the Access
     - Seamless MPLS
     - MPLS-TP
    + BGP ASN consolidation
     - Single ASN offering to customers

    NGN Requirements:
    + Large Network
     - 2000+ routers, say
    + Multi-Play Services Anywhere in network
     - Service Instantiation happens anywhere
    + End-to-End Visibility
     - v4/v6 Uni/Multicast based Services
    + Fast Convergence or Restoration
     - Closer to Zero loss, the better :-)
    + Scale & Performance

    Solution Overview:
    + Unicast Routing + MPLS - Divide & Conquer
     1. Isolate IGP domains
     2. Connect IGP domains using BGP
    + Fast Restoration - Leverage FRR
     1. IP FRR (IGP LFA & BGP PIC)
     2. MPLS FRR (LDP FRR & TE FRR)
    + Topological Consideration - Choose it right
     1. PoP Design
     2. ECMP vs. Link-Bundling
    + Services - Scale

    Routing + MPLS Design

    Must Provide:
    + PE-to-PE Routes (and Label Switched Paths)
     - PE needs /32 routes to other PEs
     - PE placement shouldn't matter
    + Single BGP ASN

    Conventional Wisdom Says:
    + Advertise infrastructure (e.g. PE) routes in IGP
    + Advertise infrastructure (e.g. PE) labels in LDP
    + Segment IGP domains (i.e. ISIS L1/L2 or OSPF Areas)

    Conventional Wisdom Not Good Enough:
    + Large IGP database size a concern
     - For fast(er) convergence
    + Large IGP domain a concern
     - For Network Stability
    + Large LDP database a concern

    'Divide & Conquer' - Game Plan:
    + Disconnect & Isolate IGP domains
     - No more end-to-end IGP view
    + Leverage BGP for infrastructure (i.e. PE) routes
     - Also for infrastructure (i.e. PE) labels

    'Divide & Conquer' - End Result:

    image
    Example - 'PE31' Reachability:
    + Control Plane Flow - RIB/FIB Table View
    + Data Plane Flow - PE11 to PE31 Traffic View

    Divide & Conquer - Summary:
    1. IGP is restricted to carry only internal routes
     - Non-zero or L1 area carries only routes for that area
     - Backbone carries only backbone routes (ISIS Backbone Would Carry Both L1 and L2 Routes Since L1->L2 (or L1->L1) Redistribution Cannot Be Avoided Yet, but OSPF Non-Zero<->Zero Area Redistribution Can Be)
    2. PE redistributes its loopback into IGP as well as iBGP+Label
    3. PE peers with its local ABRs using iBGP+label
     - ABRs act as Route-reflectors
     - ABRs reflect_only_Infrastructure (i.e. PE) routes
     - RRs also in the backbone
    4. ABR, as RR, changes the BGP Next-hop to itself
     - On every BGP advertised routes
    5. PEs separately peer using iBGP for Services (VPN, say)
     - Dedicated RRs for IPv4/6, VPNv4/6, L2VPN, etc.

    Divide & Conquer - End Result:

    image
    Example - 'L3VPN Services'
    + PE11 send L3VPN traffic for an L3VPN prefix "A" to PE31

    Take-Away:
    + Higher Network scale is attainable
     - 1000s of routers
    + BGP and MPLS Label Stacking are key

    Fast Restoration:
    + Business Services demanding faster restoration
     - Against link or node failures
    + "Service Differentiator" for many operators
    + Faster Restoration is driving towards 0 loss
     - ~50ms restoration may be good enough for many :-)
     - Requirements influence Complexity and Cost
    + Fast Restoration is optimal with "Local Protection"
     - pre-compute and pre-install alternate path
     - no need for remote nodes to know about the failure

    + Fast Restoration of Services i.e. BGP Prefixes
     - BGP Prefix Independent Convergence (PIC)
    + Fast Restoration of BGP next-hops i.e. IGP Prefixes
     - IP FRR (LFA) with LDP FRR (or RSVP-TE FRR)
    + Fast Convergence (FC) of IP routing protocols is key and still required

    Fast Convergence

    IGP Prefixes:
    + Remember that FRR is intended for temporary restoration
    + Fast Convergence (FC) is key for IP routing protocols
    + Faster the routing convergence, faster the permanent restoration
     - <1 sec restoration is possible
    + Routing convergence happens at the process level, hence, depends on the platform processor
     - Restoration time can not be guaranteed

    + Detect Link/node down event as fast as possible (MUST for FRR)
     - BFD, Layer2 protocol keep-alives, Alarms, IGP fast hellos, Proactive Protection
    + Generate the link state event - LSP/LSA generation is optimized
    + Propagate the changes in the network as soon as possible - Flooding and passing is optimized
    + Recalculate the paths (run SPF) as soon as possible - Support of incremental SPF and optimized for full SPF
    + Install the new routes in the routing/forwarding table with Prefix Prioritization (MUST for FC)
     - CRITICAL: IPTV SSM sources
     - HIGH: Most Important PE's
     - MEDIUM: All other PE's
     - LOW: All other prefixes

    IGP Tuning for FC:

    OSPF Tuning:
    + OSPF Event Propagation
     - timers pacing flood value
     - timers pacing retransmission value
     - default values are 33 msec/66 msec
    + OSPF Subsecond Hellos Configuration:
     - ip ospf dead-interval minimal hello-multiplier value
     - Value-range 3-20
    + OSPF LSA Generation Exponential Backoff
     - timers throttle lsa all lsa-start lsa-hold lsa-max
     - timers lsa arrival timer
    + OSPF SPF Exponential Backoff
     - Timers throttle spf spf-start spf-hold spf-max
     - All LSA/SPF values are in ms
    Note: MinLSArrival Must Be <= lsa-Hold

    IS-IS Tuning:
    + IS-IS hello interval / Hello Multiplier
     - isis hello-interval { seconds | minimal }
     - isis hello-multiplier value
     - Value-range 3-20
    + IS-IS LSP-Generation Exponential Backoff
     - lsp-gen-interval lsp-max lsp-start lsp-hold
     - lsp-max - (sec) lsp-hold - (msec) lsp-start - (msec)
    + IS-IS Event Propagation
     - lsp-interval value
     - Default rate - one LSP every 33 ms
    + Fast LSP Flooding
     - fast-flood lsp-number (Previously ip fast-convergence)
    + IS-IS SPF Exponential Backoff
     - spf-interval spf-max spf-start spf-hold
     - <spf-max> - (sec) <spf-start> - (msec) <spf-hold> - (msec)
     - prc-interval prc-max prc-start prc-hold
     - <prc-max> - (sec) <prc-start> - (msec) <prc-hold> - (msec)


    B-)
  • 3 Comments sorted by
  • http://www.cisco.com/web/ME/connect2013/saudiarabia/pdf/fixed_mobile_convergence_by_gary_day.pdf

    IPFRR: Network Availability and Simplicity

    Fast Reroute Requirements:

    Convergence: Impact of Outage on Video

    image
    Video artifacts. With a slice error, we can see the image (a) as a viewer would see it and (b) with the parts in error highlighted. With a blocking or pixelization error (c), the effect occurs when a loss occurs in either an I- or P-frame. (Source material copyright the Society of Motion Picture Television Engineers.)

    image
    If we have a 10mS outage, the best we can expect is a 33mS disruption because we lose a b-frame and the worst case is because we got unlucky and lost an I-frame. The worst case loss for low motion is longer than the worst case loss for high motion: because there are more frames in high motion.

    Convergence:

    image
    + Assume a flow from A to B
    + T1: when L dies, the best path is impacted
     - loss of traffic
    + T2: when the traffic reaches the destination again through the computed next best path.
     - If fast reroutes technologies are used, this may happen well before the network convergence
     - Once the network converges, a next best path is computed
    + Loss of Connectivity: T2 - T1, called "convergence" hereafter
    + Traffic can be restored long before the convergence time if fast reroute technology is used

    Fast Convergence & Fast Reroute:
    + Minimize network downtime/traffic loss
     - "Classical" Convergence > 1 sec.
     - Fast Convergence < 1 sec.
     - Fast Re-Route < 50-100 msec.
    + Support all types (Link, Node or SRLG) of IP/MPLS restoration mechanisms.
    + Keep it simple and straight.
    + Keep it cost effective (both capex/opex)

    Classical and Fast Convergence:
    + Detection (link or node aliveness, routing updates received)
    + Walk through routing DB's
    + State propagation (routing updates send)
    + Compute primary path & label
    + Download to HW FIB
    + Switch to newer path

    Fast Reroute - Path Precomputed more action from Classical and Fast Convergence:
    + Switch to Repair Path
    + Pre-Compute Repair path (Offline Calculation)
    + Download to HW FIB (Offline Calculation)

    BGP PIC (Prefix Independent Convergence):
    + What is it, and why?
    image
    + PIC is the ability to restore forwarding without resorting to per prefix operations.
    + Loss Of Connectivity does not increase as network grows (one problem less).

    BGP Recursion:
    + show ip route 110.1.0.0
     * 10.0.0.3, from 10.0.0.3, 00:01:20 ago
    + show ip route 10.0.0.3
    + show ip cef 110.1.0.0

    Non Optimal: Flat FIB:
    + Each BEG FIB entry has its own local Outgoing Interface (oif) information
    + Forwarding Plane must directly recurse on local oif information
    + FIB changes can take long, dependent on number of prefixes

    Right Architecture: Hierarchical FIB:
    + Pointer Indirection between BGP and IGP entries allow for immediate leveraging of the IGP convergence, and immediate update of the multipath BGP pathlist at IGP convergence
    + Only  the parts of FIB actually affected by a change needs to be touched
    + Used in newer IOS and IOS-XR (all platforms), enables Prefix Independent Convergence

    Failure in the Core:
    + Address failures "in the core" where the recursive BGP path stays intact.
     - Failures covered are P-PE link or P node failures that trigger a change of the IGP path to the BGP next-hop.

    image
    + IGP convergence on PE1 leads to a modification of the RIB path to PE3.
     - BGP Dataplane Convergence is finished assuming the new path to the BGP nhop is leveraged immediately

    Hierarchical FIB:
    + FIB Leaf: group of prefixes
    + BGP Path-List: list of best ECMP BGP nhops and list of alternate BGP nhops
    + IGP Path-List: list of ECMP IGP paths
    + Adjacency: OIF and immediate nhop

    + As soon as IGP converges (200msec), the IGP PL memory is updated and hence all children BGP PL's leverage the new path immediately
    + Optimum convergence, Optimum Load-Balancing, Excellent Robustness

    PE Node Failure:
    + Addresses a change in the BGP path
    + i.e. a change to a different BGP next-hop due to a PE node failure, which normally would require network wide BGP best-path re-computation and path withdrawing

    image
    + BGP Dataplane Convergence is kicked in on PE1 and immediately redirects the packets via PE4 using a pre-calculated alternate (repair) path.

    + PE1 has primary and backup path
     - Primary via PE3
     - Backup via PE4 best external route

    + IGP propagates loss of PE3's/32 host route across the core to remote PEs

    + PE1 detects loss of PE3's/32 host route in IGP
     - CEF immediately swaps forwarding destination label from PE3 to PE4 using backup path
    + BGP NHT sends a "delete" notification to BGP which triggers BGP Control-Plane Convergence
     - BGP on PE1 computes a new bestpath later, choosing PE4

    PE-CE Link Failure:
    + PE1 and PE3 are the reacting points
    + Enhancement to the MPLS VPN BGP Local Convergence feature
    + Improvement by calculating a backup/alternate path in advance
    + When primary link PE3 - CE2 fails:
     - Data Plane: The traffic is sent to the backup/alternate path
     - Control Plane: PE1 is expected to converge to start using PE4's label to send traffic to 110.x.0.0/24

    + PE3 has primary and backup path
     - Primary via directly connected PE3-CE2 link
     - Backup via PE4 best external route
    + What happens when PE3-CE2 link fails?

    + CEF (via BFD or link layer mechanism) detects PE3-CE2 link failure
    + CEF immediately swaps to repair path label
     - Traffic shunted to PE4 and across PE4-CE2 link

    + PE3 withdraws route via PE3-CE2 link
    + Update propagated to remote PE routers

    + BGP on remote PEs selects new best path
    + New best path is via PE4
    + Traffic flows directly to PE4 instead of via PE3

    Loop Free Alternate (LFA) Key Concepts

    Why Not Just Use Fast Convergence:
    + ISIS/OSPF and CEF can be very fast!
     - 200ms on high end platform can be achieved.
    + But...
     - It runs at the process level
      Does not guarantee time limit
     - Performance depends on tuning and platform implementation

    What is an LFA?
    + Stands for Loop Free Alternate
     - A node other than the primary next hop
    + Provides local protection for unicast traffic in pure IP (and MPLS/LDP) networks in event of a single failure, whether link, node, or shared risk link group (SRLG)
    + Traffic is redirected to the LFA almost immediately after failure
    + An LFA takes forwarding decision without knowledge of the failure
     - LFA must not use the failed element to forward the traffic
     - LFA must not use the protecting node to forward traffic
     - LFA must not cause loop

    Per-Link LFA Protection:
    + Goal is to bypass failed link and reach primary node via alternative way
    + Main Idea: We know there exists good path from primary node to all destinations, so if we can bypass failed link and deliver traffic to router which was next hop of primary path before link failure then we know that router can forward it further

    Per Link LFA Limitations:
    Per-Link LFA Does Not Work in Some Cases

    image
    B-)
  • Scope of Orchestration:

    VMS 1.0.2 Services:

    image

    VMS 2.0 Services (Added):
    + 4000 Series in CPU
    + Intrusion Prevention (IPSv)

    VMS 2.1 Services (Added): Cloud VPN "as a Service"

    VMS 2.2 Services:

    image

    Delivering services to the branch:

    Today's approaches: Rack and Stack
    Good:
    + Best in breed
    + Customer choice
    + Modular build-out
    Drawbacks:
    + Environmental (space/power/wiring)
    + Onsite + complex installation
    + Truck rolls

    Integrated Branch Solution:
    Benefits:
    + Fully integrated solution
    + No truck roll
    + Simpler environmental
    Drawbacks:
    + Reduced customer choice
    + Upfront hardware investment
    + Software inter-dependencies

    What is vBranch Orchestration:
    + Centrally orchestration branch level NFV solution
    + Central portal Infrastructure
    + NFV orchestrator - NCS
    + VNF EMS / NMS / Controller - choice
    + Elastic Services Controller @ branch
     GUI + Local life cycle management
    + x86 capability at the branch

    image

    Customer Experience in Brief:

    image

    Self-Service User and Operator Portals - Customizable:

    image

    Cisco Virtual Managed Services
    Cloud VPN and Cloud MPLS Packages:

    image

    Application Policy Model and Instantiation:

    image
    All forwarding in the fabric is managed through the application network profile
    + IP addresses are fully portable anywhere within the fabric
    + Security and forwarding are fully decoupled from any physical or virtual network attributes
    + Devices autonomously update the state of the network based on configured policy requirements

    Cisco ACI Introduces Logical Network Provisioning of Stateless Hardware:

    image

    TWO TYPES OF LANGUAGES:

    Infrastructure Language <- Human Translator -> App Language

    Infrastructure Language:
    + VLAN
    + IP Address
    + Subnets
    + Firewalls
    + Quality of Service
    + Load Balancer
    + Access Lists

    App Language:
    + Application Tier Policy and Dependencies
    + Security Requirements
    + Service Level Agreement
    + Application Performance
    + Compliance
    + Geo Dependencies

    APIC-EM: Common Policy Model from Branch to Data Center:

    image

    Ultra Service Platform: From Physical to Virtualized Mobile Networks:

    image

    Agile Carrier Ethernet - ACE:

    image
    Minimal but "Sufficient" distributed control plane on network nodes
    Centralized intelligence on the SDN service controller

    + Transport: Autonomic self-deployed and self-protected, dynamic, ECMPs, flexible traffic engineering
    + Services: SDN + BGP for service, programmable

    + Autonomic Networking
     - Virtual Out of Band Channel Autonomic Control Plane
     - Secure & Zero Touch deployment
     - Auto IP / IP unnumbered
    + Segment Routing
     - Reduced Protocols
     - Application Integration
     - TI-LFA
     - Simplified TE
    + SDN Orchestration
     - NSO / Tail-F for Service and static Label provisioning
     - XRv for central control plane
     - Open SDN Controller and WAE as add-ons for SR TE

    Autonomic Networking: Secure, Plug-n-Play:
    + Plug-n-Play: New node use v6 link local address to build adjacency with existing nodes, no initial configuration is required
    + Secure: New node is authenticated using its ID, and then build encrypted tunnel with its adjacent nodes
    + Always-on VOOB: Consistent reachability between Controller and network devices over Virtual Out-of-band management VRF. Even with user mis-configuration, the VOOB will still remain up

    Transport Evolution with Segment Routing (SR):
    + Application Enabled Forwarding
     - Each engineered application flow is mapped on a path
     - A path is expressed as an ordered list of segments
     - The network maintains segments
    + Simple: less Protocols, less Protocol interaction, less state
     - No requirement for RSVP, LDP
    + Scale: less Label Databases, less TE LSP
     - Leverage MPLS services & hardware
    + Forwarding based on Labels with simple ISIS/OSPF extension
    + 50msec FRR service level guarantees
    + Leverage multi-services properties of MPLS

    Millions of Applications flows ->
    A path is mapped on a list of segments ->
    The network only maintains segments
    No application state

    The state is no longer in the network but in the packet

    ACE Transport: Unified MPLS with Segment Routing
    Unified MPLS with SR <- Simplified MPLS Transport:
    • Isolated network domains with common IP/MPLS technology using segment routing
    • Autonomic: auto-discovery, plug-n-play
    • Intra-domain routing: shortest-path, TI-FRR, anycast node SID for node redundancy
    • Inter-domain routing: SDN controlled inter-domain end-to-end routing
    • Back compatible: with existing unified MPLS network, LDP/RSVP-TE, RFC 3107

    image

    BGP Prefix Independent Convergence:
    BGP PIC affects prefixes under IPv4 and VPNv4 address families. For those prefixes, BGP calculates an additional second best path, along with the primary best path. (The second best path is called the backup/alternate path.) BGP installs the best and backup/alternate paths for the affected prefixes into the BGP RIB. The backup/alternate path provides a fast reroute mechanism to counter a singular network failure.
    • route-policy X
        set path-selection backup 1 advertise install
      end-policy
      !
      router bgp 65123
       address-family vpnv4 unicast
        additional-paths receive
        additional-paths send
        additional-paths selection route-policy X
    B-)
  • deepakarora1984.blogspot.com/search/label/UMMT-Seamless%20MPLS

    50 ms Switch-over Time:

    เนื่องจากเราทำงานกันบน Protocol IP แหล่งแรกๆ ที่เราจะวิ่งไปหาคือ RFC

    RFC 3469 "Framework for Multi-Protocol Label Switching (MPLS)-based Recovery" กล่าวไว้ว่า

    Fastest MPLS recovery is assumed to be achieved with protection switching and may be viewed as the MPLS LSR switch completion time that is comparable to, or equivalent to, the 50 ms switch-over completion time of the SONET layer.

    นั่นคือจะทำ Protection Switching ที่เร็วที่สุด... เร็วแค่ไหน... ก็ภายใน 50 ms เทียบเท่ากับ SONET นั่นแหละ

    แล้วมีข้อจูงใจอื่นๆ ที่บอกว่า IP ต้อง Switch เร็วภายใน 50 ms ไหม... เท่าที่ผ่านมายังไม่เคยเห็น

    จริงๆ RFC ฉบับที่เป็น Standard ของ MPLS-TP เจาะจงเลยว่าอ้างอิง G.841 แต่บางคนอาจจะบอกว่าส่วนใหญ่เราไม่ได้ใช้ MPLS-TP กัน เลยเอา RFC ของ MPLS ทั่วๆ ไป คือ RFC 3469 มาก่อน

    ก็ลองไปดู ITU-T G.841: Types and characteristics of SDH network protection ar
    chitectures

    เขาเขียนไว้หลายที่ว่าการ Detect หลายๆ อย่างกำหนดระยะเวลาภายใน 50 ms ไม่งั้นกลไกอื่นๆ ก็จะเกิดขึ้นเป็นทอดๆ ต่อไป ซึ่งเดาเอาเองว่าการที่กำหนดให้ Switch ภายใน 50 ms น่าจะเพื่อป้องกันไม่ให้กลไกเหล่านั้นถูก Trigger โดยไม่จำเป็น ซึ่งจะส่งผลกระทบต่อ Service อย่างมีนัยสำคัญ

    ในส่วนที่เกี่ยวกับความหมาย  มีเขียนไว้ดังนี้ (ตัดมา 3 ท่อน จากหัวข้อต่างๆ กัน เน้นที่คำว่า Switch Completion Time ที่บอกว่าต้องไม่เกิน 50 ms

    “ switch completion time: The interval from the decision to switch to the completion
    of the bridge and switch operation at a switching node initiating the bridge request. “

    “ Protection switch completion time excludes the detection time necessary to initiate
    the protection switch and the hold-off time. ”

    “Switch time – In a ring with no extra traffic, all nodes in the idle state (no detected failures,
    no active automatic or external commands, and receiving only Idle K-bytes), and with less
    than 1200 km of fiber, the switch (ring and span) completion time for a failure on a single
    span shall be less than 50 ms.”

    สรุปคือไม่นับรวมระยะเวลาที่ใช้ในการ Detect และ Hold-off Timer ในที่นี้ที่เราอาจจะสนใจมากหน่อยคือ Detection Time ที่เราอาจจะใช้ BFD หรือใช้ Fault Propagation Mechanism ของอุปกรณ์ DWDM เป็นต้น ซึ่งตามมาตรฐาน ไม่นับรวมใน 50 ms

    แต่เวลาทำการทดสอบที่จะวัดกันว่าผ่านหรือไม่ผ่านด้วย Tester มักจะนับกันที่ระยะเวลาที่ Loss ซึ่งมันรวมเอา Detection Time เข้าไปด้วย

    ซึ่งมันก็ไม่ได้ผิดที่จะกำหนด Target อะไรก็ได้ตามใจ แต่มันจะผิดถ้าบอกว่าเป็นข้อบังคับตาม Standard

    แล้วกลไกอะไรที่ตอบโจทย์ว่า Switch-over Time (หรือ Switch Time, Switch Completion Time) จะอยู่ภายใน 50 ms

    คำตอบก็คือ FRR.... แล้ว FRR คืออะไร จำเป็นต้องเป็น RSVP-TE Tunnel ไหม


    จริงๆ FRR มีหลักการง่ายๆ คือ ต้องหา Next-hop สำรองไว้ก่อน ถ้าอุปกรณ์ระดับ Carrier Grade ก็คือ Download ลงไปที่ตัว Hardware ที่ทำการ Forward ข้อมูลรอไว้ให้ Activate ขึ้นมาใช้งานได้ทุกเมื่อ เพราะถ้ารอให้เกิดเหตุแล้วค่อยหา Next-hop ใหม่ แล้ว Update มันจะ Switch ได้ไม่เร็วนัก (ช้ากว่า 50 ms ถ้าเอา Spec ของ SONET/SDH เป็นตัวเทียบ)

    แค่ทำ FRR ไม่ว่าจะ BGP FRR, IP FRR, LDP FRR ก็เทียบเท่า SONET/SDH Standard ในเรื่อง Switch-over Time

    แต่ถ้าใช้เวลา Detect 3 นาทีแล้วค่อย Switch ... ต่อให้ Switch Completion Time น้อยกว่า 50 ms หรือแม้กระทั่ง 1 ms มันก็ไม่ OK ใช่ไหมล่ะ

    ลองหาเกี่ยวกับ Detection Time ของ SONET/SDH ก็พบว่า Detection Time ของ SDH อาจใช้เวลาถึง 10 ms หรือมากกว่า ดังนั้นกว่าจะ Detect จน Switch เสร็จอาจจะใช้เวลา 60 ms หรือมากกว่าก็ได้

    www.sonet.com/EDU/upsr.htm

    "When a fault occurs the node is allowed 10mS to detect the failure and 50mS to make the switch. This is standard for all SONET systems."

    www.mapyourtech.com/entries/general/blog-post

    According to GR-253 and G.841, a network element is required to detect AIS and initiate an APS within 10 ms. B2 errors should be detected according to a defined algorithm, and more than 10 ms is allowed. This means that the entire time for both failure detection and traffic restoration may be 60 ms or more (10 ms or more detect time plus 50 ms switch time).

    สำหรับ Mobile ปัจจุบัน  Protocol หนึ่งที่สำคัญคือ SCTP คือ Protocol ที่ใช้สำหรับส่งข้อมูล Signaling ถ้า SCTP Down จะเป็นประเด็นที่ค่อนข้างซีเรียส

    การ Detect Failure ของ SCTP เป็นดังนี้


    (อ้างอิง: rimmon-essentials.blogspot.com/2008/10/sctp-failure-detection-time.html)

    SCTP's multi-homing failure detection time depends on three tunable parameters:

    RTO.min (minimum retransmission timeout)
    RTO.max (maximum retransmission timeout), and
    Path.Max.Retrans (threshold number of consecutive timeouts that must be exceeded to detect failure).

    RFC2960 recommends these values:

    RTO.min - 1 second
    RTO.max - 60 seconds
    Path.Max.Retrans - 5 attempts per destination address

    If the timer expires for the destination address, set RTO = RTO * 2 ("back off the timer").
    The maximum value discussed (RTO.max) may be used to provide an upper bound to this doubling operation.

    Since Path.Max.Retrans = 5 attempts, this translates to a failure detection time of at least 63 seconds (1 + 2 + 4 + 8 + 16 + 32).
    In the worse case scenario, taking the maximum of 60 seconds, the failure detection time is 360 seconds (6 * 60).

    In another example, where the following parameters are used,

    RTO.min - 100ms
    RTO.max - 400ms
    Path.Max.Retrans - 4 attempts

    Then,
    Max. failure detection time = (1 + PMR)* RTO.max = 5*400 = 2,000ms
    Min. failure detection time = 100 + 200 + 400 + 400 + 400 = 1,500ms

    Cr: P'Pae@Nokia
    B-)