Uncovering Cisco ACI and Software-Defined-Networking (SDN) solutions

Leon Lai
31 min readApr 13, 2021

--

After publishing a few stories deep-diving data center technologies, it maybe time to publish a document to unveil technology Software-Defined-Networking that looks complicated to dummies

Before going into details of the technologies, I would like to point out this document is not intended to go in-depth and serve as reference design document for both Cisco ACI SDN and VMWare NSX SDN Technologies. If you want do go more in-depth, you can visit following reference design document from vendors:

What is Software-Defined-Networking (SDN)?

From Wikipedia:

Software-defined networking (SDN) technology is an approach to network management that enables dynamic, programmatically efficient network configuration in order to improve network performance and monitoring, making it more like cloud computing than traditional network management.[1] SDN is meant to address the fact that the static architecture of traditional networks is decentralized and complex while current networks require more flexibility and easy troubleshooting. SDN attempts to centralize network intelligence in one network component by disassociating the forwarding process of network packets (data plane) from the routing process (control plane). The control plane consists of one or more controllers, which are considered the brain of the SDN network where the whole intelligence is incorporated. However, the intelligent centralization has its own drawbacks when it comes to security,[1] scalability and elasticity[1] and this is the main issue of SDN.

Picture from https://www.amazon.com/Adorable-Brain-Cartoon-Emoji-Sticker/dp/B073GC5QW7
Picture from https://www.dreamstime.com/illustration/nervous-system.html
photo from https://depositphotos.com/vector-images/blood-vessel.html
Photo from https://welcometochina.com.au/the-evolution-of-high-speed-rail-in-china-7726.html
Photo from http://www.iridetheharlemline.com/2012/04/19/taking-a-visit-to-the-occ-metro-norths-operations-control-center/
Photo from https://www.networkrail.co.uk/stories/signals-explained/
Photo from https://www.rediff.com/money/slide-show/slide-show-1-union-budget-railway-25-interesting-facts-about-the-indian-railways/20140707.htm
Photo from VMWare HK

History of SDN — Introducing the father of SDN

Martin Casado, Nick McKeown, Scott Schenker

Martin Casado (Photo from Google)
Nick McKeown (Photo from Google)
Scott Schenker (Photo from Google)

Quote from Wikipedia

Martín Casado is a Spanish-born American software engineer, entrepreneur, and investor. He is a general partner at Andreessen Horowitz, and was a pioneer of software-defined networking, and a co-founder of Nicira Networks.

Casado attended Stanford University from 2002 to 2008,[citation needed] earning both his Masters and PhD in computer science.[5] While at Stanford, he began development of OpenFlow,[6] His PhD thesis, “Architectural Support for Security Management in Enterprise Networks,” under advisors Nick McKeown, Scott Shenker and Dan Boneh, was published in 2008.[5]

Career

In 2007, Casado co-founded Nicira Networks along with McKeown and Shenker, a Palo Alto, California based company working on network virtualization. Along with McKeown and Shenker, Casado promoted software-defined networking.[6] His PhD work at Stanford University led to the development of the OpenFlow protocol, which was promoting using the term software-defined networking (SDN). McKeown and Shenker co-founded the Open Networking Foundation (ONF) in 2011 to transfer control of OpenFlow to a not-for-profit organization.[8]

In July 2012, VMware acquired Nicira for $1.26 billion.[9][10] At VMware he was made a fellow and held the positions chief technology officer (CTO) for networking and security and general manager of the Networking and Security Business Unit.[11]

Casado was named one of Business Insider’s 50 most powerful people in enterprise tech in 2012,[12] and was featured in Silicon Valley’s Business Journal’s “Silicon Valley 40 Under 40” in 2013.[1] Casado was a 2012 recipient of the Association for Computing Machinery (ACM) Grace Murray Hopper Award as for helping create the Software Defined Networking movement.[13]

In 2015 Casado, McKeown and Shenker received the NEC C&C Foundation award for SDN and OpenFlow.[14] In 2015, he was selected for Forbes’ “Next Gen Innovators 2014.” [15] Casado left VMware and joined venture capital firm Andreessen Horowitz in February 2016 as its ninth general partner.[16][17][18] Andreessen Horowitz had been one of the investors Nicira, contributing $17.7 million to the start-up venture.[10]

The overview concept of OpenFlow is shown in following:

Photo from overlaid.net

How Cisco and VMware cooperate in early stage before the cloud native era

In the early days around 2006–2009 when the adoption of VMware ESXi and VCenter were getting more popular. These two companies maintained very good relationship and have joined R&D efforts to develop the Nexus 1000V virtual switch for LAN switching of virtualized workloads stationed in hypervisor like VMware ESXi. At that time, VMware allowed the customer to use either their
VMware Virtual Distributed Switch (VDs) or Cisco Nexus 1000v virtual switch where Cisco Nexus 1000v switch can provide more advanced features like NetFlow telemetry, Etherchannel, Private VLAN, SPAN Port etc that VMware’s VDS switch cannot provide.

Architectural Diagram of Nexus 1000V virtualized switch components covering VSM, VEM and VSG (Photo from https://www.cisco.com/c/en/us/products/collateral/interfaces-modules/virtual-security-gateway-nexus-1000v-series-switch/deployment_guide_c07-647435.html)

ASBIS: Virtualization Aware Networking — Cisco Nexus 1000V

https://www.slideshare.net/ASBISSK/asbis-virtualization-aware-networking-cisco-nexus-1000vhttps://www.slideshare.net/ASBISSK/asbis-virtualization-aware-networking-cisco-nexus-1000v

Limited Features in early stage of Virtual Switch of VMware ESXi 4.0

Information from “VMware ESXi 4.0 Update 1 Release Notes”
Information from https://www.slideshare.net/ASBISSK/asbis-virtualization-aware-networking-cisco-nexus-1000v
Information From https://www.slideshare.net/ASBISSK/asbis-virtualization-aware-networking-cisco-nexus-1000v

The insights and market trend that Cisco observed from the adoption of Nexus 1000V are policy enforcement between applications segments where the policy elements should be beyond traditional elements lile IP addressing and VLAN and a highly scalable holistic DC switching fabric. Cisco harnessed these important exprience for the development of ACI.

The acquisition war of Nicira Networks.

However, the relationship of these two companies changed in 2012 when VMware and Cisco kick-started the war for acquiring the SDN company Nicira. Eventually VMware won the competition at USD $1.25 Billion. VMware absorbed Nicira quickly in its product portfolio quickly and then product the first release of NSX in 2013. Afterwards, the cooperation between VMware and Cisco started to break. VMware has pictured the future of their Software Defined Data Center (SDDC)clearly thus they acted to isolate Cisco from their software-based virtualized network domain. VMware kept improving the features of VDS and pushed customer migrating to VDS from Nexus 1000V. Eventually, in 2015, VMware announced End of Availability (EOA) for Nexus 1000V. And finally, in 2020, VMware announced “Discontinuation of third party vSwitch program”

How Cisco response to VMware’s action — Cisco ACI Solution

Before losing the battle of acquisition Nicira Networks, Cisco already got prepared and funded a new start-up company called Insieme. In 2013, Cisco chose to acquired Insieme for USD $863 million and then put this into Cisco Data Center Networking Business Unit. Insieme named their SDN solution called Application Centric Infrastructure (ACI). Physically ACI system consists of essential components including Application Policy Infrastructure Controller (APIC) clusters, Nexus 9000 series switches and even Nexus Virtual Edge (We will cover more about the position of this Virtual Edge in later) in different roles. These Nexus 9000 series switches can be classified as different essential roles including Spine Node, Leaf Node, Remote Leaf Node in remote DC without full-scale architecture, Border Node and L3/L2 out gateway device as well as other optional roles like optional roles like IPN, ISN depending on the scale and the scenario of the ACI deployment. Cisco Nexus 9000 series switch are designed to support two modes, the first one is NXOS-based CLI whose CLI syntax architecture is similar to the one that has been running on other Nexus platform like Nexus 7000 series switch and Nexus 5000 series switch. Another mode is called ACI mode where N9K switches that support this mode come with ACI compatible ASIC chipset and firmware pre-installed on the switch motherboard. Logically, Cisco positioned ACI Fabric as a very big network switch by emphasizing its highly stretchable 40Gbps/100Gbps Spine-Leaf Architecture. In hierarchic point of view, Cisco emphasizes ACI is a Application Policy Driven Programmable Fabric for Cloud Native Data Center.

If you want to know more about details of Spine-Leaf architectural design for SDN, you may refer to following information:

Spine-Leaf Architecture is a proven scalable design for telecommunication networks that was raised by Charles Clos in 1952. If you want to know more about CLOS architecture, please refer to following:

For more about fundamental concept of Cisco ACI, you may visit this

Following illustrates components of Cisco ACI

The building block of Cisco ACI Components
  1. A APIC Cluster
  • At least 3 x APIC Controllers Node. APIC Controller Node is a Cisco UCS Server Appliance. It automates and manage ACI fabric, policy enforcement, and health monitoring. More APIC nodes within a cluster doesn’t mean more availability. It will increase scalability only.

2. Cisco Nexus 9000 Series spine / leaf switches and L3Out Switch for Cisco ACI

  • Spine (aggregation) switches are used to connect to all leaf switches and are typically deployed at the end or middle of the row. Spine switches do not connect to other spine switches. Spines serve as a backbone interconnect for leaf switches. Generally, spines only connect to leaves, but when integrating Cisco Nexus 9000 switches with an existing environment, it is acceptable to connect other switches, services, or devices to the spines. Many veterans mentioned spine has similar concept to the fabric module that is present in chassis-based switch like Nexus 7000.
  • Leaf (aggregation) switches are what provide devices access to the fabric (the network of spine and leaf switches) and are typically deployed at the top of the rack. Generally, devices connect to the leaf switches. Devices can include servers, Layer 4–7 services (firewalls and load balancers), and WAN or Internet routers. Leaf switches do not connect to other leaf switches (unless running vPC in standalone NX-OS mode). However, every leaf should connect to every spine in a full mesh. Some ports on the leaf are used for end devices (typically 10 Gb), and some ports are used for the spine connections (typically 40 Gb or 100Gb).

Virtual Port Channel (vPC) in ACI

https://www.cisco.com/c/dam/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/aci-guide-vpc.pdf

  • Border Leaf switches are a pair of Nexus 9000 switches running ACI Image. Border Leaf enables standard Layer 3 technologies to connect to external networks. These can be Layer 3 connections to an existing network, WAN routers, firewalls, mainframes, or any other Layer 3 device. Border leaf switches within the Cisco ACI fabric provide connectivity to the external Layer 3 devices. Cisco ACI supports Layer 3 connections using static routing (IPv4 and IPv6) or the following dynamic routing. To make ACI fabric setup simple, border leaf switch usually runs static routing to external networks only and therefore it may requires additional Layer-3 routing devices to interconnect ACI fabric with external network devices like firewall, branch network routers, Internet, even load balancers.
Diagram from https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/1-x/ACI_Best_Practices/b_ACI_Best_Practices/b_ACI_Best_Practices_chapter_010010.html
  • L3Out Connection is a pair of Nexus switch running Nexus Operating System (NXOS) means how Cisco ACI can connect to outside networks using Layer 3 routing covering route exchange between Cisco ACI and the external routers, and how to use dynamic routing protocols between the Cisco ACI border leaf switch and external routers. It also explores the forwarding behavior between internal and external endpoints and the way that policy is enforced for the traffic flow between them.

3. Active/Active DC Node: Inter-Pod Network (IPN) for Multi-Pod Design or Intersite Network (ISN) for Multi-Site

  • IPN — Multi-Pod solution is an evolution of the stretched-fabric use case. Multiple pods provide intensive fault isolation in the control plane along with infrastructure cabling flexibility. As the name indicates, it connects multiple Cisco Application Policy Infrastructure Controller (APIC) pods using a Layer 3 interpod network (IPN).
Slide from CiscoLive https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2016/pdf/BRKACI-3502.pdf
  • ISN — the different APIC domains are interconnected through a generic Layer 3 infrastructure, generically called the Intersite Network (ISN). The ISN requires plain IP-routing support to allow the establishment of site-to-site VXLAN tunnels. This requirement means that the ISN can be built in an arbitrary way, ranging from a simple, single router device (two are always recommended, for redundancy) to a more complex network infrastructure spanning the world.
Slide from CiscoLive https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/BRKDCN-2657.pdf

4. Cisco ACI Multi-Site Orchestrator (MSO) Cluster — 3 x Nodes

Cisco ACI Multi-Site architecture (Image from https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html)

Cisco ACI Multi-Site Orchestrator (MSO) clusters includes Hardware Appliance or VM over Hypervisors his component is the intersite policy manager. It provides single-pane management, enabling you to monitor the health score state for all the interconnected sites. It also allows you to define, in a centralized place, all the intersite policies that can then be pushed to the different APIC domains for rendering them on the physical switches building those fabrics. It thus provides a high degree of control over when and where to push those policies, hence allowing the tenant change domain separation that uniquely characterizes the Cisco ACI Multi-Site architecture.

5. Cisco Cloud APIC

The ACI Cloud APIC (Image from https://www.cisco.com/c/en/us/products/collateral/cloud-systems-management/application-policy-infrastructure-controller-apic/datasheet-c78-739715.html)

The Cisco Cloud APIC can extends ACI coverage to Amazon Web Services (AWS) and Microsoft Azure. It is available on the Amazon Web Services (AWS) Marketplace as an AMI image. A single instance of the Cisco Cloud APIC can provide networking, visibility, and policy-translation functionalities for workloads deployed across multiple AWS regions and availability zones. This enables IT organizations to simplify their operations and governance in multicloud environments. The solution enables ease of application deployment across any location and any cloud.

How Logical Building Block construct ACI Fabric (Diagram from Cisco.com)

Cisco ACI has positioned itself a SDN solution. Like other SDN solution vendor, ACI covered Northbound API for management access via CLI, Python and Web. Whereas ACI covered Southbound API for control plane access to Switch Fabric via its OpFlex protocol API. (OpFlex: An Open Policy Protocol White Paper)

Cisco is Highlighting ACI is a Policy Driven Data Center Solution (Photo from Cisco Press)

Like other hardware-based networking vendors such as Juniper and Huawei, Cisco ACI switching fabric layers harness 40Gbps/100Gbps based VxLAN overlay and L3 virtual Anycast Gateway techniques to build a robust and highly stretchable Layer-3 / Layer-2 domain. Indeed, having VxLAN overlays does not mean any uniqueness from other SDN vendors. So what makes Cisco a unique market position in the SDN market? The answer is their stunning and leading concept called Policy-Object-Model (POM) concept and their special ASE (ACI Spine Engine) or ALE (ACI Leaf Engine) ASIC chipset installed on N9K switch motherboard.

Nexus 9300 Platform Hardware Architecture to illustrate how ALE / ASE ASIC perform Policy Enforcement
ACI Policy Model Logical Constructs Overview (Photo from https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/policy-model-guide/b-Cisco-ACI-Policy-Model-Guide.html)
Association of Endpoint Groups with Access Policies (Photo from https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/policy-model-guide/b-Cisco-ACI-Policy-Model-Guide.html)

Following Youtube video delivers a very good introduction of Cisco ACI Policy Model:

The development experience from Nexus 1000V virtual switch — Micro-Segmentation:

From the very beginning during the development of Nexus 1000V virtual switch, ACI team has realized clearly that security compliance and micro-segmentation between workload groups are crucial features in future cloud native SDN solution. Meanwhile Cisco ACI has positioned itself as robust and highly stretchable Policy-Driven Data Center Fabric. Therefore, ensuring consistent security policy control even Data Center Fabric is expanded tremendously is a crucial point to differentiate value of Cisco ACI from competitors.

Following Diagram illustrate how initial Nexus 1000V switch does Micro-segmentation across workload groups

Nexus 1000v Workload Segmentation (Photo from https://www.cisco.com/c/en/us/td/docs/switches/datacenter/vsg/sw/4_2_1_VSG_1_2/vsg_configuration/guide/VSG_config/vsg_config_intro.html)

In Cisco Nexus 1000V uses Virtual Securiy Group (VSG) to segment traffic between Zone. Traffic between zones are controlled by ACL Policy rule of VSG where it will check following network attributes in rule with either permit / deny action. VSG block traffic by default (implicit deny. Same as Contracts between ACI EPG)

5 Tuples of VSG Supported Attributes
Access Control Policy by ACL Model in VSG on Nexus 1000V (Photo from https://www.cisco.com/c/en/us/products/collateral/interfaces-modules/virtual-security-gateway-nexus-1000v-series-switch/deployment_guide_c07-647435.html)

ACI — Policy-Driven-Model SDN

In Cisco ACI, workloads with common characteristics are classified as End Point Group (EPG) according to ACI Policy Model. For example, Web Servers are classified as “Web EPG” and Databased Servers are classified as “Databased EPG” and DHCP, DNS and IPAM servers are classified as “Shared Services EPG” respectively. Policy Enforcement rules between EPGs are controlled by policy rules called “Contracts”. By default, EPG Contracts use Whitelist model and implicit deny for traffic with no matching any of “Contracts”. In contrast, ACI Contracts enforcement also support a model called “Contract Taboo” that is a Blacklist model with implicit permit for traffic with no matching of “Contracts”.

Policy-Driven-Model of Cisco ACI sounds complicated to network administrators when he is provisioning APIC via WebGUI. In contrast, from DevOps perspective, ACI Policy-Driven-Model is good for developers via Cisco ACI API interface. It ensures a standard policy model for developing use cases of DCN.

Policy Object Model for ACI (From https://developer.cisco.com/docs/aci/#!introduction/object-model)
How Traffic Control between EPG by Contracts (Diagram from https://sdn.systemsapproach.org/index.html)

In ACI design, APIC controllers will only program / download policy to Leaf Nodes only Where it is Required to make ALE ASIC memory resources more efficient. However, ACI allows different methodology to program / download to Leaf Nodes as well in followings modes:

  • Pre-provision: Specifies that EPG policies (for example, VLAN, VXLAN binding, contracts, or filters) are downloaded to a leaf switch software even before a hypervisor is attached to the VDS, thereby pre-provisioning the configuration on the switch. Switches are selected based on AEP presence.
  • Immediate: Specifies that EPG policies are downloaded to the associated leaf switch software upon hypervisor attachment to VDS. LLDP or OpFlex permissions are used to resolve the hypervisor to leaf node attachments.
  • On Demand: Specifies that a policy is pushed to the leaf node only when a pNIC attaches to the hypervisor connector and a VM is placed in the port group (EPG).
From CiscoLive (https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2018/pdf/BRKACI-2504.pdf)
From CiscoLive (https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2018/pdf/BRKACI-2504.pdf)
From CiscoLive (https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2018/pdf/BRKACI-2504.pdf)
From CiscoLive (https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2018/pdf/BRKACI-2504.pdf)
Document from https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/BRKACI-2300.pdf

Between EPG, Cisco ACI can support following action for Contracts

From https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html

In Enforcement Point of view for Contracts between EPGs, in most the scenario, Cisco tries perform policy enforcement at Ingress Direction of the Leaf Node to make traffic forwarding within the ACI Fabric more efficient.

From https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html

APIC controllers will communicate with software called Policy Element Manager to program ACI Policy Model covering EPG and Contracts to the ASIC of the switch.

How APIC Communicate with N9K Leaf Switch ASIC for program of Policy Model

Following document illustrates more details about ACI Contracts:

Macro-segmentation on ACI (Service Graph)

In Cisco ACI, security policy enforcement between EPG is controlled by ACI Contract ordinary. In Cisco ACI, Contract does not mean Firewall / IPS because policy enforcement is stateless and does not provide advanced security features like threat prevention . For compliance issue that need interception or inspection by firewall / IPS or Load Balancer between EPG(s), it requires Macro-Segmentation features of ACI.

MacroSegmentation by Service Graphs in ACI (Slide fromhttps://www.cisco.com/c/dam/m/de_de/events/2016/techdays-juni/pdfs/Tag1-ACI-Intro-FW1-2.pdf)

Cisco ACI offers three management models for the service graph:

● Network policy mode (or unmanaged mode): In this mode, Cisco ACI configures only the network portion of the service graph on the Cisco ACI fabric, which means that Cisco ACI doesn’t push configurations to the L4-L7 device.

● Service policy mode (or managed mode): In this mode, Cisco ACI configures the fabric and the L4-L7 device VLANs, and the APIC administrator enters the L4-L7 device configurations through APIC.

● Service manager mode: In this mode, the firewall or load-balancer administrator defines the L4-L7 policy, Cisco ACI configures the fabric and the L4-L7 device VLANs, and the APIC administrator associates the L4-L7 policy with the networking policy.

Among these three modes, Network policy mode (or unmanaged mode) is the more popular one. ACI provides the capability to insert Layer 4 through Layer 7 (L4-L7) functions using an approach called a service graph. One of the main features of the service graph is Policy-Based Redirect (PBR).

With PBR, the Cisco ACI fabric can redirect traffic between security zones to L4-L7 devices, such as a firewall, Intrusion-Prevention System (IPS), or load balancer, without the need for the L4-L7 device to be the default gateway for the servers or the need to perform traditional networking configuration such as Virtual Routing and Forwarding (VRF) sandwiching or VLAN stitching. Cisco ACI can selectively send traffic to L4-L7 devices based, for instance, on the protocol and the Layer 4 port. Firewall inspection can be transparently inserted in a Layer 2 domain with almost no modification to existing routing and switching configurations.

Some Firewall vendor has integrated their solution to Cisco ACI on Macro-segmentation use cases. For example, both Fortinet and Palo Alto have developed very good integration of policy element synchronization tools / plug-in between their firewall solution with Cisco ACI.

Following document provide a basic illustration of Macro-Segmentation of Cisco ACI with 3rd party platform:

Slides from https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2017/pdf/BRKACI-2307.pdf

FortiGate Connector for Cisco ACI

Fortinet Connector for Cisco ACI

FortiGate provides L4 — L7 service insertion and automation within ACI. The FortiGate Connector is a device package installed on Cisco ACI that contains XML metadata describing Fortinet’s security services and can be easily uploaded to the Cisco APIC controller. This joint solution streamlines traffic to supported FortiGate appliances and assigns security policies on command for data center workloads.

How to install ACI Device Package on Cisco APIC
Cisco APIC to push policy to Fortigate via Fortigate Connector API for Cisco ACI
Provisioning and Service Insertion of Fortigate VDOM by Fortigate Connector in Cisco APIC

Palo Alto Networks Firewall Integration with Cisco ACI

Diagram from Palo Alto https://docs.paloaltonetworks.com/vm-series/9-1/vm-series-deployment/set-up-a-firewall-in-cisco-aci/palo-alto-firewall-integration-with-cisco-aci-overview

Meanwhile, Palo Alto also published whitepaper for integration of Palo Alto Firewall with Cisco ACI. Palo Alto highlights they can support for both east-west policy enforcement between ACI EPGs or north-south policy enforcement between users and the applications. Their integration is quite broad that can support various integration including following

East/West Control: Traffic forwarding in a one-arm PBR integration (Diagram from Palo Alto Design Document
East/West Control: GoTo Integration Mode (Diagram from Palo Alto Design Document)
East/West Control: Traffic forwarding in a GoThrough integration Traffic forwarding in a GoThrough integration

East/West Policy Enforcement

  • Policy-Based Redirect mode that have one-arm PBR between ACI EPG\
  • GoTo mode that ACI routes route traffic between ACI EPGs / Bridge Domains where Palo Alto Firewall serves gateway of the Endpoints for routing
  • GoThrough Mode between ACI EPG that Palo Alto Firewall enforce policy between Endpoints reside on same subnet but different EPG.
Slides from Palo Alto

North South Policy Enforcement

North/South Control: Firewall used as the external router in an L3Out deployment
  • Palo Alto Firewall can integrate with ACI Border Leaf as a L3Out connection in ACI design practice. The firewall serves as external router for policy enforcement of inbound / outbound traffic of the ACI fabric. With this mode, the Palo Alto Firewall can even act us L3Out Gateway to replace standard L3Out device like Nexus 93K or Nexus 92K.
Slides from Palo Alto

Reference Architecture Guide for Cisco ACI of Palo Alto can be downloaded at following:

https://www.paloaltonetworks.com/apps/pan/public/downloadResource?pagePath=/content/pan/en_US/resources/guides/cisco-aci-reference-architecture-guide

Slides from Palo Alto

Cisco and Palo Alto developed a plugin for installing on Panorama which is the centralized provisioning platform of Palo Alto Firewalls. The plugin polls APIC for endpoint information and changes in the ACI environment and provides that information to Panorama. The Panorama will then distribute firewall configurations to corresponding Palo Alto Firewalls. The plugin can work with maximum of 16 x APIC clusters.

Active/Active DC Deployment in Cisco ACI

Following diagram illustrate different deployment scenario when Active/Active DC is designed with Cisco ACI

Journey of Active/Active DC for ACI https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf
Photo from https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-737855.html

Active/Active Data Center Design is always one of the mandatory requirement s of sizeable application infrastructure. From ACI 1.0 version launched from 2014 , Cisco has evolved a series of practice for design and deploy a Cisco ACI-Based Active/Active Data Centers. Multi-Pod and Multi-Site are two most popular deployment modes of Cisco ACI in current release (4.x or latest 5.x release).

https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2019/pdf/BRKACI-2003.pdf

Multi-Pod deployment consists of a single APIC Cluster domain ranging from 3 to 7 APIC nodes in the clusters that can manage up to 12 Pods across the Multi-POD deployments. The APIC cluster span the the ACI domain across DC(s). Each DC are usually named as “Pods”. Cisco emphasizes workloads reside between Pods remain the same functionality as long as under the same ACI fabric. Since Multi-Pod is managed by a single APIC Cluster, the outage of the APIC cluster means outage to the ACI services. Therefore, Multi-Pod is usually regarded as single fault domain.

https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-737855.html

Each Pod is locally a spine-leaf VxLAN architecture. Within the Pod, there is a IS-IS / COOP / MG-BGP domain where IS-IS acts as underlay routing between spine-leaf for VXLAN VTEP establishment, COOP (Council Of Oracle Protocol) act as Endpoint address learning mechanism, MP-BGP routing acts as the mechanism for how external L3 prefix inject into ACI Pods and how ACI’s subnets are announced out by L3out.

Pods are interconnected by IP networks called Inter-Pod Network (IPN) nodes and link to Spine of each DC. Mandatory link requirement are Round-Trip Time between Pods must be 50 msec or less and MTU on WAN networks for IPN must be over 9000 Byte. Likely it should 9150 Byte or above.

Currently, IPN is not managed by APIC and run in NXOS mode. Leafs between Pods can establish VTEP over underlay for seamless east/west traffic flow through underlay IS-IS to OSPF routing redistribution at IPN layers. Spines can establish VTEP over IPN for MP-BGP EVPN protocol for Endpoint address across Pods because across Pods Endpoint address learning is beyond the scope of COOP.

For L3Out route injected in each Pods, EIGRP / OSPF or BGP routing protocol are options to integrate external L3out routable subnets from border leaf within Multi-Pod control plane. Spine nodes in each Pod can serve as MP-BGP VPNv4 Route-Reflectors (RRs) to distribute routing information within the fabric. For outbound from ACI to L3Out External, if EIGRP / OSPF is used, each the endpoints in each Pod will prefer its local L3Out path. If BGP is used, L3Out in ACI or external routers in a Pod have the flexibility to adjust preferred path by standard BGP attributes like AS-Path prepend or Local Preference. If the L3Out of a Pod is outage, Endpoints in a Pod will failover the L3Out path to remote Pod via IPN.

Multi-Site

https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html

Multi-site deployment consists of a two or multiple APIC Clusters domain and policy, object, settings between APIC cluster domains are orchestrated by ACI Multi-Site Orchestrator (MSO) delivered as Virtual Machine or Appliance in Cluster mode or even orchestrator as instance in public cloud (3 x MSO instances as a MSO Cluster). ACI Multi-Site setup eliminates the single fault domain issues of Multi-Pod by creating more APIC clusters in other sites. Functions between DC on Multi-Pod and Multi-Site deployment are not comparable. Cisco Multi-site setup can be treated as a cloud native architecture that its virtual ACI domain can be extended to public cloud like AWS and Azure through orchestration by MSO.

MSO Deployment Options can be Physical Appliance, VM or VM on AWS / Azure (From https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html)
RTT Latency Figures between MSO Node is 150 msec or less (https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html)

Unlike Multi-Pod setup via IPN, in Multi-Site setup, sites are interconnected with device called Intersite Node (ISN). Link requirement between ISN is loose than IPN that it can transport over IP routable links but latency between MSO to Sites is less 1 sec requirement. However, MTU requirement is still 9050 Byte and the same with IPN because the data plane between sites are still VXLAN. In addition, in case MSO nodes are diversified and distributed to sites, the RTT latency between MSO nodes is 150 msec or less to ensure system setting database are in-sync between MSO nodes.

Between sites, Control Plane is based on MP-BGP EVPN and Data Plane is based VXLAN encapsulation. Spines and ISN use OSPF routing to establish IP level underlay for MP-BGP EVPN to perform TEP connection establishment covering Overlay Unicast TEP (O-UTEP) and Overlay Multicast TEP (O-MTEP).

MSO to orchestrate namespace conversion through management plane / API between sites and Spines use MP-BGP EVPN as control plane actual name space conversion between sites (https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf)
(https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf)
Spines are responsible for keeping track of COOP Endpoint Database in each site. Spines will use MP-BGP EVPN to exchange COOP Endpoint Database across sites.

Start from ACI 4.0(1), ACI Multi-site supports cross Spine-level VTEP data encryption through unifying silos IPSEC / MACSEC setting and overlay VTEP by MSO. MSO serves as the orchestrator to dispatch encryption key to spines for payload encryption.

CloudSec — An innovation to combine hardware encryption and software overlay as an unified control plane to eliminated silos overlay tunnel and data encryption individually (https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf)

As mentioned in previous chapter, ACI has powerful Policy-Object-Model with downloadable EPG/Contracts by hardware-based ASIC level policy enforcement . To ensure a highly scalable policy enforcement engine without the boundary by hardware scalability limits especially highly scalable multi-site setup, Cisco has introduced certain mechanism to simplify design and reduce the no of contracts between EPG by Preferred Groups or EPG VzAny mechanism. This can also reduce the complexity of policy mapping between EPG(s).

ACI Scaling Limits between Sites (https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf)

Stateful In/Out Traffic Path between DCN Fabric

In Multi-site, Border Leaf also speak MP-BGP EVPN with L3Out nodes. Before ACI version 4.2(1), due to the nature that L3Out in each site is managed by individual APIC cluster, therefore endpoints deployed in a given site can communicate with the external network domain only through a local L3Out connection.

Cisco ACI Release 4.2(1) introduces a new functionality named “Intersite” L3Out, removing the restriction shown above. All the considerations made in the remaining part of this section assume the deployment of at least a local L3Out per site.

Cisco ACI Multi-Site and L3Out connectivity (pre–Release 4.2(1)) (https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html#ConnectivitytotheexternalLayer3domain)
Intersite L3Out Improve asymmetric path traffic when workloads or subnets failover to remote sites (https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf)

Active/Active DC with External Firewall Deployment

Standard Firewall Cluster Setup

In any Active/Active DC Design, usually there is a pair of firewall cluster in either active/active or active/passive mode in each DC. Firewall cluster in each DC operates in discrete control plane that traffic session database are not synchronized with each other. Therefore, eliminating asymmetric path for inbound / outbound traffic flow to/from between DC fabric and external network otherwise firewall cluster in either site will drop the traffic due to the stateful inspection behavior of firewall per site.

https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf

In Cisco ACI whitepaper, Cisco recommends a number setup scenario including following option to ensure symmetric inbound/outbound traffic flow in ACI multi-site setup.

  1. Build a pair firewall cluster in each site in either active/active or active/passive mode. Individual Endpoint within the subnet will use its new local L3Out after a site failover taken place for the Endpoint. APIC administrator should turn on “Host Route Advertisement” options on the EPG. However, the drawback of this setup is that injecting host route from ACI to L3Out will create large number of route entries that will consume large of system memory for the corresponding external network devices. Therefore, this option may not be practical in scalable design.
  2. Build a pair cross-site firewall cluster in active/passive mode. All Endpoint within the subnet will treat the L3Out of one DC as primary path. Endpoints will still use its original L3Out although site failover taken place for the Endpoint by Intersite L3out feature that was introduced in ACI 4.2(1). However this option restrict ACI Multisite or even MultiPod design to use L3Out of one DC only and it will limits the inbound/outbound bandwidth performance of the ACI domain.
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf

Replace Local Firewall Cluster by Cross-Site Firewall Cluster

To ease the mentioned asymmetric traffic flow issue, having cross-site Firewall Cluster seems to be an ideal and scalable setup. Recently, some firewall vendors like Fortinet and Palo Alto networks introduced these techniques in their recent firmware version.

Palo Alto Network Firewall — HA Clustering

Palo Alto Networks introduce HA Clustering feature in their recent PAN-OS 10.0 version that a number of Palo Alto Networks firewall models now support session state synchronization among firewalls in a high availability (HA) cluster of up to 16 firewalls depending models. The HA cluster peers synchronize sessions to protect against failure of the data center or a large security inspection point with horizontally scaled firewalls. In the case of a network outage or a firewall going down, the sessions fail over to a different firewall in the cluster. Such synchronization is especially helpful in the following use cases.

Fortigate Firewall — Fortigate Session Life Support Protocol (FGSP)

The FortiGate Session Life Support Protocol (FGSP) is a proprietary HA solution for only sharing sessions between two entities and is based on a peer-to-peer structure. The entities could be standalone FortiGates or an FGCP cluster. If one of the peers fails, session failover occurs and active sessions fail over to the peer that is still operating. This failover occurs without any loss of data. Also, the external routers or load balancers will detect the failover and re-distribute all sessions to the peer that is still operating. FortiGates in both entities must be the same model and must be running the same firmware. FGSP supports up to 16 peer FortiGates.

Multi-Pod or Multi-Site? Which one you should go?

Being the largest networking giant in the industry, Cisco has done well and produced large amount of whitepapers to elaborate the concept and use cases of the two ACI active/active DC setup technique. Usually networking architect will be questioned which one is the optimal and the best in terms of TCO for a active/active DC setup.

https://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/aci/aci_multi-site/sw/1x/fundamentals/Multi-Pod_vs_Multi-Site_Infographic.pdf

However, when we go deep into philosophy of ACI Multi-Pod and Multi-Site design, by comparing the concept and terminology of ACI with public cloud platform like Azure, we will realize that Multi-Pod is actually harnessing the concept of “Availability Zone” and Multi-Site is actually harnessing the concept of “Region

https://docs.microsoft.com/en-us/azure/availability-zones/az-overview
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-2125.pdf

To go more deep into advanced architectural level design of Cisco ACI, network architects can combine Multi-Pod and Multi-Site into one setup thus enhancing overall availability DCN infrastructure.

https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739609.html

Virtualized ACI Leaf Switch on Hypervsior

ACI — Application Virtual Edge (AVE)

Like Nexus 1000v virtual switch for hypervisor before, Cisco ACI team developed a virtual switch called Application Virtual Edge (AVE) for for hypervisor layer. It targets for hypervisor from different brands like VMware, Linux KVM and Microsoft Hyper-V but the moment of writing this document, AVE can only runs on VMware platform only.

Cisco datasheet introduce AVE as followng:

“Cisco ACI Virtual Edge is virtual network edge offering of the Cisco ACI portfolio and the next generation of Cisco Application Virtual Switch software. Virtual Edge extends the Cisco ACI policy model, security, and visibility to virtual infrastructure and provides policy consistency for the virtual domain.

Virtual Edge extends the Cisco ACI policy model to existing infrastructure, providing investment protection. It eliminates the VLAN configuration burden in blade and Fabric Interconnect deployments, reducing OpEx and time to value. Virtual Edge also enables Cisco ACI to be extended to bare metal clouds and offers consistent policies across on-premises and cloud applications.”

Diagram from https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2019/pdf/BRKDCN-2044.pdf
Diagram from https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2019/pdf/BRKDCN-2044.pdf

AVE Features List in Datasheet in following. After scanning the datasheet, the major features of AVE is to extend ACI Policy Model including EPG and Contracts down to Hypervisor level. Also ACI Spine-Leaf Architecture with VxLAN functions can be extended to the Hypervisor layer by AVE.

Information from https://www.cisco.com/c/en/us/products/collateral/switches/application-centric-infrastructure-virtual-edge/datasheet-c78-740249.html

What VMware does for Cisco AVE?

As mentioned before, the good relationship between Cisco and VMware broke after the acquisition of Nicira by VMware in 2012. In response to the SDN strategy of Cisco ACI, VMware issued a document “Cisco ACI AVE/VMM Mode Support in a VMware environment (57780)”

In the document, VMware stated that VMM/AVE (Both VMM and AVE together? or either one???) was developed outside of any formal partner program and therefore is not supported by VMware. Therefore, for Support Requests which directly relate to the ACI VMM/AVE component, VMware will request that the Cisco VMM/AVE component be removed. From the document, it looks like VMware try to mix VMM / AVE as one thing but not two individual matters.

From https://kb.vmware.com/s/article/57780

For details of ACI VMM integration with VMware, please refer to following

From https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2019/pdf/PSODCN-2555.pdf

At the time of writing this paper, Cisco ACI VMM integration can work with a variety of Hypervisors, Containers, Dockers and K8S ranging from VMware, Microsoft, Redhat OpenShift, K8S even RancherOS

How to create VMM Domain and Integration with VCenter

How Cisco ACI address to VMware point of view?

In response to the point of view from VMware, Cisco issued following document

Cisco Support Statement for Cisco ACI with the VMware Hypervisor Suite:

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/Support-Statement-Cisco-ACI-Virtual-Edge.html https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/Support-Statement-Cisco-ACI-Virtual-Edge.html

In the document, Cisco stated that ACI VMM and ACI AVE are different things that VMM is leveraging VMware public API for integration to map VMware side VDS Port-Group attribute with Cisco ACI side EPG attribute. However, for Cisco AVE, it looks like Cisco doesn’t cover too much for point of view from VMware. In the market, not too many VMware VCenter users dare to deploy Cisco ACI AVE on their VMware Hypervisors actually. Without ACI AVE, Cisco cannot intercept and control East-West traffic flow in case both workloads resides in the same Hypervsor. (This is actually why VMware to try eliminate Cisco ACI from Hypervisor layer)

The statement of VMware makes Cisco ACI difficult to extend their ACI Policy Object Model and EPG / Contracts policy enforcement in VMware Hypervisor. Especially when customers ask for micro-segmentation for workload under same EPG because ACI Contract / Policy Enforcement cannot extend to Port-Group level in VMware ESXi VDS. Therefore they need to look for alternative approach to complete the story of Application Segmentation for Data Center Network.

Cisco Tetration

In nowadays, determining application segmentation policy is not easy. It requires holistic view of application dependency graph to visualize the whole picture.

Cisco claims that Cisco Tetration platform addresses workload and application security challenges by providing microsegmentation and behavior based anomaly detection capabilities across a hybrid cloud infrastructure.

One of the major use cases for Tetration is for Data center workload protection covering following functions:

Workload protection: The Cisco Tetration platform enables holistic workload protection for multicloud data centers by using:

◦ Allow list-based segmentation for implementing a zero-trust model

◦ Behavior baselining, analysis, and identification of deviations for processes

◦ Detection of common vulnerabilities and exposures associated with the software packages installed on the servers

◦ The ability to act proactively, such as quarantining servers when vulnerabilities are detected and blocking communication when policy violations are detected

Tetration can be deployed as on-prem or on public cloud like AWS.

Tetration collect telemetry, data from a variety of sensors including software sensors installed on Operation Systems. Or collect data from ERSPAN sessions from HyperVisors as well as NetFlow / IPIX sensors from compatible network devices like Cisco Routers / Switches. The latest Nexus 9000 switch can act as embedded hardware sensor that Tetration can collect data from Embedded Network Sensors installed on compatible Nexus 9000 switch.

https://www.cisco.com/c/en/us/products/collateral/data-center-analytics/tetration-analytics/datasheet-c78-737256.html
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/PSOACI-4591.pdf
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/PSOACI-4591.pdf

Tetration can visualize what is running in Data Center.

https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/PSOACI-4591.pdf
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/PSOACI-4591.pdf
https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2018/pdf/PSOACI-4591.pdf

Summary

With this document, I hope it can serve as an overview and unveil the metaphor of vendor-based SDN solutions as well as Cisco Application Centric Infrastructure (ACI).

--

--

No responses yet