Cite
Huawei Cloud Stack 8.3.1 Solution Description
Additional Readings
1. Overview
- FusionSphere OpenStack
- consolidate resources in each physical DC
- ManageOne
- manages multiple DCs
- Services
- Compute services
- Elastic Cloud Server (ECS)
- Bare Metal Server (BMS)
- Image Management Service (IMS)
- Auto Scaling (AS)
- Storage services
- Elastic Volume Service (EVS)
- Block storage
- Scalable File Service (SFS)
- NFS, CIFS
- Object Storage Service (OBS 3.0)
- HTTP, HTTPS
- Scalable File Service Turbo (SFS Turbo)
- NAS
- Elastic Volume Service (EVS)
- Network services
- Virtual Private Cloud (VPC)
- Source Network Address Translation (SNAT)
- Elastic IP (EIP)
- Elastic Load Balance (ELB)
- Distribute traffic across ECSs
- Network ACL
- Security service for VPCs
- Virtual Private Network (VPN)
- Direct Connect
- Dedicated network
- VPC Endpoint (VPCEP)
- Secure and private channel to VPCs without EIPs
- Cloud Connect (CC)
- Connect VPCS in different regions
- CloudDNS
- Enterprise Networking Service (ENS)
- Virtual Private Cloud (VPC)
- Security services
- Database Audit Service (DBAS)
- Key Management Service (KMS)
- Web Application Firewall (WAF)
- Host Security Service (HSS)
- Cloud Firewall 2.0 (CFWforHCS)
- Anti-DDoS
- Cloud Bastion Host (CBH)
- Cloud Secret Management Service (CSMS)
- Data Encryption Workshop (DEW)
- SecMaster
- Platform Bastion Host (PBH)
- DR (Data Replication?) and backup services
- Volume Backup Service (VBS)
- EVS disks backup
- Cloud Server Backup Service (CSBS)
- eBackup, OceanProtect
- Cloud Server Disaster Recovery (CSDR)
- Cloud Server High Availability (CSHA)
- Volume High Availability (VHA)
- Volume Backup Service (VBS)
- Container services
- Cloud Container Engine (CCE)
- K8S service
- Software Repository for Container (SWR)
- Manage container images
- Cloud Container Engine (CCE)
- Application services
- Simple Message Notification (SMN)
- ROMA Connect
- Message, data, API, device
- Distributed Cache Service (DCS)
- In-memory cache compatible with Redis
- Application Performance Management (APM)
- Performance monitoring
- Application Operations Management (AOM)
- Health monitoring
- Log Tank Service (LTS)
- Logs
- ServiceStage
- Astro Zero
- CodeArts services
- …
- Enterprise Intelligence (EI) services
- …
- Database services
- GaussDB
- Distributed relational database
- Data Replication Service (DRS)
- Database migration, real-time synchronization
- Relational Database Service (RDS)
- GaussDB
- Management services
- …
- IoT services
- …
- Enterprise application services
- …
- Common components
- Linux Virtual Server (LVS)
- Nginx
- Network Time Protocol (NTP)
- HAProxy
- API Gateway
- TaskCenter
- DNS
- Service Detail Record (SDR)
- Cloud Configuration Service (CCS)
- Deploy Management Kit (DMK)
- GaussDB
- EulerOS
- Cloud management
- ManageOne
- eSight
- FusionCare
- HCS ServiceLink
- …
- Compute services
2. Application Scenarios
- Principles for regions or global zone deployments
- Global zone manage by ManageOne and Identity and Access Management (IAM)
- Region
- Physical data center
- Each region completely independent
- High fault tolerance and stability
- Cannot change region of created resources
- AZ
- Multiple in a single DC, physically isolated but interconnected through internal networks
- Physical location using independent power supplies and networks
- AZs are independent
- Resource pool
3. Architecture
4. System Security
- System security
- Virtualization management layer
- Difficult to track malicious users (because on-demand and self-service allocation of resources)
- North-south/Vertical traffic - between external and internal
- East-west/Horizontal traffic - between internal
5. Infrastructure and Resource Pools
- Infrastructure
- OpenStack
- Nova (compute resource management)
- Manage compute resource
- Cinder (block storage management)
- Persistent block storage
- Swift (object storage management)
- Scalable, redundant storage system
- Glance (image management)
- VM image management
- Keystone (identity management)
- Authentication, service rules management, token management
- Implements OpenStack identity API
- Heat (service orchestration)
- Ceilometer (telemetry)
- Ironic (bare metal provisioning)
- APIs for physical machines with no OS installed
- Service OM
- Operation and maintainence
- Virtualized pool
- Connects KVM compute nodes
- Bare metal server pool
- Connects bare metal server nodes
- Block storage pool
- Connects block storage devices
- Nova (compute resource management)
- Cloud management
- ManageOne structure
- VDC tenant model
- Tenant
- Enterprise or subsidiary, isolated
- VDC
- Department or subsidiary, up to 5 levels
- Resource space
- Can be assigned to specific user
- Quota
- Can be unlimited to all within whole ManageOne/VDC
- User group
- Authorization
- Tenant
- OpenStack
6. Cloud Management
- Cloud Management
- Architecture/Management
- Service Builder
- Create ECS, BMS, CCE
- Bind an EIP
- BMS can create IMS, EVS, ELB, VPC
- Approval process management
- VDC Quotas
- 1:1.2 (headcount:quantity)
- Based on CPUs and RAM
- Based on SLAs (e.g. SATA, SSD)
- Pricing for quotas
- Service Builder
- Operation
- O&M user groups
- Administrators
- SecurityAdministrators
- AuditManagers
- Read Only User Group
- North User Group
- Custom
- Monitoring
- “Dimensions”, e.g. region, resource pool, AZ, cluster/host group, etc.
- “Indicator”, e.g. CPU usage, supports aggregation
- “Canvas”, UI for viewing metrics
- Alarm severities, event types, modify based on scenarios
- Dashboard
- UI elements and functions
- Data source → Elasticsearch (alarms, performance, capacity, resources, services) → data set
- Visualize by charts
- Combined into dashboards
- Use for monitoring, demonstration
- Big Data Application Monitoring
- HBase
- Column-based distributed storage system
- GaussDB 200
- Relational database for large-scale parallel processing
- Hive
- Data warehouse built on Hadoop
- HBase
- Business Application Monitoring
- Data of middleware, containers, cloud resources, servers, storage devices, network devices
- Data provided to data analysis platform, resource management, alarm warehouse
- Aggregation algorithm
- Computing module → stats, e.g. topologies, resources, alarms
- Cloud Service Monitoring
- cAdvisor
- URL test
- ⇐ 500 URL test tasks, ⇐ 5 tasks for each tenant application, ⇐ 200 tasks for a test point
- Metrics threshold
- Provided to Elasticsearch for further operation
- O&M user groups
- Resource Management
- Configuration Management Database (CMDB)
- Manages data of devices and systems
- Producer, consumer, automatic discovery
- Configuration Management Database (CMDB)
- Topology Management
- To manage network, automatic discovery
- Physical, Virtual, Business topology
- Topology objects
- Nodes
- Identify a managed device
- Physical nodes
- Virtual nodes (not managed)
- Links
- Connection between nodes
- Physical links
- Virtual links (not managed)
- Groups
- Divide into smaller network groups by some criteria
- Nodes
- Automatic Jobs (AutoOps)
- Agile O&M automation platform
- Manage ECS, BMS, host machine, management VM
- e.g. patch installation in batch, password change in batch
- Resource Analysis
- Resource pool
- VDC
- Scenario-specific analysis
- Capacity
- Bottleneck
- Idleness
- My Reports
- Drill up, down, through; periodic report
- Health Assurance
- Health check
- Log management
- Run logs → SFTP server
- Tenant operation logs, management operation logs → Elasticsearch server
- Troubleshooting
- Call chain
- Certificates
- Query, replacement, revocation, registration
- Certificate Authority (CA)
- Root, own public key
- Issuing, managing certificates
- Class A → human-machine interaction
- Class B → interaction between different solutions (e.g. ManageOne, FusionSphere)
- Class C → interaction between different (same? doc typo?) solutions (e.g. ManageOne component)
- Accounts
- Account query, management
- Authorization records
- Account request
- Password management
- Data Backup
- SFTP, SSHv2
- Full/Incremental backup
- System Management
- System integration
- SNMP (TCP/IP)
- User management
- License management
- CA service
- Root CA
- Self-signed, no need to be verified by other CAs
- Same subject and issuser
- multiple subordinate CAs
- Signed by root/other subordinate CAs
- Different subject and issuser
- Certificate Revocation List (CRL)
- Certificate chain
- Public key infrastructure (PKI)
- One-way/Two-way TLS
- …
- Root CA
- Task center
- SNMP Alarm API
- Integration Gateway
- RemoteNotifyService
- Personal Settings
- Broadcast Message
- Personalized Customization
- System integration
- Architecture/Management
- Operations Command Center
- Workspace, resource management, data analytics, etc.
- Multi-Cloud Management
- Cloud federation
- CloudGateway
- VPN
7. Compute Services
- Compute Services
- ECS
- Consists of vCPUS, memory, EVS disks
- Works with VPC, CSBS, EIP
- Virtualization
- No secondary virtualization
- No audio adapters
- Controlled by console via combined API, or directly by API
- Infra controlled by OpenStack
- BMS
- Compared to ECS
- No feature loss
- No performance loss
- Have exclusive resources
- Have local storage
- Non-permanent, use EVS for that (high stability and reliability, can be used for backup also)
- SPOF (single point of failure)
- Stable I/O, high throughput
- Compared to physical server
- Have EVS, VPC, remote, monitoring, etc
- Can use image
- High-speed layer 2 horizontal networking by creating a dedicated VLAN sub-interface in the BMS OS
- Usually used for core database, high-performance computing scenario, security-demanding scenario (data isolation, controllability, traceability)
- Compared to ECS
- IMS
- Public, private, shared images
- AS
- ECS as instance in AS group
- AS policy can trigger a scaling action, by
- Alarm
- Periodic
- Scheduled
- ECS
8. Storage Services
- Storage Services
- EVS, EVS Cloud Block Storage
- Elastic attaching/detaching
- Various disk types
- Scalability (⇐ 64TB)
- Snapshot
- Shared disk
- e.g. relational database, < 1 ms, 2000-20000 IOPS/TB, RAID 6
- e.g. data warehouse, 1-3 ms, 500-4000 IOPS/TB, RAID 5
- e.g. deployment/test, 10-20 ms, 5-25 IOPS/TB, RAID 6
- NFS, CIFS, DPC
- SFS
- NFS, CIFS, DPC, FNS&DPC
- OBS 3.0
- No space limitation
- HTTPS
- ACL
- Object
- Data and metadata, stored in buckets
- Name as UTF-8 key up to 1024 char
- Metadata
- System-defined metadata
- Date, content-length, last-modified, ETag
- Custom metadata
- System-defined metadata
- Using ”/” as folder and stored as a key
- Data and metadata, stored in buckets
- Buckets
- Containers for objects
- Max 100 buckets, no restriction on number and total size
- Parallel File System (PFS)
- Access Keys (AK/SK)
- Access key ID (AK), secret access key (SK)
service_name.region0_id.external_global_domain_name
- SFS Turbo
- Scalable, high performance NAS, used in ECS, BMS, CCE, CCI
- Can share across AZs
- NFSv3 only
- EVS, EVS Cloud Block Storage
9. Network Services
- VPC
- Concepts
- EIP
- Subnets
- CIDR block
- Layer 3
- By default, different subnets can communicate in same VPC
- Security groups, ACLs
- Route tables
- Gateways
- VPN
- NAT, SNAT, DNAT
- Direct connect
- VPC peering
- DHCP
- …
- Constraints
- 1200 compute nodes
- same subnet layer 2, different subnet layer 3
- 1 NAT gateway, consumes 1 IP
- 1 SNAT rule, 20 EIPs
- 200 DNAT for 1 NAT gateway
- Concepts
- EIP
- Static IP on extranet (WAN/LAN)
- Directly accessed through Internet
- Not exposed to the resouce
- Mapped to instance by NAT
- Dedicated/Shared bandwidth
- Virtual IP Address (VIP)
- ELB
- Listener
- Checks for connection requests using a protocol and port for connections from clients to the load balancer, and a protocol and port for connections from the load balancer to backend cloud server
- Algorithms
- Weighted round robin
- Weighted least connections
- Source IP hash
- Must connect to same server
- Sticky session
- For round robin and least connections
- Types
- HTTP cookie
- Max 1 hour
- Application cookie
- Fixed 24 hour
- Source IP address
- Not applicable for proxy or NAT
- Only type TCP, UDP supports
- Max 1 hour
- HTTP cookie
- Health check
- TCP, UDP, ICMP, HTTP, HTTPS
- Backend server group
- Slow start
- Backend server exits when
- Slow start duration slapses
- It becomes unhealthy
- Only works with HTTP/HTTPS backends
- Only works with round robinP
- Backend server exits when
- Priority group
- Traffic distributed only to activated high priority group
- If servers in activated high-priority group < minimum available servers
- Activate priority group the highest priority
- If servers in activated high-priority group >= minimum available servers AND priority > activated priority group
- Deactivate priority group
- Advantages
- High availability
- Security
- Performance
- Flexibility
- Low cost
- Easy upgrade
- Work together with AS to achieve capacity expansion
- Listener
- Network ACL
- Security service for VPCs
- Controls access to subnets
- Supports whitelists and blacklists
- Based on inbound and outbound ACL rules associated with subnets
- FWaaS v2
- Permit, Deny, Reject
- By source IP, destination IP, source port, destination port
- VPN
- VPN gateway, remote gateway
- 1-to-n (to remote gateway)
- VPN connection with IPsec
- RFC 2409
- IKE protocol
- RFC 4301
- IPsec architeture
- Applications ^93c534
- VPC ←> local data center
- VPC ←> multiple local data centers
- VPC ←> VPC (cross-region)
- VPN gateway, remote gateway
- Direct Connect
- Basic
- VM-based
- Enhanced
- Hardware switches
- VPC ←> virtual gateway, virtual interface ←(virtual connection)> direct connection zone ←(physical connection)> local data center
- Same applications as VPN
- Basic
- VPC Endpoint (VPCEP)
- Gateway endpoint services
- Only for cloud services
- Interface endpoint services
- Private services
- Applications
- Access cloud services behind VPC which is connected via VPN/Direct Connect
- Communicate between separate VPCs in a region
- Gateway endpoint services
- Cloud Connect (CC)
- Connect between VPCs across regions
- CloudDNS
- Private DNS
- A, CNAME, AAAA, MX, TXT, SRV, NS, SOA
- PTR
- Reverse resolution
- Time-to-live (TTL)
- Cache
- Enterprise Networking Service (ENS)
- Networking between clouds
- Architecture
- Orchestration layer
- ENS console
- Global controller
- Calculates and distributes resources to local controllers
- Local controllers
- Manage local connection gateways, delivers configurations, interacts with APIs in the same region
- Connection gateways
- Connects networks across regions and resource pools
- Orchestration layer
10. Security Services
11. DR and Backup Services
12. Container Services
- Cloud Container Engine (CCE)
- Kubernetes services
- Integrated with ECS, VPC/EIP/ELB, EVS/OBS/SFS
- Multi-AZ and multi-region disaster recovery to ensure high availability
- Kubernetes Certified Service Providers (KCSPs)
- Architecture
- Developer → (Console, Kubectl, Kubernetes APIs) ←> (Master → (Node 1, …, Node N))
- Helm charts
- Redundancy
- 5 master nodes → 2 failures
- 3 master nodes → 1 failure
- Auto scaling
- Application Operations Management (AOM)
- Microservices
- Istio service mesh
- Application Performance Management (APM)
- Istio service mesh
- CI/CD
- Kubernetes Workload (abstract model of a group of pods)
- Deployment
- Pods are completely independent, functionally identical
- Auto scaling, rolling upgrade
- e.g. Nginx, WordPress
- StatefulSet
- Stable persistent storage
- Feature orderly deployment and deletion
- e.g. MySQL-HA, etcd
- DaemonSet
- All or some nodes run one pod
- e.g. Ceph, Fluentd, Prometheus Node Exporter
- Job
- One-time task that runs to completion
- Cron job
- Schedule individual tasks on all nodes
- Deployment
- Orchestration Template
- Deploy and manage multi-container applications
- Namespace
- Collection of resources and objects
- Multiple in a single cluster, data isolated
- Service
- Expose a group of applications on pods as networked services
- ClusterIP, NodePort, LoadBalancer
- Layer-7 Load Balancing (Ingress)
- Network Policy
- Use label selectors to simulate traditional segmented networks and controls traffic
- ConfigMap
- Key-value store
- Secret
- Can be used as volume or environment variable
- Label, Label Selector
- Annotation
- Like label but without strict naming rules
- PersistentVolume (PV)
- PersistentVolumeClaim (PVC)
- Pods consume node resources, request CPU and memory
- PVCs consume PV resources, request data volumes (specific size and access mode)
- Auto Scaling - HPA (Horizontal Pod Autoscaling)
- ReplicationController
- Affinity
- Containers are schedules onto the nearest node
- Reduce performance loss due to slow routing
- Anti-Affinity
- To achieve higher availability
- Avoid SPOF
- Node Affinity
- Affinity labels to schedule pods to specific nodes
- Node Anti-affinity
- Anti-affinity labels to prevent pods from being scheduled to specific nodes
- Pod Affinity
- Pod Anti-affinity
- Resource Quota
- Resource Limit (LimitRange)
- Add resource limit to a namespace
- ENV
- Istio-based Application Service Mesh (ASM)
- Microservice governance
- Works with Application Performance Management (APM)
- SoftWare Repository for Container (SWR)
- Full lifecycle management
- Push, pull, deletion
- Private image repository and access control
- Permission for read, write, edit
- Large scale image distribution acceleration
- Image pull acceleration technology
- Automatic deployment update through triggers
- Full lifecycle management
13. Application Services
- Simple Message Notification (SMN)
- One-to-multiple message subscriptions and notifications
- Publisher → SMN (topics) → Subscriber
- SMTP, SMPP3_4, CMPP2_x, CMPP3_x
- Topic
- Uniform Resource Name (URN) for identifying a topic
- Subscriber
- Phone
- HTTPS
- Message template
- ROMA Connect
- Fast Data Integration (FDI)
- Service Integration (API Connect) (APIC)
- Message Integration (Message Queue Service) (MQS)
- Kafka
- RocketMQ
- Device Integration (LINK)
- Message Queue telemetry Transport (MQTT)
- Composite Application
- Distributed Cache Service (DCS)
- Online, distributed, fast in-memory cache service, compatible with Redis
- Instance Types
- Single-node Redis
- Low system overhead, high QPS
- Fault recovery (< 30s)
- Out-of-the-box usability, no data persistence
- Low-cost
- Master/Standby Redis
- Data persistence, high reliability
- Data synchronization
- Automatic master/standby switchover
- Disaster Recovery (DR) policies
- Proxy Cluster Redis
- Linux Virtual Server (LVS) and proxies to achieve high availability
- Redis Cluster
- Sharding
- 16384 slots per Redis Cluster
- CRC16 % 16384 to determine which Redis node
- Sharding
- Read/Write Splitting Redis
- Implemented on server, proxies distinguish between read/write and forward request
- Suitable for high read concurrency and few write requests
- Single-node Redis
- Service Stage
- Application Operations Management (AOM)
- Log Tank Service (LTS)
- Application Performance Management (APM)
14. CodeArts Services
15. Database Services
- Relational Database Service (RDS)
- MySQL only
- Performance monitoring system, multi-level security measures, professional database management platform
- GaussDB
- Data Replication Service (DRS)