· ☕ 3 分钟
https://istio.io/latest/docs/ops/configuration/traffic-management/tls-configuration/
Sidecars
Sidecar traffic has a variety of associated connections. Let’s break them down one at a time.
Sidecar proxy network connections
- External inbound traffic This is traffic coming from an outside client that is captured by the sidecar. If the client is inside the mesh, this traffic may be encrypted with Istio mutual TLS. By default, the sidecar will be configured to accept both mTLS and non-mTLS traffic, known as
PERMISSIVE
mode. The mode can alternatively be configured toSTRICT
, where traffic must be mTLS, orDISABLE
, where traffic must be plaintext. The mTLS mode is configured using aPeerAuthentication
resource. - Local inbound traffic This is traffic going to your application service, from the sidecar. This traffic will always be forwarded as-is. Note that this does not mean it’s always plaintext; the sidecar may pass a TLS connection through. It just means that a new TLS connection will never be originated from the sidecar.
- Local outbound traffic This is outgoing traffic from your application service that is intercepted by the sidecar. Your application may be sending plaintext or TLS traffic. If automatic protocol selection is enabled, Istio will automatically detect the protocol. Otherwise you should use the port name in the destination service to manually specify the protocol.
- External outbound traffic This is traffic leaving the sidecar to some external destination. Traffic can be forwarded as is, or a TLS connection can be initiated (mTLS or standard TLS). This is controlled using the TLS mode setting in the
trafficPolicy
of aDestinationRule
resource. A mode setting ofDISABLE
will send plaintext, whileSIMPLE
,MUTUAL
, andISTIO_MUTUAL
will originate a TLS connection.
The key takeaways are:
· ☕ 1 分钟
https://istio.io/v1.4/docs/tasks/security/authentication/mtls-migration/
Ensure that your cluster is in PERMISSIVE mode before migrating to mutual TLS. Run the following command to check:
|
|
In PERMISSIVE
mode, the Envoy sidecar relies on the ALPN
value istio
to decide whether to terminate the mutual TLS traffic. If your workloads (without Envoy sidecar) have enabled mutual TLS directly to the services with Envoy sidecars, enabling PERMISSIVE
mode may cause these connections to fail.
· ☕ 2 分钟
SPIFFE
old school Official SPIFFE method:
https://blog.envoyproxy.io/securing-the-service-mesh-with-spire-0-3-abb45cd79810
Workload
A workload is a single piece of software, deployed with a particular configuration for a single purpose; it may comprise multiple running instances of software, all of which perform the same task. The term “workload” may encompass a range of different definitions of a software system, including:
- A web server running a Python web application, running on a cluster of virtual machines with a load-balancer in front of it.
- An instance of a MySQL database.
- A worker program processing items on a queue.
- A collection of independently deployed systems that work together, such as a web application that uses a database service. The web application and database could also individually be considered workloads.
SPIFFE ID
A SPIFFE ID is a string that uniquely and specifically identifies a workload. SPIFFE IDs may also be assigned to intermediate systems that a workload runs on (such as a group of virtual machines). For example, spiffe://acme.com/billing/payments is a valid SPIFFE ID.
· ☕ 2 分钟
x-forwarded-client-cert
x-forwarded-client-cert (XFCC) is a proxy header which indicates certificate information of part or all of the clients or proxies that a request has flowed through, on its way from the client to the server. A proxy may choose to sanitize/append/forward the XFCC header before proxying the request.
The XFCC header value is a comma (",") separated string. Each substring is an XFCC element, which holds information added by a single proxy. A proxy can append the current client certificate information as an XFCC element, to the end of the request’s XFCC header after a comma.
· ☕ 2 分钟
https://istio.io/latest/docs/ops/common-problems/network-issues/#double-tls
Double TLS (TLS origination for a TLS request)
When configuring Istio to perform TLS origination, you need to make sure that the application sends plaintext requests to the sidecar, which will then originate the TLS.
TLS Origination
TLS origination occurs when an Istio proxy (sidecar or egress gateway) is configured to accept unencrypted internal HTTP connections, encrypt the requests, and then forward them to HTTPS servers that are secured using simple or mutual TLS. This is the opposite of TLS termination where an ingress proxy accepts incoming TLS connections, decrypts the TLS, and passes unencrypted requests on to internal mesh services.
· ☕ 1 分钟
How Does the CPU Manager Work?
When CPU manager is enabled with the “static” policy, it manages a shared pool of CPUs. Initially this shared pool contains all the CPUs in the compute node. When a container with integer CPU request in a Guaranteed pod is created by the Kubelet, CPUs for that container are removed from the shared pool and assigned exclusively for the lifetime of the container. Other containers are migrated off these exclusively allocated CPUs.
· ☕ 1 分钟
Memory Manager Goals
-
保证最少 NUMA Node 去满足 POD 的内存需求: Offer guaranteed memory (and hugepages) allocation over a minimum number of NUMA nodes for containers (within a pod).
-
长远是让pod中的所有 container 运行在尽量少的 NUMA NODE 中: Guaranteeing the affinity of memory and hugepages to the same NUMA node for the whole group of containers (within a pod). This is a long-term goal which will be achieved along with PR #1752 and the implementation of
hintprovider.GetPodLevelTopologyHints()
API in the Memory Manager.- Offer guaranteed memory (and hugepages) allocation over a minimum number of NUMA nodes for containers (within a pod).
· ☕ 2 分钟
K8s Memory Manager
Requriement
Your Kubernetes server must be at or later than version v1.21. To check the version, enter kubectl version
.
To align memory resources with other requested resources in a Pod Spec:
- the CPU Manager should be enabled and proper CPU Manager policy should be configured on a Node. See control CPU Management Policies;
- the Topology Manager should be enabled and proper Topology Manager policy should be configured on a Node. See control Topology Management Policies.
Starting from v1.22, the Memory Manager is enabled by default through MemoryManager
feature gate.
· ☕ 2 分钟
Topology Manager Scopes and Policies
Topology Manager provides two distinct knobs: scope
and policy
.
The scope
defines the granularity at which you would like resource alignment to be performed (e.g. at the pod
or container
level). And the policy
defines the actual strategy used to carry out the alignment (e.g. best-effort
, restricted
, single-numa-node
, etc.).
Topology Manager Scopes
The Topology Manager can deal with the alignment of resources in a couple of distinct scopes:
· ☕ 1 分钟
kubectl debug
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-running-pod/#ephemeral-container
https://towardsdatascience.com/the-easiest-way-to-debug-kubernetes-workloads-ff2ff5e3cc75
|
|
Process Namespace Sharing
https://towardsdatascience.com/the-easiest-way-to-debug-kubernetes-workloads-ff2ff5e3cc75
kubectl debug -it some-app –image=busybox –share-processes –copy-to=some-app-debug
· ☕ 1 分钟
➜ 2305 pidstat -t -p 2305 1
Linux 5.4.0-74-generic (labile-T30) 2021年06月24日 _x86_64_ (2 CPU)
18时37分39秒 UID TGID TID %usr %system %guest %wait %CPU CPU Command
18时37分40秒 64055 2305 - 1.00 2.00 14.00 3.00 17.00 1 qemu-system-x86
18时37分40秒 64055 - 2305 1.00 1.00 0.00 3.00 2.00 1 |__qemu-system-x86
18时37分40秒 64055 - 2307 0.00 0.00 0.00 0.00 0.00 0 |__qemu-system-x86
18时37分40秒 64055 - 2312 0.00 0.00 0.00 0.00 0.00 0 |__IO mon_iothread
18时37分40秒 64055 - 2313 0.00 0.00 9.00 5.00 9.00 1 |__CPU 0/KVM
18时37分40秒 64055 - 2314 0.00 2.00 5.00 5.00 7.00 0 |__CPU 1/KVM
18时37分40秒 64055 - 2316 0.00 0.00 0.00 1.00 0.00 1 |__SPICE Worker
18时37分40秒 64055 - 76701 0.00 0.00 0.00 0.00 0.00 0 |__worker
perf kvm stat live
18:42:23.185815
Analyze events for all VMs, all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
MSR_WRITE 648 60.67% 5.04% 0.47us 181016.17us 715.22us ( +- 44.28% )
HLT 207 19.38% 92.36% 2.05us 3684299.87us 40991.17us ( +- 43.93% )
EXTERNAL_INTERRUPT 85 7.96% 1.43% 0.34us 43275.57us 1540.36us ( +- 50.18% )
PREEMPTION_TIMER 78 7.30% 0.25% 0.66us 12804.28us 294.09us ( +- 69.79% )
PENDING_INTERRUPT 49 4.59% 0.93% 0.60us 84909.22us 1735.07us ( +- 99.87% )
PAUSE_INSTRUCTION 1 0.09% 0.00% 0.82us 0.82us 0.82us ( +- 0.00% )
Total Samples:1068, Total events handled time:9187522.33us.