· ☕ 1 分钟
Memory Manager Goals
-
保证最少 NUMA Node 去满足 POD 的内存需求: Offer guaranteed memory (and hugepages) allocation over a minimum number of NUMA nodes for containers (within a pod).
-
长远是让pod中的所有 container 运行在尽量少的 NUMA NODE 中: Guaranteeing the affinity of memory and hugepages to the same NUMA node for the whole group of containers (within a pod). This is a long-term goal which will be achieved along with PR #1752 and the implementation of
hintprovider.GetPodLevelTopologyHints()API in the Memory Manager.- Offer guaranteed memory (and hugepages) allocation over a minimum number of NUMA nodes for containers (within a pod).
· ☕ 2 分钟
K8s Memory Manager
Requriement
Your Kubernetes server must be at or later than version v1.21. To check the version, enter kubectl version.
To align memory resources with other requested resources in a Pod Spec:
- the CPU Manager should be enabled and proper CPU Manager policy should be configured on a Node. See control CPU Management Policies;
- the Topology Manager should be enabled and proper Topology Manager policy should be configured on a Node. See control Topology Management Policies.
Starting from v1.22, the Memory Manager is enabled by default through MemoryManager feature gate.
· ☕ 2 分钟
Topology Manager Scopes and Policies
Topology Manager provides two distinct knobs: scope and policy.
The scope defines the granularity at which you would like resource alignment to be performed (e.g. at the pod or container level). And the policy defines the actual strategy used to carry out the alignment (e.g. best-effort, restricted, single-numa-node, etc.).
Topology Manager Scopes
The Topology Manager can deal with the alignment of resources in a couple of distinct scopes:
· ☕ 1 分钟
kubectl debug
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-running-pod/#ephemeral-container
https://towardsdatascience.com/the-easiest-way-to-debug-kubernetes-workloads-ff2ff5e3cc75
|
|
Process Namespace Sharing
https://towardsdatascience.com/the-easiest-way-to-debug-kubernetes-workloads-ff2ff5e3cc75
kubectl debug -it some-app –image=busybox –share-processes –copy-to=some-app-debug
· ☕ 1 分钟
➜ 2305 pidstat -t -p 2305 1
Linux 5.4.0-74-generic (labile-T30) 2021年06月24日 _x86_64_ (2 CPU)
18时37分39秒 UID TGID TID %usr %system %guest %wait %CPU CPU Command
18时37分40秒 64055 2305 - 1.00 2.00 14.00 3.00 17.00 1 qemu-system-x86
18时37分40秒 64055 - 2305 1.00 1.00 0.00 3.00 2.00 1 |__qemu-system-x86
18时37分40秒 64055 - 2307 0.00 0.00 0.00 0.00 0.00 0 |__qemu-system-x86
18时37分40秒 64055 - 2312 0.00 0.00 0.00 0.00 0.00 0 |__IO mon_iothread
18时37分40秒 64055 - 2313 0.00 0.00 9.00 5.00 9.00 1 |__CPU 0/KVM
18时37分40秒 64055 - 2314 0.00 2.00 5.00 5.00 7.00 0 |__CPU 1/KVM
18时37分40秒 64055 - 2316 0.00 0.00 0.00 1.00 0.00 1 |__SPICE Worker
18时37分40秒 64055 - 76701 0.00 0.00 0.00 0.00 0.00 0 |__worker
perf kvm stat live
18:42:23.185815
Analyze events for all VMs, all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
MSR_WRITE 648 60.67% 5.04% 0.47us 181016.17us 715.22us ( +- 44.28% )
HLT 207 19.38% 92.36% 2.05us 3684299.87us 40991.17us ( +- 43.93% )
EXTERNAL_INTERRUPT 85 7.96% 1.43% 0.34us 43275.57us 1540.36us ( +- 50.18% )
PREEMPTION_TIMER 78 7.30% 0.25% 0.66us 12804.28us 294.09us ( +- 69.79% )
PENDING_INTERRUPT 49 4.59% 0.93% 0.60us 84909.22us 1735.07us ( +- 99.87% )
PAUSE_INSTRUCTION 1 0.09% 0.00% 0.82us 0.82us 0.82us ( +- 0.00% )
Total Samples:1068, Total events handled time:9187522.33us.
· ☕ 1 分钟
Website docs
Java | OpenTelemetry
Manual Instrumentation | OpenTelemetry
OpenTelemetry Client Design Principles | OpenTelemetry
Instrumentation Examples | OpenTelemetry
OpenTelemetry to Jaeger Transformation | OpenTelemetry
Java | OpenTelemetry
GitHub - open-telemetry/opentelemetry-java-docs
Getting Started | OpenTelemetry
Java | OpenTelemetry
Manual Instrumentation | OpenTelemetry
OpenTelemetry Client Design Principles | OpenTelemetry
Instrumentation Examples | OpenTelemetry
OpenTelemetry to Jaeger Transformation | OpenTelemetry
Java | OpenTelemetry
GitHub - open-telemetry/opentelemetry-java-docs
Getting Started | OpenTelemetry
Java | OpenTelemetry
Manual Instrumentation | OpenTelemetry
OpenTelemetry Client Design Principles | OpenTelemetry
Instrumentation Examples | OpenTelemetry
OpenTelemetry to Jaeger Transformation | OpenTelemetry
Java | OpenTelemetry
GitHub - open-telemetry/opentelemetry-java-docs
Getting Started | OpenTelemetry
- https://stackoverflow.com/questions/68739774/add-logs-to-spans-using-otel-instrumentation-with-jaegar-backend/68739794#68739794
Github
- Capture request and response bodies · Issue #1062 · open-telemetry/opentelemetry-specification · GitHub
- Modelling HTTP client response body capture · Issue #1284 · open-telemetry/opentelemetry-specification · GitHub
- Capture request and response bodies · Issue #1317 · open-telemetry/opentelemetry-java-instrumentation · GitHub
- OpenTelemetry Logging Overview | OpenTelemetry
- opentelemetry-java-instrumentation/instrumentation/spring/starters/otlp-exporter-starter at main · open-telemetry/opentelemetry-java-instrumentation · GitHub
- opentelemetry-java-instrumentation/README.md at main · open-telemetry/opentelemetry-java-instrumentation · GitHub
· ☕ 1 分钟
images and layers
https://docs.docker.com/storage/storagedriver/#images-and-layers
· ☕ 2 分钟
https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/
$__range
Currently only supported for Prometheus and Loki data sources. This variable represents the range for the current dashboard. It is calculated by to - from. It has a millisecond and a second representation called $__range_ms and $__range_s.
$__interval
You can use the $__interval variable as a parameter to group by time (for InfluxDB, MySQL, Postgres, MSSQL), Date histogram interval (for Elasticsearch), or as a summarize function parameter (for Graphite).