· ☕ 1 分钟
images and layers
https://docs.docker.com/storage/storagedriver/#images-and-layers
https://docs.docker.com/storage/storagedriver/#images-and-layers
https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/
Currently only supported for Prometheus and Loki data sources. This variable represents the range for the current dashboard. It is calculated by to - from
. It has a millisecond and a second representation called $__range_ms
and $__range_s
.
You can use the $__interval
variable as a parameter to group by time (for InfluxDB, MySQL, Postgres, MSSQL), Date histogram interval (for Elasticsearch), or as a summarize function parameter (for Graphite).
https://iximiuz.com/en/posts/prometheus-metrics-labels-time-series/
*Side note 1: Despite being born in the age of distributed systems, every Prometheus server node is autonomous. I.e., there is no distributed metric storage in the default Prometheus setup, and every node acts as a self-sufficient monitoring server with local metric storage. It simplifies a lot of things, including the following explanation, because we don’t need to think of how to merge overlapping series from different Prometheus nodes *😉
https://iximiuz.com/en/posts/prometheus-functions-agg-over-time/
Almost all the functions in the aggregation family accept just a single parameter - a range vector. It means that the over time part, i.e., the duration of the aggregation period, comes from the range vector definition itself.
The only way to construct a range vector in PromQL is by appending a bracketed duration to a vector selector. E.g. http_requests_total[5m]
. Therefore, an <agg>_over_time()
function can be applied only to a vector selector, meaning the aggregation will always be done using raw scrapes.
https://en.wikipedia.org/wiki/Cache_coherence
Theoretically, coherence can be performed at the load/store granularity. However, in practice it is generally performed at the granularity of cache blocks.[3]
https://www.geeksforgeeks.org/cache-coherence/
Cache coherence is the discipline that ensures that changes in the values of shared operands are propagated throughout the system in a timely fashion.
http://tutorials.jenkov.com/java-concurrency/cache-coherence-in-java-concurrency.html
Ticket spinlock
is the spinlock implementation used in the Linux kernel prior to 4.2. A lock waiter gets a ticket number and spin on the lock cacheline until it sees its ticket number. By then, it becomes the lock owner and enters the critical section.
Queued spinlock
is the new spinlock implementation used in 4.2 Linux kernel and beyond. A lock waiter goes into a queue and spins in its own cacheline until it becomes the queue head. By then, it can spin on the lock cacheline and attempt to get the lock.
https://easyperf.net/blog/2018/09/04/Performance-Analysis-Vocabulary
Majority of modern CPUs including Intel’s and AMD’s ones don’t have fixed frequency on which they operate. Instead, they have dynamic frequency scaling. In Intel’s CPUs this technology is called Turbo Boost, in AMD’s processors it’s called Turbo Core. There is nice explanation of the term “reference cycles” on this stackoverflow thread:
Having a snippet A to run in 100 core clocks and a snippet B in 200 core clocks means that B is slower in general (it takes double the work), but not necessarily that B took more time than A since the units are different. That’s where the reference clock comes into play - it is uniform. If snippet A runs in 100 ref clocks and snippet B runs in 200 ref clocks then B really took more time than A.
https://easyperf.net/blog/2018/09/04/Performance-Analysis-Vocabulary
https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance/custom-analysis/custom-analysis-options/hardware-event-list/instructions-retired-event.html
The Instructions Retired is an important hardware performance event that shows how many instructions were completely executed.
Modern processors execute much more instructions that the program flow needs. This is called a speculative execution.
Instructions that were “proven” as indeed needed by the program execution flow are “retired”.
In the Core Out Of Order pipeline leaving the Retirement Unit means that the instructions are finally executed and their results are correct and visible in the architectural state as if they execute in-order.
Let us assume a ‘classic RISC pipeline’, with the following five stages:
Each stage requires one clock cycle and an instruction passes through the stages sequentially. Without pipelining, in a multi-cycle processor, a new instruction is fetched in stage 1 only after the previous instruction finishes at stage 5, therefore the number of clock cycles it takes to execute an instruction is five (CPI = 5 > 1). In this case, the processor is said to be subscalar
. With pipelining, a new instruction is fetched every clock cycle by exploiting instruction-level parallelism, therefore, since one could theoretically have five instructions in the five pipeline stages at once (one instruction per stage), a different instruction would complete stage 5 in every clock cycle and on average the number of clock cycles it takes to execute an instruction is 1 (CPI = 1). In this case, the processor is said to be scalar
.
http://web.eece.maine.edu/~vweaver/projects/perf_counters/retired_instructions.html
Retired instruction counts on x86 in general also include at least one extra instruction each time a hardware interrupt happens, even if only user space code is being monitored. The one exception to this is the Pentium 4 counter.
Another special case are rep
prefixed string instructions. Even if the instruction repeats many times, the instruction is only counted as one instruction.