kernel
tty
· ☕ 3 åˆ†é’Ÿ
Jobs SIGHUP 默认动作:Terminate 可能动作:Terminate, Ignore, Function call 当检测到 hangup 时,UART 驱动会向整个 session 发送 SIGHUP 信号。 正常情况下,这会 kill 掉所有

Netfilter and IPTable and conntrack
· ☕ 1 åˆ†é’Ÿ
Tables↓/Chains→ PREROUTING INPUT FORWARD OUTPUT POSTROUTING (routing decision) ✓ raw ✓ ✓ (connection tracking enabled) ✓ ✓ mangle ✓ ✓ ✓ ✓ ✓ nat (DNAT) ✓ ✓ (routing decision) ✓ ✓ filter ✓ ✓ ✓ security ✓ ✓ ✓ nat (SNAT) ✓ ✓ Incoming packets destined for the local system: PREROUTING -> INPUT Incoming packets destined to another host: PREROUTING -> FORWARD -> POSTROUTING Locally generated packets: OUTPUT -> POSTROUTING Connection Tracking https://arthurchiao.

内核同步原语
· ☕ 5 åˆ†é’Ÿ
什么是同步原语 共享内存,多进程/线程的运行期设计模式已成主流的今天,你有好奇一下,进程/线程间的怎么同步的吗?大部分人知道,我们用的开发语言

程序员的平行宇宙 —— eBPF 系统级跟踪技术简单入门
· ☕ 5 åˆ†é’Ÿ
Linus Torvalds in 1991 程序员的平行宇宙 程序员有两个世界: 一个是编码世界,我们很容易认为,我们考虑了一切,也完成了一切的代码。 然后是运行世界,我们发现,无论

bpftool
· ☕ 1 åˆ†é’Ÿ
bpftool bpftool prog 1 $ ./execsnoop & 1 2 3 4 5 6 7 $ bpftool prog 9: kprobe name syscall__execve tag f66477d6a4dd923d gpl loaded_at 2021-03-05T16:52:05+0800 uid 0 xlated 4064B jited 2321B memlock 8192B map_ids 11 10: kprobe name do_ret_sys_exec tag 3a66f7b49f929a2e gpl loaded_at 2021-03-05T16:52:05+0800 uid 0 xlated 480B jited 314B memlock 4096B map_ids 11 The prog show subcommand lists all programs (not just those that are perf_event_open() based): 1

eBPF API
· ☕ 1 åˆ†é’Ÿ
User Space API 1 2 3 4 5 6 7 8 9 10 11 # strace -ebpf ./execsnoop bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0}, 120) = -1 EPERM (Operation not permitted) bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="events", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0}, 120) = 3 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=508, insns=0x7fbdc7157000, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 3, 18), prog_flags=0, prog_name="syscall__execve", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL,

tracepoint - Linux 内核跟踪点
· ☕ 1 åˆ†é’Ÿ
List tracepoints by perf root@worknode5:/home/labile# perf list tracepoint List of pre-defined events (to be used in -e): tcp:tcp_destroy_sock [Tracepoint event] tcp:tcp_probe [Tracepoint event] tcp:tcp_rcv_space_adjust [Tracepoint event] tcp:tcp_receive_reset [Tracepoint event] tcp:tcp_retransmit_skb [Tracepoint event] tcp:tcp_retransmit_synack [Tracepoint event] tcp:tcp_send_reset [Tracepoint event] Metric Groups: sock:inet_sock_set_state [Tracepoint event] Metric Groups: By BCC tplist sudo tplist-bpfcc -v 'tcp:*' bpftrace bpftrace -l 't:sock:*' By kernel filesystem root@worknode5:/home/labile# sudo ls /sys/kernel/debug/tracing/events alarmtimer btrfs compaction devlink exceptions filelock ftrace

系统级跟踪 eBPF 工具 —— bpftrace 入门
· ☕ 1 åˆ†é’Ÿ
bpftrace 简介 bpftrace 简单使用 查询可以跟踪的内核函数,以 sleep 为关键字 1 2 3 4 5 6 7 8 9 $ bpftrace -l '*open*' tracepoint:syscalls:sys_exit_open_tree tracepoint:syscalls:sys_enter_open ... kprobe:vfs_open kprobe:tcp_try_fastopen ... 跟踪所有 sys_enter_open() 系统调用 1 $ bpftrace -e 'tracepoint:syscalls:sys_enter_open{ printf("%s %s\n", comm,str(args->filename)); }' | grep vi 然后在另

Kernel - Page Frame 回收
· ☕ 3 åˆ†é’Ÿ
Page Frame 回收 之前我们了解到,Linux 倾向用最多的内存做 Page Cache。这使我们不得不考虑如何在内存不足前回收内存。问题是,回收内存的程序本身也可

Kernel - 内存寻址
· ☕ 1 åˆ†é’Ÿ
CPU Cache Cache 有两种写策略: write-through:同步写 Cache 和 Main Memory write-back:不同步写 Main Memory,直到CPU发出 flush 指令,或收到了 FLUSH

Kernel - Memory Area
· ☕ 1 åˆ†é’Ÿ
Memory Area Management 使用 buddy system algorithm来分配大块内存是合理的,但小块内存就会做成空间浪费。 Slab Allocator 在 buddy system algorithm之上做一个内存分配算法会很低