Sholck

不积跬步,无以至千里.不积小流,无以成江海

0%

程序分析利器perf

程序分析利器perf

简介

全称为performance counter,是一个linux内核实现的框架来进行性能分析,覆盖了硬件CPU/PMU和(软件计数器,监测点)之类的特性。

本文重点介绍perf的使用和实践,不对原理性知识进行讲解


背景知识

硬件

cpu流水线

cache

tick

时钟周期,假设一台处理器的处理频率为2GHZ,那么时钟周期为2GHZ,一次tick的时间间隔为1/2ns

软件

clock tick

时钟滴答,在经过多个时钟周期后,机器内部会产生一次时钟滴答,即一次时钟中断,用来对任务进行管理。

tracepoints

追踪点,由ftrace框架实现,一些函数在内核中被特殊标记,在运行到这些函数时进行函数记录并记录在内核环形缓冲区,消耗时间很小,并可以通过文件节点打印出来,详见/sys/kernel/debug/tracing

原理

perf通过进行事件采样来对性能进行分析,事件根据类型分为几类:

  1. Kernel PMU event
  2. Hardware event 比如 branch-miss
  3. Hardware cache event dTLB和iTLB相关,即数据转换后备缓存区和指令转换后备缓冲区
  4. software event 比如 cpu-migrations
  5. tracepoints

缩写

  1. IPC:每个时钟周期内的指令数
  2. PMU:performance monitoring units

使用

安装

1
sudo apt-get install perf-tools-unstable

命令

参数

常用的参数如下:

  1. -e 指定事件
  2. -p 指定pid

perf help

通过perf --help打印全部的命令,perf 的全部选项命令如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
bench           General framework for benchmark suites   基准测试
test Runs sanity tests. 检查系统是否支持perf的全部功能

list List all symbolic event types
top System profiling tool. 实时热点事件分析
stat Run a command and gather performance counter statistics
record Run a command and record its profile into perf.data
report Read perf.data (created by perf record) and display the profile
probe Define new dynamic tracepoints 动态追踪点

trace strace inspired tool 同strace功能
kmem Tool to trace/measure kernel memory properties kernel内存
lock Analyze lock events
mem Profile memory accesses
sched Tool to trace/measure scheduler properties (latencies) 调度子系统
kvm Tool to trace/measure kvm guest os

timechart Tool to visualize total system behavior during a workload
evlist List the event names in a perf.data file 列出record文件的所有事件

perf bench

系统基准测试,通过对4个子系统的运行来评估系统性能

sched

包括进程调度和IPC进程通信(socket和pipe)

1
2
3
4
5
6
7
8
9
10
11
12
13
kernel_tool_test_C/ $ perf bench sched all                                                 # Running sched/messaging benchmark...      >>20个进程对通信
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

Total time: 0.031 [sec]

# Running sched/pipe benchmark...
# Executed 1000000 pipe operations between two processes

Total time: 6.182 [sec]

6.182868 usecs/op
161737 ops/sec
mem

内存copy性能评估,包括两种方式memcpy和memset,结果显示为吞吐率

1
2
3
4
5
kernel_tool_test_C/ $ perf bench mem all                                                             # Running mem/memcpy benchmark...
# function 'default' (Default memcpy() provided by glibc)
# Copying 1MB bytes ...

25.699013 GB/sec
futex

锁压力评估

perf list

打印系统支持的事件 sudo perf list, 之后在 perf 的其他命令中可以-e 指定事件

这些事件中hw,PMU, cache属于硬件, sw属于内核计数器,Tracepoints event属于ftrace捕捉的内核追踪点

软件evnet
1
2
3
4
5
6
7
8
9
10
11
alignment-faults                                   [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
硬件event
1
2
3
4
5
6
7
8
branch-instructions OR branches                    [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
ref-cycles [Hardware event]
cache evnet
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
L1-dcache-load-misses                              [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-stores [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
LLC-load-misses [Hardware cache event]
LLC-loads [Hardware cache event]
LLC-store-misses [Hardware cache event]
LLC-stores [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
dTLB-store-misses [Hardware cache event]
dTLB-stores [Hardware cache event]
iTLB-load-misses [Hardware cache event]
iTLB-loads [Hardware cache event]
node-load-misses [Hardware cache event]
node-loads [Hardware cache event]
node-store-misses [Hardware cache event]
node-stores [Hardware cache event]
PMU event
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
branch-misses OR cpu/branch-misses/ [Kernel PMU event]
bus-cycles OR cpu/bus-cycles/ [Kernel PMU event]
cache-misses OR cpu/cache-misses/ [Kernel PMU event]
cache-references OR cpu/cache-references/ [Kernel PMU event]
cpu-cycles OR cpu/cpu-cycles/ [Kernel PMU event]
cycles-ct OR cpu/cycles-ct/ [Kernel PMU event]
cycles-t OR cpu/cycles-t/ [Kernel PMU event]
el-abort OR cpu/el-abort/ [Kernel PMU event]
el-capacity OR cpu/el-capacity/ [Kernel PMU event]
el-commit OR cpu/el-commit/ [Kernel PMU event]
el-conflict OR cpu/el-conflict/ [Kernel PMU event]
el-start OR cpu/el-start/ [Kernel PMU event]
instructions OR cpu/instructions/ [Kernel PMU event]
intel_bts// [Kernel PMU event]
intel_cqm/llc_occupancy/ [Kernel PMU event]
intel_cqm/local_bytes/ [Kernel PMU event]
intel_cqm/total_bytes/ [Kernel PMU event]
intel_pt// [Kernel PMU event]
mem-loads OR cpu/mem-loads/ [Kernel PMU event]
mem-stores OR cpu/mem-stores/ [Kernel PMU event]
...
tracepoint evnet
1
2
3
4
5
alarmtimer:alarmtimer_cancel                       [Tracepoint event]
alarmtimer:alarmtimer_fired [Tracepoint event]
alarmtimer:alarmtimer_start [Tracepoint event]
alarmtimer:alarmtimer_suspend [Tracepoint event]
block:block_bio_backmerge [Tracepoint event]

perf top

打印说明

可以实时生成和显示系统中事件计数的函数排名,新版内核可以查看函数对应的汇编指令

  1. 第一列 Overhead指事件比例

  2. 第二列 DOS (Dynamic Shared Object) 可以为kernel,应用程序perf,动态链接库libc-2.23.so,模块

  3. 第三列 DOS类型 [k]表示对象属于内核或者模块, [.]表示对象为二进制程序,为应用程序或者动态链接库

  4. 第四列 符号名,没有symbols的显示为地址

ubuntu服务器上执行如下,默认为cycles,即cpu周期占用比例

1
2
3
4
5
6
7
8
9
10
11
12
13
Samples: 2K of event 'cycles', Event count (approx.): 507514780
Overhead Shared Object Symbol
4.11% [kernel] [k] delay_tsc
2.78% [kernel] [k] kstat_irqs
2.51% perf [.] 0x00000000000a4784
2.32% libc-2.23.so [.] _int_malloc
2.03% [kernel] [k] _find_next_bit.part.0
1.84% [kernel] [k] ixgbe_read_reg
1.52% [kernel] [k] vsnprintf
1.31% [kernel] [k] kallsyms_expand_symbol.constprop.1
1.31% [kernel] [k] update_blocked_averages
1.28% [kernel] [k] format_decode
1.25% perf [.] 0x0000000000099837

手机上执行如下

1
2
3
4
5
6
7
8
9
10
11
12
 PerfTop:     418 irqs/sec  kernel:72.2%  exact:  0.0% [4000Hz cycles:ppp],  (all, 8 CPUs)   
---------------------------------------------------------------------------------------------

14.42% [kernel] [k] 0xffffff8008bcb854 >>地址
9.80% [kernel] [k] 0xffffff8008fc6acc
9.08% [kernel] [k] 0xffffff8008fc6a68
1.90% [kernel] [k] 0xffffff80080820a8
0.94% [kernel] [k] 0xffffff8008301c00
0.84% perf [.] 0x0000000000088250
0.75% [kernel] [k] 0xffffff8008bf9928
0.63% [kernel] [k] 0xffffff8008fa5550
0.55% perf [.] 0x00000000000d87b8

因为权限设置,无法获取symbols,因为无法将地址转化为函数

1
2
3
4
5
6
7
Failed to open [kernel.kallsyms], continuing without symbols
Warning:
Kernel address maps (/proc/{kallsyms,modules}) are restricted.

Check /proc/sys/kernel/kptr_restrict.

Kernel samples will not be resolved.
常用命令
参数说明
-e指定事件
-d指定屏幕刷新间隔
-K不显示内核或者模块对象
-U不显示上层应用或者动态链接库
-p对指定的进程和其创建的线程采样

perf stat

对一个命令或者程序进行事件统计

打印说明
1
2
3
4
5
6
7
8
9
10
11
12
13
14
kernel_tool_test_C/ $ sudo perf stat ./test-run
.....
Performance counter stats for './test-run':

1.206903 task-clock (msec) # 0.000 CPUs utilized >>CPU利用率
4 context-switches # 0.003 M/sec >>上下文切换
1 cpu-migrations # 0.829 K/sec >>CPU调度迁移
119 page-faults # 0.099 M/sec >>缺页异常
2,804,835 cycles # 2.324 GHz >>处理器周期数
1,627,455 instructions # 0.58 insn per cycle >>指令数
338,915 branches # 280.814 M/sec >>分支指令数
14,436 branch-misses # 4.26% of all branches >>预测错误的分支指令数

12.012281176 seconds time elapsed
常用命令

-e和-p为通用选项

参数说明
-r重复指定命令n次,结果显示标准偏差和期望的比值
-d打印更多信息,包括L1和LLC(L3,三级缓存,数据共享)数据缓存
-d -d打印更多信息, 包括dTLB和iTLB

perf record

运行命令并将其事件记录在per.data中,之后通过perf report来展示。一般需要加上-g参数来进行函数调用显示

1
2
3
4
5
:kernel_tool_test_C/ $ sudo perf record -g ls 

[11:39:23] Makefile perf.data perf.data.old README.md src test-run
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (1 samples) ]

perf report

读取per.data文件并给出分析显示,可以看出函数的调用逻辑顺序

perf probe

perf probe可针对内核和ELF进行动态 tracepoint的追踪,更加详细见 man perf-probe

增加动态追踪点
1
2
kernel_tool_test_C/ $ sudo perf probe -x test-run test_blk                                           Added new event:
probe_test:test_blk (on test_blk in /work/chejian/test/kernel_tool_test_C/test-run)
抓取动态追踪点
1
sudo perf record -e probe_test:test_blk -gaR ./test-run
删除动态追踪点
1
kernel_tool_test_C/ $ sudo perf probe -d test_blk                                                     Removed event: probe_test:test_blk
列出动态追踪点
1
2
3
kernel_tool_test_C/ $ sudo perf probe -l                                                             
probe:schedule (on schedule) >>内核
probe_test:test_blk (on test_blk@src/modules/strace/strace.c in /work/chejian/test/kernel_tool_test_C/test-run) >>ELF程序

perf trace

同strace,包含抓取syscall的功能

1
2
3
4
5
6
7
8
9
10
sudo perf trace -o output.txt  ./test-run

11993.175 ( 0.003 ms): test-run/267526 rt_sigaction(sig: CHLD, act: 0x7ffc7d5ad570, oact: 0x7ffc7d5ad610, sigsetsize: 8) = 0
11993.180 ( 0.012 ms): test-run/267526 open(filename: 0x6031c0, flags: RDWR ) = 4
11993.195 ( 0.005 ms): test-run/267526 stat(filename: 0x6031c0, statbuf: 0x7ffc7d5ad828 ) = 0

11993.203 ( 0.023 ms): test-run/267526 mmap(len: 32, prot: READ|WRITE, flags: SHARED, fd: 4 ) = 0x7fdbf0da0000 >>共享内存映射地址, fd为 4

11995.233 ( 0.002 ms): test-run/267526 close(fd: 4 ) = 0
11995.239 ( 0.009 ms): test-run/267526 munmap(addr: 0x7fdbf0da0000, len: 32 ) = 0

-s-S选项和strace 下的 -c -w功能类似,都是对调用次数和消耗时间的统计

1
2
3
4
5
6
7
8
9
10
11
12
 Summary of events:

test-run (269586), 8 events, 3.0%

syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)

--------------- -------- --------- --------- --------- --------- ------

nanosleep 1 4995.994 4995.994 4995.994 4995.994 0.00%
write 2 0.020 0.008 0.010 0.012 18.33%
clone 1 0.000 0.000 0.000 0.000 0.00%

但是发现write时的字符串参数没有具体打印是啥,只是显示了一个地址,因此strace的write参数可以更好的确定源代码位置

perf kmem

内核内存的分配统计,一般为slab和page存取两类,操作为record和stat搭配使用

slab

slab的解析分为两部分,一部分为--caller导致的Callsite部分,一部分为–alloc导致的Alloc Ptr地址打印部分,具体的打印解释如下:

  1. Callsite,表示谁调用了kmalloc来进行slab申请
  2. Total_alloc/Per 表示调用者一共分配的内存和每次平均分配的内存
  3. Total_req/Per 表示调用者一共申请的内存和每次平均申请的内存
  4. Hit 表示调用者调用kmalloc的次数
  5. Ping-pong 多cpu架构kfree和kmalloc的共享内存对应的cpu不同,这看做是一次ping-pong,理想是越小越好。
  6. Frag 碎片率,为(实际分配-申请)/实际分配
  7. Alloc Ptr 为申请的地址
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
kernel_tool_test_C/ $ sudo perf kmem record --slab ls                                                 

Makefile output.svg output.txt output.txt.old perf.data perf.data.old README.md src test-run
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 2.225 MB perf.data (12859 samples) ]

kernel_tool_test_C/ $ sudo perf kmem --alloc --caller --slab stat
---------------------------------------------------------------------------------------------------------
Callsite | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
---------------------------------------------------------------------------------------------------------
proc_reg_open+32 | 64/64 | 40/40 | 1 | 0 | 37.500%
apparmor_file_alloc_security+5c | 608/32 | 456/24 | 19 | 0 | 25.000%
ext4_readdir+91e | 64/64 | 48/48 | 1 | 0 | 25.000%
ext4_htree_store_dirent+3e | 800/66 | 698/58 | 12 | 0 | 12.750%
....

---------------------------------------------------------------------------------------------------------
Alloc Ptr | Total_alloc/Per | Total_req/Per | Hit | Ping-pong | Frag
---------------------------------------------------------------------------------------------------------
0xffff995a4bb4c000 | 1024/1024 | 640/640 | 1 | 0 | 37.500%
0xffff995b83910f80 | 128/64 | 88/44 | 2 | 0 | 31.250%
0xffff995c561c3ec0 | 96/32 | 72/24 | 3 | 0 | 25.000%
page(to do)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
kernel_tool_test_C/ $ sudo perf kmem record --page --live ls

#
# GFP flags
# ---------
# 002152c0: I|F|NWR|NR|C|NMA|NT: __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_NOTRACK
# 01000200: NW|NWR: GFP_NOWAIT|__GFP_NOWARN
# 014000c0: K: GFP_KERNEL
# 014200ca: HUM: GFP_HIGHUSER_MOVABLE
# 014280ca: HUM|Z: GFP_HIGHUSER_MOVABLE|__GFP_ZERO
# 015080c0: KAC|Z: GFP_KERNEL_ACCOUNT|__GFP_ZERO
# 015200c2: HU|AC: GFP_HIGHUSER|__GFP_ACCOUNT
# 017080c0: KAC|Z|NT: GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK
# 01c200ca: HUM|WR: GFP_HIGHUSER_MOVABLE|__GFP_WRITE

---------------------------------------------------------------------------------------------------------
Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
120 | 30 | 0 | UNMOVABL | KAC|Z|NT | pte_alloc_one
84 | 21 | 0 | RECLAIM | HUM|Z | handle_mm_fault
40 | 10 | 0 | RECLAIM | HUM | wp_page_copy
32 | 8 | 0 | RECLAIM | HUM | handle_mm_fault
20 | 5 | 0 | UNMOVABL | NW|NWR | tlb_next_batch.isra.42
20 | 5 | 0 | UNMOVABL | KAC|Z | __pmd_alloc
12 | 3 | 0 | UNMOVABL | K | __pollwait
8 | 2 | 0 | UNMOVABL | KAC|Z | __pud_alloc
8 | 1 | 1 | UNMOVABL | I|F|NWR|NR|C|NMA|NT | new_slab
4 | 1 | 0 | UNMOVABL | KAC|Z|NT | pgd_alloc
4 | 1 | 0 | RECLAIM | HUM|WR | __page_cache_alloc
4 | 1 | 0 | RECLAIM | HUM|Z | wp_page_copy
4 | 1 | 0 | UNMOVABL | HU|AC | pipe_write
---------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------
PFN | Total alloc (KB) | Hits | Order | Mig.type | GFP flags | Callsite
---------------------------------------------------------------------------------------------------------
23073150 | 8 | 2 | 0 | UNMOVABL | KAC|Z | __pmd_alloc
28346272 | 8 | 1 | 1 | UNMOVABL | I|F|NWR|NR|C|NMA|NT | new_slab
17327464 | 4 | 1 | 0 | UNMOVABL | KAC|Z|NT | pte_alloc_one

perf lock(todo)

内核锁事件分析

使用:perf lock {record|report|script|info}

条件: 内核需要使能CONFIG_LOCKDEP和CONFIG_LOCK_STAT

1
sudo perf lock record ls                                                                             tracepoint lock:lock_acquire is not enabled. Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled?

perf mem

内存访问分析,内存操作分为两类,为load和store,对应取和存,对应的事件为cpu/mem-loads,ldlat=30/Pcpu/mem-stores/P

1
2
3
4
5
6
7
8
9
kernel_tool_test_C/ $ sudo perf mem record ./test-run

Samples: 22 of event 'cpu/mem-loads,ldlat=30/P', Event count (approx.): 4449
Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access Locked
12.00% 1 534 LFB hit [k] filemap_map_pages [kernel.kallsyms] [k] 0xffffd75181feaaa0 [kernel.kallsyms] None L1 or L2 hit No
11.80% 1 525 LFB hit [k] filemap_map_pages [kernel.kallsyms] [k] 0xffffd751809b0888 [kernel.kallsyms] None L1 or L2 hit No
9.78% 1 435 L3 miss [k] tty_write [kernel.kallsyms] [k] 0xffff995caac199c0 [kernel.kallsyms] N/A L1 or L2 hit No
9.44% 1 420 LFB hit [k] unmap_page_range [kernel.kallsyms] [k] 0xffffd7518094fca0 [kernel.kallsyms] None L1 or L2 hit No
9.42% 1 419 L1 hit [k] page_add_file_rmap [kernel.kallsyms] [k] 0xffffd75181fe0d58 [kernel.kallsyms] None L1 or L2 hit Yes

perf sched

调度器分析

使用:perf sched {record|latency|map|replay|script}

抓取调度相关事件
1
kernel_tool_test_C/ $ sudo perf sched record ./test-run
查看调度延迟
1
2
3
4
5
kernel_tool_test_C/ $sudo perf sched latency -s runtime
----------------------------------------------------------------------------------------------------
Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
----------------------------------------------------------------------------------------------------
test-run:(3) | 1.436 ms | 7 | avg: 0.020 ms | max: 0.068 ms | max at: 9477856.501406 s

各列的含义如下:

  1. Task 进程名
  2. Runtime ms 运行时间
  3. Switches 进程切换次数
  4. Average delay 调度平均延迟
  5. Maximum delay ms 调度最大延迟,是影响较大的因素
  6. Maximum delay at 调度延迟发生时间点
打印抓取的调度事件

各列的含义为:进程名, 进程号, 所在的cpu,时间戳, 调度事件

1
2
3
kernel_tool_test_C/ $ sudo perf sched script
swapper 0 [005] 9477854.475353: sched:sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_sched next_pid=8 next_prio=120
rcu_sched 8 [005] 9477854.475357: sched:sched_switch: prev_comm=rcu_sched prev_pid=8 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120
复现调度场景

重放record文件中的调度场景,但是当进程有sleep函数时重现就会卡主

1
kernel_tool_test_C/ $ sudo perf sched replay    >>复现调度打印卡住,暂无发现实践意义
调度迁移(上下文切换)

perf sched map可以打印任务在cpu之间直观的调度状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
kernel_tool_test_C/ $ sudo perf sched map -f
D0 A0 H0 G0 *R0 L0 .
D0 A0 H0 G0 R0 L0 . *S0
D0 A0 H0 G0 R0 L0 *T0 S0
D0 A0 H0 G0 R0 L0 T0 *.4
D0 A0 H0 *P0 R0 L0 T0 .
D0 A0 H0 P0 R0 L0 T0 *S0
D0 A0 H0 P0 R0 L0 *. S0
D0 *U0 H0 P0 R0 L0 . S0
*Q0 U0 H0 P0 R0 L0 . S0
Q0 U0 H0 *G0 R0 L0 . S0
Q0 U0 *. G0 R0 L0 . S0
Q0 U0 . G0 R0 *V0 . S0
Q0 *A0 . G0 R0 V0 . S0
Q0 A0 . G0 R0 V0 *W0 S0

*标示调度事件发生所在的cpu,.表示CPU处于IDEL状态 ,一个正常的map可以显示一个任务的调度会尽力给到闲置的cpu,而不是忙碌的cpu

perf evlist

显示record文件中的events

1
2
3
4
5
6
7
8
9
kernel_tool_test_C/ $ sudo perf evlist                                                               sched:sched_switch
sched:sched_stat_wait
sched:sched_stat_sleep
sched:sched_stat_iowait
sched:sched_stat_runtime
sched:sched_process_fork
sched:sched_wakeup
sched:sched_wakeup_new
sched:sched_migrate_task

perf timechart

一个通过抓取事件转化为直观的视图svg文件来显示调度器,CPU, IO等事件的工具

1
2
kernel_tool_test_C/ $ sudo perf timechart record -I -T ./test-run   
kernel_tool_test_C/ $ sudo perf timechart -p test-run Written 12.0 seconds of trace to output.svg. >>需要将文件拖入到浏览器