How to log and replay the dataplane?.
This is a tool to run tcpdump at distributed nodes and to analyze the collected data.
Here are the instructions on how to use it:
1. configure
Write a configure file(.rb) in dpm directory and specify its name by modifying the CONFIG_FILE in Makefile.
2. make install
3. monitor packets
There are several ways for users to monitor packets and other information.
A. Add the ToDump element to whereever you want to monitor within the Click configuration.
See the sample ClickGenerator.rb
B. Run tcpdump to monitor packets at each node.
Use mon.pl to start tcpdump and collect information from every node in node_list at intervals.
Usage: mon.pl [--options=string] [--inteval=t]
--options Users can specify the tcpdump by setting the options. The format of the options are exactly the same with that of tcpdump.
--inteval=t make mon.pl collect data from all the nodes every t seconds. In default, t = 60.
Example: ./mon.pl --options="tcp src 128.112.139.108"
When your experiments are done, use "make stop" to stop the tcpdump on all the nodes
Note: We will check all the *.dat files in the directory you give us, so please make sure your tcpdump files end with ".dat" in one directory and there're no other *.dat files in this directory.
4. Record packet loss
add the command in your uml startup script "/host/home/{node.slice}/mon/script/record_loss.pl --host=#{node.slice}@#{node.tap0.ipaddr} --port=#{$click_port} --dir=/host/home #{node.slice}/mon/data &"
See the sample UmlNetGenerator.rb
5. collect data
Use collect.pl to collect the tcpdump file at each node to the central node.
Usage: collect.pl [--dir=dir] [--options=string] [--interval=t] [--hash=num] [--online]
--dir we will view all the *.dat files as the original tcpdump files. So please make sure your tcpdump files end with ".dat"
--options you can use this to further specify the group of packets to deal with. Note that this option is only useful with filter on.
--hash=num use hash function to select a group of packets to monitor. (Trajectory Sampling)
We will select num% packets to monitor.
--online collect real-time data
--interval=t This is useful only when --online is on.
6. analyze data
Usage: analyze.pl [--nam] [--path] [--graph=filename]
--nam generate the trace file for Nam Animator to replay the traffic on the network
--path show all the paths an indicated group of packets transfer, it will also generate a group of files route0.pac, route1.pac … Routei.pac will store the general information of all the packets that transfer through the ith path. These files are also the input files of "analyze.pl --graph".
--graph draw the graph about the amount of packets on each path, should first run
"analyze.pl --path" and then run "analyze.pl --graph". We will generate path.gp, you can modify the x/y axis as you need in this file, and run "gnuplot path.gp" to get the graph.
--loss show the loss packets at each node
Here we show an example about how we use this tool to monitor packets and analyze their behaviors.
In this example, we want to see a special group of packets and get the detailed behavior of each packets.
1. monitor
Add the ToDump element in Click in the following places:
The entry of mac_table
The output of uml
The output of fea
See ClickGenerator.rb for the details.
Note that we use SNAPLEN 28 to avoid capture the whole packet.
2. collect data
./collect.pl --options="port 12349"
Note that the path is exactly where we store our packet record in Click configuration file. We will collect all the files in this directory, so make sure there are no other files in this directory.
We don't use the --hash option because we need the detailed view of each packet and hashing may cause an unclear view of certain behavior.
We can also choose the --online option to get some message during the experiment. But now we cannot offer the online animation. There are just some real-time statistics.
3. analysis
./analyze.pl --pid=58443
packet 58443 from tap0.sttl.12349 to tap0.wash.60820
1177191392.27063: at sttl, ttl 64
=> 1177191392.27063: at sttl, ttl 63
=> 1177191393.25804: at snva, ttl 62
=> 1177191392.28829: at losa, ttl 61
=> 1177191392.29975: at hstn, ttl 60
=> 1177191392.35433: at atla, ttl 59
=> 1177191392.40571: at wash, ttl 58
ok
packet 58443 from tap0.wash.60820 to tap0.sttl.12349
1177191403.64068: at wash, ttl 64
=> 1177191403.64068: at wash, ttl 63
=> 1177191403.65184: at nycm, ttl 62
=> 1177191403.67035: at wash, ttl 61
=> 1177191403.67153: at nycm, ttl 60
=> 1177191403.67358: at chin, ttl 59
=> 1177191403.71921: at ipls, ttl 58
=> 1177191404.7694: at kscy, ttl 57
=> 1177191404.77816: at dnvr, ttl 56
=> 1177191403.80561: at sttl, ttl 55
ok
Note: ok means that this packet is successfully transferred to the destination.
There're two other options as linkdrop(pakcets get dropped on the way) & nodedrop(packets dropped on the node). -- not fully implemented yet.
./analyze.pl --path
route 0: total packets 4377
sttl=>dnvr=>kscy=>ipls=>chin=>nycm=>wash
route 1: total packets 37
sttl
route 2: total packets 11
sttl=>kscy=>ipls=>chin=>nycm=>wash
route 3: total packets 3
sttl=>kscy=>ipls=>chin=>nycm
route 4: total packets 4
sttl=>kscy=>ipls=>chin
route 5: total packets 4
sttl=>kscy=>chin
route 6: total packets 1
kscy
route 7: total packets 2503
sttl=>snva=>losa=>hstn=>atla=>wash
route 8: total packets 9
sttl=>snva=>losa=>atla=>wash
route 9: total packets 4
sttl=>snva=>losa=>wash
route 10: total packets 523
wash
route 11: total packets 4619
wash=>nycm=>chin=>ipls=>kscy=>dnvr=>sttl
route 12: total packets 14
wash=>nycm=>chin=>ipls=>kscy=>sttl
route 13: total packets 2
wash=>chin=>ipls=>kscy=>sttl
route 14: total packets 2
chin=>ipls=>kscy=>sttl
route 15: total packets 4
chin=>kscy=>sttl
route 16: total packets 1
kscy=>sttl
route 17: total packets 5
wash=>nycm=>chin=>ipls=>kscy
route 18: total packets 2692
wash=>atla=>hstn=>losa=>snva=>sttl
route 19: total packets 11
wash=>atla=>losa=>snva=>sttl
route 20: total packets 1
wash=>losa=>snva=>sttl
route 21: total packets 2
wash=>nycm=>wash
route 22: total packets 3
wash=>nycm=>wash=>nycm=>chin=>ipls=>kscy=>dnvr=>sttl
route 23: total packets 6
wash=>nycm=>chin=>ipls=>kscy=>dnvr
./analyze.pl --nam > trace.nam
Now you can get a Nam trace file, and you can use nam to see the animation about what happened in your network
experiment.
Freeze/replay the
data plane(Click).
This function is
specialized for VINI.
------------------------------------
* Case 1: XORP and
Click
Step 1: monitor
monitor/xorp_mon.pl - use tcpdump to
catch all the packets xorp send to click
Usage:
xorp_mon.pl [--dumpfile=filename]
Description:
--dumpfile=filename
- indicate the file in which tcpdump write. In
default, it's "xorp_dump.dat".
Use "make
stop" to stop tcpdump at all the nodes when you
have done your experiment.
Step 2: script
generation
make replay_gen -
generate the replay script at each node
Step 3: replay
Start VINI with
router="none"
make replay_xorp
------------------------------------
* Case 2: Quagga and Click
Step 1: monitor and
log
add the command in your uml
startup script "~/mon/script/RouteLog
$path"
The path is where
we store our log.
Step 2: replay
Start VINI with
router="none"
make replay_route
------------------------------------
* Case 3: Enhance
the performance by forward packets in Click
add the command in your uml
startup script "~/mon/script/Route2Click $path
$hostname $click_port" (cannot work well)
Example:
we modify the UmlNetGenerator.rb:
when "Quagga"
#we add here
#for click
forwarding
if node.click_forwarding
f.print <<EOF
/home/#{node.slice}/mon/scripts/Route2Click
/home/#{node.slice}/mon/routes
#{node.slice}@#{node.tap0.ipaddr} #{$click_port} &
sleep 200
EOF
#for
log routing table in the kernel.
else
f.print <<EOF
/home/#{node.slice}/mon/scripts/RouteLog /home/#{node.slice}/mon/routes &
EOF
end
f.print
<<EOF
echo start quagga
echo 1 >
/proc/sys/net/ipv4/ip_forward
cp -R /home/#{node.slice}/demo/#{node.slice}@#{node.dnsname}/quagga* /usr/local/quagga/etc/ &
chown
-R quagga.quagga /usr/local/quagga/ &
/usr/local/quagga/sbin/zebra -d -f /usr/local/quagga/etc/quagga_zebra.cfg
&
chown
-R quagga.quagga /var/run/quagga &
EOF
Note:
One error might
occur in Case 3 like this:
"The system has no more
ptys. Ask
your system administrator to create more."
Solution:
This entry needs to be added in the
host's fstab:
devpts /dev/pts devpts gid=4,mode=620 0 0