====================== Emulab Source Tree Map ====================== This file documents roughly the contents of our source tree as of April, 2003. Some of the entries in here are per-script, others are for a group of scripts, in which case the documentation inside the individual scripts should be sufficient explanation. The end of the file also has some overview-ish stuff about abstractions and things like that. [This file maintained by testbed-ops@emulab.net] For big picture and some details, read the OSDI'02 paper, in doc/papers/netbed-osdi02* and on the Web. Accounts - unix accounts - unix group management (per-proj and per-group) - ssh key distribution - sfs key distribution - account permissions (web only, ron/wa, root/non-root, etc.) - emulab permissions - control of hardware/hw config. - administrative control - hierarchical organization - delegation at all levels - trust models and their security impact Assign (resource allocation algorithms) - the Testbed Mapping Problem: read draft of our upcoming CCR paper in doc/papers - NP Hard - in some ways, constraint satisfaction problem - but more, because not all satisfactory solutions are equal - time constraints: we're an interactive system, and need to perform on interactive timescales - a few seconds max to get a good answer - variation in wide area - soft matching - complicated more by fact that we can combine the unknown (wide-area link) with something we control (traffic shaping) - Emulab solution - many "valid" solutions, but difference between near-optimal and random valid soln. is huge and important - sim. annealing core - highly optimized - clever domain specific tricks - main purpose is to conserve scarce resources (nodes, interswitch bandwidth, soon special hw like GigE) - lots of parameters, not always clear how to tune them - Netbed solution - typically no exact match, just some that may be closer than others - very fuzzy matching - genetic algo. core - not as highly developed yet, but meets our needs - main purpose is to find a real-world overlay that matches the supplied topology as closely as possible Capture/console (node consoles - "'zero-penalty' remote research") - serial line consoles to nodes replace kbd/vga - fine-grained access control - changes quickly when node changes "ownership" - simple, secure remote access - ACLs, authenticated ssl tunnel program + standard telnet client CD-ROM (remote node mgmt/robustness, adding nodes to the system) - simple to add a node - fallback boot method (CD-ROM) when disk is hozed - path for self-update and disk reimaging - goal is to reduce need for human intervention whenever possible Database (centralized store for persistent shared system state) - lots o' stuff here - most stuff falls into one of several categories - semi-permanant hw setup info (wires, ifaces, nodes, outlets) - current hardware configs (reservations, ifaces, vlans, etc) - semi-permanant sw setup info (disk images, OS's, etc.) - current sw setup (traffic shaping, trafgen, routing, etc.) - virtualized expt info (topology, config, etc) - administrative info (users, groups, projects, etc.) - misc. config bits and logging - sw engineering issues - db schema must match sw build IXP (special hw resources) [not released due to Intel license restrictions] - use as testbed infrastructure - traffic shaping - use for experimentation - shared facil. gives more people access, increases usage - emulab is good environment w/many tools Event system (distributed event coordination/communication) - "Elvin" publish/subscribe system underneath (imported from elsewhere) - used in several directions - emulab to nodes/programs - nodes to emulab - programs on emulab server to each other - can be nodes to nodes too - delay agent - coordinated control of traffic shaping - changes can initiate anywhere - automatic timed changes from emulab - manual changes from emulab server or a node - allows for reactive traffic shaping, trace playback, etc. - nsetrafgen - control of NSE simulators and their traffic generation - program agent - start/stop arbitrary program - timed or manual, and allows reactivity - event scheduler - controls timed events - may be submitted apriori or during a run - stated uses it heavily, but is described elsewhere - tevc/tevd - simple command line client for use on any server or node - trafgen - traffic generation via TG toolkit - patched to allow control via events install (emulab cluster site configuration tools) - for making more emulabs - mostly automated install process - FreeBSD "port"/"meta-port"-style install script - installs dependencies as needed - performs emulab-specific install tasks - one for configuring a "boss" node (secure server) - one for configuring an "ops" node (public server) ipod/apod (node control without power control hardware) - "ICMP Ping-Of-Death" and big brother, "Authenticated Ping-Of-Death" - reboot pingable but hung node without external intervention - adds robustness and greater control - especially important where only other alternative is a human Libaries (Software engineering?) - shared constants - common interfaces - database routines and abstractions - important for robust, maintainable software OS tools (disk images, etc) - management of disk contents - image creation - imagezip - lots of cool tricks here - read the frisbee paper - image distribution/installation - frisbee - lots to say here... read the paper in USENIX'03 and doc/papers - growdisk - partition management on heterogeneous nodes - deltas - deprecated - dump/restore - with our incredible disk image tools, it is way faster to just reload the disk instead of checking it first - tarfile installation - easy changes without forcing a customized disk image PXE/DHCP - node boot process - automatic database-driven control of nodes - can't assume anything about the disk - node always boots off of PXE so we get control - talk to the database (via bootinfo) - may be told to boot a tftp kernel or a specific partition - tftp kernels (often with Memory file systems) used for: - disk image creation/installation - NetBoot - OSKit kernels - in emulab disk images, nodes self-configure using a pull model - see also TMCD - progress monitored by stated Security - always conscious of threat model - segregate public server (ops) - limited shells on secure server - secure server trusted by all nodes - emulab performs config tasks on behalf of user - plasticwrap/paperbag - transparently run commands on secure server - suexec during web execution adds extra layer of security and permission checks - lastlogs - track logins on servers and nodes, report into main db - giving away root on the nodes causes issues - passwords - we enforce good ones via checkpass/cracklib - have expirations Sensors - monitor nodes - healthd - temperature, etc - slothd - activity measurements - detect tty, network, cpu activity and report it - low overhead - agile - extremely low latency in detecting new activity in an idle node - higher latency okay for detecting beginning of inactivity - when its active, stay out of the way... TBSetup - core of testbed software - primary focus: expt config tasks - and auxiliary functions necessary for expt config stuff - assign_wrapper - interface between db data representation and resource allocation algorithms. Call the solver and use the output to set up the database state that runs the rest of the process. - batch daemon - core of a pretty typical batch system - allows for more automation - submit expt even when no resources are avail., runs later - checkports - ? - console reset/setup - control console access (see also capture section) - db2ns - dump our db data rep back into an ns file - eventsys start/control - start up event schedulers for each expt - see event section - exports setup - control access to files via NFS on nodes - create an /etc/exports file based on current node "ownership" and group membership - controls access to all home dirs, proj dirs, and group dirs - frisbeelauncher - wrapper to set up a frisbee server when trying to load a disk - libaudit - track requests for certain control actions - libtbsetup - see libraries section - libtestbed - see libraries section - mkgroup/mkproj, rmgroup/rmproj, rmuser - manage users, groups, and projects (sync unix world to match db) - named_setup - set up dns subdomains for each expt - create aliases for each node that are consistent across swapins - node_control - change node sw setup params (boot params, startup) - node_reboot - reboot a node as gracefully as possible - try 'ssh reboot', IPOD, then power cycle, as needed. - node_update - push mounts/accounts changes to nodes - nscheck - syntax check an ns file for use in emulab - os_load - start a frisbee disk reload - os_select - configure node boot params - os_setup - major part of expt config - db says what nodes should be running, so make it happen - may load disks, then reboots nodes and waits for them to come up - portstats - diag. tool for switch port counters - power - power control program - ptopgen - generate description of currently available hw - reload_daemon - first-cut node manager - reload disks when nodes get freed - resetvlans - clear any vlans made up of a set of nodes - routecalc - generate shortest path routes for a topology - sched_reload - set up a disk reload for later - sched_reserve - set up a node to go to an expt when freed - setgroups - update unix groups file with current membership - sfskey update - sync live sfskey config with db config - snmpit - SNMP switch control - supports multiple switch types - configures VLANs into "links" and "LANs" in topologies - read other switch data (ie for portstats) - startexp/endexp - begin/end experiments - wrappers called from web - start takes a "new" expt and an ns file - prerun it and swap it in, and send mail, leaving "active" expt - end takes a expt that is "new", "swapped", "active", or "terminated" - swap out if needed, and tbend it, then clean up the last bits - staticroutes - take db topology info and pass it to routecalc to generate static shortest-path routes for the expt. Save result in db. - swapexp - called from web - swap in, out, or restart an expt. - performs some checks, some locking, and calls tbswap or tbrestart - tbprerun - parse an ns file into the database, fully preparing it for swapin - tbswap - swap an expt in or out - performs a long list of sw/hw setup tasks - tbend - end an expt that has been swapped out - clean out virtual state - tbreport - dump a report of the experiment's configuration (virt and phys) - tbresize - older interface for rudimentary expt editing - add nodes to an expt, either unconnected or in a LAN - tbrestart - restart an expt without completely swapping out and back in - restart event system, reset ready/startup/boot status, port cntrs - vnode_setup - called from os_setup - configures multiplexed virtual nodes - mechanism: ssh runs a script in on the disk - wanassign/wanlinksolve (see assign section) - wanlinkinfo - display info on wide-area nodes from db - checkpass - see security section - ns2ir - The Parser - similar to/based on ns parser - rewrote methods to put info into database - performs emulab-specific checks - we supply a library that they use to get access to emulab-specific commands Testsuite (regression testing - software engineering?) - automated system runs lists of tests in different modes - modes are levels of reality - used for regression testing ("did we break something?") - and development ("does this new thing work?" - test mode (aka frontend mode): - all scripts run like normal, but whenever something would have touched hardware, assume it succeeded, and return - doesn't touch nodes/switches, etc, but does all the db changes - full mode: - reserve some nodes from the testbed - set up "redirect" for certain critical daemons - set up an alternate db, make our nodes the only free ones - run alternate daemons (or live daemons use alt. db for our nodes) - entire system runs like normal, but off of a separate installed set of scripts - very flexible - tests can modify db, run arbitrary scripts - simple to use in normal case - check that normal expt path runs w/o errors - work in progress: - use full mode to verify accuracy/precision of traffic shaping - some parts may evolve to a set of tests that we run quickly at after swapping in before turning it over to the user TMCD - Testbed Master Control Daemon - Server for node self-configuration - provides controlled access to the database - supports a pull model - recieves various reports/messages from nodes - TMCC - Testbed Master Control Client - currently supported on FreeBSD and Linux, and ported to OpenBSD - tool for nodes<->emulab communication - part of a set of node initialization scripts - Node self-configuration process - report "I'm alive" - update config scripts (currently via sup) - run the config, which sets up: - interfaces, accounts, mounts, agents, startup programs, testbed daemons, installs tarfiles/rpms/etc, starts ntp, traffic shaping, virtual nodes, routing (gated/ospf and static/manual routes), hostname, /etc/hosts, IPOD/APOD, sfs, etc. - used on local nodes and widearea nodes, as well as inside jails Tools (built for emulab, but useful outside of it too) - pcapper - traffic visualization tool - realtime tcl/tk graph of packets/throughput - categorized by traffic types Visualization - graphical view of topologies in the database Web Interface - Main configuration/administrative interface - Manage projects, groups, users - edit user info, ssh keys, sfs keys, etc. - push account updates to nodes - Control nodes/experiments - start/end/swap expts - control nodes, delays, etc. - NetBuild GUI for creating expts/nsfiles - node status/monitoring - Get info about Emulab/Netbed - even download a CD, and get a key to join Netbed - all the documentation - tutorials, FAQs, etc. - publications, photos, some of our users, etc. - manage project data - disk images, custom OS's, etc. - for admins etc, also provides web db access and cvs web access Stated ("state-dee") - node state management daemon - listens for node state events - performs triggered actions - watches for problems/timeouts - sends notifications at times - updates the database with current state - watches how nodes reboot, reload, etc - several "state machines" (operational modes) define what is correct - each node is somewhere in some state machine always - reports successful boots, reloads, etc. Netbed Wide-area nodes - Most emulab abstractions have netbed wide-area counterpart - same methods/abstractions/tools used in LAN or WAN environment - easy to switch from a wide-area run to an emulated run (or simulated) - Boot process a little different - Many parallels to local area case - SFS instead of NFS for shared homedirs - Can set up links as tunnels with 192.168.* addresses - Accounts same (except for rootness) - Traffic generation Simulated Nodes - many nodes simulated inside NSE on a single phys. node - can interact with real network - traffic gen can happen inside - links, etc. all work like normal - Due to NS limitations/abstractions, lots of things in the real world don't have a parallel here Multiplexed Nodes - many nodes run on one physical node, and appear as many individual nodes - Implemented with "jail" on FreeBSD, or "____" on Linux - Goal to be as close to normal physical nodes as possible - creates lots of issues with multiplexing of virtual links onto physical links - routing, demultiplexing, etc Cross-cutting Abstractions - Four different environments - Emulab/emulation (dedicated phys.) nodes, wide-area nodes, simulated nodes, and multiplexed ("virtual") nodes - can mix and match in same expt - in many cases, same expt can run in any (or several) of the environments with few or no changes - Nodes - Emulated/emulab: dedicated physical nodes in a cluster - get root, can reboot, serial console, total control of node - including OS, disk imaging, etc. - Widearea: shared nodes, geographically distributed - get an account (non-root, typically) - sometimes get a jail / "virtual server" - less control (of OS, rebooting, etc.) - Simulated: nodes inside of an NS simulator - nodes are simulated, don't run an OS, etc. - functionality programmed via NS models - Multiplexed: jails / virtual servers on cluster nodes - Almost as real as emulation nodes - allows bigger scale, risks potential for side-effects - same level of control as emulation nodes - Links - Emulated/emulab: - completely controllable network characteristics - including LAN speeds or shaped links - isolated control network - very realistic, predictable, repeatable - Widearea: - network is the real/raw internet - tunnels are optionally configured - no separate control network - completely realistic, but unpredictable - Simulated: - links inside NSE (NS Emulator) - NSE does shaping - real and sim worlds can talk to each other - Multiplexed: - Same capabilities as normal emulated/emulab links - some tricks involved to get everything to work right ---EOF---