|
Main
Reading
SWAT
Cycling
Rants
Travel
Meetings
Wiki Stuff
|
|
Work
This page collects notes on papers, books, etc, and some websites. Most work related, notable web reading is captured via del.icio.us tagging. This is essentially a list of readings that might be good cites at some point.
2k8/8
''The Voice of the Turtle: Whatever Happened to AI?" Lenat. AI Magazine, Summer 2008.
''Electric Elves: What Went Wrong and Why." Tambe, Bowring, Pearce, Varakantham, Scerri, Pynadath. AI Magazine, Summer 2008.
2k8/5
A First Course in Database Systems. Chapter 5, The Database Language SQL. Ullman and Widom, 2007.
- Familiarizing w/ SQL.
- Subsections: Simple queries, subqueries, duplicates, aggregation, modifications, schema definitions, views, nulls and outerjoins, recursion.
The Next Mainstream Programming Language: A Game Developer's Perspective. Sweeney. POPL 2006 (presentation).
- Some interesting discussion about the codebase for a typical game.
- Three main segments of code: Gameplay simulation, numeric computation, shading
- Gameplay: High-level, object oriented; tens of thousands of objects, each touching a handful when it is updated every 1/30--1/60th of a second.
- Numeric: Physics, path planning, sound, scene graphs; low level, high perforance; essentially functional---taking one large input and making a smaller, different output
- Shading: Highly specialized, running on the GPU; extremely parallel already
- The two major issues languages need to address: Concurrency, reliability
- It turns out that the two often go hand-in-hand
- Effective, reliable modularization needed in large products is very similar to guarantees needed for good concurrency
- Also need to be able to support parallel object hierarchies, i.e. a new game reconfiguring a game engine's framework
- Willing to trade some performance for ease-of-coding, reliability
- Much more static and run time checking, guarantees, semantics
- Interesting changes to integer data types recommended, performance cost feasibility shown
- Lots of focus on features of Haskell, but doesn't like the syntax
- Also, questions lazy evaluation, preferring lenient evaluation
An Integrated Framework for Enabling Effective Data Collection and Statistical Analysis with ns-2. Cicconetti, Mingozzi, Stea. WNS2 2006.
- Important goals in network simulation w/ ns-2:
- Not requiring people to write a lot of ad hoc, offline analysis code
- Reducing simulation run time by reducing logging, encapsulating results
- May want to watch just some traffic, not all the background traffic
- Logging may be especially bad for performance in distributed setting
- Describes ns2measure package, containing two main components
- C++ classes for collecting data in ns-2
- Stat class connects to probes, collects histograms
- Stat object is global, can be called from any code; arguments are metric identifier, numerical identifier (e.g. a flow), and the sample value
- Scripts & programs for doing post-analysis
- Scripts take set of measures of interest, compute averages and confidence intervals
- Manage independent replications---run a number of separate simulations, collecting the sample from each one, then taking the mean of those
- Supports stats terminating simulations, e.g. stopping replications once confidence interval is within given bounds
- Some interesting measures: Size of routing tables, size of contention windows, throughput, loss rate, queue length, end-to-end delay
- Some interesting statistics: Histograms, means, counts, time averaged
An Object-Oriented Representation for Efficient Reinforcement Learning.
Diuk, Cohen, Littman, ICML 2008.
- Presents a rule and object-oriented representation for reinforcement learning.
- Simple object oriented paradigm used to describe world
- Rule conditions are states in that world
- Actions manipulate attributes and objects
- Close relationships w/ Guestrin's relational RL
- Compresses the state space a great deal, and generalizes
- E.g., walls are basically the same, and I should learn about all of them anytime I interact with one of them
- Great example taken from Atari's Pitfall, played through an emulator
- Future work: More generalization, object hierarchies
VLDB review---distributed asynchronous continuous query checkpointing.
VLDB review---external flash memory sorting.
2k7/12
IP Performance Metrics (IPPM) for spatial and multicast. Stephan, Liang, Morton. draft-ietf-ippm-multimetrics-05.txt
2k7/11
An Ethernet Address Resolution Protocol. Plummer. http://www.ietf.org/rfc/rfc826.txt
- Apparently still current?
2k7/9
HOLSR: A Hierarchical Proactive Routing Mechanism for Mobile Ad Hoc Networks. Villasenor-Gonzalez, Ge, Lamont. IEEE Communications, July 2005.
- Basic point 1: Ad hoc networks, and perhaps especially proactive ones due to the state they must keep and flood around the network, do not scale well. One way to address this is to break the space up, so not all nodes are keeping track of all nodes.
- Basic point 2: Current protocols are not well designed for heterogenous networks, consisting of nodes with different radio range, throughput, memory, etc
- Hierarchical OLSR runs several OLSR networks, each associated with a layer in a hierarchy defined by a group of nodes sharing a particular communications band
- For example, small sensors may form clusters at the bottom, WiFi a mid layer, and satellite the top layer
- Nodes with clusters under them must know all the details of that network, but other networks need not. This improves scalability; such nodes generally associated with more powerful computing platforms anyway
- Does not require a particular addressing scheme
Wiring Vietnam. Tambini. Scarecrow Press, 2007.
- Documents very early sensor networks/fields deployments and development throughout the Vietnam war.
- Good look at operations, structure, and successful uses of sensor networks.
- Some talk of current developments and usages.
2k7/8
Allowing Errors in Speech Over Wireless LANs. Chakeres, Dong, Belding-Royer, Gersho, Gibson. Workshop on Applications and Services in Wireless Networks (ASWN), 2004.
- Basic point:
- 802.11 MAC discards packets with any error in them.
- For some applications, like speech and voice, this is unnecessary; they can tolerate some error
- What can be modified to enable this, and how well does it perform?
- Varying amounts of protected bytes are required
- MAC & IP Headers have to be protected, otherwise packet may not be delivered correctly
- * Some parts of the application data must also be protected, such as sequence numbers, frame counts, metadata, and other identifiers
- Accomplished here by altering the MAC to protect parts of data (corresponding to VOIP traffic)
- Could be done by modifying MAC to protect header, allowing IP checksum over its header, and using UDP Lite or similar package to protect header and parts of payload
- Performance is not overwhelmingly better here though
- Perhaps it would be over multiple hops or lesser quality links?
- Congestion is notably eased, however, by requiring less retransmits upon discard
Generalized MANET Packet/Message Format. Clausen, Dearlove, Dean, Adjih. draft-ietf-manet-packetbb-08.
- NOTE: Version 9 now exists, but does not change any significant content except noting that child protocols may choose how to handle malformed packets.
- Defines a general format for control information packets, intended to be used by MANET routing protocols, etc.
- Based around blocks of addresses and TLV structures
- Comments to MANET group are here.
MANET Neighborhood Discovery Protocol (NHDP). Clausen, Dearlove, Dean, OLSRv2 Design Team. draft-ietf-manet-nhdp-04.txt
- Outlines a relatively simple protocol for nodes to determine their two-hop neighborhood.
- Done by local broadcasting all known neighbors; receiving nodes then know their neighbors' neighbors
- Based on PacketBB, using address blocks and TLVs to convey information
- HELLO messages may carry other information, e.g. MPR info
- Some provision for quality information, but no specifics, left up to implementation
- Possibilities: Using announced times for next message to count missing messages, link layer info, etc
- Comments to MANET group are here.
A Gossip Protocol to Support Service Discovery with Heterogenous Ontologies in MANETs. Nedos, Singh, Cunningham, Clarke. Trinity College Dublin Tech Report TCD-CS-2006-34.
- Concerned more with sharing and aligning ontologies than necessarily with matching for discovery.
- Ontologies are spread by gossiping---each node opportunistically transmits a random portion of its knowledge, which is then aligned by receiving nodes and eventually forwarded on
- This is important for MANET settings, which perhaps have many different but related ontologies in use
- Also solves (in)persisntence of references and availability of resources---links for retrieval, CPU for alignment
- Services described as input/output concept sets; some special annotation required, not very clear
- Not totally clear here how services are actually found; data seems to be in "ontologies" and locally searched
Host Extensions for IP Multicasting. Deering. RFC 1112
- Coordination and forwarding between routers is not addressed in this RFC, just host requirements.
- Multicast groups are identified by Class D IP addresses, those with 1110 as their high order bits
- This ranges from 224.0.0.0 to 239.255.255.255
- 224.0.0.0 is guaranteed to not be assigned
- 224.0.0.1 is assigned to all IP hosts, including gateways
- pg 3, interesting figure, note placement of IP-to-local mapping:
| |
| Upper-Layer Protocol Modules |
|__________________________________________________________|
--------------------- IP Service Interface -----------------------
__________________________________________________________
| | | |
| | ICMP | IGMP |
| IP |______________|______________|
| Module |
| |
|__________________________________________________________|
---------------- Local Network Service Interface -----------------
__________________________________________________________
| | |
| Local | IP-to-local address mapping |
| Network | (e.g., ARP) |
| Modules |_____________________________|
| (e.g., Ethernet) |
| |
- Multicast datagrams sent using same "Send IP" operation as unicast
- Desirable but not necessary extensions for sending: Multicast TTL, identification of interface to use, inhibit local delivery of datagram
- Must recognize group IP addresses and route out interfaces appropriately
- "The IP source address of the outgoing datagram must be one of the individual addresses corresponding to the outgoing interface."
- And it only goes out one interface
- To receive, must provide join & leave group functions
- Host components of IGMP version 1:
- Routers issue Host Membership Query messages to the all-hosts group
- For each group it belongs to, the host sets a timer for a random interval; when that timer expires, it issues a Host Membership Report for that group
- Membership reports are addressed to the group being reported; if a host overhears a membership report for one of its groups, it kills that timer and squelches its own report
- Hosts also generate membership reports upon joining a group (actually, one, two, or more reports, in case some are dropped)
- This is clearly all messier on a MANET
User Datagram Protocol. Postel. RFC 768.
Internet Protocol: DARPA Internet Program Protocol Specification. Postel. RFC 791.
- Discusses the basic IP model and packet format.
Broadcasting Internet Datagrams in the Presence of Subnets. Mogul. RFC 922.
- Discusses addressing of various classes of broadcast messages, with slightly more attention on subnet issues.
- Does not indicate what source should be chosen for multi-interface broadcasts.
Broadcasting Internet Datagrams. Mogul. RFC 919.
- Discusses addressing of various classes of broadcast messages.
- Does not indicate what source should be chosen for multi-interface broadcasts.
Comparison of Broadcasting Techniques for Mobile Ad Hoc Networks. Williams, Camp. MOBIHOC 2002.
- Broadcasting is a necessary component of many unicast protocols, frequently used for disseminating topology or route information
- No 802.11 RTS/CTS for broadcast traffic
- Must jitter scheduling of broadcast packets; otherwise we all talk at the same time
- Should discard queued packets if receive duplicate
- How feasible/easy is this?
- Alternatively, keep the packet at the network layer as long as possible
- Four families of techniques: Simple Flooding, Probability Based Methods, Area Based Methods, Neighbor Knowledge Methods
- Simple flooding: Rebroadcast all packets
- Probabilistic: Rebroadcast with pre-determined probability
- Counter-Based Scheme: Probability I'll gain additional coverage is inversely proportional to number of times I've received packet; base retransmit prob on that
- Area Based Methods: Distance-Based (based on signal strength), Location-Based(based on coordinates); only retransmit if covering significant additional (geographic) area
- Neighbor Knowledge Methods; key distinguishing characteristic: self-election or explicit upstream pointing
- Self-pruning: Compare neighbor lists; if equivalent, don't retransmit
- Scalable Broadcast Algorithm: Examine neighbor list w/ neighbor list from received node; forward if differ; check each time a dupe is received
- Dominant Pruning: Difference w/ MPR?
- MPR: Choose neighbors best covering 2-hop neighborhood
- Ad Hoc Broadcast Protocol: Same as MPR, except done per-packet, removes from consideration nodes in neighborhood known to have received same transmission, assumse relay if no info received (i.e. no HELLOs exchanged yet)
- CDS-Based Broadcast Algorithms: Similar to AHBP, but takes into account coverage of MPRs
- LENWB: Self-elect to rebroadcast based on priority, which is based on number of neighbors
- Simulation properties: Node density, mobility, traffic rates
- In sparse networks, performance approachs flooding; should do better as density increases; best case is Minimum Connected Dominating Set
- Set of nodes are connected, all non-set nodes are within one-hop of at least one member of the set
- Remember: Minimizing transmissions is not only thing here; ensuring delivery is also important
- Four studies: W/ no MAC, with congestion, with mobility, combined density, congestion, mobility
- Study 2: Small and fixed packet size (as with most broadcast traffic), varying packet rates
- MCDS only increases with network area, not density
- Protocols minimizing redundant transmission deliver the most in congested networks
- In high mobility, explicit selection can falter when selected nodes are gone. Can lower HELLO frequency, but that increases overall transmissions. Can work out that implicit schemes work better.
- Have to adjust wait-for-duplicate intervals to account for congestion
- Good references
Simplified Multicast Forwarding for MANET. Macker, SMF Design Team. draft-ietf-manet-smf05.
Review for WIDM 2007. Comments in email.
IP over 802.16 Problem Statement and Goals. Lee et al, IETF Internet Draft (Feb 2007 ver)
- 802.16 stops, as usual, at physical and link layers. WiMAX Forum and WiBro defining network architecture above those layers.
- 802.16 MAC is point-to-point and connection oriented, which complicates this
- Each node has a 48bit MAC address, including base station
- Nodes have no connection upon entry to the network; must create them
- Uplink is always unicast
- PDU (Protocol Data Unit) contains no source and destination addresses; only CID (Connection ID)
- IP Specific Subpart of 802.16 Packet Convergence Sublayer supports IP over point-to-point connection
- This draft includes a good terminology section
Applied Cryptography. 2nd edition. Schneier. Wiley & Sons. Chapters 1--8, 10, 24.
- Chapter 1: Foundations
- Chapter 2: Protocol Building Blocks
- Chapter 3: Basic Protocols
- Chapter 7: Key Length
- Chapter 8: Key Management
- Chapter 10: Using Algorithms
- Chapter 24: Example Implementations
2k7/7
Policy-Based Network Management for NeXt Generation Spectrum Access Control. Perich. IEEE Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2007.
- Need to overcome harmful interference from malfunctioning devices or malicious users
- Defines an OWL-based policy language for various capabilities of software defined radios
- Uses SWI-Prolog on desktops, ad hoc reasoner in radios (maybe just XML parsing?)
- SWRL also used
- Policies and configuration decisions pushed around using NETCONF
Highly Distributed XQuery with DXQ. Fernandez, Jim, Onose, Morton, Simeon. SIGMOD 2007.
- Ad hoc programs for distributed data processing are expensive to maintain, hard to deploy, and require hand optimization of the distributed data processing operations
- Distributed XQuery processing can alleviate much of that
- Especially as more data is captured, exchanged, and manipulated in XML, this is very natural
- Enable easier development of reliable, extensible, efficient distributed data-intensive applications
- DXQ provides for XQuery dissemination, remote invocation of XQuery programs, and optimization and placement of distributed XQueries
- Examples demonstrated include DNS and Narada
Performance of Routing Protocols in Large-Scale Mobile Wireless Ad Hoc Networks. Zhang, Riley. IEEE Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOT 2005)
- Uses GTNets to evaluate AODV on very large networks under various parameters
- 10000 to 50000 nodes
- Increased mobility, by either faster speed or shorter pause times
- Increased traffic load
- Metrics: Packet delivery ratio, end-to-end latency, control overhead, average hop count
- Results:
- Within that range of topology size, there is little effect on packet delivery ratio; it's already hard enough to begin with
- Would be nice to show that statistically the difference is insignificant
- Traffic load has substantial effect on packet delivery ratio, presumably due to collisions
- Speed has little effect on packet delivery ratio; pause time has significant effect
- Pause time has greatest effect on end-to-end delay of any parameter
- Not clear how control overhead grows; appears linear, but rises sharply at 50k nodes
- MAC layer packet loss is very sensitive to mobility
- Includes short description of AODV and various possible enhancements, though it doesn't do much to look at those enhancements
- General lack of solid statistics here, i.e. confidence or even number of trials
Edutella: A P2P Networking Infrastructure Based on RDF. Nejdl, et al. WWW 2002.
- Available at http://www2002.org/CDROM/refereed/597/
- A lack of formal, general metadata specifications and search capabilities is fragmenting P2P infrastructure and use, instead of enabling universal P2P infrastructure to be developed and deployed.
- Makes heavy use of JXTA, a set of XML based P2P protocols from Sun
- P2P infrastructure enables organizations to share costs
- Keyword search in filenames or keys is not sufficient to manage more exacting query needs.
- Not clear how much better the metadata search is compared to the filename search; it's still proximity matches over natural strings for things like books by title.
- Supporting mapping between schemas is important for large P2P networks containing subnetworks that may have there own versions or extensions.
- Several query levels are defined, characterizing what kinds of queries a node may answer.
- Conjunctive formulas
- Conjunction and disjunction
- Conjunction, disjunction, and negation of literals; this is essentially datalog
- Transitive closure and linear recursive query definitions
- Arbitrary recursive query definitions
- Aggregation (COUNT, AVG, MIN, MAX); this level is more a characteristic that may be added to any of the other levels
- No discussion on query propagation
- Local query answering provided by rather standard transformation to instance matching in datalog or DBs
Toward Distributed Service Discovery in Pervasive Computing Environments. Chakraborty, Joshi, Yesha, Finin. IEEE Transactions on Mobile Computing, February 2006.
- Two important components: Discovery architecture, service matching mechanism
- Broadcast wastes processing time, network resources along paths that may not even lead to a service, nodes don't have enough memory to cache descriptions
- Metrics: Average response time, average response hops, discovery efficiency, and average network load
- References to work on cache replacement policies
- Routes responses back along request forwarding routes, uses AODV when those routes have gone stale
- Doesn't seem to define a concrete message model
- Do requests get routed to just one provider? How is that managed?
- What about explicit multicast support?
- Service grouping and selective forwarding based on given ontology
Maximizing the Probability of Delivery of Multipoint Relay Broadcast Protocol in Wireless Ad Hoc Networks with a Realistic Physical Layer. Ingelrest, Simplot-Ryl. Mobile Ad-hoc and Sensor Networks (MSN 2006)
- Previous analysis of MPR was based on unit disk graph, which yields unrealistic results
- As long range links have same probability in the model, they are generally chosen since they have the most coverage
- This leads to many failures in actuality, since success may not be very probable
- Often better to choice more relays and shorter links to provide higher probability of success
- Low error rates can be addressed with correcting codes; otherwise more or re- transmissions are required
- Goal: Maximize coverage, minimize # or relays
- Accomplished by changing MPR step 2 heuristic to one of three (third is best):
- Weight coverage of relay by the probability of it receiving the message
- Probability of successful transmit to relay weighted by average probability of successful relay
- Do the same, but don't just set a node as covered when a relay has been chosen; choose redundant relays until node is covered beyond some threshold probability of success
- Simulations done using approximation of lognormal shadowing model
- Unit disk model results in optimizing over hops
- Lognormal shadowing results in optimizing over expected throughput
- Chooses somewhat more relay nodes, but has much higher coverage under more realistic model
- Good references to follow up on in this
'Multipoint Relaying for Flooding Broadcast Messages in Mobile Wireless Networks. Qayyum, Viennot, Laouiti. 2002 Hawaii International Conference in System Sciences.
- Introduces MPR as a technique for controlling the amount of traffic produced in a network-wide broadcast
- techniques must tradeoff amount of control traffic generated versus savings (don't spend more than saved!)
- Very simple: Pick a minimal set of neighbor nodes that cover the 2-hop neighborhood
- 2-hop neighborhood easily exchanged/determined w/ HELLO messages
- Basic problem is NP Hard; reduction from dominating set used in proof here
- Simple heurisitic used here: Choose relay nodes in order of number of 2-hop nodes covered, after choosing those required to reach 2-hops (single path to 2nd node)
- Heuristic is log n optimal
- Works well in smiulations w/ unit disk graph under varying probabilities of success
- Much less traffic, though less redundancy and therefore higher probability of failure (important)
- Good references to follow up on
Opinion: Is It Time To Replace SMTP? Crocker. IP Journal; Volume 10, number 2 (June 2007).
- Key point: SMTP hasn't changed in a long time. It's good at what it does, but should it do more?
- Rather than evolving SMTP, people have moved toward more centralized, non-standard solutions for other comms modalities like forums, instant messaging and wikis
- But, mail is definitely different; if you're not looking for/expecting communication, forums don't work; need to be able to push messages
- How do those concepts and usage models relate to mail?
More ROAP. IETF Journal, IETF 68, May 2007, volume 3, issue 1
- Network Management Research Group (nmrg) and European Network of Excellence for the Management of Internet Technologies and Complex Services (EMANICS) interest in uncertainty and probabilistic approaches as well as ontologies for network management
- Separating identity and location is a huge, looming issue
- Important to reduce the information associated with locators so that they reflect primarily network topology
- Need to be able to aggregate them efficiently
- Important features: Multi-homing across multiple providers, ease of provider switching and locator renumbering, mobility, roaming, session resilience, and traffic engineering
- Clean slate approach is pretty questionable, how could you ever get it moving?
- BGP is unusual in that it works w/ TCP's send queue, compressing state by removing obsoleted data waiting in the send queue
- This has to be done on a host-by-host basis (as opposed to data/message) to avoid slowing uncongested hosts
- Difficult to make richer endpoints in the face of simple devices such as RFID tags
Improving 802.11 Range with Forward Error Correction. Riemann, Winstein. Technical Report MIT-CSAIL-TR-2005-011, 2005.
- Ethernet is restricted by delay, not signal power; combined with CSMA/CD, means congestion causes errors, not noise.
- In wireless settings lots of things can cause noise; adding FEC redundancy can recover from bit errors in received packets
- FEC makes even more sense for 1-way links as well as when latency is important; doesn't make sense for 2-way links when latency does not matter
- According to authors, at the time only Intersil Prism 2/2.5/3 chipset cards have a method to retrieve frames with errors
- MadWifi for Atheros might now as well
- Used modified Host AP driver to catch erroneous frames
- Used broadcast to avoid retransmits (most 802.11 cards retransmit 4 times)
- Scrambles the message to mitigate problems with contiguous sequences of errors, which would overwhelm that coding block.
- Get distances up to half a mile plain, and 0.87 of a mile w/ Reed-Solomon coding, but half a 200mW transmitter.
- Notes that RoofNet did not have nearly the same sucess; hypothesises that this is because line-of-sight links are fundamentally different from urban multi-path.
The Google File System. Ghemawat, Gobioff, Leung. SOSP 2003.
- Design for failure; file system is built on commodity hardware, some percentage of which is bound to have a fault at some point and/or fail permanently, and at large scale that means a lot of devices will fail.
- Examine assumptions---very large files and predominantly append-only writes implies different strategies for disk access, block structure, atomicity guarantees, and optimization.
- Focus on sustained bandwidth rather than low latency.
- Support short read/write operations in random places, but don't optimize for them.
- Non-standard operations:
- Snapshot: Copy a file or directory tree at low cost
- Record Append: Allow concurrent, atomic appending to a file. Simplifies apps by removing locking logic. Useful for multi-way merge & producer/consumer queues with many producers.
- Clients get metadata from central master server; talk directly to chunkservers to perform reads/writes
- Chunkservers ultimately have final say over what they're storing, and there are many cases such as failure where the master cannot control it.
- Therefore, use soft state on the master rather than trying to keep a consistent view.
- Master state periodically written to disk in direct memory form; no need to parse upon reload.
- Once operation log grows too big, it is checkpointed in this fashion.
- Consistency model is somewhat simplified compared to DB semantics
- Consistent: All clients will always see the same data, regardless of replica is used
- Defined: Consistent and clients will see data in entirety
Art of Computer Systems Performance Analysis, The. Jain, Wiley Press, 1991. Chapters 1--3, 13.
- Chap 1: Basic overview of performance evaluation.
- Chap 2: Common mistakes.
- Chap 3: Techniques & Metrics.
- Chap 13: System comparison using sample data.
2k7/6
Introduction to RF Propagation. Seybold. Wiley Press, 2005. Chapters 1--4, 6, 7, 12.
- Chap 1: Basic applications and introduction to RF modeling.
- Chap 2: Electromagnetic physics overview.
- Chap 3: Antenna fundamentals.
- Chap 4: Link budgets.
- Chap 6: Atmospheric effects.
- Chap 7: Near-earth propagation models.
- Chap 12: RF safety.
Experimental Evaluation of Ad Hoc Routing Protocols. Borgia. IEEE PerCom Wkshps 2005
- Compares AODV and OLSR in outdoor and indoor settings; UNIK-OLSR and UU-AODV implementations
- Claim: Current routing technology has an ad hoc horizen of 2--3 hops and 10--20 nodes, beyond which it doesn't function very well
- Relatively small number of nodes used, 8 Linux laptops of varying kinds
- Test inside an office/academic building, out on a field
- Metrics: Overhead, delay, packet delivery ratio
- Overhead: OLSR 200--1200Bps; AODV 200--400 Bps
- AODV 19--20 seconds to establish first path; OLSR 8
- AODV delays are significantly longer, ~1--2 seconds versus ~200msec OLSR
- Outdoor 3 hop, OLSR goes to 1 sec delay, AODV drops 50% more packets
- Decent set of references in this.
ISM-Band and Short Range Device Antennas. Loy, Sylla. Texas Instruments Application Report SWRA046A. 2005
- Reviews basic antenna and propagation models
Path Loss Model for Wireless Applications at 3500 MHz. Joseph, Roelens, Martens. IEEE (where?)
- Quickly presents a measurement study for a WiMAX-type transmitter and develops a path loss model
- Compares the model to some existing ones, but does not seem to go back and compare to new data again
A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols. Broch, Maltz, Johnson, Hu, Jetcheva. MOBICOM 98
- Required for simulating MANETs:
- Node mobility
- Spatial diversity, varying collision domains
- Radio propagation model supporting propagation delay
- Propagation typically $1/r^2$ at short distance (r is distance) and $1/r^4$ at longer distances
- Crossover is usually ~100m for outdoor low-gain antennas roughly 1.5m above ground plane in the 1--2GHz band
- Capture effects (continue receiving as long as new packets RSS > -10dB from current)
- Carrier sense
- General improvements:
- Periodic broadcasts and response packets are artificially jittered to avoid synchronization
- Routing information queued for transmission ahead of ARP and data packets
- Most used link breakage detection relying on MAC layer
- Good summary here of DSDV, TORA, DSR, AODV; some optimizations introduced:
- DSDV: Hop by hop distance vector routing protocol; guarantees loop freedom
- TORA: Prioritizes quickly found routes over the best routes
- Uses IMEP (Internet MANET encapsulation protocol); does some aggregation of packets, guarantees in-order delivery for control traffic
- Very sensitive to the interval chosen to wait for aggregation
- DSR: Reactive source routing; some optimizations:
- Supports unidirectional routes, though 802.11 does not
- First queries neighbors to see if they're destination or have route
- Sniffs for routes in promiscuous mode
- Caching multiple redundant routes, to use on failure
- AODV: Combines DSR and DSDV in a sense
- Use link layer link detection instead of HELLO messages
- Shorten timeouts on route requests
- Notes on experiment setup:
- Chose a rectangular workspace to force longer routes than in a square space
- Includes a lot of runs---70 movement patterns
- Speeds from 20m to 1m per second
- Constant bit rate sources
- None of these protocols do load balancing, and scene gets congested, so investigation limited to small packets
- Claim: Difficult to compare routing protocols directly using TCP, since it is also adapting to the network
- Metrics: Packet delivery ration, routing overhead, path optimality
- Delivery rate is ultimately what matters most to many applications
- Routing overhead effects scalability, functionality in congested or low-bandwidth environments, efficiency of power consumption; may exacerbate congestion/collisions, fill up queues.
- Should include #bytes, not just messages in this---some have big headers
- Path optimality effects delay, loss, bandwidth
- Interesting points from results:
- DSDV suffers because it does not cache backup routes
- Claim: broadcast packets are inherently less reliable because they cannot use DCF
Mobile Ad hoc Networking and the IETF. Chakeres and Macker. Mobile Computing and Communications Review, vol 1, #2.
- Briefly overviews the state of the IETF MANET WG just after it was refocused on OLSR, DYMO, and SMF
- OLSRv2: Simplified version of OLSR, a proactive routing protocol
- More extensibility in packets, aggregation of IP addresses
- DYMO: Reactive protocol, designed to be simple and easy to implement
- SMF: Basic multicast forwarding service, with interface to add in routing information
- Main parts: Duplicate packet detection (DPD), relay set selection (RSS)
- Supports both MPR and CDS set algorithms
- Notes some related work in new autoconf group, OSPF (OSPF-MANET), and 6lowpan
Large-Scale Network Simulations With GTNetS. Riley. Winter Simulation Conference 2003.
- Briefly touches on overall design of GTNetS.
- Describes optimizations to reduce memory footprint, typically what constrains topology the most
- FIFO event queues---though it's not clear to me why you'd do this any other way...
- Abstract packet queues---calculating some events to eliminate the actual marker
- Timer buckets---grouping events into a less detailed time-resolution to save list size
- No routing tables---compute on demand, cache at sources
- No routing tables or computation at leaf nodes or for local broadcast destinations
- Lots of control over logging, built in statistics collection (obviating data collection)
2k7/5
Empirical Methods for Artificial Intelligience. Cohen. MIT Press, 1995. Chapters 1--3.
- Covers basic statistics, experiment design, and classical hypothesis testing.
2k7/4
Dude, get up to date!
2k7/3
TAG: A Tiny AGgregation Service for Ad-Hoc Sensor Networks. Madden, Franklin, Hellerstein, Hong. OSDI 2002.
OWLDB: Integrating Ontologies and Relational Databases. Auer, Ives. Pre-print draft.
- Extensive comments in email.
Aurora: A New Model and Architecture for Data Stream Management. Abadi, Carney, Cetintemel, Cherniak, Convey, Lee, Stonebraker, Tatbul, Zdonik. VLDB Journal, 2003.
This paper outlines a very comprehensive stream processing system. In addition to the basic operators and data management mechanisms (buffers, archival storage, etc), it incorporates a heavy focus on resource utilization and quality of service optimization. In addition to data flow, users also specify resource limits and QoS graphs based on time, values, and other properties. The system then employs a variety of mechanisms to meet resource limits while optimizing QoS.
As presented, Aurora is a centralized stream processor, with some mention of future work on distributing the system. More than being "just" a stream processor, however, it presents a lot of great concepts for application messaging. It's interesting to consider all messaging as streams, and apply Aurora concepts to those streams. Examples include squelching messages that have current recipients (moving drop boxes downstream), prioritizing messages based on application metrics (value-based QoS graphs), and aggregating messages when specifics are not needed.
The notion of train scheduling probably fits in well with generating bursts of network traffic in a distributed Aurora system. Processing and then shipping out tuples in batches will improve aggregate costs in transmitting the tuples.
It's interesting that the Aurora scheduling process consumes as much time or more than the actual data processing. More discussion on how that overhead varies---by data, by network, etc---would be worthwhile. It's not completely clear from the graphs presented how much of the costs are fixed, how much are data load dependent, and how much depend on network size.
Models and Issues in Data Stream Systems. Babcock, Babu, Datar, Motwani, Widom. Invited talk, ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2002.
This paper outlines several models for streaming databases, with a focus on continuous queries. It frames much of the discussion in terms of approximation, where relaxation is done either by discarding data (e.g, sliding windows over recent data, or sampling) or aggregation (e.g. statistics and synopsis techniques). Other interesting points include punctuation, control markers inserted into data for upstream consumption (i.e. everything after this point is day > 10), optimization over query plans, and queries referencing discarded history.
If declarative routing style networking were to be extended to incorporate higher, application and domain layer data and policies, it would most likely have to more explicitly incorporate data stream mechanisms such as those described here. The current work seems to include aspects of update data models and sliding windows (timeouts on data), but would have to be more flexible and explicitly defined to support the wide range of data models (i.e. append) and queries (i.e. averages over periods of time) that applications my require.
Also of interest is that the close similarities between database streams and networking. Sliding windows, aggregations, and other approximations may seem somewhat new for databases, but are very natural and accepted from a networking perspective. However, there are some things here which networking can in turn perhaps draw from. For example, different data models, such as append versus update, imply possibilities for different delivery strategies and optimizations (i.e. deleting a packet if not delivered before its update window, or replacing queued items made obsolete by newly delivered data).
It's also interesting to consider all of a system's networking from a data stream perspective. That might provide a unified framework for tasks such as cost based, cross-layer optimization of networking strategies and quality of service. Examples include expediting messages deemed important by the domain, reducing all traffic in the presence of dwindling power supplies, trading off delay versus traffic in aggregating smaller messages, etc.
BAA07-07: WNaN Adaptive Network Development (WAND). DARPA. 2007.
- Call to develop ultra-large (tens of thousands of nodes), densely connected, adaptive networks over inexpensive wireless hardware (developed under BAA06-26: WANN).
- Huge networks, mission aware, network planning, infrastructure utilization, distributed computation, multicast, cacheing, cross layer, policy reasoning, delay tolerant, flexible and maintainable (easily updated).
Long Term Knowledge Retention Workshop Summary. Lubell, Rachuri, Subrahmanian, Regli. NISTIR 7386, December 2006.
- Review of workshop held at NIST in 2006.
- There are real business cases & stories around for engineering archiving, including examples from legal, operational, and development perspectives.
- Same old story: STEP is great for solid geometry, not so much other data. Need to capture simulations, intent, test data, process plans, etc.
- Need package formats and specifications.
- Don't worry about capturing everything in some easily accessible format. It's more important to capture as much as possible than to do it well but only get a little bit. The "digital archeology" approach to archiving.
Forensic Analysis for Epidemic Attacks in Federated Networks. Xie, Sekar, Reiter, Zhang. IEEE ICNP 2006.
- Describes a system for information sharing between autonomous systems (in the Internet sense) in order to more effectively analyze and trace attacks.
- Key point revolves around statistical sampling of connections in the network, moonwalks, to backtrace nodes likely to have been involved in the epidemic.
- ASs may optionally provide information to their neighbors to help with this process. It's in their benefit to do so, as they will also be better able to analyze the attack.
2k7/2
Service Oriented Architecture (SOA) Primer: Service Oriented Computing, SOA Components, Web Services, and Service Orchestration. Mongan. Draft preliminary background study, 2006.
Active Networking: One View of the Past, Present, and Future. Smith. IEEE Transactions on Systems, Man, and Cybernetics, February 2004.
- Great review of development of active networks.
- Interesting point: Typical users don't see the network as dumb. Routing, caches, DNS, etc, all just gets lumped in there.
- Basic code loading approaches: On demand; in advance; per packet. ** Related distinguishers: Active over packets or flows? Upgrades by application developers or network administrators?
- Basic tradeoff: Flexibility vs safety vs performance.
- Interesting point: Not all nodes need to be active. Could place at lower bandwidth links, AS boundaries, etc.
- Interesting combination: Powerful services installed easily on a node, most of them offering information rather than forwarding decisions, and a small per-packet glue language to tie them together and make decisions through the network.
- PLAN packet language employs interesting, severe restrictions to ensure properties such as termination, byte-code compiling for performance, framework resource controls for fairness on node.
- Much of active networks has since been rolled into projects such as PlanetLab. Overlay networks in general can be seen as an ad-hoc form of active networking. Using kernel hooks for userspace routing programs is another (popular for MANETs). ** Example from router development/experimentation: NRL's SMF implementation, which uses IPQueue module in IPTables to install its own code and rewrite kernel's IP sequencing.
- A lot of active networks concepts are also being deployed in the mainstream, but in an ad-hoc fashion. Firewalls, IPTables, etc. Mostly from the services/stack direction rather than capsules.
- Related work under other names, e.g. composable protocols (DARPA SAPIENT), extensible routers.
- Mobile agents from AI community basically combo of overlay & active networking. Different tradeoff in flexibility vs performance: Don't do routing, but mobility useful for large application layer tasks.
- Encapsulation very useful, e.g. to allow encryption as possible services.
- Maybe capsule approaches make the most sense in low-bandwidth, high latency settings (e.g. IPN DTN), where retransmits are costly, and decisions may vary widely.
Active Network Vision and Reality: Lessens from a Capsule-Based System. Wetherall. SOSP 1999.
- In this work, capsules do not carry code. Instead, they carry an identifier uniquely associated with a service. If that code is not loaded at a node when a packet arrives, the code is requested of the previous forwarding node. If it's already not available there, the packet is dropped.
- Relies on Java language & security model, simple techniques such as killing long-running threads, to provide resource guarantees.
- API is very restricted. Deployment of upgrades to basic active framework sounds almost as problematic as traditional router upgrade problem. Mitigating factor is that packets can reason on upgrades and possibly degrade gracefully in their absence.
- It's unfortunate that packets of different types cannot share information. It would perhaps be useful if soft state was world-readable, but only writable by packets in that service class.
- It seems somewhat misleading to refer to this as a capsule based approach. My impression is that their choice of Java maded per-capsule code completely infeasible. In addressing the size issue, they wound up moving to basically a router plugin architecture into which packets are demultiplexed and plugins are easily fetched on demand. Technically you could ship new code with every packet, but it would require the fetch after receiving it.
- Capsule code in a domain specific language is likely to be much smaller and faster. This implies that it might be feasible to have a powerful, full blown language implementing services, and a small packet language to glue them together. Seems to be what PLAN does.
- You might be able to argue that packets hardly ever change, so why not just have them demultiplex into a set of services?
- Regardless, it would be nice if services were more composable, which they aren't really here.
- No generic setup here for collecting local information and allowing packets to use it, such as bitrates, drop rates, neighbors, etc.
- Some interesting points here about amortizing the fetch cost. Also, interesting numbers on how much processing time can be allocate per packet at different places in the network.
Implementing Declarative Overlays. Loo, Condie, Hellerstein, Maniatis, Roscoe, Stoica. ACM SOSP 2005
This paper presents P2, an extension of the "Declarative Routing" work to better support more sophisticated network protocols. In particular, the aim is essentially to enable rapid network overlay prototyping, debugging, and evaluation. A version of Datalog extended with implicit message passing/tuple storage and many functions and features for protocols, such as tuple-triggering timers, is developed as the language for describing these protocols. A flexible dataflow framework with Datalog operators, timers, network handlers, and other components is then generated to conduct actual execution.
One issue here is that the pool of functions required outside of the tuple semantics in order to implement many protocols could grow large and quite arbitrary. Already this paper introduces a number of functions and features not incorporated in the original Declarative Routing publications. This might especially be the case for protocols closely tied to application and domain specific logic. However, for many protocols this may not be the case. A review and sample implementations of many distinct protocols may be necessary to determine the seriousness of this concern. It is also worth pointing out that much of this may be obviated by supplying a small language or making it easier to deploy new functions. However, this definitely reduces the elegance of the Datalog-based semantics & language.
The implementation discussed here is based around a single-thread, follow-to-completion execution model. This may have to change in order to support worthwhile goals such as supporting simultaneous execution of several overlay variants, with data and processing sharing between them. In addition to the implementation concerns, it's not clear what additional features will be required, e.g. stronger locking or other transaction semantics. Such features may also be necessary to implement reactive protocol components, which seem to have great potential to block the processing of an entire sequence and waste node resources (CPU, memory, network).
As with the declarative routing work, some sort of tuple aggregation scheme to reduce the amount of generated traffic and save on packet overhead also seems to be a requirement for practical use. Some of this can be implemented through careful orchestration within the Datalog specification, however that again moves the language away from its clean, logic-based semantics. On a related note, it seems difficult in this scheme to implement protocols which ride control traffic on top of data traffic, and this is not something which can be done within the language.
Another note is that the work here makes several significant uses of negation. It seems that an ability to provide an ordering over rules is required to make effective and non-ambiguous use of negation as failure, which seems to be what's in play here. There also seem to be similar concerns & interactions regarding the delete operator, which might when combined with negation-as-failure allow for a very strong and possibly unwieldy negation.
Concerns about the difficulty of programming in the language are probably a minor issue. Every language has a learning curve, and rule-based programming certainly takes some getting used to. The opportunities for verification and automated optimization, beyond the concise nature of the specifications, probably outweigh this concern.
On a different topic, it would be interesting to incorporate a transformation engine into the dataflow framework in order to interoperate with other implementations. This may also require some aggregation scheme or coding in the rulebase. However, the dataflow framework and automated composition does seem well suited for simply including such transformations as another element to be inserted, and is a definite plus for the approach.
Declarative Routing: Extensible Routing with Declarative Queries. Loo, Hellerstein, Stoica, Ramakrishnan. SIGCOMM 2005
- See slides from my presentation reviewing this paper. Other comments:
- Really interesting; rule based specifications of routing protocols compiled down into DB-style operator strands in a dataflow execution framework.
- Rules implicitly trigger communication between nodes to exchange neighbors, paths, etc.
- Very interesting; some serious questions about overhead in messaging, necessity to incorporate some sort of aggregation, etc.
- Related to communication planning framework interests? Add more hardware reasoning in?
Loo, Ives, Smith. NSF FIND proposal.
Design and Implementation of an Intentional Naming System, The. Adjie-Winoto, Schwartz, Balakrishnan, Lilley. SOSP99
Presents a system for descriptive naming & late binding as an overlay network service in a LAN environment. Names are written in a custom hierarchical attribute/value pair format. Routers aggregate the names into a tree which incoming queries are pushed through to determine matching destinations.
Definitely oriented toward a medium sized network. Every node needs to know about every other node. The name trees aggregate some information, but memory use and lookup processing could be substantial. An important consideration to keep in mind is that for networks with low latency (e.g. office settings) and rich descriptions, processing power could easily become a dominant bottleneck rather than the network. The paper investigates this somewhat in regard to their system, but makes apparently conflicting statements at two points, and doesn't seem to look at all issues, e.g. the (relatively) expensive sounding name extraction required in their scheme for every periodic update.
The naming scheme is slightly odd; it has unexpected semantics. For example, it seems that if I request a "printer, color" it will match against "printer." That's almost useful for a human user looking at directories, although they'll quickly be overwhelmed by small, irrelevant descriptions. It's completetly useless for a machine, e.g. if a print job is being forwarded that actually requires a color print. It also enables malicious users to receive or take all messages by posting ill-formed (short) descriptions.
The paper also discusses several times the ability of the system to keep sessions alive despite node failure; messages will simply be directed to another service. This is true with "session" in the sense of the overall run of the system. It is not true in the more common sense of two machines carrying out a conversation. Any evolved state would be lost when the service died, and the sequence would have to start over with the next host unless carefully crafted otherwise, which may not be possible in all cases (e.g. in sharing credentials).
Although perhaps out of scope for this, a spanning tree backbone overlay is perhaps not ideal. It would concentrate traffic along the backbone, wasting underutilized network resources on other links, artificially capping the network capacity, as well as possibly over-taxing host processing and power resources. Further, in this sort of relatively small, wireless environment, there is near ubiquitous support for single-hop IP multicast, which is an opportunity to amortize transmit costs by sharing messages as well as cheaply create redundant delivery paths.
One notable oversight in the paper though is that it discusses at several points the ability to route on application layer metrics such as geographic location or load. However, it is never discussed in more detail. Efficently representing, disseminating, and reasoning on the required data and criteria sounds problematic, especially for very dynamic features such as load.
2k7/1
Corona: A High Performance Publish-Subscribe System for the World Wide Web. Ramasubramanian, Peterson, Sirer.
Corona is a system for distributing the polling required in a sensing loop. Its primary motivation is reducing the amount of Internet traffic spent checking web sites and news feeds for updates, while at the same time decreasing the latency at which updates are discovered. Further, the implemented system is trying to achieve this without modifying existing infrastructure (web servers) or user client software (web browsers, news readers, chat programs).
There are some issues with the proposed system, but it's not immediately clear how to resolve them. For example, updates without source-provided timestamps are forwarded to the primary owner, who then assigns an ordering. This sounds possible to do incorrectly, as well as potentially overwhelming that node. However, for the data (web sites) discussed, such sources are probably relatively rare.
The centralized user updating is probably mostly an implementation issue, though one that should be addressed. There seems to be no hurdle against creating a more distributed system, other than utilizing existing and widely installed software.
Also, interestingly, the paper focuse on analyzing the reduction in demand on the sources. No mention is made of the affect on the network as a whole, e.g. bandwidth consumption. It's not clear if there's a reduction in traffic across the network once all of the update propagations and other system messages are factored in. This could easily be the case, but it's not clear. The answer may also depend a great deal on how close users are to their owners.
Another small point is that there's a place here, and newsfeed reading in general, to apply richer models toward learn polling frequencies. Corona simply maintains an average of observed update intervals. It would be interesting and beneficial to learn a richer polling ruleset, e.g. "Don't poll over the weekend because nothing happens, but poll every minute on weekdays."
Virtual Ring Routing: Network Routing Inspired by DHTs. Caesar, Castro, Nightingale, O'Shea, Rowstron. SIGCOMM 2006.
- VRR is a combination ad hoc routing protocol and distributed hash table overlay.
- It routes and forwards over location independent addresses; it does not require an underlying IP layer (for either addressing or forwarding).
- Nodes periodically send out beacons to determine local neighborhood.
- Nodes maintain a list of physical neighbors, as well as a small set of nearby virtual neighbors in the ID space.
- Forwarding is along the physical neighbors toward the known node w/ID closest to the target ID. Notably, this is down in greedy fashion. That's not so rare for some DHTs, notably CAN seems similar, but now the underlying multihop network layer routing as also being done in the same fashion.
- Despite this, the comprehensive-sounding testing in the paper indicates it's a winner.
- Possible lesson from this is that more often than not, by the time you establish the best route, you could have already delivered the message along any rout.
- It's not clear how this fares under links w/ high latency, packet loss, or heterogeneity.
- Also not clear how brittle the protocol is and what the bounds are on node state, among other things.
Towards an Internet-Scale XML Dissemination Service. Diao, Rizvi, Franklin. VLDB 2004.
This paper presents a distributed publish/subscribe system based on XML messages and querying. The expressiveness supported by the XPath based query language and engine is fairly expressive, particularly as compared to most publish/subscribe engines. Much of the paper focuses on optimizing evaluation of XML messages such that as much work as possible may be shared between router components as well as queries. It also includes a scheme for propagating query generalization and aggregations up a broadcast tree to minimize downstream traffic.
The paper neither focuses on nor provides definite notes on construction of the underlying broadcast tree. Data source and user linking is discussed, but not how the router overlay is created. This is arguably outside the scope of this paper, but is an important component of such a system. In the absence of other evidence, the assumptions here seem to lean toward a fairly static and managed, infrastructure-type environment.
A scheme is presented for attaching data sources and users, as opposed to routers. However, the given scheme is fairly centralized. Although this service is handling less traffic than the message routers, in a truly Internet-scale system, it could still be substantial. This is particularly true if users and data sources are frequently connecting and disconnecting, which is quite likely for many applications of interest (e.g. users reading/monitoring streams during work breaks). Registration is also a single point of failure.
Although it is not discussed, it seems possible that there could be redundant/alternate registration services around the network. Given the current decision making on attaching sources and users, as discussed in the next several points, these decisions do not need to be coordinated. The major factor that might have to be coordinated is load balancing across the system, but this is not currently addressed at all. Coordination of ID ranges could be done a priori. It might therefore be straightforward to introduce multiple, fairly autonomous registration points into the network.
The decision making in attaching sources and users to the network also seems somewhat simplistic. Sources are assigned to routers based on data rates, router bandwidth, and topological distance. This assignment happens once, when the source joins, and is based on advertised data rates and bandwidth. Changes in the system as well as malicious or simply incorrect advertisements might have significant effects on the system. It's also not clear how (network) topological distance is determined. Similarly, it's not clear how criteria for users, such as general location in the routing fabric, is determined.
A good bit of time is also spent on techniques for aggregating user queries and partitioning them across the router fabric. This is a centralized process in this paper. That introduces another single point of failure, a heavily loaded service, and hinders the system's ability to adapt to changing requests and network environments. However, it is singled out as future work to be followed up on.
Another issue that doesn't seem to be addressed by either the partitioning or registration schemes is the "Britney Spears" effect. Partioning is based solely on the queries and data schemas, without taking into account network and processor load balancing issues. Registration is noted as taking into account available bandwidth and topology, not load. Although it may use a source's published data rate and routers' published bandwidth capacities when a source is incorporated, but this does not consider its effect on that region of the network, nor the user demands placed upon that flow. Nothing is presented to specifically spread the work for large data sources & queries across the system.
Mesh-Based Content Routing Using XML. Snoeren, Conley, Gifford. ACM SOSP 2001.
This paper presents work on XML-based, reliable publish/subscribe systems. Notably, the focus is less on high data rates and large quantities of nodes, and more on reliability. The major content in the paper is a scheme for utilizing redundant links and correctly ordering and merging received data. It is an interesting use of application layer routing and overlay networking to provide services not otherwise supported by the network (namely, reliability), but seems oriented toward toward systems with stable overlay & underlying network configurations, small numbers of nodes, significant networking infrastructure, and procedural/deployment coordination.
From an applications' perspective, one of the major features of the system is data streams encoded in XML and the use of XPath/XQuery. Some time is spent discussing and demonstrating the use of simple compression schemes to reduce the large bandwidth costs this incurs. This is a common argument, that the costs of using XML in a network setting can be significantly reduced by techniques such as compression, and benefits such as improved application integration easily outweigh those costs. An often overlooked point is that this assumes that network latency dominates processing time. There are settings where this is not the case. Some are relatively uncommon, e.g. satellite uplinks and sensor networks. Others are not; most PDAs and smartphones might be challenged to keep up with link capacities on Wi-Fi, Bluetooth, and other mediums. Even on high powered desktop and server machines, large numbers of nodes (e.g. on Internet scale overlays) could easily swamp available processing power.
Another smaller point is that the reordering of packets along internal links should be justified further. Some sort of retransmission scheme seems reasonable, but it seems feasible and potentially worthwhile from a throughput perspective to forward nodes as processed, rather than processing and forwarding in order. This is particularly true as other nodes may have already filled in those packets for all of a nodes' children and retransmissions may be unnecessary.
Some of the basic assumptions are also more or less suspect depending on the setting, in particular the assumption of link independence. Wireless networks and most home and office settings do not fall under that assumption. This, combined with the mobility points below, orients the system toward supporting servers with significant infrastructure (i.e. multiple independent Internet connections ISPs) in applications with high reliability requirements.
Coordination of the system also seems likely to require substantial background infrastructure & policies. The need to coordinate Application Number identifiers ( ANs) across all roots may be difficult in some applications and networks, and increasingly so as there are more of them. Merging streams also seems problematic in practice, although the paper spends some time on it. Assigning usable identifiers and allowing nodes to flexibly work off split or merged streams does not seem to be well supported.
Construction and adaptation of the mesh is a definite weak point however. Most problematic, the given scheme seems likely to swamp root and other nodes with requests and network traffic and is not prepared to effectively operate in a changing network. There is no scheme for adapting to changing latencies, which is only marginally supported anyway. Changes in connectivity or node availability could also be handled more efficiently, e.g. by attempting to determine the cause of a loop rather than completely disconnecting the rejoining node. In addition, the given algorithm does not seem to guarantee given levels of resilience as it does not check the resilience of potential parents' and the independence of potential links.
A Scalable Distributed Information Management System. Yalagandula, Dahlin. SIGCOMM 2004.
This paper presents a system for disseminating information in large scale networks. It is focused on distributing "network weather" data, but is generic enough to be useful for a wide range of applications. Central to the approach is rendering explicit the routing trees constructed by the routing procedures of distributed hash table algorithms, and using those trees to aggregate, summarize, and disseminate data.
One moderately detailed point not discussed in the paper is how aggregation functions are disseminated. Given the implementation notes included in the paper, it's likely that the aggregation functions are a set of pre-distributed Java functions which the aggrfunc parameter of the install() function indexes in some fashion. Support for run-time description, dissemination, and execution of aggregation functions is likely to have a huge effect on the applications that may be supported by the system. Here, unlike in many systems, different data is treated very differently, in a fashion defined by the aggregation function. It's not geared toward generic data dissemination. Introducing new applications and data into a running system therefore requires dynamic distribution of some, potentially fairly arbitrary, executable function.
More generally, the aggregation functions mean that applications are limited in how they may use data. For example, a system with a type installed that aggregates the average of a particular data type would not be able to easily switch and determine the min, max, and median. A new aggregation function and tuple structure would have to be created, disseminated around the network, and then run long enough to collect useful data. This is not a system for doing relatively free-form network investigation (e.g. to analyze an attack on the fly), or for running a wide range of different applications at different times---it seems geared toward long running, mature services, not user-initiated and interactive applications.
On a related note, under this system there is no way to collect data about other nodes, except through the aggregates. For example, unless a data type were pre-arranged to support it, there is no way to ask for data from or about a specific node or nodes.
The aggregation scheme also introduces some amount of fragility and effectiveness to the system. A noteworthy feature of Chord, for example, was its ability to function, albeit inefficiently, simply by knowing local nodes in the ID space. The DHT trees (finger tables) are not necessary for correct behavior. Here they're much more necessary; without them no data will get around the network.
The system also has many knobs to support application-directed tuning; proper use of them is probably required for effective use. This is largely a good thing, and a key feature of the system. However, as mentioned briefly in the paper, it would be useful to have a layer abstracting and possibly learning much of this, such that applications are less involved in the specifics of shipping data around. For example, the ability to force a reaggregation after a reconfiguration is useful, but requires that application developers worry about the state of the network/overlay, how data gets passed around, and how to make a decision about the reaggregation. Such features also sound likely to be abused by application developers, similar to the abuse of UDP to avoid (TCP's) congestion control on the Internet.
Finally, the system does not seem to lend itself toward some common applications. In addition to network monitoring, this work also forms the basis for distributed file system and data replication schemes. However, this approach seems to require that many nodes hold much information. In a straight DHT-based file store, no intermediary node between a query and the key owning node need know anything about that file. In this scheme, it seems that nodes in the tree will know at least about the key's existence, are involved in computing some aggregation function to fetch the relevant data (e.g. concatenating lists of files held by child nodes), or are possibly storing the relevant data, depending on which update()/probe() depth strategy is in use. Further, the proposed schemes for finding the nearest source of the data soom to make too much of an assumption that local nodes in ID space are local on the network, and likely to generate much network traffic in any case as increasing levels of the tree are probed.
Querying the Internet with PIER. Huebsch, Hellerstein, Lanham, Loo, Shenker, Stoica. VLDB 2003.
This paper presents PIER, a system for massively distributed relational database storage & querying. PIER builds on distributed hash tables to distribute & locate relational tuples across the network. The paper presents several algorithms for spreading query computations and data shipping, and the various performance results.
One of the primary motivating applications given for PIER is networking monitoring, e.g. intrusion and attack detection. Using a PIER type system for this sounds likely to generate much redundant network traffic. For examaple, querying every 15 seconds over the last 30 seconds' worth of a particular data type will involve shipping much of the same data around. This could be addressed in the monitoring application, but that's contrary to the intent of providing a basic data service, such that the application doesn't have to worry about details like that. A streaming, continuous query approach might not have the same problem.
However, the continuous approach would have its own similar issues for this application. Opposite the above problem, it might not be necessary to regularly ship data around, e.g. to investigate in detail a specific, infrequent or unique problem rather than continuous monitoring. PIER seems more appropriate for the former task. The PIER approach also seems more robust to monitor node failure, i.e. switching to a new monitor host will not require restoring history/state or waiting to collect sufficient new data.
Many of the experiments in this paper could perhaps be presented and/or conducted more rigorously. For example, many of the simulation results make no note of the number of trials conducted, and imply that given data is based on a single trial. These one-shot data points should be justified more, or multiple trials conducted & their statistical significance evaluated.
Also of note would be some discussion about the underlying networking setup. In some respects these are application details, but they do matter a great deal and could bear significant effect on observed performance. For example, is TCP used and are connections continually setup/torn down or long-lived? There are good reasons for both. If TCP is not used, how is reliability assurance & fragmentation performed? The paper mentions "dropping packets" at one point, but it's not clear what this means exactly. These choices may also have strong bearing on the accuracy of their simulation.
On a related note, the paper briefly looks into simulation over network topologies other than a complete graph. It's unclear whether cross traffic remains unmodeled in these simulations. Clearly there are topologies where congestion & collisions matter a great deal, e.g. over a (topological) bridge. It would be interesting to look at performance over specific topologies, e.g. if data and query nodes are seperated by bridge links. But, this of course tends toward a new or extended line of work investigating data placement & query host optimization to reduce latency & bandwidth.
A key detail not discussed in this paper is how all nodes storing tuples of a given relation are located and communicated with---how namespaces are used and the (high level) conduct of the multicast() operation. Following reference 18, it behaves largely as expected---namespaces (relations) are mapped onto ranges in the DHT identifier space. One interesting point here is that mapping namespaces onto high order bits could swamp that region w/ storage requests and queries on popular relations. This is a particular problem if nodes cover relatively large segments of the identifier space and node IDs aren't randomly assigned, but based on IP address or other persistent identifier. Such a scheme makes some attacks more difficult, but dooms some nodes to handling popular data & queries.
From a load balancing perspective, it makes some sense to map relations, presumably much more common than resource IDs (primary keys), onto low order bits or otherwise spread them around the network. Unfortunately, tuples then become much much harder to find. The system also most likely incurs higher latency or aggregate network bandwidth costs as a result of not grouping relations. However, effectively performing some form of load balancing does seem to be a worthwhile goal for this kind of system.
Knowledge Plane for the Internet, A. Clark, Partridge, Ramming, Wroclawski. SIGCOMM03.
- Vision paper on the need for networking to become more self managing and autonomous.
- Current Internet requires an awful lot of management, more and more things configured by hand (e.g. many routers are manually configured in order to implement policy decisions).
- Only by being more autonomous can it truly adapt at reasonable time scales.
- Need to treat this is a knowledge-based problem, ship data around, support users, and make actionable decisions.
- Machines can do a lot to help users diagnose frequently opaque problems ("Your game doesn't work because your ISP is blocking multiast traffic to that port.").
- It's no longer efficient enough to have all the intelligence at the network edges. The core needs to know what shoud be happening, because it's the one in a place to detect problems and do something about them.
- Need to coordinate across regions and the globe.
- Must handle incomplete, out of date, inconsistent, and malicious data.
- Information economies---services compete to provide data/analysis. Among many others, interesting issues here of providence---how do I know everyone's not just repackaging the same data?
- Data and knowledge based routing.
- Where does it run, where does the KP live?
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. Stoica, Morris, Karger, Kaashoek, Balakrishnan. SIGCOMM 2001.
This paper, and the Chord work, are trying to solve one basic problem: Given a key, find the node responsible for that key in a large network of nodes. A primary, and reasonable, assumption is that there are too many keys and nodes for any node to keep track of many of either. Furthere, there is no natural hierarchical or other structure to the network, as imposed by services such as DNS. How the key and the responsible node are used is up to the application, but typically involves that node storing data associated with that key.
Chord provides a relatively simple solution to this problem, based on hashing keys and nodes onto a one-dimensional space and then hopping through nodes in that space to search for responsible nodes. It provides asymptotic bounds logarithmic in the number of nodes for both storage per node and search hops.
A significant shortcoming that Chord appears to have is the difficulty of using partial keys. This makes it difficult to support tasks such as searching for data by keywords. Even with simple keywords, at first glance it seems difficult to assign a structure to the keywords such that similar (subset) searches produce similar (i.e. commonly prefixed) keys and some mechanism adopted to search the Chord network for matching keys (e.g. to find all nodes within a key range). This sounds even more difficult for more structured data and queries, such as finding all recent data or data containing particular values. However, neither of these tasks is what Chord is intended for, although they are closely related, keyword searching in particular.
It's also not exactly clear how practical Chord would be on real networks, both Internet or otherwise. For example, the frequent outages and partitions on mobile networks sound problematic, although the authors do describe some preliminary ideas on replication and detecting partitions. The protocol would also become more complicated and likely generate more network traffic if it did more to route more efficiently in terms of network latency in addition to search hops.
Looking Up Data in P2P Systems. Balakrishnan, Kaashoek, Karger, Morris, Stoica.
- Centralization is bad in most case---single point of failure, can't cope with high load, wasted resources under light load, singe attack point.
- Broadcasting out requests will also never scale.
- Manually constructing hierarchy requires infrastructure, some nodes take more load, require more management, failure at higher levels can be catastrophic.
- Key issues for DHTS are mapping keys to nodes in a load-balanced way; forwarding a lookup for a key to an appropriate node; building routing tables.
- Guaranteeing that data which is on the network will be found is critical.
- The hashtable interface presents a simple, common viewpoint upon which many applications may be built.
- Summarizes the major DHT work up to that point, identifies areas for future work.
- CAN: d-dimensional space, broken up into zones managed by nodes. Keys map into coordinate space, requests greedily forwarded toward containing zone.
- Chord: Ring of virtual identifiers, forwarding along ring, finger tables maintained that boost performance logarithmically, but are not necessary.
- Pastry: Virtual ring, sounds somewhat similar to Chord.
- Tapestry: Again similar, but focuses on trying to optimize network travel.
- Future work: Handle frequent node joins and departures; fault tolerance; proximity routing; handling malicious nodes; richer search/keys.
Outdoor Experimental Comparison of Four Ad Hoc Routing Algorithms. Gray, et al. MSWiM 2004.
- Describes construction of and results from a testbed for outdoor/indoor/simulated routing comparison.
- Constructs common routing algorithm software environment so the same implementation can be used across each. Uses mobility & connectivity traces from outdoor experiments in indoor testing.
- Each node generates a local beacon, which serves to build some picture of the connectivity, regardless of what the routing tables say. Further, for many applications this sort of traffic is also necessary (i.e. broadcasting GPS coordinates).
- Each node also generates random amounts of dummy application traffic, from the host to a randomly chosen other host. Intervals and size are on a normal distribution. This traffic emulates application traffic.
- Four algorithms were studied:
- APRL: Each node puts out a beacon including its routing table. The first determined route for a node is used, without any notion of "shortest path." Routing loops are avoided by not using a route until a dummy packet is successfully sent to a destination and returned.
- STARA-S: With some probability, a neighboring node is chosen as the next hop. Acknowledgements are sent back from the destination. Probabilities are adjusted based on the delay, such that long routes are used less frequently. This can lead to an exploration problem. Messages to determine these times are sent periodically.
- AODV: Upon required a route to a destination, AODV sends out a request. Any node receiving this that knows of a route sends back a response. Otherwise it forwards the request one. Regular beacons are sent out to determine neighbors; messages are sent out if a node disappears, to invalidate routes.
- ODMRP: Very similar to AODV, except that data is sent along with request messages. This is good when data is small, bad otherwise. Also, it maintains a notion of a multicast group, and routes to forward messages to members of that group. These forwarding groups time out periodically.
- Results from this paper are questionable; their analysis is funny in places.
- However, they note that ODRMP does very well, probably largely because shares data and control packets and the data is small. APRL and STARA either do not adapt frequently enough, or clog the network.
- Metrics used in their studies include:
- Communication efficiency: How many control and other packets are generated per message (not including fragmentation; all packets here are smaller than MTU).
- End to end latency: How long it takes for messages to be delivered. To all hosts in multicast case. Some stuff here about trying to sync clocks based on GPS data, which isn't accurate enough, so they also try to adjust for skew afterward.
- Hop count: How many nodes a message is routed through before reaching the destination. Not clear what this means if several routes are used (e.g. redundant broadcasting).
- Message delivery ratio: How many messages actually arrive at their intended destinations.
2k6/12
Optimized Link State Routing Protocol for Ad Hoc Networks. Jacquet, Muhlethaler, Clausen, Laouiti, Qayyum, Viennot. 2001. IEEE where?
- The original OLSR paper.
- OLSR is, of course, a proactive link-state based algorithm.
- It is based on determination of multi-point relays, which handle broadcast traffic for groups of nodes and whom act as the route backbones.
- Geared toward large, dense networks with a large number of nodes communicating with a large number of nodes, and the talking pairs change over time.
- Each node knows its two-hop neighborhood.
- Routing decisions are based on hop count. Routing along MPRs does not necessarily decrease latency?
On the Credibility of Manet Simulations. Andel, Yasinsac. IEEE Computer, July 2006.
- Network simulation is good & important, but there's not nearly enough rigor used in practice.
- Results are not reproducible, mis-interpreted or analyzed, and models don't reflect reality.
- Difficult to determine where to sample at; e.g. one protocol may dominate another at 10 & 100Mbps, while the second protocol is better at intermediate values. How do you find that? Exhaustive testing is time consuming.
- Simulators and their components (e.g. fading modules) vary widely & produce different results.
- People don't generate traffic in realistic ways, often using a single connection at a fixed bit rate. Need to model usage requirements more closely, e.g. need to have application-layer connections coming up in down, include ARP, changing traffic, etc.
- Need to thoroughly document test parameters for it to be useful. Put critical metadata in paper, complete information on web.
- Need to determine how many trials are necessary to be statistically valid. Some interesting notes & references on how to do this.
- Most MANET studies use random mobility models---but how often do people actually do this?
Defining Network Capacity. Chimento, Ishac. draft-ietf-ippm-bw-capacity-04.
- Note read of earlier version listed below.
- The following are thoughts passed on to the IPPM group (here):
- There should perhaps be more emphasis or explanation that this is really about path capacity. The notions defined here are "network capacity" from the perspective of one host to another specific host along specific routes, not from the perspective of the network or even across all hosts or all routes to one host. In particular, MANET literature often discusses network capacity in the sense of the entire network, how many bits can be moving forward at one time between all hosts. This is particularly interesting in that setting because of the conflicts in the following point.
- pg 7, definition 3.3, IP Layer Path Capacity. This definition assumes that the links are independent and do not affect each other. The minimum capacity along the path does not necessarily provide a lower bound on the path's capacity. For example, this is the case on multi-hop wireless networks, where each link may very well have reasonable capacity, but communication (forwarding) on each link hinders communication on the successor and predecessor links as they can't both be live. Note also that definitions which account for this probably take care of congestion as well. The note about congestion in this definition is odd, although perhaps it's really trying to say that this definiton does not address the above phenomenon of adjacent links affecting each other? In that case, this definition perhaps has some value, but a metric which accounts for that would be more intuitive and practically useful.
- pg 9, discussion 3.2, Hardware Duplicates. This argument at least needs more motivation. It should point out that IP packets do have IDs and can conceivably be identified as duplicates (do most implementations set sensible IDs for non-fragmented packets or is it ignored?). Wouldn't most (all?) implementations simply pass duplicate packets up the stack? I would not have previously thought IP included any checking for this. In that case, it's not immediately obvious why these are not included in the network capacity but corrupted data is. Granted, it reduces the amount of data that can be transmitted. But so do corrupt packets. That line of thought seems to rely on an argument about measuring transport layer bits, which is not the goal of this draft. In short, I don't necessarily believe this definition should be changed, but it's clear it will be a sticking point for most readers and at a minimum needs to be stronger. Also, the closing sentence should read "reflect the duplication with a lower value" to be slightly more clear, rather than "the lower value."
Wireless Sensor Navigation for Emergency Navigation. Tseng, Pan, Tsai. IEEE Computer, July 2006.
- Discusses adapting TORA (temporally ordered routing algorithm) to handle physical routing of people in an emergency (e.g. escape in a forest fire, building fire, etc).
- Seems based largely on potential fields.
- Includes some notion of hazardous regions, essentially making path cost not just a factor of length but hazard as well.
- Mentions Great Duck Island and Firebug.
- Not clear or well written at all.
2k6/11
Efficient Reasoning in EL+. Baader, Lutz, Suntisvaraporn.
- EL+ is EL concepts with general concept and role inclusion.
- Can handle transitive roles, role hierarchies and right identities.
- This language is rich enough for many applications.
- It remains tractable.
Measurement Study of Vehicular Internet Access Using In Situ Wi-Fi Networks, A. Bychovsky, Hull, Miu, Balakrishnan, Madden. Mobicom 2006.
- Studies the ability of wi-fi in cars driving in residential areas to connect to encountered, unplanned, open access points.
- While driving in large cities (this paper reports on Boston), enough access points are encountered to provide useful coverage.
- Typically in range of such acces points long enough to transfer useful amounts of information, about 216KB over TCP.
- Throughput not affected much by speed, though obviously the overall time of connection shortens as you drive out of range. Median of 30KB/s, fairly typical of broadband upload speeds in US.
- Association times improved by a small optimization of DHCP, remembering leases granted by access points and requesting them again if still within lease bounds.
- Some interesting points about actually running the test, e.g. problems with starting up scripts causing time lags due to the limited systems.
Measurement-based Characterization of 802.11 in a Hotspot Setting. Rodrig, Reis, Mahajan, Wetherall, Zahorjan. SIGCOMM 2005 Workshop.
- Studies wi-fi data collected around access points at a reasonably large conference.
- Finds that only 40% of wi-fi data is spent transmitting original data packets. 35% spent on retransmissions, 15% acknowledgements, 10% management traffic.
- Retransmissions are common---28% of data transmissions, 46% of the time due to decremented bit rate.
- Most transmissions are done at 1Mbps, which actually only makes contention worse. Switching bit rates is common and very frequent. Most cards strongly prefer the highest bit rate.
- Placed monitors external to but near access points. Collected most data with tcpdump. Some hacks to MadWiFi to get data on errors.
- Key point: Airtime dominated by 1Mb traffic.
Framework for IP Performance Metrics. Paxson, Almes, Mahdavi, Mathis. RFC 2330
- Wire time vs. host time; imperfect clocks.
- Why it is recommended to avoid thinking about Internet properties in probabilistic terms.
- Metrics must be concrete & well defined; repeatable; unbiased; useful; not create artificial goals.
- Includes definitions for hosts, routers, paths, clouds.
- IMMP documents are based around metric definitions of bit prefixes, e.g. M=1000, not 1024.
- Many different measurement methodologies exist:
- Direct measurement with generated traffic.
- Calculating a metric from other metrics, e.g. given propagation delay for each link, work out propagation delay for the path.
- Estimation of a metric from more aggregated metrics, e.g. estimating propagation delay from packet delays & sizes.
- Estimating a metric at the current time based on past and current data for that and other metrics.
One-way Active Measurement Protocol (OWAMP) Requirements. Shalunov, Teitelbaum. RFC 3763.
- Increasing availability of high precision time sources (GPS, CDMA), enables accurate one-way active measurement.
- But, note that for many metrics, such as loss, highly accurate timing is not critical anyway.
- Need to define:
- Test stream initiation.
- Exchange of test packets.
- Retrieval of test information.
- Poisson streams.
- Keep raw data so user can generate any statistics desired.
- Should disconnect test controls and actual test execution.
- Cannot allow control to specify construction of arbitrary packets for testing, due to security concerns.
- Have to ensure as best as possible that tests cannot be arbitrarily boosted, e.g. by watching for OWAMP traffic and giving that priority.
IP Performance Metrics (IPPM) for Spatial and Multicast. Stephan, Liang, Morton. draft-ietf-ippm-multimetrics-02.
- Typographical comments for this draft are at Reading.IPPMMultimetrics.
- Other standards define metrics for point-to-point communication, possibly over several links. This defines them for point-to-group communications.
- Spatial metrics: One of the hosts involved is neither the sender nor the receiver (e.g. a link in a multi-hop path).
- It is useful to decompose paths and determine statistics along each segment. For example, this can help determine where delay and jitter are occurring.
- One-to-group metrics: The measured packet is sent by one host and received by several destinations.
- Does not address group-to-one and group-to-group metrics, but hopes those defined here may be used to build those.
- Notes that the placement of instrumentation effects the results---you may see more before a buffer than after.
- Makes heavy use of 1-way metrics, which require a great deal of clock synchronization.
- Notes basic problem in passive measuring---packets don't have timestamps necessarily.
- Having the multicast group address optional may be useful for applying the tests to broadcast tests.
Two-way Active Measurement Protocol (TWAMP), A. Babiarz, Hedayat, Krzanowski, Yum. draft-ietf-ippm-twamp-01.
- Defines extensions to One-Way Active Measurement Protocol to handle a Two-Way test.
- This would be applicable, e.g., if clocks cannot be synced well.
Defining Network Capacity. Chimento, Ishac. draft-ietf-ippm-bw-capacity-03.
- The term "bandwidth" is overloaded. Hence, this document defines "capacity."
- Have to differentiate from theoretical link capacity versus usable capacity, e.g. after overhead bits, errors, etc.
- Must avoid sampling in a way that causes bias. For example, sampling at a frequency near a multiple of some underlying phenomena would be a problem.
- Have to include IP fragments in measurement even if full packet cannot be reassembled, as these are useful IP data. Some tests differ on this.
- However, packets partially received when the test timeout occurs are to be tossed. This has some ramifications for biasing introduced by relationships between packet size and timeouts that must be kept in mind.
Tailoring OWL for Data Intensive Ontologies. Calvanese, Giacomo, Lembo, Lenzerini, Rosati. DL2006.
- Same as below, but some more details, lower bounds on reasoning (LOGSPACE).
DL-Lite: Practical Reasoning for Rich DLs. Calvanese, Giacomo, Lenzerini, Rosati, Vetere. DL2004.
- Introduces simple but usable DL including existential qualification, limited negation, etc.
- Orients reasoning procedures around data & reasoning over queries.
- Reasoning taken as two-step process:
- Transform query into a new query, possibly in a different language.
- Evaluate over ABOX.
- For DL-LITE, the original query can be transformed into an equivalent conjunctive query to execute in a relational DB, taking advantage of its optimizations, memory structure, etc.
JCISE review.
Design Framework for Highly Concurrent Systems, A. Welsh, Gribble, Brewer, Culler. Published where?
- Highly concurrent server desgin largely based on thread or event models.
- Threads. Pros: Comfortable programming model, good tools, wrap blocking calls, utilize multiple processors. Cons: Lock contention, OS overhead leading to degradation in performance.
- Events. Pros: Less overhead, no degradation at maximum levels. Cons: Difficult programming model, can't utilize parallelism (e.g. multiple processors).
- These are actually ends of a spectrum and have to combine both to work most effectively.
- Introduces design patterns based around Tasks, Thread Pools, and Queues.
- Wrap: Associate a thread pool with a queue to create a Task Handler; eliminate degradation, utilize parallelism.
- Pipeline: Break tasks up into Stages in order to utilize downtime, different task latencies, utilize cache, etc.
- Combine: Share threads in a thread pool between low-latency Tasks to minimize number of threads.
- Replicate: Duplicate Task Handler to utilize parallelism.
- Note: Have to determine thread count threshold for machine, OS I/O request thresholds, etc. in order to tune well.
- Interesting comments in passing on use of economic models for scheduling/allocation.
2k6/10
Essential Guide to RF and Wireless, The. Weisman. 2nd Edition. Book.
- Pretty good intro to basic RF principles, antennas, cellular technology, etc. Not so much on wi-fi. The "Here's another equation you won't care about." shtick wears thin fast, but it disappears as the book goes on. Good enough that I read this cover to cover. Interesting tidbits:
- Time modulation.
- Commercial avionics run near 900Mhz, ergo rules on cell phone use.
- Cell towers usually have 3 antennas per 120 degree sector: 1 to transmit, 2 to receive and cancel multipath effects.
Architecture and Evaluation of an Unplanned 802.11b Mesh Network. Bicket, Aguayo, Biswas, Morris. Mobicom 2006.
|
|