Internet measurements: fault detection, identification, and topology discovery

Renata Teixeira
Renata Cruz Teixeira , CNRS and UPMC Sorbonne Universités


Troubleshooting faults or performance disruptions in today's Internet is at best frustrating. When users experience performance or connectivity problems, there is little they can do. The most common attempt to solve the problem is to reboot or call the provider's hot line. Network providers have more information to use in diagnosing problems in their networks, but their network troubleshooting is often mostly manual and ad-hoc. Automatically troubleshooting network faults or performance disruptions requires monitoring capabilities deployed at end-hosts (in or close to the customer's premises). End-host monitoring is necessary to detect the problems that affect end-users. In addition, when problems happen outside the control of the network administrator or the end-user performing the troubleshooting, end-host monitoring is the only approach to identify problem location.

This talk will present recent advances in end-to-end Internet measurement methods to support fault diagnosis. During the past years, we have worked on two initial steps in network troubleshooting: detection that a problem has occurred and identification of the cause of the fault or disruption. In particular, our research has focused on the design of measurement methods for improving troubleshooting accuracy and efficiency. We will cover both passive and active techniques to detect faults. Active probing relies on sending a probe message and waiting for the response, whereas passive monitoring observes users' incoming and outgoing traffic and monitors the status of active TCP connections or UDP flows. For fault identification, we will focus on the two basic techniques using end-to-end measurements: traceroute and network tomography. Traceroute sends probes to a destination with an increasing Time-To-Live (TTL) to force routers along the path to send an error message, which reveals the IP address of the router's interface that issued the error message. Network tomography refers to the practice of inferring unknown network properties from measurable ones. It correlates end-to-end measurements of a set of paths to infer the links responsible for a fault. We will focus on the issues to apply these techniques in todays Internet. One key property for fault identification (and a number of other tasks in networked systems and applications) is the topology of the underlying network. This talk will end with a discussion on techniques to measure Internet topologies.


 Renata Teixeira received the B.Sc. degree in computer science and the M.Sc. degree in electrical engineering from Universidade Federal do Rio de Janeiro, Brazil, in 1997 and 1999, respectively, and the Ph.D. degree in computer science from the University of California, San Diego, in 2005, for which she was awarded the Department of Computer Science and Engineering Ph.D. Dissertation Award 2005. During her Ph.D., Teixeira worked at the AT&T Labs in Florham Park. She is currently a Researcher with the Centre National de la Recherche Scientifique (CNRS) at LIP6, Université Pierre et Marie Curie, Paris, France. Renata is a member of the steering committee of the ACM Internet Measurement Conference and of editorial boards of IEEE/ACM Transactions on Networking and ACM SIGCOMM Computer Communication Review. Her research interests are in computer networks with emphasis on measurement, analysis, and management of IP networks.