Tuesday, February 19, 2008

Untraceable (2008)

Untraceable follows an FBI cyber crimes investigator as she attempts to track down a spree killer who posts live videos of his victims being tortured and killed on the Internet. As if that was not bad enough, the victims are killed faster as more people visit the Web site.

The title is derived from the fact that the FBI investigator, played by Diane Lane, is unable to track down the killer nor shutdown the his Web site down.

So how did the suspect hide and prevent the FBI from bring his site down? The movie describes it this way:


"The site's IP keeps changing constantly. Each new address is an exploited server. It is running a mirror of the site. The site's Russian main server uses a low TTL so that your computer constantly queries the name server's record. And that is how it gives you a new address so consistently. There are thousands of exploited servers on the Internet, so he is not going to run out of victims anytime soon. But he is accessing these servers so quickly; he has got to be running his own botnet. I mean, we are black holing these IPs. Every time we shut one mirror down another one pops up."

What this technical monologue describes, with surprising accuracy and correct pronuciation, is fast-flux DNS. Let me explain how it works in a little more detail.

DNS, or Domain Name System, are the servers--sometimes known as name servers--that turn human readable domain names, such as www.killwithme.com, into numeric Internet address, such as 64.37.182.110. These mappings--known as DNS records--include a mechanism to tell the requester how long the mapping is valid. That mechanism is know as time-to-live, or TTL.

Bot herders, the nefarious operators of botnets, figured out that you could use a low TTL to avoid having a botnet or phishing site shutdown. To do this, these lawless vagabonds create DNS records that map a single domain to hundreds or thousands of IP addresses. When they add the low TTL, which causes the IP address maps to update as fast as once per minute, it makes it possible to deploy a phishing site or botnet controller across thousands of mirrors--computers with copies of the Web site or controller application--while the ISPs' security staff played whac-a-mole trying to knock the servers off the Net.

In spite of the fact that the the screen writers got the description of fast-flux correct, in the scenario that they presented, it would not have prevented the FBI from tracking down the source of the videos. What the screen writers missed in their logic was the fact that the videos were live, not pre-recorded. A pre-recorded video would have been extremely difficult to track down unless the investigators knew exactly when it was seeded to the mirrors; had the video been seeded into a peer-to-peer network for distribution, it would have made the source almost impossible to find.

With live video, on the other hand, a network stream would have to originate, in real-time, from the physical location where the event is taking place. To track down the source of a live video, the FBI could have started with a single mirror of the Web site and worked backwards based on the network traffic being sent to it. As you can see from the diagram below, even if the killer hid behind multiple layers of servers, a properly trained investigator would still have been able to determine the origin of the video by tracing the network traffic from node to node.


The investigator would have used data generated from a tool known as Netflow. Netflow works by extracting information from network packets that are received by a router's interface and creating records that describe the unique flows. For the layman, flows are groups of similar packets from the same source and destination that are sent and received during the same period of time. For the more advanced reader, flows are based on the 5-tuple, which is source address and port, destination addresses and port, and protocol. Start time of the flow is defined when the first packet is seen, and an aging timer is used to determine the end time--when the router sees a new packet it resets the aging timer, if the timer reaches zero before another packet is seen, the flow is considered complete. For TCP, the end time is also determined when a session teardown is initiated with FIN/FIN-ACK packets.

The live video would have produced an easily identifiable flow that could have been used to track the network location of the creator and subsequently their physical location. With a little router command line magic, it could have been done in real-time. Whether the FBI could have mobilized fast enough to save the victim and catch the bad guy is another issue, but the bad guy would have definitely been traceable.

Untraceable, Continued