Intro
Making a forensic analysis means to be able to collect and analyze data in order to find out evidence that could led you to a specific break.
Although is usually considered a post-mortem activity in the IT realm this aspect is less marked than in other forensic environment. If we are running an investigation on a homicide, as an example, we will be present when everything is already done, and we just have to collect cold evidence. On the other end when we are running a forensic IT investigation we cannot be sure that the event has occurred in the past, some traces and some events related to the break could still be alive and running as the break itself.
When performing an IT forensic investigation we should be able to collect cold evidences as hard disks, logs and whatever we could need, as well as live “warm” data.
This part of the job require, among other abilities, to be able to sniff and analyze your network, collect packet capture and deep inspect the content in order to find out what is unusual or clearly symptom of a breach.
What means forensic
Through the history, we have seen a great development of forensic instruments and branches: from pathology to entomology, from engineering to psychology several branches develop and we see new coming out as science and technology evolve.
No matter what branch or technique we are using, the aim is always the same: to be able to analyze in detail a crime scene in order to find the truth using any physical available evidence.
Since this is a quite new science definition and rules are not well defined for any field, some areas are still in development, as for the computer and digital forensic, while other are in an advanced stage of development, as pathology for example.
Computer Forensic is still a new branch of the Forensic world we have to face some aspects:
- Computer forensics is in its early or development stages
- It is different from other forensic sciences as digital, and not physical, evidence is examined
- There is a little theoretical knowledge based up on which empirical hypothesis testing is done
- Designations are not entirely professional
- There is a lack of proper training
- There is no standardization of tools
- It is still more of an “Art” than a “Science”
Working in the Cyber area require, nevertheless a quite standard approach, common to every forensic investigation. There are some “simple” rules that should be followed by any forensic investigator:
Identifying the crime
If there is no crime there is no need of a Forensic investigation, although it is always possible to use a forensic approach to proactively find possible areas where a crime can be pursued.
In any case, either if proactive or reactive, an investigation has to start with an object, a target, this is necessary in order to define instruments and tools that have to be used. Is quite different to suspect a database manipulation than a use of pornographic material in the work environment and so different will be the tools used to gather evidences.
Gathering the evidence
This is the core of any forensic activity, evidences are what we will work on, but in Computer Forensic those evidences are partially physical and mostly digital. This require an extra effort in order to no pollute evidences due to our gathering.
The process should be non-destructive and should preserve the data collected and the environment we are working on.
There are several good reason to be cautious: usually we will work in working environment and the evidence could be stored on production systems. Stopping or harming those environments would be sometimes worse than the crime itself, so it is strongly suggested to not bring home the entire database and related servers of an e-commerce company making them stop the activity, would not be appreciated. Clearly, we will have to find the best tradeoff between the collection of data and the preservation of the work environment.
Another good reason to be cautious is due to the fact that
Digital data can be easily modified and this could invalidate our effort, trying to not change, modify the evidence is always necessary, but here we need to be extra cautious.
We should also consider that sometimes law requirements force us to respect specific policy and rules, we can think of the privacy EU set of laws that makes difficult to handle properly sensitive and personal data even during a forensic investigation.
If it is possible we should use as less as possible the original evidences, trying to use when possible duplicate.
Building a chain of custody
Forensic data usually need to be moved from the crime scene to a safer location where the forensic expert can analyze it. The chain of custody is fundamental when we’re moving digital data both if they are physical evidence, think of an HDD or USB keys, and digital as logs, scan reports or files. We have to be sure that nobody can tamper our evidences.
Analyzing the evidence
Finally we can analyze the evidence we gathered. Again most of the consideration of the previous points should be followed; we should never modify the original data, any modification should be reported in detail with any change of status, the analysis should be non-destructive when possible, we should use the appropriate tool and course of action considering the data we’re analyzing.
The 3 A’s
- Acquire evidence without modification or corruption
- Authenticate that the recovered evidence is same as the originally seized data
- Analyze data without any alterations
Forensic and the network
Computer forensic evidences can be collected form a multitude of sources, data can be found in HDD’s , USB keys, CD\DVD rom, Ram, Rom and Cmos memory and so on. We should also consider that data can be presented in raw form (think of a stream of byte in a drive), as files, as metadata, as process (sql queries for example) and so on. But the complex network environment can present other sources as protocols and the network itself.
If the Network can be itself analyzed for a forensic investigation, why we should do this and what can we could find?
The network is the communication backbone where our data flow, and thus can be used in a crime to modify, access, copy, tamper data and systems . Being able to analyze this backbone is mandatory in any forensic investigation.
To be able to do so we should be able to understand network protocols and the various objects we can find in a network as:
- Computers
- Network devices as printer, scanner and so on
- Mobile computing devices as tablet, phones, laptop …
- Network equipment‘s such as routers, switches, tappers, WAP (Wireless Access Point), firewalls, proxies, IDS …
- …
But we have also to relate them with network protocols at the various level, the operating systems used, application environment, authentication environment, management and security platforms and so on.
This is a quite complicated effort since, usually, there is not a complete and clear documentation nor a single source of information. We, as Forensic Investigators, need to create ourselves a map of the environment where we will perform our investigation.
Activities like listing all the active process on a machine or all the port currently used are commonly performed with tools or internal command of the various OS and are fundamental to understand and design the environment.
Network analysis is fundamental to trails the whole incident how the attack begins, which are the intermediate devices through which it pass and who was the victim.
In order to obtain this goal we should collect evidences from log, firewalls, internetworking devices and files, and being able to make a network scan that can depict location and status of every device present on the network itself.
Once collected the data we should also be able to analyze that in detail, using the appropriated tools.
The first type of analysis is at the physical layer where we usually use:
- Sniffers , which put NICs in promiscuous mode that allow them to be used to collect digital evidence at the physical layer
- SPANned ports, hardware taps help sniffing in a switched network
Sniffers collect traffic from the network and transport layers other than the physical and data-link layer.
Investigators should configure sniffers for the size of frames to be captured, the default size of frame that most sniffers capture by default is 68 bytes of Ethernet frame, It is advisable to configure sniffers to collect Ethernet frames of size 65535 bytes
The de facto standard of saving the gathered data from the network is in a tcpdump file with “*.dump” extension
At the Data-Link Layer we usually check
- MAC address , a part of the data-link layer is associated with the hardware of a computer
- The ARP table of a router comes handy for investigating network attacks as the table contains IP addresses associated with the respective MAC addresses
- The DHCP database also provides as a means for determining the MAC addresses associated with the computer in custody The DHCP server maintains a list of recent queries along with the MAC address and IP Address
At the Network and transport layer we usually collect:
Authentication logs:
- Shows accounts related to a particular event
- The IP address of the authenticated user gets stored in this log file
Application logs :
- Application logging is meant for the storage of auditing information, which includes information produced by application activity
- Web server logs help identify the system which was used as a means to commit the crime
Operating System logs:
- It maintains log of events such as errors, system reboot, shutdown, security policy changes, user and group management
But before enable logging one should bear in mind: what to log otherwise, it can result in over-collection of data making it difficult to trace the critical event
Network device logs:
- Network devices such as router and firewalls are configured to send a copy of their logs to remote server as the memory for these devices is low
The logs from network devices can be used as evidence for particular investigation on that network
Some of those data requires administrative privileges to be collected, some other rely on tools able to help to gather read and analyze data. Since we will have to correlate data coming from different sources in order to reconstruct a process is mandatory to align all timestamps. So if this is not done we should collect all the time setting for each device providing a log and considerate eventual time gaps in order to normalize all the references.
Wireshark, the standard tool for deep packet inspection
If you search on Google for a network sniffer or protocol analyzer, one of the first hits will be Wireshark (www.wireshark.org).
Wireshark is an open source network packet analyzer. Without any special hardware or reconfiguration, it can capture live data going in and out over any of your box’s network interfaces: Ethernet, WiFi, PPP, loopback, even USB. Typically it’s used as a forensics tool for troubleshooting network problems like congestion, high latency, or protocol errors — but you don’t want to wait until your network is in trouble to learn how to use it.
Wireshark has some features not found in many of the other free sniffers that are available, such as conversation reassembly and a capture display/filter, syntax that is more advanced than most.
Wireshark does require the WinPCap drivers, which are very simple to set up.WinPcap can be downloaded from www.winpcap.org. The WinPcap drivers enable a greater degree of access and control to the network communications at the packet level than is available by going through the Windows network drivers. For this reason, many third-party utilities that do heavy packet manipulation make use of WinPcap.
Profile normal network traffic is not the goal of Wireshark — it is a tool to help you recognize aberrant behavior when you are trying to track down the source of a problem. Unfortunately, there is no quick-and-easy path to tracing down the root cause of high latency or slow throughput.
Sure, if there is a zombie machine on your network infected by a trojan, you may easily flag it as an infected spam bot when you see it initiate thousands of SMTP connections per hour — and detecting viruses and malware is an important forensic task. But tracking down why one of your database servers is always a little bit slower than the other could involve quite a bit more digging and analysis.
That’s why taking some time to profile your network traffic under what we’ll assume are normal operating conditions is valuable. You can get a feel for how often WiFi clients see TCP retransmissions; if the rate doubles or triples, and you have not added any more machines, then you may need to look to see whether any one client is behaving differently than the others, or if you are simply seeing signal degradation. When diagnosing bandwidth woes, if one of your local servers is consistently timing out and dropping connections, it could be an application-level problem. But if Wireshark’s logs reveal that it is the remote server resetting connections, then you need to make a phone call.
Here, the tutorials at the Wireshark site are an invaluable aid. The wiki has some basic network troubleshooting pages, as well as links to resources hosted elsewhere.
Wireshark includes a lot of features that will help you analyze your network when you are tracking down the source of your problem. For example, you can run statistical comparisons between two saved packet captures; this allows you to perform a capture when you are experiencing the problem, and compare it against a data set you collect as a control group when things are running smooth. Likewise, you can collect and compare captures from two different machines — say, on different network segments or with different configurations.
This is also why it is so helpful that there are builds of Wireshark available for the proprietary operating systems: when chasing down a performance problem, you may need to collect data from every source. Last but not least, although Wireshark is always referred to as a network analyzer, the truth is it can analyze other things as well, including USB traffic and even Unix socket connections between applications. So even if you master your TCP/IP traffic this weekend, you may still have room to explore.
Capturing wireless control traffic can be done with Wireshark. To capture the control frames, the system must support the monitor mode on the card.
Its availablity are platform, driver and libpcap dependent, on most Linux systems it is possible to get the card into monitor mode with iwconfig or more easy with the airmon-ng script, for example, airmon-ng start wlan0, on windows, the AirPcap adapters from Riverbed allows the capture of full raw wireless traffic
NOTE
Wireshark is packet-centric (not data-centric)
Wireshark doesn’t work well with large network capture files (you can turn all packet coloring rules off to increase performance).
Since Wireshark is so powerful we can wonder if it can do everything we need. The answer is “Almost”. Where wireshark is limited we can find or write expansion in order to allow deeper and further analisys.
There are several plugin or addon on the internet related to wireshark, of course the very first can be found on the wireshark site, and are the ones crafted by reverbed technology, main sponsor of the wireshark project.
Plugin, addon and collateral software allow the forensic analyst to deep analyze data presented by wireshark. There are, actually, several needs to implement further wireshark astonishing capability.
Let’s present some.
Deep diving protocols:
Although the standard version of wireshark is able to process the most used protocols, there could be some specific areas that are not covered, or where the feature provided by wireshark could not be exhaustive.
The basic way to look into this kind of necessity is to find or build a component able to analyze the specific protocol you need.
Wireshark is able to capture raw network data and then show you the several component of single packet, it allow you also to create your specific component to read data, those components are called dissector.
A dissector is able to read a specific sequence inside your frame and define the type of protocol with its components. Then is able to send the eventual incapsulated protocol to the appropriated dissector to further analyze.
Dissectors:
Each dissector decodes its part of the protocol, and then hands off decoding to subsequent dissectors for an encapsulated protocol.
Every dissection starts with the Frame dissector, which dissects the packet details of the capture file itself (e.g. timestamps). From there it passes the data on to the lowest-level data dissector, e.g. the Ethernet dissector for the Ethernet header. The payload is then passed on to the next dissector (e.g. IP) and so on. At each stage, details of the packet will be decoded and displayed.
Dissection can be implemented in two possible ways. One is to have a dissector module compiled into the main program, which means it’s always available. Another way is to make a plugin (a shared library or DLL) that registers itself to handle dissection.
There is little difference in having your dissector as either a plugin or built-in. On the Windows platform you have limited function access through the ABI exposed in libwireshark.def, but that is mostly complete.
The big plus is that your rebuild cycle for a plugin is much shorter than for a built-in one. So starting with a plugin makes initial development simpler, while the finished code may make more sense as a built-in dissector.
Monitoring/tracing tools
The following tools can process the libpcap-format files that Wireshark and TShark produce or can perform network traffic capture and analysis functions complementary to those performed by Wireshark. Some of them are useful for very specific needs or task, sometimes the same results are available with some digging and working by wireshark itself but some of those tools will make your life easier.
In brackets you will find the program license and the supported operating systems, in bold the ones we can’t live without.
- Cap’r Mak’r generates new pcaps for various protocols
- Chaosreader Extracts data streams from TCP connections and writes each stream to a file (GPL, Windows, various UN*Xes)
- CloudShark Ability to view and analyze captures in a browser, annotate and tag them, and share them with a URL.
- Cookie Cadger Helps identify information leakage from applications that utilize insecure HTTP GET requests.
- Driftnet It is a program which listens to network traffic and picks out images from TCP streams it observes (GPL, Linux)
- Ettercap Allows for sniffing of machines in a switched network LAN[1] (GPL, BSD/Linux/Solaris)
- HUNT Allows for sniffing of machines in a switched network LAN as well as providing a very easy to use API to modify the intercepted frames before they are forwarded. Intercept and Modify. (GPL, Linux)
We should remember that Wireshark could analyze, in a switched environment, only the traffic presented to the sniffed network interface (direct traffic and broadcast type). To check all the traffic we should use a span-port or use tools like ettercap or hunt to make a classic poison mac sniffing.
Both those tools can do a lot of damage on the network so it is strongly suggested to pay a lot of attention mainly if you are collecting data in a live production environment. Although could be of some use to use the router mac address to explore the incoming\outgoing internet traffic, probably the performance of the network would decrease to an unacceptable low rate.
- ipsumdump summarizes TCP/IP dump files into a self-describing ASCII format easily readable by humans and programs (uses the Click modular router).
This tool is particularly useful when data need to be presented and explained to people that is not so found on the forensic analysis we are performing.
- Moluch Moloch is an open source, large scale IPv4 packet capturing (PCAP), indexing and database system.
- Mu DoS converts any packet into a DoS generator
- Ntop Network top – tool that lets you analyze network traffic statistics (GPL, FreeBSD/Linux/Unix)
- Online PCAP to MSC chart Generator generates MSC arrow diagram charts from PCAP files.
- p0f versatile passive OS fingerprinting and many other tricks (Freeware, BSD/Linux/Win32/…).
P0f is a tool that utilizes an array of passive traffic fingerprinting mechanisms to identify the players behind any incidental TCP/IP communications (often as little as a single normal SYN) without interfering in any way. Version 3 is a complete rewrite of the original codebase, incorporating a significant number of improvements to network-level fingerprinting, and introducing the ability to reason about application-level payloads (e.g., HTTP).
Some of p0f’s capabilities include:
Highly scalable and extremely fast identification of the operating system and software on both endpoints of a vanilla TCP connection – especially in settings where NMap probes are blocked, too slow, unreliable, or would simply set off alarms.
Measurement of system uptime and network hookup, distance (including topology behind NAT or packet filters), user language preferences, and so on.
Automated detection of connection sharing / NAT, load balancing, and application-level proxying setups.
Detection of clients and servers that forge declarative statements such as X-Mailer or User-Agent.
The tool can be operated in the foreground or as a daemon, and offers a simple real-time API for third-party components that wish to obtain additional information about the actors they are talking to.
Common uses for p0f include reconnaissance during penetration tests; routine network monitoring; detection of unauthorized network interconnects in corporate environments; providing signals for abuse-prevention tools; and miscellanous forensics.
- PacketShark™ A handheld hardware tap for 100% on-field capturing of Ethernet packets at wire speed; store captured data using an external storage device (SD memory card) and analyze using wireshark
- pcap_diff compares pcap files for received, missing or altered packets.
- Prelude Another network intrusion detection system (GPL, BSD/Linux/Unix)
- RRDtool is “a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average)”. (GPL, various UN*Xes) Many RRDtool-based applications are listed on the RRD World page.
- Snort Network intrusion detection system (GPL, BSD/Linux/Unix/Win32)
- SplitCap A pcap file splitter.
- tcpflow Extracts data streams from TCP connections and writes each stream to a file (GPL, UN*X/Windows)
- Tele Traffic Tapper Graphical traffic-monitoring tool; can also read saved capture files (BSD style?, BSD/Linux)
- TPCAT will analyze two packet captures (taken on each side of the firewall as an example) and report any packets that were seen on the source capture but didn’t make it to the destination (GPLv2, any OS with Python and pcapy)
- VisualEther Protocol Analyzer generates sequence diagrams from Wireshark PDML output (Win32)
- Xplico A network forensic analysis tool (GPL, Linux only)
Traffic generators
These tools will either generate traffic and transmit it, retransmit traffic from a capture file, perhaps with changes, or permit you to edit traffic in a capture file and retransmit it.
- Bit-Twist includes bittwist, to retransmit traffic from a capture file, and bittwiste, to edit a capture file and write the result to another file (GPL, BSD/Linux/OSX/Windows)
- Cat Karat – Easy packet generation tool that allows to build custom packets for firewall or target testing and has integrated scripting ability for automated testing. (Windows)
- D-ITG – (Distributed Internet Traffic Generator) is a platform capable to produce traffic at packet level accurately replicating appropriate stochastic processes for both IDT (Inter Departure Time) and PS (Packet Size) random variables (exponential, uniform, cauchy, normal, pareto, …).
- Mausezahn Mausezahn is a free fast traffic generator written in C which allows you to send nearly every possible and impossible packet.
- Nemesis is a command-line network packet crafting and injection utility. Nemesis can natively craft and inject ARP, DNS, ETHERNET, ICMP, IGMP, IP, OSPF, RIP, TCP and UDP packets. (GPL, BSD/Linux/Solaris/Mac OSX/Win32)
- Network Expect is a framework that allows to easily build tools that can interact with network traffic. Following a script, traffic can be injected into the network, and decisions can be taken, and acted upon, based on received network traffic. An interpreted language provides branching and high-level control structures to direct the interaction with the network. Network Expect uses libwireshark for all packet dissection tasks. (GPL, BSD/Linux/OSX)
- Network Traffic Generator Client/Server based TCP/UDP traffic generator (GPL, BSD/Linux/Win32)
- Ostinato is a network packet and traffic generator and analyzer with a friendly GUI. It aims to be “Wireshark in Reverse” and thus become complementary to Wireshark. It features custom packet crafting with editing of any field for several protocols: Ethernet, 802.3, LLC SNAP, VLAN (with Q-in-Q), ARP, IPv4, IPv6, IP-in-IP a.k.a IP Tunneling, TCP, UDP, ICMP, IGMP, MLD, HTTP, SIP, RTSP, NNTP, etc. It is useful for both functional and performance testing. (GPL, Linux/BSD/OSX/Win32)
- Scapy Scapy is a powerful interactive packet manipulation program (in Python). It is able to forge or decode packets of a wide number of protocols, send them on the wire, capture them, match requests and replies, and much more. (GPL, BSD/Linux/OSX)
A more exaustive collection of traffic generators can be found at:
Capture file anonymization
These tools can be used to “anonymize” capture files, replacing fields such as IP addresses with randomized values. This is a mandatory requirement when presenting data to not authorized people. We should consider that IP address in many countries are considered “sensitive” data and so processing and showing those data could be ruled by privacy legislation.
- AnonTool from the CRAWDAD archive of wireless traffic.
- The bittwiste tool from Bit-Twist.
- The Crypto-PAn tool.
- The pktanon tool from the Karlsruhe Institute of Technology Institute of Telematics.
- The SCRUB-tcpdump tool.
- The tcpdpriv tool from the Internet Traffic Archive.
- The tcprewrite tool from tcpreplay.
- The TraceWrangler tool.
There’s a categorized list of anonymization tools at the CAIDA site http://www.caida.org/tools/taxonomy/anontaxonomy.xml
Capture file repair
These tools attempt to repair damaged capture files as much as can be done.
- pcapfix can repair corrupted or truncated capture files.
[1] We should remember that Wireshark could analyze, in a switched environment, only the traffic presented to the sniffed network interface (direct traffic and broadcast type). To check all the traffic we should use a span-port or use tools like ettercap or hunt to make a classic poison mac sniffing.
Related articles