Wednesday, 9 January 2013

Port Scanning the World


Or just “large sized networks”


Intro


In his spare time Tiago leads a Portuguese based research team. Recently they have undertaken a research project to port scan the world. In this blog post Tiago takes a look at how they approached this and the key lessons learnt so far.

The overall aim of the project is to have an automated system that scans the entire world, with a super fast querying system that delivers a real-time dashboard of incoming scans.

The key difference in our project to previous attempts to scan the world is in the collection, storing and interrogation of the data in conjunction with known exploits.

By combining the output with collected vulnerability data we can then deliver the following:
  • A list of vulnerable machines that we report to the different CERT teams/ responsible parties.
  • A list of machines that have been compromised and have backdoors on them.
  • Statistical data on the countries with most vulnerable machines.
  • Statistical data on the services that are running out there.


Scope

The initial proof of concept and design used the IP range assigned to Portugal. This provided a discreet scope that was not too small, yet still manageable at around 5.8 million IP address.

Approach

To deliver this initial proof of concept we first had to identify the test population and then configure a suitable scan. Once we had the date we then had to determine how best to analyse. The following section outlines how we achieved this.

Information Acquisition and Service Decisions

Initially the team obtained all the CIDRs for Portugal. Acquiring the CIDR information was relatively easy as the assigned range information can be obtained from RIPE’s FTP server. We then converted these into CIDRs using a set of scripts that will be published soon on the 7E github page.

We then determined which set of services we wanted to scan for. The services to be scanned were chosen by identifying the most common services where vulnerabilities are found, and ports that are frequently found open during security testing engagements.

The following table shows the ports and protocol that we decided to scan for.

ID
Port Number
TCP/UDP
Service
1
80
TCP
http
2
443
TCP
https
3
8080
TCP
http alternative
4
21
TCP
FTP
5
22
TCP
SSH
6
23
TCP
Telnet
7
53
UDP
DNS
8
445
TCP
Samba
9
139
TCP
Samba
10
161
UDP
SNMP
11
1900
UDP
UPNP
12
2869
TCP
UPNP
13
5353
UDP
MDNS
14
137
TCP
Netbios
15
25
TCP
SMTP
16
110
TCP
POP3
17
143
TCP
IMAP
18
3306
TCP
Mysql
19
5900
TCP
VNC Server
20
17185
UDP
VoIP
21
3389
TCP
Rdesktop
22
8082
TCP
TR 069


Scanning 

We then used NMAP to scan the services and finally produced a SYN scan of Portugal.  For the full trials and tribulations of making this happen see Tiago’s personal blog.


As you can see from the following screenshots, there was a huge amount of data created.






This provided us with more data than we could handle, so we needed to find a way to be able to interrogate it.  For this we built a web application that would let us query the data.  After several iterations we finally settled on the following approach and used mongoDB rather than MySQL.





Improvements

Analysing the Nmap scans, we noticed that doing the –sS was fast, however doing –sV and DNS resolution was really slowing things down. So we decided to create a methodology for our portscans.
This was based on splitting the scan. First of all we completed a FAST scan against the IP range. Then extracted hosts that had ports open and finally ran the SLOW part of the scan only on those open ports.

Phase 1
nmap -v –randomize-hosts -sS –iL CIDRS-PORTUGAL.txt -p 21 --open -n -PN –T5 --host-timeout 3000ms --min-hostgroup 400 --min-parallelism 100 --disable-arp-ping –oA PORT21-OPEN-TOBEFILTERED

Phase 1.1 – Filter IP addresses with open ports
cat PORT21-OPEN-TOBEFILTERED.gnmap | grep -w "21/open" | awk '{print $2}' > PORT21-OPEN

Phase 2
nmap -vvv -d -sV -p 21 -iL PORT21-OPEN -Pn -n --host-timeout 30s --disable-arp-ping --min-parallelism 100



Things worked much better with this process and scans were coming in fast and with consistent results. We then started importing more and more sources of information into our database, including vulnerabilities, exploits and some cuckoo sandbox results.

Next on the list for improvements was UDP scanning.

Nmap, has a UDP Scan mode. However, when we attempted to scan anything with it, it would run for weeks and weeks and would never finish. UDP is picky. There are plenty of issues when scanning UDP and lots of references that can explain why UDP is picky:


So we went to our labs, and started messing around with other options including using Scapy. The following table demonstrates the tests we did. These tests were for 1 IP and 1 port only.

Test
Lab time results
Internet time results
Nmap UDP mode
25s 506ms
41s 050ms
Scapy v.1
2s 702ms
2s 522ms


The script we wrote for Scapy was way faster and worked well. With this success we then proceeded to do mass tests (1 port against 5.8M IP Addresses) and these were the results:

Test
Time results
Notes
Nmap
4+ weeks
Never finished
Scapy v.1
1 week
Python+Scapy only
Scapy v.2
3 days
Python+scapy+multithreading

We were very happy with these results. But still it wasn’t at a point we were satisfied with. However, we had reached the limit of our knowledge at the time when it came to mass UDP scanning. So we asked for some help on twitterverse, and in came HD Moore to give us a hand.  He sent us some of his code and explained how his UDP scanner worked. Using the same test approach his scanner finished in 4 minutes and 45 seconds. However, this approach requires the tool to be preconfigured with sample UDP packets for the port to be scanned. Where as Scapy v.2 was capable of scanning for all ports (more slowly, but still faster than Nmap!).

Key Points:

One of the key points we have taken from this is that tools are good, but you have to invest time and effort to tailor them to deliver the results you need. So next time you grab your security tool of choice, don’t just accept the way it works. Get coding and make it do what you need.

This is a brief overview of PTcoreSec’s project and we have only covered a few of the key points in the process.  As with any large project covering a new topic, it did not run this smoothly.  To find out about the problems the team faced and how they overcame them visit Tiago’s blog. 



Part Two:

Watch out for part two of this blog. We will analyse and explain all the different components of the scanning system. We will take a look at the different technologies  used, how they work and how they allow for an automated system. This will enable you to conduct large scale scans, store data and most importantly query the data fast and maybe just do this:



No comments:

Post a Comment