One of the goals of the Dutch Honeypots Project is to collect data. Since our inception, we have been working to design our data collection infrastructure. As a start we have chosen to run some basic honeypot services (Kippo, Dionaea and p0f). We start with basic services such that we can make our logging backend perfect.
In order to collect logs centrally, we chose to use Splunk since it has many advanced features and allows us to integrate (possibly) live data on this website. Furthermore any types of logs (e.g. to text files, log files, databases etc.) can be imported to Splunk by running the Splunk Universal Forwarder application on the honeypot. This forwarder will send data over to the Splunk Indexer which parses the logs. We also store the logs on the Indexer and keep an archive.
The following diagram shows a global overview of our current infrastructure (click the image for a larger version):
This is the current infrastructure which is now in place. Splunk still needs to be configured for the most part such that we are able to generate some useful overviews and crunch through all the data.
Our next step is to expand the number of honeypot services. These services will be in line with the goals set on the Project page. Once we have more services we will also expand the number of honeypots to be able to collect more data.