Article Info

Validation of Individual Identification Through Decision Tree Packet Header Profiling

Khairul Osman, T'Ng Qi Feng, Hairee Izzam Mohd Noor, Noor Hazfalinda Hamzah, Gina Francesca Gabriel
dx.doi.org/10.17576/apjitm-2022-1102-08

Abstract

The drastic rise in the cybercrime rate associated with the surge of users' dependence on the Internet has elevated the concern of digital forensic examiners toward the footprints of perpetrators left in a virtual environment. However, suspect identification is a big challenge in network forensics due to the anonymous nature of data transmission across the network. This study utilises the decision tree classification approach to characterise users from their behavioural web navigation pattern using the meta-data of captured network packets (Destination IP, Protocol, Port Source, and Port Destination). A total of 95,795,379 network packet headers from 96 subjects were successfully collected. Their meta-data header packets were statistically profiled to generate digital fingerprints that try to link their action on the network to their identity accurately. Hence, CHAID decision tree modelling using Destination IP, Unique protocols, and a combination of the two, including Port source and Port destination, resulted in an accuracy of 4.07%, 6.34%, and 6.36%, respectively. However, the modelling could not create a reliable decision tree for the Port source and destination. The validation study on all the combined variables had a similar accuracy of 6.36%, indicating model created had reproducibility capability. Despite the outcome, the proposed method is not yet sufficiently strong for suspect identification. Further enhancement to improve its accuracy is required.

keyword

digital forensic, decision tree, digital fingerprint, user identification

Area

Cyber Security and Digital Forensic