Interrogative elements and verb-noun ranking for criminal chatting forensics
The rapid development in computer and Internet technology through cyber space as well as communicating in the real world globally has brought a tremendous increase in cyber-crimes. Chat is an easy and fast way to communicate interactively without having face to face conversation. It incorporates var...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English English |
Published: |
2011
|
Subjects: | |
Online Access: | http://psasir.upm.edu.my/id/eprint/25959/1/FSKTM%202011%2012R.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The rapid development in computer and Internet technology through cyber space as well as communicating in the real world globally has brought a tremendous increase in cyber-crimes. Chat is an easy and fast way to communicate interactively without having face to face conversation. It incorporates various types of animation including human emotions which is factored during the chat session. The challenges faced in this research is that the chatter develops his own language, a language where speed prevails over correct spelling through short form words, thus contributing to greater interactivity which is defined as unstructured and colloquial. Furthermore, chat utterance is built from a simple sentence which normally contains only one clause. Hence without the subject of the object of sentence structures, each of the words gives variable meaning in the criminal conversation. Thus, this research is aimed to solve the problem towards finding out the meaning of the text behind the messages. Preprocessing is the cleaning process before proceeding to the actual processes. Criminal identification is the first process which requires three steps. Firstly, tokenization is done to assign each lexical automatically with a corresponding serial number in every suspect’s and victim’s utterance. The second step is to tag the lexical with the interrogative elements together with Part-of-Speech (POS). In this process, the combination of interrogative elements and verb-noun ranking is considered in the experiment. Thirdly, criminal investigation by using the Protégé criminal ontology is used to investigate all the evidences behind the text of utterances. Finally, the reporting is produced in the Digital Evidence Form (Casey, 2004), as well as the validation and satisfaction of methodology implemented in the research are done by a forensic lawyer. The chatting corpus consists of 3,098 suspects’ and victims’ utterances with 16,278 words, collected from nine criminal chatting cases. For criminal identification, two processes of identifying are considered. The identifying is done by the system and an expert. The results obtained from the system and expert show that the criminal identification is almost similar. However, the sign test to get the significance differences between the number of interrogative words extracted by the system and an expert shows that the system has an ability to function as an identifier of the interrogative elements which extracts the verb-noun ranking in criminal forensics. Furthermore, the 40 respondents are measured in interpolation precision. The interpolated precision shows that all of the interrogative elements meet the higher average percentage where the why and the how represent the highest percentages. Furthermore, the COps prototype system is produced to investigate the words behind the text. About 128 respondents of three backgrounds of qualification are investigating 5,175 words (31.8%) of words in the criminal chatting corpus and the values of recalls and precisions are measured. The interpolated precision shows that the backgrounds of respondents play a key role in the experiment of criminal investigation. Finally, the criminal chatting evidence as well as the validation of the methodology implemented in the research is carried out by a forensic lawyer |
---|