A new document classification algorithm against malicious data leakage attacks

Kesenek, Yahya; Ozcelik, Ibrahim; Kaya, Fmrah

DSpace Home
→
Araştırma Çıktıları / Research Outputs
→
WOS İndeksli Yayınlar Koleksiyonu / WOS Indexed Publications Collection
→
View Item

dc.contributor.authors	Kesenek, Yahya; Ozcelik, Ibrahim; Kaya, Fmrah
dc.date.accessioned	2023-01-24T12:09:02Z
dc.date.available	2023-01-24T12:09:02Z
dc.date.issued	2022
dc.identifier.issn	1300-1884
dc.identifier.uri	http://dx.doi.org/10.17341/gazimmfd.641580
dc.identifier.uri	https://hdl.handle.net/20.500.12619/99763
dc.description	Bu yayın 06.11.1981 tarihli ve 17506 sayılı Resmî Gazete’de yayımlanan 2547 sayılı Yükseköğretim Kanunu’nun 4/c, 12/c, 42/c ve 42/d maddelerine dayalı 12/12/2019 tarih, 543 sayılı ve 05 numaralı Üniversite Senato Kararı ile hazırlanan Sakarya Üniversitesi Açık Bilim ve Açık Akademik Arşiv Yönergesi gereğince telif haklarına uygun olan nüsha açık akademik arşiv sistemine açık erişim olarak yüklenmiştir.
dc.description.abstract	Nowadays it is important to store sensitive data and restrict its usage only to authorized people or institutions. In general, solutions for Data Leakage Prevention (DLP) ignores malicious attacks on documents and algorithms using fingerprinting and regular expressions are used. However, content-based attacks are successful evading those algorithms. In this paper an algorithm robust against malicious content-based attacks is proposed, which is independent of the attack executed. Transposition, sentence structure alteration, modification, obfuscation attacks are taken into consideration within the scope of paper. N-gram, character -gram, k-skip-n-gram and LSA methods are used in the feature extraction step, for having better classification results under attacks. The extracted features are passed to a Vote Classifier consisting of Support Vector Machine, Random Forest and Multi-Layer Perceptron classifiers. Additionally, the effects of instrumenting Spell-Correction in different steps of the algorithm is evaluated, which is effective against modification attacks.
dc.language	English
dc.language.iso	eng
dc.publisher	GAZI UNIV, FAC ENGINEERING ARCHITECTURE
dc.relation.isversionof	10.17341/gazimmfd.641580
dc.subject	Engineering
dc.subject	Data leakage prevention
dc.subject	malicious DLP
dc.subject	DLP
dc.subject	advanced persistent thread
dc.subject	structural evasion
dc.subject	APT
dc.title	A new document classification algorithm against malicious data leakage attacks
dc.type	Article
dc.identifier.volume	37
dc.identifier.startpage	1639
dc.identifier.endpage	1654
dc.relation.journal	JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY
dc.identifier.issue	3
dc.identifier.doi	10.17341/gazimmfd.641580
dc.identifier.eissn	1304-4915
dc.contributor.author	Kesenek, Yahya
dc.contributor.author	Ozcelik, Ibrahim
dc.contributor.author	Kaya, Fmrah
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rights.openaccessdesignations	gold