Doğal dil işleme tekniklerini ve derin öğrenme algoritmalarını kullanarak sosyal ağlarda spam tespiti

Bakir, Rezan

dc.contributor.advisor	Erbay, Hasan
dc.contributor.author	Bakir, Rezan
dc.date.accessioned	2023-10-02T20:56:14Z
dc.date.available	2023-10-02T20:56:14Z
dc.date.issued	2022
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=RsTBl6RWK25OBMIKtIgYYREmr3bHwnO4HypNxC2tInCDu3dFXmQ6ju_P1RqZL1E1
dc.identifier.uri	https://hdl.handle.net/20.500.12587/18284
dc.description.abstract	Kısa metin sınıflandırma problemi olarak kabul edilen sosyal ağlarda spam tespiti, metnin seyrekliği ve belirsizliği nedeniyle doğal dil işlemede zorlu bir görevdir. Sorunu çözmek için en önemli görevlerden biri güçlü bir metin gösterimi bulmaktır. Geleneksel Kelime gömme (word embedding) modelleri, yoğun vektörlerle kelimeleri temsil ederek veri seyrekliği problemini çözmektedir, ancak bu modellerin bazı problemleri etkili bir şekilde ele almalarını engelleyen bazı sınırlamaları vardır. Geleneksel kelime gömme yöntemlerinin maruz kaldığı en yaygın sınırlamalarından birisi, "kelime dağarcığı (Out Of Vocabulary)" olarak adlandırılan ve modelin sözlüğünde olmayan sözcükleri için herhangi bir vektör temsili sağlayamamasından çıkan problemidir. Bu modellerin karşılaştığı bir diğer problemi ise, bu tip modellerin, kelimenin cümle içindeki konumundan bağımsız olarak her bir kelime için yalnızca bir vektör verdiği bağlamdan bağımsız olarak temsil etmektedir. Bu sorunların üstesinden gelebilmek için, derin öğrenme teknikleriyle birlikte bağlamsal doğal dil işleme modelleri benimsenmiştir. Doğal dil işlemenin ana hedeflerinden biri, farklı bağlamlarda kelime anlamları ve benzerlikleri yakalama yeteneğini güçlendiren kelimelerin anlamlı bir temsilini geliştirmektir. Sonuç olarak bu tez çalışması, spam mesajlarını etkili bir şekilde tespit etmek amacıyla sosyal ağlardaki kısa metinlerin seyrekliğini ve diğer kısıtlamalarını ele almak için farklı modelleri önerilmiştir. Önerilen modelleri, üç kıyaslama veri seti üzerinde test ederek elde edilen sonuçları, bu modellerin yüksek sınıflandırma doğruluk elde ettiğini ve sosyal ağlarda spam masajları tespit etmek için mevcut son teknoloji yöntemlerden daha iyi performans gösterdiğini görülmüştür.	en_US
dc.description.abstract	Spam detection on social networks, considered a short text classification problem, is a challenging task in natural language processing due to the sparsity and the ambiguity of the text. One of the key tasks to address such a problem is powerful text representation. Traditional word embedding models solve the data sparsity problem by representing words with dense vectors, but these models have some limitations that make them unable to handle some problems effectively. The most common limitation that traditional word embedding methods suffer from is the "out of vocabulary" problem in which they fail to provide any vector representation for words that are not in the model's dictionary. Another problem these models face is the independence from the context, in which the models output just one vector for each word regardless of the position of the word in the sentence. To overcome these problems, we relied on contextualized natural language processing models in combination with deep learning techniques. One of the main goals of natural language processing is developing a meaningful representation of words, that improves the ability to capture word senses and similarity in different contexts. Consequently, in this thesis, we proposed different models to handle the sparsity and other limitations of short text on social networks in order to detect spam messages effectively. The results obtained on three benchmark datasets stated that our proposed methods achieve high accuracy and outperform the existing state-of-the-art methods to detect spam on social networks.	en_US
dc.language.iso	tur	en_US
dc.publisher	Kırıkkale Üniversitesi	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol	en_US
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Doğal dil işleme tekniklerini ve derin öğrenme algoritmalarını kullanarak sosyal ağlarda spam tespiti	en_US
dc.title.alternative	Using natural language processing techniques and deep learning algorithms for detecting spam on social networks	en_US
dc.type	doctoralThesis	en_US
dc.contributor.department	KKÜ, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.contributor.institutionauthor	Bakir, Rezan
dc.identifier.startpage	1	en_US
dc.identifier.endpage	121	en_US
dc.relation.publicationcategory	Tez	en_US
dc.identifier.yoktezid	755156	en_US

Bu öğenin dosyaları:

Ad:: 755156.pdf
Boyut:: 4.989Mb
Biçim:: PDF
Açıklama:: Tam Metin / Full Text

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Doktora Tez Koleksiyonu [25]

Basit öğe kaydını göster