Görüntü hash'leri, dijital görüntülerin özünü temsil eden kısa, benzersiz ve sabit uzunluktaki karakter dizileridir. Hashing yöntemlerinde çoğunlukla ikili sayı sistemindeki kodlar oluşturulur. Her kod bir görüntüyü ya da görüntü ile ilgili bir kurguyu temsil eder. Görüntü hashlerinin çıkarılmasının temel amacı, görüntülerin daha kolay ve verimli bir şekilde işlenmesini ve karşılaştırılmasını sağlamaktır. Genellikle büyük ölçekli veri tabanlarında arama, veri sıkıştırma ve güvenlik ihtiyacı için görüntülerin hash kodu üretilir. Hash kodları hızlı arama, veriyi kolay saklama gibi avantajlar sağlamaktadır. Geleneksel hash yöntemleri genellikle farklı hash işlevlerini bağımsız olarak öğrenir ve bu işlevler arasındaki ilişkileri göz ardı etmektedir. Oysa bu ilişkiler duyarlılık doğruluğunu önemli ölçüde artırabilmektedir. Geleneksel hash yöntemleri arasında, hash kodunu sıralı olarak öğrenen yöntemler bulunmaktadır. Bu yöntemlerin karmaşık optimizasyon gerektirmesi ve derin ağlara doğrudan uygulanamaması gibi bazı kısıtlamaları bulunmaktadır. Bu nedenle, daha etkili ve derin hash yöntemlerine ihtiyaç duyulmaktadır. Bu tür kısıtlamaları aşmak ve daha etkili bir hash yöntemi geliştirmek için derin pekiştirmeli öğrenme prensiplerinden ilham alınmaktadır. Pekiştirmeli öğrenme, karmaşık ve belirsiz ortamlarda etkili kararlar vermeyi sağlayan güçlü bir araçtır. Pekiştirmeli öğrenme genellikle Markov karar süreci (Markov Decision Process, MDP) ve güçlü öğrenme algoritmaları kullanılarak modellenir. Bu algoritmalar, ajanın görevi başarıyla tamamlaması için gerekli olan en iyi eylem stratejilerini keşfetmesine yardımcı olur. Pekiştirmeli öğrenme, bir yapay zeka ve makine öğrenimi alanı olarak karşımıza çıkar. Bu öğrenme şekli, belirsiz ve karmaşık ortamlarda hareket eden bir ajanın nasıl optimal kararlar alacağını öğrenmesini sağlar. Temelde, ajan bir çevre içinde eylemler gerçekleştirir ve bu eylemlerin sonuçlarına göre ödüller alır veya cezalarla karşılaşır. Amaç, toplamda maksimum ödülü elde etmek için hangi eylemlerin yapılması gerektiğini öğrenmektir. Bu çalışmada derin öğrenme ve pekiştirmeli öğrenme kullanarak yenilikçi bir yöntem ile görüntüler için hash kodu üretme amaçlanmıştır. Derin öğrenme yöntemleri makine öğrenme yöntemlerinden farklı olarak görüntü özelliklerini kendi başına çıkarabilmektedir. Derin pekiştirmeli öğrenme, bir ajanın çevresiyle etkileşimler yoluyla davranış öğrenmesi gereken bir problemi temsil etmektedir. Bu çerçevede, her bir hash işlevini bağımsız olarak öğrenmek yerine, hash işlevlerini ardışık bir karar süreci olarak modellemeyi ve önceki işlevler tarafından yapılan hataları düzelterek öğrenme hedeflenmiştir. Bu yaklaşım, farklı hash işlevleri arasındaki ilişkileri göz önünde bulundurarak duyarlılık doğruluğunu artırmayı amaçlamaktadır. Geri beslemeli sinir ağları (Recurrent Neural Networks, RNN), ardışık verilerle çalışabilen ve zaman içindeki bağımlılıkları yakalayabilen özel bir yapay sinir ağı türüdür. RNN'ler, önceki zaman adımlarındaki bilgileri hatırlayarak, bu bilgileri mevcut zaman adımının çıktısını üretmek için kullanabilir. Bu şekilde, geleneksel hash yöntemlerinde olduğu gibi karmaşık optimizasyon işlemleri gerektirmeden, derin ağlar üzerinde etkili bir şekilde uygulanabilir bir hash yaklaşımı geliştirme amaçlanmıştır. Politika yöntemi olarak Actor-Critic yöntemi kullanılmıştır. Yaygın olarak kullanılan Cifar-10, Nus-wide, Mirflickr veri setleri üzerinde yapılan deneyler yaklaşımın etkinliğini göstermiştir.
In the digital world, processing and comparing images quickly and efficiently has become a necessity. This need becomes even more evident for search, data compression and security purposes, especially in large-scale databases. Image hashes are short, unique, fixed-length strings of characters that represent the essence of digital images and are an ideal solution to meet these requirements. It enables faster searching, easier storage and security of digital images. The main purpose of image hashes is to enable easier and more efficient processing and comparison of digital images. This process meets the need for fast searching in large databases, data compression and security. Hash codes are fixed-length strings of characters that represent the essence of images. These codes are used to determine the similarity or dissimilarity of digital images and enable rapid searches in large databases. Provides a concise representation of digital images. In this way, searching or comparing a specific image becomes much faster and more efficient. Storing and processing large data sets is often difficult and costly. Hash codes allow these data sets to be represented more compactly, reducing storage and processing costs. Hash codes can be used to verify the authenticity of an image or to detect unauthorized access. Traditional hashing methods usually learn different hash functions independently and ignore the relationships between these functions.This may negatively impact the accuracy and efficiency of hash codes. Among traditional methods, there are also methods that learn the hash code sequentially. However, these methods have limitations such as requiring complex optimization and cannot be directly applied to deep networks. Traditional hashing methods that work with independent hash functions usually learn different hash functions independently. This approach can reduce the accuracy of hash codes because relationships between hash functions are ignored. Traditional methods that require complex optimization processes often require complex optimization processes. This makes hash codes difficult to learn and apply. Deep neural networks are ideal for processing complex and large data sets. However, since traditional hashing methods cannot be integrated into these networks, their effectiveness remains limited. Inspired by deep reinforcement learning principles to overcome the limitations of traditional hashing methods and develop a more effective hashing method. Reinforcement learning is a powerful tool that enables effective decision-making in complex and uncertain environments. This learning method allows an agent to learn how to behave in an environment. Through rewards and punishments given to the agent, the agent adapts to its environment and develops optimal strategies. Deep reinforcement learning is an effective approach to overcome the limitations of traditional hashing methods. Deep reinforcement learning has made significant progress in the field of artificial intelligence and machine learning in recent years and is used effectively in complex decision-making processes. This method enables an agent to learn how to behave within an environment to achieve certain goals, allowing it to develop optimal strategies by taking advantage of environmental conditions and interactions. Deep reinforcement learning is considered a powerful tool for making effective decisions, especially in environments containing uncertainty. Deep reinforcement learning essentially involves the agent learning how to act in an environment to maximize or minimize a specific goal. This learning process usually occurs through the rewards the agent receives from situations, actions, and the environment. When the agent observes a situation (observation), he chooses an action depending on the environmental situation and receives a reward or punishment as a result of this action. Deep reinforcement learning aims for the agent to optimize this observation-action-reward cycle to maximize long-term reward. Deep reinforcement learning has been used successfully in various application areas. For example, it has potential for use in game theory and strategy games (e.g. AlphaGo), robotic systems, optimizing financial algorithms, automation processes, and even medical decision support systems. These methods can successfully perform various tasks without human intervention by processing complex and large data sets. The deep reinforcement learning process generally requires powerful computational resources and evolves in direct proportion to the ability to learn from large data sets. As the agent explores its environment, it learns to choose the most appropriate actions through trial and error, and over time this process becomes more efficient. The optimization process involves various algorithms and approaches used to improve the behavior of the agent.This learning method is used to obtain effective results on complex and large data sets. Deep reinforcement learning enables learning hash functions more accurately and effectively. It also manages complex optimization processes more effectively. In this way, the accuracy of hash codes is increased and the learning process becomes more efficient. Deep reinforcement learning can be easily integrated with deep neural networks. This integration increases the efficiency of hashing methods and makes processing large data sets easier. Deep reinforcement learning provides many advantages in hashing methods. These advantages enable more accurate, faster and efficient processing and comparison of digital images. To ensure high accuracy, deep reinforcement learning learns and applies hash functions more accurately. In this way, the accuracy of hash codes is increased. In terms of efficiency, deep reinforcement learning manages complex optimization processes more efficiently. This makes hash codes easier to learn and apply. In processing large data sets, deep reinforcement learning makes processing large data sets easier. Thanks to integration with deep neural networks, the efficiency of hashing methods is increased and large data sets are processed faster. While traditional hashing methods have some limitations, new generation hashing methods inspired by deep reinforcement learning principles overcome these limitations. Deep reinforcement learning offers great advantages in processing and comparing digital images by enabling learning hash functions more accurately and effectively. The high accuracy, efficiency and security advantages provided by deep reinforcement learning lead to revolutionary developments in the field of digital image processing. These advances enable faster, safer and more efficient processing of digital images. In this way, users and institutions can manage and use digital image data more effectively.One of the main components of the model is the deep learning network. This network consists of a feature representation network to extract features of images and a policy network to convert images into binary codes. The policy network includes the Actor-critic method along with RNN (recurrent neural network) and acts like an agent. Feedback neural networks (RNN) are a special type of artificial neural network that can work with sequential data and capture dependencies over time. By remembering information from previous time steps, RNNs can use this information to produce the output of the current time step. Thanks to these features, they are widely used in many application areas such as language modelling, time series analysis and speech recognition. The architecture of RNNs includes feedback connections in their hidden layers, and these connections process information through weight matrices and activation functions. However, RNNs have difficulties in learning long-term dependencies, which can lead to problems such as vanishing gradients or exploding gradients. To overcome these difficulties, applications such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) are used. Improved RNN architectures have been developed. LSTMs learn long-term dependencies more effectively by using cell state and gate mechanisms, thus greatly reducing the problem of vanishing gradients. GRUs, on the other hand, work similarly to LSTMs, but have a simpler structure and control the flow of information through update gates. These improved architectures increase the performance of RNNs, enabling them to deal with data that is more complex and has long-term dependencies. RNN and its derivatives are now successfully applied in many fields such as natural language processing, financial forecasting and speech recognition. The actor-critic method is an approach frequently used in the field of reinforcement learning and is an extension of policy gradient methods. This method consists of two main components: "actor" and "critic". The Actor component learns and implements the policy that determines which action to choose in a given situation. The Critic component learns a value function that evaluates how good the actions chosen by the actor are. This evaluation provides a feedback mechanism used to improve the actor's policy. The biggest advantage of the actor-critic method is that it provides more efficient and stable learning processes by combining the strengths of both value-based and policy-based methods. It is known for its effective performance, especially in continuous action spaces and large state spaces. This method forms the basis of many deep reinforcement learning algorithms and has been successfully applied in various fields such as robotics, gaming, financial modelling. Another main component of the approach is the reward system used to shape future decisions based on the results of previous decisions. A reward function is used to evaluate the agreement between the generated hash codes and the actual tags. In the future, with the development of artificial intelligence and machine learning technologies, image hashing methods are expected to be further optimized and expanded. The role of these technologies in the management and analysis of more complex and larger data sets will increase and new application areas will be discovered. The entire network is trained by optimizing the reward function, and these reward functions evaluate the actions taken by the agent, encouraging correct actions and punishing incorrect actions. In this way, the performance and accuracy of the model are increased. The proposed deep reinforcement learning-based hashing approach works with a sequential learning strategy, which continuously optimizes the overall accuracy by correcting errors produced by past actions. The method is optimized with reward functions and sequential learning strategies, ensuring efficient and accurate conversion of images into binary codes. The contribution of deep learning models to hashing performance heralds important future developments in this field. In this study, various hashing methods were examined on important data sets such as CIFAR10, NUS-WIDE and MIRFlickr. The CIFAR10 dataset contains a total of 60,000 color images consisting of 10 different classes, and each image is 32x32 pixels in size. A query set of 1000 randomly selected images was used in the experiments. The NUS-WIDE dataset consists of approximately 270,000 images, and in the study, experiments were carried out using 1000 images selected from 21 concepts. The MIRFlickr dataset contains 25,000 images, and 1000 random images from this dataset were determined as the query set. The experiments were carried out using the PyTorch framework and the initial parameters of the network were initialized with the VGG-19 model trained on the ImageNet dataset. The results of this study show that deep learning and deep reinforcement learning techniques provide significant advantages compared to traditional methods in hashing operations.