Abstract
In fish farms a major issue is the net cage wear, resulting in fish escapes and negative impact of fish quality, due to holes and biofouling of the nets. To minimize fish losses, fisheries utilize divers to inspect net cages on a weekly basis. Aquaculture companies are looking for ways to maximize profit and reduce maintenance costs is one of them. Kefalonia Fisheries spend 250 thousand euros yearly on diver expenses for net cages maintenance. This work is about the design, fabrication, and control of an inexpensive autonomous underwater vehicle intended for inspection in net cages at Kefalonia Fisheries S.A. in Greece. Its main body is 3D-printed, and its eight-thruster configuration grants it six degrees of freedom. The main objective of the vehicle is to limit maintenance costs by increasing inspection frequency. The design, fabrication as well as the electronics and software architecture of the vehicle are presented. In addition, the forces affecting Kalypso, mobility realization, navigation, and modeling are quoted along with a flow simulation and the experimental results. The proposed design is adaptable and durable while remaining cost effective, and it can be used for both manual and automatic operations.
Abstract
Getting students motivated and interested in their education can be challenging in any classroom setting, even more so in an online learning environment. In this spectrum, educational robotics (ER) has demonstrated numerous advantages in the educational environment, not only by facilitating teaching, but also enabling the cultivation of manyfold skills, including creativity, problem-solving, and teamwork. Meanwhile, many methods have been developed with the aid of technology to improve the teaching process and boost students' ability to learn. Blended learning is one approach that integrates conventional classroom methods with digital resources in an effort to foster students' creativity. But how can blended learning be combined with robotics? The objective of this paper is to evaluate the impact of employing an underwater vehicle, called educational underwater vehicle (EDUV), in conjunction with a dedicated programming learning platform within the context of a programming course that is offered at the high school level. In this work, this platform is utilized by students in secondary education, and a survey was conducted prior and after using the underwater vehicle's platform based on two questionnaires. The survey included 112 Greek participants, 64 males and 48 females in the age range of 14–18 years old. The experimental results show an increase in their motivation and creativity. In other words, they are more engaged in the classroom and the lesson becomes more enjoyable. More specifically, the survey revealed that most participants are familiar with computers but have limited knowledge of robotics and programming. After training on the EDUV platform, participants showed a significant increase in correct responses for Python and Blockly environments, with an average of 50.7% in four programming-related questions. The platform also reduced “do not know” replies, which means that the student's self-esteem increased. The paired sample T-test showed that the EDUV platform positively influenced participants' perceptions of robotics and motivated them to further their education. In this paper, the related work is discussed, and the architecture of the vehicle is analyzed, along with the integration with the online platform. In addition, the methodology performed is explained and divided into steps. Finally, the experimental results are discussed. Instructions, 3D models, and code can be found in the github repository https://github.com/MariosVasileiou/EDUV.
Abstract
This paper introduces the El Greco Platform, a Python programming platform for distance learning that employs an educational robot. This website allows prospective learners to remotely control El Greco, a social humanoid robot designed to be cost-effective, simple to construct, and appropriate for use in education. El Greco is capable of performing multiple tasks, including combined movements. These Robot capabilities can be programmed using either Python code or the Blockly library, which adds an editor to an application that visualizes coding concepts as interlocking blocks. Programming a robot appears to be a significantly more effective and creative method for students to learn a programming language. This educational tool was designed primarily for use by students and allows anyone to learn Python while controlling a robot for free. El Greco Platform features gamification elements that increase the enjoyment and engagement of the learning experience while reinforcing the concepts taught. The survey results on students aged 13–18 revealed that the El Greco Platform captivated the study participants and positively affected their attitudes toward programming and robotics. In addition, it significantly impacted their comprehension of programming and motivated them to seek additional opportunities to expand their knowledge of robotics and programming.
Abstract
Underwater vehicles utilized in net cages at aquaculture facilities are commonly utilized for the purpose of examining the deterioration of nets and the accumulation of biofouling. The implementation of a robotic system for repairing damages has the potential to decrease the expenses associated with employing divers while reducing the risk of their injury. This study details the development, fabrication, and simulation of a cost-effective subaquatic manipulator, denoted as MURA, which can be seamlessly incorporated into submersible vehicles. The Kalypso unmanned underwater vehicle (UUV) is utilized in this study. MURA exhibits a high degree of modularity, enabling seamless alteration of the end-effector tool. Additionally, its low-cost nature renders it a viable option for integration with any underwater vehicle. Three end-effectors were subjected to testing, one designed for the purpose of disposing fish morts, another intended for removing litters from net cages in fisheries, and a third for repairing net tears. This study outlines the MURA design, including the arm’s fabrication and constituent components. In addition, the modeling of the manipulator is presented accompanied by a water flow simulation of the three manipulators. Ultimately, the experimental findings are analyzed and evaluated. These include the field experiments performed at Kefalonia fisheries, along with the duration to complete each task. For instance, the capture of fish morts typically necessitates approximately 30 s, encompassing the entire process from initial targeting to actual capture. In a similar vein, the procedure of mending tears in a net necessitates an approximate duration of 70 s on average, encompassing the stages of initial identification and subsequent detachment. The suggested design exhibits adaptability and durability while upholding affordability when utilized in aquaculture.
Abstract
Nowadays, the number of people that utilize either digital applications or machines is increasing exponentially. Therefore, trustworthy verification schemes are required to ensure security and to authenticate the identity of an individual. Since traditional passwords have become more vulnerable to attack, the need to adopt new verification schemes is now compulsory. Biometric traits have gained significant interest in this area in recent years due to their uniqueness, ease of use and development, user convenience and security. Biometric traits cannot be borrowed, stolen or forgotten like traditional passwords or RFID cards. Fingerprints represent one of the most utilized biometric factors. In contrast to popular opinion, fingerprint recognition is not an inviolable technique. Given that biometric authentication systems are now widely employed, fingerprint presentation attack detection has become crucial. In this review, we investigate fingerprint presentation attack detection by highlighting the recent advances in this field and addressing all the disadvantages of the utilization of fingerprints as a biometric authentication factor. Both hardware- and software-based state-of-the-art methods are thoroughly presented and analyzed for identifying real fingerprints from artificial ones to help researchers to design securer biometric systems.
Abstract
Nowadays, while steganography is the main mean of illegal secret communication, the need of detecting steganographic content and especially stego images is becoming more compulsory. Since multimedia content can be easily spread over the internet and more complicated steganography algorithms in different domains i.e. spatial, transform are utilized, the task of identifying stego images becomes very difficult. Early steganalysis methods deploy statistical attacks on stego images while more recent ones use deep learning techniques. The latter ones mainly utilize convolutional neural networks and show promising results. In this paper we propose a novel method to identify stego images derived from two different steganographic algorithms S-UNIWARD (Spatial-UNIversal WAvelet Relative Distortion) and WOW (Wavelet Obtained Weights) for various embedding rates. The proposed method initially utilizes a dilated convolutional neural network as a feature extractor and afterwards the extracted feature vector trains a random forest classifier. More specifically it is proved that in steganalysis, a dilated convolutional neural network could be an excellent feature extractor and the traditional softmax layer could be replaced by another machine learning classifier. Extensive experiments were conducted, and the proposed model was also compared against state-of the-art convolutional neural networks utilized in spatial image steganalysis, and other feature extraction methods. Results showed that the proposed method achieves high classification accuracy and outperforms other analogous steganalysis approaches.
Abstract
Slope and slant correction of offline handwritten word images are two of the major pre-processing steps in document image processing, because these reduce the variations in writing, thereby make further processing of the same much easier. This paper presents novel slope and slant correction methods that are applied in three different script handwritten words namely Devanagari, Bangla and Roman. The language dependency and the computational complexity of state-of-the-art approaches towards the word level slope and slant correction are addressed here. A new technique for approximate core region detection is introduced here for skew detection and then linear regression is recursively applied to de-skew the word image. Whereas, in case of slant correction, a novel cost function over the vertical projection of de-skewed image is designed and optimized to fix the uniform slant angle of text words. A new …
Abstract
Isolating non-text components from the text components present in handwritten document images is an important but less explored research area. Addressing this issue, in this paper, we have presented an empirical study on the applicability of various Local Binary Pattern (LBP) based texture features for this problem. This paper also proposes a minor modification in one of the variants of the LBP operator to achieve better performance in the text/non-text classification problem. The feature descriptors are then evaluated on a database, made up of images from 104 handwritten laboratory copies and class notes of various engineering and science branches, using five well-known classifiers. Classification results reflect the effectiveness of LBP-based feature descriptors in text/non-text separation.
Abstract
Steganalysis and steganography are the two different sides of the same coin. Steganography tries to hide messages in plain sight while steganalysis tries to detect their existence or even more to retrieve the embedded data. Both steganography and steganalysis received a great deal of attention, especially from law enforcement. While cryptography in many countries is being outlawed or limited, cyber criminals or even terrorists are extensively using steganography to avoid being arrested with encrypted incriminating material in their possession. Therefore, understanding the ways that messages can be embedded in a digital medium –in most cases in digital images-, and knowledge of state of the art methods to detect hidden information, is essential in exposing criminal activity. Digital image steganography is growing in use and application. Many powerful and robust methods of steganography and steganalysis …
Abstract
Slanted text has been demonstrated to be a salient feature of handwriting. Its estimation is a necessary preprocessing task in many document image processing systems in order to improve the required training. This paper describes and evaluates a new technique for removing the slant from historical document pages that avoids the segmentation procedure into text lines and words. The proposed technique first relies on slant angle detection from an accurate selection of fragments. Then, a slant removal technique is applied. However, the presented slant removal technique may be combined with any other slant detection algorithm. Experimental results are provided for four document image databases: two historical document databases, the TrigraphSlant database (the only database dedicated to slant removal), and a printed database in order to check the precision of the proposed technique.
Abstract
The Special Issue “Document Image Processing” in the Journal of Imaging aims at presenting approaches which contribute to access the content of document images. These approaches are related to low level tasks such as image preprocessing, skew/slant corrections, binarization and document segmentation, as well as high level tasks such as OCR, handwriting recognition, word spotting or script identification. This special issue brings together 12 papers that discuss such approaches. The first three articles deal with historical document preprocessing. The work by Hanif et al.[1] aims at removing bleed-through using a non-linear model, and at reconstructing the background by an inpainting approach based on non-local patch similarity. The paper by Almeida et al.[2] proposes a new binarization approach that includes a decision-based process for finding the best threshold for each RGB channel. In the paper by Kavallieratou et al.[3], a segmentation-free approach based on the Wigner-Ville distribution is used to detect the slant of a document and correct it. Once a document image is preprocessed, a next step described in the paper by Ghosh et al.[4] consists in separating text components from non-text ones, using a classifier based on LBP features. Following steps may consist in recognizing text components or searching from word queries. In the paper by Nashwan et al.[5] a holistic-based approach for the recognition of printed Arabic words is proposed, coupled with an efficient dictionary reduction. In the work by Nagendar et al.[6] it is shown that using a query specific fast Dynamic Time Warping distance, improves the Direct Query Classifier …
Abstract
We survey the AI research carried out in Greece recently. We concentrate on the case of linked geospatial data, an area with significant practical importance, very interesting research results, and implemented systems developed by a Greek research team.
Abstract
A technique appropriate for extracting textual information from documents with complex layouts, such as newspapers and journals, is presented. It is a combination of a foreground analysis and a text localization method. The first one is used to segment the page in text and nontext blocks, whereas the second one is used to detect text that may be embedded inside images, charts, diagrams, tables, etc. Detailed experiments on two public databases showed that mixing layout analysis and text localization techniques can lead to improved page segmentation and text extraction results.
Abstract
Computer Science and it focuses on the acquisition, preservation and analysis of digital evidence, in a way that that these evidences are suitable for presentation in a court of law. Forensic investigators follow a standard set of procedures. One major and difficult problem is the correct identification of file types. Criminals often hide evidence in a digital device, by changing the file type. It is very common, a child predator to try to hide image files with immoral content in order to fool police authorities. In this paper we examine a methodology for file type identification, which uses computational intelligence techniques for feature selection and classification. This methodology was applied to the three most common image file types (jpg, png and gif). In order to ascertain the method’s accuracy, different machine learning classifiers were utilized. A three stage process involving feature extraction (Byte Frequency Distribution), feature selection (genetic algorithm) and classification (decision tree, support vector machine, neural network, logistic regression and k-nearest neighbor) was examined. Experiments were conducted having files altered in a digital forensics perspective and the results are presented. The examined methodology showed-in most casesvery high and exceptional accuracy in file type identification.
Abstract
In this paper, a hybrid technique for complex layout analysis is presented. Morphological operations are applied to both the foreground and the background, in order to connect neighboring regions and detect separator lines and columns respectively. Contour tracing is used for the extraction of shape and size information and classification of the connected components. Evaluated on the RDCL-2015 dataset, the method achieved state-of-art results in less than three seconds per page.
Abstract
In this paper, a feature that is based on statistical directional features is presented. Specifically, an improvement of the statistical feature: edge hinge distribution, is attempted. Furthermore, different matching techniques are applied. For the evaluation, the Firemaker DB was used, which consists of samples from 250 writers, including 4 pages per writer. The suggested feature, the skeleton hinge distribution, achieved accuracy of 90.8% using nearest neighbor with Manhattan distance for matching.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted or mass reproduced without the explicit permission of the copyright holder.
Abstract
In this paper, the software of Kalypso, an underwater robot designed for fisheries is presented. The software that controls the robot's movements and data processing is a critical component of its operation, and this paper provides a detailed analysis of the algorithms used to ensure reliable performance. The results of this analysis provide valuable insights for researchers and engineers working to improve the efficiency and effectiveness of underwater robots in the context of fisheries management.
Abstract
Underwater vehicles can play an essential role in fish farming and their construction can be a challenging aspect of their development owing to the necessity of waterproofing electronic components. This paper describes the construction and basic framework of a robotic vehicle named "Kalypso," which was developed to inspect nets in fish farms. Kalypso can distinguish between clean areas on the net and areas that are either torn or covered in algae, based on an algorithm. Also, it is equipped with sensors to estimate position, depth, temperature and leakage, which can detect water in the watertight housing. The watertight part of the robot includes cable plugs, some of which are detachable to facilitate the connection of various devices, such as sensors and lights. Kalypso has two cameras, one facing the front and another facing the bottom. Furthermore, the robot has underwater ultrasonic sensors, which measure the distance from the environment, and LED lights on the front for increased visibility in low-light environments. Several tests were carried out, both inside fish farm nets and in the open sea.
Abstract
A significant challenge in fish farms is net cage deterioration, which leads to fish escapes owing to holes and can have a severe influence on the fish's health due to biofouling. To reduce fish losses, divers are utilized on a weekly basis by fisheries to carry out the task of inspecting the net cages. Companies that are involved in aquaculture are always seeking for methods to increase their profits, and one of those ways is to cut down on the costs of maintenance. This paper is about the development of an unmanned underwater vehicle used for inspection in net cages at Kefalonia Fisheries in Greece. This vehicle has a 3D-printed body and a six-thruster configuration, featuring five degrees of freedom. The vehicle's primary goal is to reduce divers’ expenses by conducting inspections more frequently. In this work, the design, manufacture, and control are presented along with the experimental results and flow simulation of the vehicle. The suggested design is versatile and robust while yet being affordable, and the fisheries may easily adopt it due to its inexpensive price and simplicity of operation.
Abstract
Educational robotics is rooted in Constructionism and allows learners to investigate and discover new concepts. It would appear that learning a programming language while programming a robot is more motivating and productive than conventional methods. El Greco platform is an educational platform built to teach Python. Users can control El Greco from any computer connected to the Internet due to the platform’s web-based interface. El Greco is a social humanoid robot built to be affordable and appropriate for use in education. Potential users can use Python direct code entry or the Blockly library to control El Greco. The Blockly library embeds an editor in an application to represent coding notions like interlocking blocks. Unique functions that control El Greco were created. The inserted code can be executed on the website or by the Robot. The user can view the result of code execution through a live-streaming window. The El Greco platform has been designed with students in mind but is available to anyone at no cost.
Abstract
Remotely Operated Underwater Vehicles (ROVs) used in net cages at fish farms are usually employed to inspect net wear and biofouling. Implementing a robotic system to repair the damages can reduce diver costs and can be lucrative for the aquaculture company. This paper presents the design, fabrication, and modeling of an affordable subsea robotic arm called IURA, that can be integrated to any underwater vehicle. For this study Kalypso ROV was used. Although IURA is at its infancy, it features (i) high modularity allowing to change end-effector tool effortlessly, and (ii) lowcost which makes it feasible for any ROV integration. It is mostly 3D-printed and has two servo motors as joints featuring 2 Degrees of Freedom (DoF). Two manipulators were tested, one to dispose fish morts, and another to remove objects from the net cages at fish farms. In this work, the design of IURA will be presented along with the fabrication and sealing of the arm. Furthermore, the modeling of the robotic arm will be quoted and flow simulation of both end-effectors. Finally, the experimental results will be discussed. The proposed design features customizability and sturdiness while retaining low-cost and can be used for subsea operations at Kefalonia Fisheries’ net cages when mounted on Kalypso ROV.
Abstract
This paper introduces the conception, design, fabrication, and control of a modular remotely operated underwater vehicle. It is an affordable submarine robot intended for inspection operations in shallow depths. It has a 3D-Printed hull, and it can be adjusted according to the respective needs. It is equipped with six thrusters which allow it to have five Degrees of Freedom enabling increased maneuverability while it operates. The robot has marginally positive buoyancy which allows it to emerge in case of a malfunction, and the fins attached to the motors enhance its stability. The design, development, and construction as well as the components and electronics of the vehicle are presented. Moreover, the forces acting on the ROV and the buoyancy are introduced, along with the motion generation of the motors. The IMU sensor calibration is explained and a dataset of the rotational movement was analyzed into 3 axes. The proposed design features low cost, customizability and sturdiness allowing the user to configure and operate by himself.
Abstract
In the digital era we live, trustworthy verification schemes are required to ensure security and to authenticate the identity of an individual. Traditional passwords were proved to be highly vulnerable to attacks and the need of adopting new verification schemes is compulsory. Biometric factors have gained a lot of interest during the last years due to their uniqueness, ease of use, user convenience, and ease of deployment. However, recent research showed that even this unique authentication factors are not inviolable techniques. Thus, it is necessary to employ new verification schemes that cannot be replicated or stolen. In this paper we propose the utilization of steganography as a tool to provide unbreakable passwords. More specifically, we obtain a biometric feature of a user and embed it as a hidden message in an image. This image is then utilized as a password, the so-called StegoPass. Reversely, when a legit user or an attacker tries to unlock a device or an application, the same biometric feature is captured and embedded with the same steganography algorithm into the picture. The hash key of the resulted stego image in both cases is produced and if there is a complete match, user is considered as authenticated. To ensure that the proposed StegoPass cannot be replicated, we have conducted experiments with state-of-the-art deep learning algorithms. Moreover, it was examined whether Generative Adversarial Networks could produce exact copies of the StegoPass to fool the suggested method and the results showed that the proposed verification scheme is extremely secure.
Abstract
The File Forgery Detection tasks is in its first edition, in 2019. This year, it is composed by three subtasks: a) Forged file discovery, b) Stego image discovery and c) Secret message discovery. The data set contained 6,400 images and pdf files, divided into 3 sets. There were 61 participants and the majority of them participated in all the subtasks. This highlights the major concern the scientific community shows for security issues and the importance of each subtask. Submissions varied from a) 8, b) 31 and c) 14 submissions for each subtask, respectively. Although the datasets were small, most of the participants used deep learning techniques, especially in subtasks 2 & 3. The results obtained in subtask 3 -which was the most difficult one- showed that there is room for improvement, as more advanced techniques are needed to achieve better results. Deep learning techniques adopted by many researchers is a preamble in that direction, and proved that they may provide a promising steganalysis tool to a digital forensics examiner.
Abstract
Currently, there is a plethora of video wearable devices that can easily collect data from daily user life. This fact has promoted the development of lifelogging applications for security, healthcare, and leisure. However, the retrieval of not-pre-defined events is still a challenge due to the impossibility of having a potentially unlimited number of fully annotated databases covering all possible events. This work proposes an interactive and weakly supervised learning approach that is able of retrieving any kinds of events using general and weakly annotated databases. The proposed system has been evaluated with the database provided by the Lifelog Moment Retrieval (LMRT) challenge of ImageCLEF (Lifelog2018), where it reached the first position in the final ranking.
Abstract
This paper presents an overview of the ImageCLEF 2019 lab, organized as part of the Conference and Labs of the Evaluation Forum - CLEF Labs 2019. ImageCLEF is an ongoing evaluation initiative (started in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval of visual data with the aim of providing information access to large collections of images in various usage scenarios and domains. In 2019, the 17th edition of ImageCLEF runs four main tasks: (i) a medical task that groups three previous tasks (caption analysis, tuberculosis prediction, and medical visual question answering) with new data, (ii) a lifelog task (videos, images and other sources) about daily activities understanding, retrieval and summarization, (iii) a new security task addressing the problems of automatically identifying forged content and retrieve hidden information, and (iv) a new coral task about …
Abstract
This paper presents an overview of the foreseen ImageCLEF 2019 lab that will be organized as part of the Conference and Labs of the Evaluation Forum - CLEF Labs 2019. ImageCLEF is an ongoing evaluation initiative (started in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval of visual data with the aim of providing information access to large collections of images in various usage scenarios and domains. In 2019, the 17th edition of ImageCLEF will run four main tasks: (i) a Lifelog task (videos, images and other sources) about daily activities understanding, retrieval and summarization, (ii) a Medical task that groups three previous tasks (caption analysis, tuberculosis prediction, and medical visual question answering) with newer data, (iii) a new Coral task about segmenting and labeling collections of coral images for 3D modeling, and (iv) a new Security task …
Abstract
This paper describes our contribution for the Lifelog Moment Retrieval (LMRT) challenge of ImageCLEF Lifelog2018. Lifelogging has a tremendous potential in many applications. However, the wide range of possible moment events along with the lack of fully annotated databases make this task very challenging. This work proposes an interactive and weakly supervised learning approach that can dramatically reduce the time to retrieve any kind of events in huge databases. Impressive results have been obtained in the referred challenge, reaching the first rank.
Abstract
A system for extracting the textual information from document images with complex layouts is presented. It is based on both layout analysis and text localization techniques. Layout analysis is first applied to segment the page in text and non-text blocks and then text localization is used to detect text that may be embedded inside images, charts, diagrams, tables etc. Detailed experiments on scanned Arabic newspapers showed that combining layout analysis and text localization methods can lead to improved page segmentation and text extraction results.
Abstract
This paper deals with a system consisting of a robot named El Greco that can be programmed via internet and livestreaming from any place, anywhere, to move on a play room as well as perform other functions such as to produce voice in many languages, recognize voice commands, recognize faces, perceive its environment, perform combined movements and provide information by searching the Internet. All these capabilities of the robot can be programmed using a friendly block programming platform that has been developed to be used by students for educational purposes. The system was used by a number of students and the results of two questionnaires before and after the use are reported.
Abstract
In this paper, the humanoid robot named El Greco is presented (design, hardware and software). The humanoids from DARPA Robotics Challenge inspired us in many aspects, while at the same time the goal was to maintain a low cost and build a humanoid accordingly, so that we can create a model that is affordable and easy to build in order to be used by the youth. The entire assembly of the robot consists of parts such as limbs, head, body, legs and the base which are 3D printed. All the details over the design specifications and the problems incurred during the development of the humanoid are described under each module of the humanoid. Further details and results are given over the open source software of El Greco.
Abstract
In order to convert a document image in its editable version, an OCR engine must identify and separate the nontext regions from text regions of a given document image. In the present work, a technique is developed to classify various text and non-text regions present in a document image. For that purpose, a modified version of Histogram of Oriented Gradient (HOG) is used as a feature descriptor. Multi-Layer Perceptron (MLP) is chosen from a pool of classifier by comparing the recognition accuracy of it with two other well-known classifiers viz., Random Forest (RF), Nave Bayes (NB). The designed technique is evaluated on a dataset, containing 862 images of manually extracted regions from two standard databases namely, RDCL2015 dataset and Media Team Document dataset. The proposed system has achieved 96.44% recognition accuracy and outperformed some of the state-of-the-art feature descriptors …
Abstract
The state-of-the-art writer identification systems use a variety of different features and techniques in order to identify the writer of the handwritten text. In this paper several statistical and model based features are presented. Specifically, an improvement of a statistical feature, the edge hinge distribution, is attempted. Furthermore, the combination of this feature with a model-based feature is explored, that is based on a codebook of graphemes. For the evaluation, the Firemaker DB was used, which consists of 250 writers, including 4 pages per writer. The best result for the statistical suggested approach, the skeleton hinge distribution, achieved accuracy of 90,8%, while the combination of this method with the codebook of graphemes reached 96%.
Abstract
This paper presents a novel procedure for localizing text on scene photos. It takes advantage of the fact that text should present some contrast in comparison with the background, in order to be distinguished by the human eye. A procedure of binarization is applied in order to create appropriate images for the text detection. The connected components of the image are extracted and some heuristic rules are applied in order to identify areas containing text. Finally, the overlaps are handled and the false detections are rejected. The method is evaluated using images of natural scene taken from the robust reading competition of ICDAR2011. The results are promising and some useful conclusions are drawn.
Abstract
Blindness is the condition of lacking visual perception due to physiological or neurological factors. Blind people do not have the full perception of the surrounding environment, though navigating, in an unknown environment or/and with obstacles on route, can be a very difficult task. In this paper, an information mobile system is presented, that acts as an electronic travel aid, and can guide a blind person through a route, inform him about imminent obstacles in his path and help him orientate himself. The current prototype consists of a mobile phone, and the developed application.
Abstract
Ruling line removal is an important pre-processing step in document image processing. Several algorithms have been proposed for this task. However, it is important to be able to take full advantage of the existing algorithms by adapting them to the specific properties of a document image collection. In this paper, a system is presented, appropriate for fine-tuning the parameters of ruling line removal algorithms or appropriately adapt them to a specific document image collection, in order to improve the results. The application of our method to an existed line removal algorithms is presented.
Abstract
In this article the GCDB (Greek Characters DataBase) System is presented. GCDB is a complete database system for storing images of Greek unconstrained handwritten characters. The three elements that compose this database are: a specialized input form, a database that contains the images of the filled forms and the software that allows the inputting of the data from the form into the database and their retrieval. The main purpose of this database system is to make it possible to achieve the automatic storage and organization of images of Greek symbols and letters in view of their use by OCR (Optical Character Recognition) systems or other applications. The GCDB system is designed within the concept of future expansion, providing an up to date database of Greek handwritten characters to cover the growing demands for offline character recognition of the Greek language.
Abstract
In this paper, we present a binarization technique
specifically designed for historical document images.
Existing methods for this problem focus on either
finding a good global threshold or adapting the
threshold for each area so that to remove smear,
strains, uneven illumination etc. We propose a hybrid
approach that first applies a global thresholding
method and, then, identifies the image areas that are
more likely to still contain noise. Each of these areas is
re-processed separately to achieve better quality of
binarization. We evaluate the proposed approach for
different kinds of degradation problems. The results
show that our method can handle hard cases while
documents already in good condition are not affected
drastically.
Abstract
It is common for libraries to provide public access
to historical and ancient document image collections.
It is common for such document images to require
specialized processing in order to remove background
noise and become more legible. In this paper, we
propose a hybrid binarizatin approach for improving
the quality of old documents using a combination of
global and local thresholding. First, a global
thresholding technique specifically designed for old
document images is applied to the entire image. Then,
the image areas that still contain background noise are
detected and the same technique is re-applied to each
area separately. Hence, we achieve better adaptability
of the algorithm in cases where various kinds of noise
coexist in different areas of the same image while
avoiding the computational and time cost of applying a
local thresholding in the entire image. Evaluation
results based on a collection of historical document
images indicate that the proposed approach is effective
in removing background noise and improving the
quality of degraded documents while documents
already in good condition are not affected.
Abstract
In this paper the problem of music performer verification is introduced.
Given a certain performance of a musical piece and a set of candidate pianists
the task is to examine whether or not a particular pianist is the actual performer.
A database of 22 pianists playing pieces by F. Chopin in a computercontrolled
piano is used in the presented experiments. An appropriate set of features
that captures the idiosyncrasies of music performers is proposed. Wellknown
machine learning techniques for constructing learning ensembles are applied
and remarkable results are described in verifying the actual pianist, a very
difficult task even for human experts.
Abstract
In this paper, we present a trainable approach to
discriminate between machine-printed and handwritten
text. An integrated system able to localize text areas and
split them in text-lines is used. A set of simple and easyto-
compute structural characteristics that capture the
differences between machine-printed and handwritten
text-lines is introduced. Experiments on document images
taken from IAM-DB and GRUHD databases show a
remarkable performance of the proposed approach that
requires minimal training data.
Abstract
This paper presents a character segmentation algorithm for unconstrained cursive handwritten text. The transformation-based learning method and a simplified variation of it are used in order to extract automatically rules that detect the segment boundaries. Comparative experimental results are given for a collection of multi-writer handwritten words. The achieved accuracy in detecting segment boundaries exceeds 82%. Moreover, limited training data can provide very satisfactory results.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted or mass reproduced without the explicit permission of the copyright holder.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted or mass reproduced without the explicit permission of the copyright holder.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted or mass reproduced without the explicit permission of the copyright holder.