This paper describes static analysis for the languages with exception handling. A low level intermediate representation, which supports exceptions, is proposed in this study. Data flow analyses for unreachable code detection were described. А general scheme of static analysis that takes exception related paths into account was given. The algorithms were implemented as a part of the static analysis tool Svace for C++, Java and Kotlin languages.
The paper describes the approach for the improvement of the accuracy of general purpose static symbolic execution analysis of C# sources based on the accounting for the values of class fields that can have only one possible value. In addition, we propose the detector of forgotten readonly modifiers and unused fields, that use data collected by the main analysis. The approach and detectors were implemented as part of the industrial static analyzer SharpChecker. The main analysis is performed at the AST level to reduce time and resource costs. Collected values of the fields are used during symbolic execution phase allowing it to use concrete value instead of symbolic for the subset of class fields. As a result, we managed to noticeably improve the accuracy of some analyzers, such as UNREACHABLE_CODE (improved by 7.57%) or DEREF_OF_NULL (improved by 1.33%) and get new results in cases with forgotten readonly or unused fields. Achieved results allow to use analysis and detectors in the main branch of the SharpChecker and make it available to users. The paper considers in detail the algorithm of the detector and provides examples of results on the set of open source software.
The paper is devoted to the scalable approach for the detection of uses of disposed resources in C# source code, that is based on static symbolic execution. The resulting detector is implemented as a part of an industrial SharpChecker, that performs a scalable inter-procedural path-, and context-sensitive analysis. The evaluation of the developed detector shows 70% true positive ratio allowing it to include to the standard set of detectors and provide functionality to users. The paper describes a detection algorithm that takes into account the limitations imposed by the existing infrastructure of SharpChecker, its evaluation on the set of open-source programs containing 6 mln LOC and some examples of found errors in real projects.
Static taint analysis can be used to find various security weaknesses and vulnerabilities in programs by discovering dataflow paths from taint sources to taint sinks. In most cases the data is called ”tainted” if it was obtained from an untrusted source without proper sanitization. In this paper we present a static taint analyzer Irbis. It implements analysis based on IFDS (Interprocedural Finite Distributive Subset) dataflow problem, as well as various extensions aimed at improving accuracy and completeness of the analysis. It supports different definitions of tainted data, which enables it to find such weaknesses as out of buffer access, use of freed memory, hardcoded passwords, data leaks and discover dataflow paths between user-defined sources and sinks. All sources, sinks and propagators definitions are stored in JSON format and can be adjusted to meet the users’ needs. We compare analysis results on Juliet Test Suite for C/C++ with several other analyzers, such as Infer, Clang Static Analyzer and Svace. Irbis manages to demonstrate 100% coverage on taint-related subset of tests for implemented CWEs, while suppressing all the false positives using heuristics. We also show performance and false positive rate on real projects, with examples of real vulnerabilities, which can be detected by Irbis.
We introduce possibly the first approximation of programming language metrics that represent a spectrum over 70 unique and carefully gathered dimensions by which any two programming languages can be compared. Based on those metrics, one can evaluate her own `best' language, and to demonstrate how complex feelings such as “simplicity” and “easy to use”, often found as arguments in language debates and advertisements, can be decomposed into clear and measurable pieces. We put the collection as a completely separate open-source file (here as an appendix) so that everyone can participate in eliciting new and interesting dimensions used in programming languages research, development, and use. Metrics can find their use to compare languages, define requirements, create rankings, give tips for language designers, and simply provide a bird’s-eye view on existing languages features found in the wild.
The creation of new generations of autonomous robotic complexes, recognition systems and vision systems in general is impossible without the use of modern computer technologies. This article presents models of the robot vision system based on Elbrus microprocessors. Models of detection, classification and segmentation tasks were developed. The models are based on the number of arithmetic operations required to perform a forward pass. The models take into account such features of Elbrus microprocessors as: number of executing devices, pipeline, data pre-pumping, clock frequency, etc. Theoretical and experimental results were obtained on existing and promising "Elbrus" microprocessors. It is shown that Elbrus microprocessors can be the basis of an on-board vision system. The results obtained by the authors indicate the prospects of import substitution in the field of robotics.
Our paper compares the accuracy of the vanilla ResNet-18 model with the accuracy of the Clipped BagNet-33 and BagNet-33 models with adversarial learning under different conditions. We performed experiments on images attacked by the adversarial sticker under conditions of image transformations. The adversarial sticker is a small region of the attacked image, inside which the pixel values can be changed indefinitely, and this can generate errors in the model prediction. The transformations of the attacked images in this paper simulate the distortions that appear in the physical world when a change in perspective, scale or lighting changes the image. Our experiments show that models from the BagNet family perform poorly on images in low quality. We also analyzed the effects of different types of transformations on the models' robustness to adversarial attacks and the tolerance of these attacks.
Identifying a person in a digital image using computer vision is a crucial aspect of this field. The presence of external objects, such as medical masks that cover part of the face, can drastically reduce recognition accuracy and increase errors from 5% to 50%, depending on the algorithm. This paper investigates the use of neural networks, in particular the generative adversarial network (GAN), to solve the problem of reconstructing an image of a face covered by a medical mask to improve face recognition accuracy.
This research paper focuses on the use of computer vision in intelligent systems to analyze human contours. With the growth of technology in various industries, there is a need to improve the efficiency of human-computer systems. The proposed method uses a video camera and computer software to detect a person in the image and process it using the OpenCV library and C++ programming language. The paper reviews existing human detection methods, analyzes an alternative method that uses computer vision, and develops a new method for human detection. Modifications include the use of the Kuwahar filter for image blurring and the Sobel algorithm for outline extraction. Applications for this technology include security at transportation hubs and crowded areas, remote health monitoring, enhanced control at borders and secure facilities, and interactive advertising and entertainment.
The article proposes a new method for automatic data annotation for solving the problem of document image segmentation using deep object detection neural networks. The format of marked PDF files is considered as the initial data for markup. The peculiarity of this format is that it includes hidden marks that describe the logical and physical structure of the document. To extract them, a tool has been developed that simulates the operation of a stack-based printing machine according to the PDF format specification. For each page of the document, an image and annotation are generated in PASCAL VOC format. The classes and coordinates of the bounding boxes are calculated during the interpretation of the labeled PDF file based on the labels. To test the method, a collection of marked up PDF files was formed from which images of document pages and annotations for three segmentation classes (text, table, figure) were automatically obtained. Based on these data, a neural network of the EfficientDet D2 architecture was trained. The model was tested on manually labeled data from the same domain, which confirmed the effectiveness of using automatically generated data for solving applied problems.
The world is moving towards alternative medicine and behavioural alteration for treating, managing, and preventing chronical diseases. In the last few decades, diagrammatical models have been extensively used to describe and understand the behaviour of biological organisms (biological agents) due to their simplicity and comprehensiveness. However, these models can only offer a static picture of the corresponding biological systems with limited scalability. As a result, there is an increasing demand to integrate formalism into more dynamic forms that can be more scalable and can capture complex time-dependent processes. In this paper, we introduce a generic disease model called Communicating Stream X-Machine Disease Model (CSXMDM), which has been developed based on X-Machine and Communicating X-Machine theories. We conducted an experiment on modelling an actual disease using a case study of Type II Diabetes. The results of the experiment demonstrate that the proposed CSXMDM is capable of modelling chronic diseases.
The article presents the results of a study f the field material of the Sosva Mansi dialect of the village of Lombovozh. Its phonological analysis was performed using a modern data processing system Lingvodic. Comparison of the system of vowel sounds obtained as a result of this analysis with other phonetic systems of the Sosva dialect proposed by linguists of the XX century, allowed us to come to the conclusion that it is necessary to further refine the phonetic system of this Mansi dialect. Thus, in the course of the study, inaccuracies in the interpretation of the field material of the village of Lombovozh were revealed, and a number of unique phonemes (o, ɛ, e, i, ə, u) were found that have uncharacteristic indicators for the generally accepted parameters F1 and F2 of the international phonetic alphabet, based mainly on the data of the analysis of European languages.
The article discusses research perspectives on the Tatar language based on the LingvoDoc platform. Digitalization of language learning in modern linguistics allows us to move to a new level of describing the language structure. Large corpora containing millions of word forms have been created in all European languages since the 90s of the last century. Currently, this has been done not only in the Russian language, but also in many national languages of Russia such as Tatar, Bashkir, Udmurt, Mari, Moksha, Komi, etc. One of the recognized platforms in modern national linguistics is the development of the LingvoDoc virtual laboratory, created ISP RAS. This platform gives an opportunity to create, store and analyze multilayer dictionaries, language materials and dialects. The main functionality of Lingvodoc is used by more than 250 linguists who process their materials online, more than 1000 dictionaries and 300 text corpora in the national languages of the Russian Federation have already been collected. We consider the possibilities of this platform to study the Tatar language. We believe that electronic corpora allow us to solve a variety of theoretical and practical problems of the language. At present, when the Tatar literary and everyday spoken language is actively used in all fields, it is very important to make a complete description of its features, which will help create more accurate grammars and dictionaries. The relevance of the study is due to the need to use a gloss corpus of texts in the Tatar language. As modern studies in linguistics show, nowadays it is impossible to describe the state of the language without such corpora and analyze its grammatical structure, which corresponds to the world standards of modern science. The LingvoDoc platform makes it possible to process a significant amount of material in a short time and create corpora with glossing and removed homonymy based on samples of the Tatar literary, business, colloquial and dialect languages.
This study investigates the linguistic competence of modern language models. Artificial neural networks demonstrate high quality in many natural language processing tasks. However, their implicit grammar knowledge remains unstudied. The ability to judge a sentence as grammatical or ungrammatical is regarded as key property of human’s linguistic competence. We suppose that language models’ grammar knowledge also occurs in their ability to judge the grammaticality of a sentence. In order to test neural networks’ linguistic competence, we probe their acquisition of number predicate agreement in Russian. A dataset consisted of artificially generated grammatical and ungrammatical sentences was created to train the language models. Automatic sentence generation allows us to test the acquisition of particular language phenomenon, to detach from vocabulary and pragmatic differences. We use transfer learning of pre-trained neural networks. The results show that all the considered models demonstrate high accuracy and Matthew's correlation coefficient values which can be attributed to successful acquisition of predicate agreement rules. The classification quality is reduced for sentences with inanimate nouns which show nominative-accusative case syncretism. The complexity of the syntactic structure turns out to be significant for Russian models and a model for Slavic languages, but it does not affect the errors distribution of multilingual models.
The influence of relative longitudinal position on the frequency characteristics of two interacting ships floating stationary in close proximity in head waves and shallow water is investigated in this paper. A CFD approach has been adopted to simulate the dynamic behavior of the interacting ships. The numerical simulation for “Aleksey Kosygin” and “Novgorod” ships floating on head waves at a small relative transverse distance was carried out using two scaled models. The effect of longitudinal separation on the frequency characteristics of both ships was studied. Heave, roll, and pitch RAOs for various cases were analyzed, and recommendations for relative longitudinal positions were made on the basis of the present analysis.
In this paper, the process of gas flow in the flow path of a turbomolecular vacuum pump using the Cercignani-Lampis (CL) model was simulated. CL model was used as new boundary conditions when calculating the transition probability. The test particle method (Monte Carlo method) was used in the simulation. The calculation of the molecules transition probability through the blade channel in the forward and reverse directions, the transition resulting probability, the compression ratio was made.
ISSN 2220-6426 (Online)