The paper describes a static analyzer for finding defects in Scala programs. The proposed analysis scheme uses JVM bytecode produced during compilation. The generated bytecode is used as an input for inter-procedural static analyzer Svace. In contrast to the analysis of other languages supported by Svace, in this work we describe an approach that does not require compiler modifications and therefore simplifies language support. This approach can also be used in static analyzers that aim to support a large number of programming languages.
The paper describes static analysis of map in the Go language for dereferencing a null pointer when extracting a key from a map. The work has been done within the Svace static analyzer. We begin with introducing Svace intermediate representation and algorithms. Then we describe the IR changes needed for modeling Go maps and their semantics. We explain how intraprocedural analysis is performed and how the null dereference detector works. Then we proceed with a summary-based interprocedural analysis. We show evaluation results on a wide range of open source projects.
Critical real-time operating system (RTOS) reliability improvement remains to be a relevant and demanding task. The use of detailed requirements provided by the developers introduces new opportunities in this direction through the memory management facilities. In this paper we present a new approach of static memory allocation in real-time systems with robust memory space partitioning. We propose to design a static memory layout tool based on the formal description of the project memory requirements. The proposed formal requirements are platform agnostic and are based only on the needs of the application software. We introduce general concepts, which allow us to use a universal approach for static memory layout tool creation. We also describe the general scheme of the memory layout algorithm as well as the requirements that must be taken into account when each step of the algorithm is implemented. We tested our approach on real industrial projects and confirmed its versatility, adaptability and effectiveness.
This paper presents the implementation of static analysis for Visual Basic .NET (VB.NET) within the industrial tool SharpChecker. Leveraging the Roslyn compiler framework, VB.NET analysis was integrated into SharpChecker, enabling static code analysis for VB.NET projects. The process involved building support for VB.NET projects, creating a comprehensive test suite, implementing a source code indexer, and adapting existing analyzers to support VB.NET syntax nodes and operations. Evaluation of translated tests and real-world projects demonstrated production-acceptable analysis quality, paving the way for improved maintenance of VB.NET projects. Additionally, the study highlighted SharpChecker’s capability for cross-language analysis, showcasing its ability to handle mixed C# and VB.NET projects efficiently.
Formal models of access control must be described in accordance with the requirements of FSTEC of Russia regulatory documents, in order to ensure trust in certified information security tools when they implement appropriate access control policies. The criterias that the description of each such model must meet were established in GOST R 59453.1-2021 “Information protection. Formal access control model. Part 1. General principles” to stimulate the development of formal access control models that are adequate to the operating conditions of modern information security tools. This standard also specifies additional criteria for cases where specific policies are implemented by information security tools: discretionary access control (DAC), mandatory access control (MAC), role-based access control (RBAC), or mandatory integrity control (MIC). A draft of the new standard GOST R “Information protection. Formal access control model. Part 3. Recommendations on development” was developed with the participation of the author to simplify the process of describing the formal model, which is scheduled for approval in 2024. This new standard is important for the development of regulatory and methodological support in this area. The standard will also be useful in developing a formal model for information security tools that are complex system software, such as an operating system (OS) or a database management system (DBMS). The article analyzes the results of the development of this draft standard, including the stages recommended in it for describing the formal model. Firstly, this is the stage of describing the states of the corresponding abstract automaton. Secondly, this is one of describing the rules for transition from states to states of an abstract automaton. Thirdly, this is the stage of formulating and implementing evidence of the fulfillment of safety conditions, the technologies and practical techniques used for this. In addition, the article provides examples of testing the recommendations set out in the draft standard when reworking the mandatory entity-role model of access and information flows security control in OS of Linux family (MROSL DP-model), which is used as the scientific basis for the implementation of the PARSEC security subsystem of certified according to the highest protection classes and trust levels of OS Astra Linux.
This article introduces a new approach to tricking perceptron based neural networks with piecewise linear activation functions using basic linear algebra. By formulating the attack as a system of linear equations and inequalities, it demonstrates a streamlined and computationally efficient approach to generating diverse sets of adversarial examples. The algorithms for the proposed attack have been implemented in code, that accessible in the open-source repository. The study highlights the formidable challenge posed by the proposed attack methodology for contemporary neural network defenses, emphasizing the pressing need for innovative defense strategies. Through a comprehensive exploration of adversarial vulnerabilities, this research contributes to the advancement of adversarial robustness in machine learning, paving the way for the development of more reliable and trustworthy artificial intelligence systems in real-world applications.
The use of knowledge graphs in the construction of intelligent information and analytical systems provides to effectively structure and analyze knowledge, process large volumes of data, improve the quality of systems, and apply them in various domains such as medicine, manufacturing, trade, and finance. However, domain-specific knowledge graph engineering continues to be a difficult task, requiring the creation of specialized methods and software. One of the main trends in this area is the use of various information sources, in particular tables, which can significantly improve the efficiency of this process. This paper proposes an approach and a tool for automated extraction of specific entities (facts) from tabular data and populating them with a target knowledge graph based on the semantic interpretation (annotation) of tables. The proposed approach is implemented in the form of a special processor included in the Talisman framework. We also present an experimental evaluation of our approach and a demo of domain knowledge graph development for the Talisman framework.
As a result of the research, mechanisms were proposed and tested to solve applied problems of importing, automatic processing, structuring and analyzing information based on components of the Talisman platform to improve the efficiency of operation of complex hardware systems. A subject area has been developed that allows us to work out similar tasks in other applied areas (testing was carried out on the example of the energy and aviation industries). The results obtained confirm the hypothesis that machine learning methods can be effectively used in complex distributed trusted IAS to solve a range of applied tasks in budget and commercial organizations, in production and in the operation of complex hardware systems.
The prominent problem in memory dump analysis and virtual machine introspection approaches is a semantic gap. Availability of debug symbols or knowledge about kernel data structures offsets is very important for retrieving high-level information from binary code. A set of information about kernel data structures field offsets is called an OS profile. Methods of generating such profiles are based on guest agents, debug symbols, source code compilation or binary analysis. Using only binary analysis makes it possible to do research with a minimal knowledge about analyzed guest OS. In this paper we present a novel approach for OS profile generating. It is based on system call tracing and comparison between data obtained from application binary interface and data extracted from expected locations of kernel structures. The advantage of this solution is scalability for supporting different guest systems. While other existing approaches use heuristics based on handling Linux kernel functions that access the fields, the current approach suggests using heuristics that are similar across different OS families. We also suggest a method of describing heuristic algorithms for profile generation that simplifies understanding of them and makes them more resistant to changes between OS versions.
The paper proposes an iterative method for extracting algorithms from a binary code and constructing their high-level representation. The construction of algorithms using the proposed method is implemented in the form of analysis of dynamic slices. The method is based on an algorithm for tracking data flow in the forward and backward directions. Also, two levels of presentation of the extracted algorithms are proposed: a functional slice diagram and an algorithm execution diagram. The functional slice diagram is a structured slice representation and is a lower-level representation than the algorithm execution diagram. The flowchart of the algorithm is a representation consisting only of function models and their parameters. The proposed method for constructing algorithms and methods of their presentation allow to increase the analyst's productivity in solving problems of code security analysis, to improve the quality of the obtained analysis results. The developed ways of representing algorithms can be used to implement algorithms for automatic analysis of code security. In addition, the authors reviewed the approaches to extracting algorithms from binary code and how they are represented by static code analysis tools and considered some of their shortcomings and limitations.
The Python Package Index (PyPI) serves as the primary repository for projects for the Python programming language, and the package manager pip uses it by default. PyPI is a free and open-source platform: anyone can register a user on PyPI and publish their project, as well as examine the source code if necessary. The platform does not vet projects published by users, allowing for the possibility to report a malicious project via e-mail. Nonetheless, every less than month analysts repeatedly discover new malicious packages on PyPI. Organizations working in the field of open repository security vigilantly monitor emerging projects. Unfortunately, this is not enough: some malicious projects are detected and removed only several months after publication. This paper proposes an automatic feature selection algorithm based on bigrams and code properties, and trains an ET classifier capable of reliably identifying certain types of malicious logic in code. Malicious code repositories MalRegistry and DataDog were used as the training sample. After training, the model was tested on the three latest releases of all existing projects on PyPI, and it succeeded in detecting 28 previously undiscovered malicious projects, the oldest of which had been around for almost one and a half years. The approach used in this work also allows for real-time scanning of published projects, which can be utilized for prompt detection of malicious activity. In this work, the additional focus lays on methos that do not require an expert for feature selection and control, thereby reducing the burden on human resources.
Automation of security analysis processes plays an important role in software development, because it allows vulnerabilities to be detected and fixed at an early stage. This article presents the development outcomes of an automated fuzz-testing platform, as well as its integration with a platform for processing and storing the results of various security analysis tools. The developed platform integrates security analysis tools into a single testing system embedded in the continuous integration process. The proposed platform not only simplifies and speeds up the testing and analysis processes, but also increases the accuracy of vulnerability detection through results aggregation and the application of machine learning algorithms for marking and prioritizing detected errors. This approach allows developers to identify and correct vulnerabilities in a timely manner, contributing to the creation of more reliable and secure products.
The article considers PDF as a tool for storing and transferring documents. Special attention is paid to the problem of converting data from PDF back to its original format. The relevance of the study is due to the widespread use of PDF in electronic document management of modern organizations. However, despite the convenience of using PDF, extracting information from such documents can be difficult due to the peculiarities of information storage in the format and the lack of effective tools for reverse conversion. The paper proposes a solution based on the analysis of the text information from the output stream of the PDF format. This allows automatic recognition of text in PDF documents, even if they contain non-standard fonts, complex backgrounds, or damaged encoding. The research is of interest to specialists in the field of electronic document management, as well as software developers involved in creating tools for working with PDF.
This paper examines the application of weak observation techniques to automate tax claim processing in the banking sector. Interest in process automation using machine learning and artificial intelligence techniques in the financial sector has grown significantly in recent years, driven by the desire to improve efficiency, accuracy and customer service. Previous research in financial process automation has often relied on traditional machine learning approaches that require large amounts of well-analyzed data. However, in the banking industry, and especially in specific tasks such as processing tax authority claims, data annotation faces significant challenges due to the need for highly skilled professionals and privacy issues. Our work therefore aims to fill the research gap by applying weak observation, a technique that allows the use of imprecise, inconsistent or incomplete data to train models. This is particularly relevant for the banking sector, where data become outdated quickly and often have limited access due to regulatory constraints. Methodologically, to implement the idea of weak supervision, we used the Snorkel framework to create a training dataset using markup functions developed together with Bank Point experts. This allowed us to significantly reduce the dependence on the time-consuming process of manual data markup and to use large volumes of unlabeled documents. The results of the study showed that weak control approaches can significantly improve the efficiency of tax claims processing by creating models capable of classifying and interpreting different types of tax documents with high accuracy. In addition, the application of weak control can accommodate the need for constant updating of data and legislation, making it preferable for the dynamically changing environment of the financial sector. Using weak supervision to automate responses to tax claims not only improves the quality of data processing, but also helps to reduce the workload of specialists, improving the overall efficiency of financial operations. These results could have an impact on the future application of machine learning in the financial sector, given the importance of innovative approaches in data-constrained environments.
The article discusses the development of an algorithm for the formation of IT project teams. The materials were data from the digital footprint of IT students. The student's digital footprint is a constantly updated set of data, including accounting documents of project disciplines, intermediate results in disciplines, and practical training. The paper provides an example of solving the problem of forming teams using graphs. An algorithm based on a graph model has been proposed, which allows you to build a graph reflecting the interaction of students in past projects. Commands are formed based on the constructed graph. Two approaches to command formation are proposed inside the graph model: based on vertex clustering and using graph traversal. To determine the best team, a graph of student communication is built with text tags representing technologies, programming languages, frameworks, etc. The algorithm was tested on data from students of the IT department of Mathematical Support and Administration of Information Systems of the School of Computer Science and requirements for a real project and tested on the spontaneous distribution of students on projects within the discipline. Using the algorithm, you can estimate how successful the split was. The creation of effective teams plays a key role in the successful implementation of projects, therefore, the proposed algorithm can be useful for teachers and project managers in the IT field. The developed algorithm is planned to be integrated into the IT project executors search web service.
This paper presents an intelligent model based on the Pix2Pix conditional generative adversarial network that automates the process of predicting the recurrence of cervical malignancy in patients who have not yet undergone surgery. The implemented model accepts a pelvic MRI image as input data and provides an output probability of tumor recurrence and a generated image for the "post-operative" perspective. The presented model differs from its basic analogue by modifying the loss function for the problem conditions and replacing the standard generator with a convolutional neural network U-Net. Since the formulated problem belongs to the class of medical diagnostic tasks, the presence of false negatives of the intelligent model was reduced to zero by slightly increasing the number of false positives. In the process of comparative analysis of prognostic and real postoperative images, it was experimentally proven that the model not only accurately predicts the recurrence of the disease, but also generates almost identical centers of tumor foci and their relative areas on the magnetic resonance tomography image. The feasibility of modifying the basic version of Pix2Pix was confirmed by comparing the results of the two models using common quality metrics – precision, recall and their harmonic mean. The modification developed makes it possible to obtain prediction data in the shortest possible time, allowing it to be used in real-time mode.
The work is devoted to the current problem of diagnosing foot deformities, which are characterized by a high incidence among all age groups. Among the objective quantitative methods for diagnosing flatfoot, plantography, based on the assessment of prints of the plantar surface of the foot, has become widespread in clinical practice. The purpose of the study was to evaluate and analyze the effectiveness of methods for automatic assessment of footprints using “computer vision”. The study examines methods for automatic recognition and marking of photoplantograms of the foot using genetic algorithms and neural networks to construct control points of the foot using the example of calculating the indices of the longitudinal and transverse arches of the foot. A comparison was made of the results of calculating flatfoot indices and photoplantograms using manual and automatic markings. It was found that the accuracy of automatic methods for analyzing photoplantograms using genetic algorithms and neural networks is 92–97% in relation to manual marking. At the same time, the time spent on manual marking exceeded the duration of automatic image analysis by 2 - 2.5 times. The results obtained confirmed the possibility of optimizing the diagnostic process when conducting mass (screening) examinations of the condition of the arches of the feet.
The article deals with the formation of a domestic dataset of dermatoscopic images of skin neoplasms of patients, the requirements for metadata and photographs are formed, the existing data sets most popular in the scientific community for building machine learning models for the classification of dermatoscopic images are described. The architecture of the developed platform for data collection of dermatoscopic images of skin neoplasms of patients from the Russian Federation is described.
This article extensively reviews radio wave jamming methods, focusing on their application to disrupt drone signals. It explores the evolution of these techniques, from basic noise-based methods to more advanced systems that target specific communication protocols. The article analyzes key jamming types such as barrage, tone, sweep, and protocol-aware jamming for their mechanisms and efficacy. Each type is discussed in terms of its operational principles, benefits, and limitations, offering a comprehensive understanding of the impact these methods have on drone communications. The review also discusses contemporary counter-jamming strategies, such as frequency hopping, which are increasingly being used to enhance the resilience of drone systems against interference. In addition, the article emphasizes the significant role of software-defined radio (SDR) systems in developing and improving effective drone communication jamming solutions. The flexibility of SDR technology allows for the dynamic adaptation of jamming techniques, making it an important area of research. We aim to improve understanding of SDR-based jamming methods and their practical application by combining theoretical studies with hands-on experiments.
This paper addresses the challenge of single-object tracking on resource-constrained devices, a critical aspect for applications like autonomous drones and robotics. We propose an efficient real-time tracking system that leverages the strengths of transformer-based neural networks in combination with correlation filters. Our research makes several key contributions: first, we conduct a comprehensive analysis of existing object tracking algorithms, identifying their advantages and limitations in resource-constrained environments. Second, we develop a novel hybrid tracking system that seamlessly integrates both neural networks and traditional correlation filters. This hybrid system is designed with a switching mechanism based on perceptual hashing, which allows it to alternate between fast but less accurate correlation filters and slower but more accurate neural network-based algorithms. To validate our approach, we implement and test the system on the Jetson Orin platform, which is representative of edge computing devices commonly used in real-world applications. Our experimental results demonstrate that the proposed system can achieve significant improvements in tracking speed while maintaining high accuracy, thereby making it a viable solution for real-time object tracking on devices with limited computational resources. This work paves the way for more advanced and efficient tracking systems in environments where computational power and energy are at a premium.
ISSN 2220-6426 (Online)