The article presents a survey of software dynamic analysis methods. The main focus of the survey is on methods supported by tools, targeted on software security verification and applicable to system software. The survey examines in detail fuzzing and dynamic symbolic execution techniques. Dynamic taint data analysis is excluded due to difficulty of gathering technical details of its implementation. Review of fuzzing and dynamic symbolic execution is focused mostly on the techniques used in supporting tools, not on tools themselves, because their number exceeds 100 already. Also, the techniques of fuzzing counteraction are surveyed.
This paper presents a summary of experience in developing the deep packet inspection system using full protocol decoding. The paper reviews the challenges encountered during implementation and provides a high-level overview of the solutions to these issues. The challenges can be grouped into two groups. The first group is related to the fundamental tasks which must be addressed when implementing full protocol decoding systems. This includes ensuring correct protocol parsing, which involves identifying and interpreting protocol headers and fields correctly. Moreover, it is necessary to ensure the processing of fragmented packets and the assembly of fragments into the original message. Additionally, the processing and analysis of encrypted traffic is a crucial task that may require the use of specialized algorithms and tools. The second group of problems is related to optimizing the process of full protocol decoding to ensure high-speed traffic processing, as well as supporting new protocols and the ability to add user-defined extensions. While there are open-source systems that address some of the primary issues associated with full protocol decoding, there may be a need for additional effort and specialized solutions to efficiently operate and expand the functionality of such systems. Although implementing deep network traffic analysis tools using full protocol decoding requires the use of advanced hardware and software technologies, the benefits of such analysis are significant. This approach provides a more complete understanding of network traffic patterns and enables more effective detection and prevention of cyber-attacks. It also allows for more accurate monitoring of network performance and the identification of potential bottlenecks or other issues that may impact network efficiency. In this article, we also emphasize the importance of system architecture development and implementation to ensure the successful deployment of deep network traffic analysis tools using full protocol decoding. At last, we conducted an experiment where several advanced optimizations were implemented in the system that had already solved primary issues. These optimizations related to working with memory, based on the features of the traffic processing scheme. By results, we evaluated significant performance improvement in solving secondary tasks, described in this work.
The paper discusses the issues of applying deep learning methods for detecting computer attacks in network traffic. The results of the analysis of relevant studies and reviews of deep learning applications for intrusion detection are presented. The most used deep learning methods are discussed and compared. The classification system of deep learning methods for intrusion detection is proposed. Current trends and challenges of applying deep learning methods for detecting computer attacks in network traffic are identified. The CNN-BiLSTM neural network is synthesized to assess the applicability of deep learning methods for intrusion detection. The synthesized neural network is compared to the previously developed model based on the use of the Random Forest classifier. The usage of the deep learning method enabled to simplify the feature engineering stage, and evaluation metrics of Random Forest and CNN-BiLSTM models are close. This confirms the prospects for the application of deep learning methods for intrusion detection.
Big data management systems are in demand today in practically all industries, and they are also the foundation for artificial intelligence training. The use of heterogeneous poly-stores in big data systems has led to the fact that tools within the same system have different data granularity and access control models. Harmonization of such components by the security administrator and implementation of common access-policy is now done manually. This leads to an increasing number of customization vulnerabilities, which in turn serves as a frequent cause of data leaks. Analysis of works in the area of automation and analysis of access control in big data systems shows the lack of automation solutions for poly-store based systems. This paper poses the problem of automating the analysis of access control analysis in big data management systems. The authors formulate the main contradiction, which consists, on the one hand, in the requirement of scalability and flexibility of access control, and on the other hand - in the growth of the burden on the security administrator, aggravated by the use of different data models and access control in the system components. To solve this problem, we propose a new automated method for analyzing security policies based on a graph model of data processing, which reduces the number of possible vulnerabilities resulting from incorrect administration of big data systems. The proposed method uses the data life cycle model of the system, current settings and the desired security policy. The use of two-pass analysis (from data sources to recipients and back) allows to solve two tasks: analyzing the access control system for possible vulnerabilities and checking compliance with correctness of business logic. The paper gives an example of analysis of security policies of the big data management system using the developed software prototype and analyzes the obtained results.
Some modern approaches to detecting defects in printed circuit boards based on automatic optical inspection are considered in order to design their own control system. The importance of the control process is growing in connection with the tightening of the requirements imposed by modern production processes. At the enterprises of mass production of electronics, attempts are being made to achieve high quality of all parts, assemblies and finished products. The optical inspection system is one of the most important tools for automating the visual inspection of printed circuits. In addition to ensuring cost efficiency and product quality control, an automated control system can also collect statistical information to provide feedback to the production process. The review considers algorithms and methods for automated optical control of the conductive pattern on the surface of printed circuit boards in order to find the optimal method for detecting defects.
In this paper, the task of embedding computer visualization, performed using the Vulkan API, into OpenGL-based software complexes, is considered. A low-level hybrid approach to implement the collaboration of two APIs within the same application is described, as well as, the organization and synchronization of access to shared resources. The technology is proposed, which "encapsulates" the hybrid approach in a separate library module (VK-capsule) with a high-level interface that is dynamically linked to the executable module of OpenGL-complex (GL-visualizer). The paper describes methods for construction of the interface and connection of the VK-capsule, providing minimal intrusion into GL-visualizer. Based on the proposed methods and technology, a prototype of modular software complex implementing hybrid Vulkan-OpenGL visualization was developed. The approbation of the created complex was carried out, which confirmed the adequacy of the proposed solutions to the task assigned and the possibility of using them to expand the capabilities of visualization systems built on the OpenGL.
A quite interpretable linear regression satisfies the following conditions: the signs of its coefficients correspond to the meaningful meaning of the factors; multicollinearity is negligible; coefficients are significant; the quality of the model approximation is high. Previously, to construct such models, estimated using the ordinary least squares, the QInter-1 program was developed. In it, according to the given initial parameters, the mixed integer 0-1 linear programming task is automatically generated, as a result of which the most informative regressors are selected. The mathematical apparatus underlying this program was significantly expanded over time: non-elementary linear regressions were developed, linear restrictions on the absolute values of intercorrelations were proposed to control multicollinearity, assumptions appeared about the possibility of constructing not only linear, but also quasi-linear regressions. This article is devoted to the description of the developed second version of the program for constructing quite interpretable regressions QInter-2. The QInter-2 program allows, depending on the initial parameters selected by the user, to automatically formulate for the LPSolve solver the mixed integer 0-1 linear programming task for constructing both elementary and non-elementary quite interpretable quasi-linear regressions. It is possible to set up to nine elementary functions and control such parameters as the number of regressors in the model, the number of signs in real numbers after the decimal point, the absolute contributions of variables to the overall determination, the number of occurrences of explanatory variables in the model, and the magnitude of intercorrelations. In the process of working with the program, you can also control the number of elementary and non-elementarily transformed variables that affect the speed of solving the mixed integer 0-1 linear programming task. The QInter-2 program is universal and can be used to construct quite interpretable mathematical dependencies in various subject areas.
This paper presents the results of the development of a numerical model of the Lagrangian particle transport and the application of parallel computation methods to increase the efficiency of the software implementation of the model. The model is a software package allowing calculations of transport and deposition of aerosol particles taking into account the properties of particles and input data describing atmospheric conditions and the underlying surface geometry. The dynamic core, physical parameterizations, numerical implementation, and algorithm of the model are described. Initially, the model has been used for computationally low-intensive problems. In this paper, given the need to use the model in computationally intensive problems, we conduct optimization of the sequential software implementation of the model, as well as creation of software implementations of the model with the use of parallel computing technologies OpenMP, MPI, CUDA. The results of testing of different implementations of the model show that optimization of the most computationally complex blocks in the sequential version of the model can reduce the execution time by 27%, at the same time the use of parallel computing technologies allows to achieve acceleration by several orders of magnitude. The use of OpenMP in dynamic block of the model resulted in acceleration of block up to 4 times, the use of MPI – up to 8 times, the use of CUDA – up to 16 times, all other conditions being equal. Recommendations on the choice of parallel computing technology depending on the properties of the computing system are proposed.
In this study the object of analysis is a set of case morphemes of nouns, identification of which draws on the semantics they mark. The range of meanings of these morphemes allows us to combine them into a group of semantic cases in the Vakh Khanty language, as opposed to the group of syntactic case markers. In the dialect under study the category of nominal cases is actively discussed in connection with the controversial issues regarding the terminology used, composition, quantity, morphemic status and functional features of case markers. Using the latest field data on this dialect collected in the village of Korliki in 2019, we were able to compare field data and data already known in Khantalogy thus systematizing the case category of this dialect. Field data of more than 10,000 words was processed using the functions of the LingvoDoc platform.
Code comments are an essential part of software documentation. Many software projects suffer the problem of low-quality comments that are often produced by copy-paste. In case of similar methods, classes, etc. copy-pasted comments with minor modifications are justified. However, in many cases this approach leads to degraded documentation quality and, subsequently, to problematic maintenance and development of the project. In this study, we address the problem of near-duplicate code comments detection, which can potentially improve software documentation. We have conducted a thorough evaluation of traditional string similarity metrics and modern machine learning methods. In our experiment, we use a collection of Javadoc comments from four industrial open-source Java projects. We have found out that LCS (Longest Common Subsequence) is the best similarity algorithm taking into account both quality (Precision 94%, Recall 74%) and performance.
An erosion mathematical model of a sand channel coastal slope, which occurs under the influence of passing flood wave, is formulated. The model includes the equation of motion of a quasi-steady hydrodynamic flow in the channel section. The movement of the bottom and bank surface of the channel is determined from the solution of the Exner equation, which is closed by an original analytical model of the movement of traction sediments. The model takes into account transit, gravitational and pressure mechanisms of movement of bottom material, and does not contain phenomenological parameters. The movement of the free surface of a hydrodynamic flow is determined from the interpolation of experimental data. The model takes into account changes in the average turbulent viscosity along the alignment when the channel alignment changes. The influence of quasi-steady hydrodynamic flow on mass loss in the channel section was studied. A criterion has been introduced to determine the disequilibrium of the channel flow. It is shown that to model channel deformations in this case, it is necessary to take into account a non-zero gradient of sediment movement along the channel axis. Numerical calculations have been carried out demonstrating the qualitative and quantitative influence of these features on the process of determining the turbulent viscosity of the flow and the erosion of the coastal slope of the channel. Data comparison on coastal deformations obtained as a result of numerical calculations with known flume experimental data showed their good agreement.
The study of formal stability of equilibrium positions of a multiparametric Hamiltonian system in a generic case is traditionally carried out using its normal form under the condition of the absence of resonances of small orders. In this paper we propose a method of symbolic computation of the condition of existence of a resonance of arbitrary order for a system with three degrees of freedom. It is shown that this condition for each resonant vector can be represented as a rational algebraic curve. By methods of computer algebra the rational parametrization of this curve for the case of an arbitrary resonance is obtained. A model example of some two-parameter system of pendulum type is considered.
ISSN 2220-6426 (Online)