Vol 30, No 6 (2018)
View or download the full issue
PDF (Russian)
7-24
Abstract
Data Center is the most progressive form of providing computing resources when it is necessary to provide services to a wide range of users. The paper discusses one of the approaches to the creation of a data center - the concept of software-defined infrastructure. Software-defined is such a data center infrastructure, in which all its key elements - computing resources, network, data storage systems, etc. - are virtualized and provided to users as services with a given quality. It is shown that the implementation of the proposed infrastructure allows each user to productively solve their tasks in a reasonable time with an acceptable level of costs. The paper formulates the general requirements for inter-departmental data processing centers, proposes methods for creating software-defined data center (deployment of computing platforms that maximize hardware reuse, provide support for the execution of programs of different classes of tasks, etc.) and describes the implementation of infrastructures of this class in specific projects.
25-38
Abstract
This paper describes a new approach for dynamic code analysis. It combines dynamic symbolic execution and static code analysis with fuzzing to increase efficiency of each component. During fuzzing we recover indirect function calls and pass that information to the static analysis engine. This improves static path detection in the control flow graph of a program. Detected paths are used in dynamic symbolic execution to construct inputs which will cover new paths during execution. These inputs are used by the fuzzing tool to improve test-case generation and increase code coverage. The proposed approach can be used for classic fuzzing when the main goal is achieving high code coverage. As well it can be used for targeted analysis of paths and code fragments in the program. In this case the fuzzing tool accepts a set of programs addresses with potential defects and passes them to the static analysis engine. The engine constructs all paths connecting program entry point to the given addresses. Finally, dynamic symbolic execution is used to construct the set of inputs, which will cover these paths. Experimental results have shown that the proposed method can effectively detect different program defects.
39-68
Abstract
A lot of binary code analysis tools do not work directly with machine instructions, instead relying on an intermediate representation from the binary code. In this paper, we first analyze problems in binary code analysis that benefit from such an IR and construct a list of requirements that an IR suitable for solving these problems must meet. Generally speaking, a universal binary analysis platform requires two principal components. The first component is a retargetable instruction decoder that utilizes external specifications for describing target instruction sets. External specifications facilitate maintainability and allow for quickly adding support for new instruction sets. We analyze some of the more common ISAs, including those used in microcontrollers, and from that produce a list of requirements for a retargetable decoder. We then survey existing multi-ISA decoders and propose our vision of a more generic approach, based on a multi-layered directed acyclic graph describing the decoding process in universal terms. The second component of an analysis platform is the actual architecture-neutral IR. In this paper we describe such existing IRs, and propose Pivot 2, an IR that is low-level enough to be easily constructed from decoded machine instructions, and at the same time is also easy to analyze. The main features of Pivot 2 are explicit side effects, SSA variables, a simpler alternative to phi-functions, and an extensible elementary operation set at the core. The IR also supports machines that have multiple memory address spaces. Finally, we propose a way to tie the decoder and the IR together to fit them to most binary code analysis tasks through abstract interpretation on top of the IR. The proposed scheme takes into account various aspects of target architectures that are overlooked in many other works, including pipeline specifics (handling of delay slots, hardware loop support, etc.), exception and interrupt management, and a generic address space model where accesses may have arbitrary side effects due to memory-mapped devices or other non-trivial behavior of the memory system.
69-88
Abstract
SDN-technology is efficiently used for implementing service function chains can be efficiently implemented utilizing common resources and their management principles in virtual networks. The network is based on a connected undirected graph of physical links called usually referred to as resource network connectivity topology (RNCT); graph nodes are network switches and hosts and each host is connected exactly with one switch. Switches operate based on rule tables that are configured by a controller that operates independently of network equipment. The configuration of network switches provides the transmission of packets from the initial to final hosts depending on the values of the packet parameters. The paper discusses the relationship between switch configurations and paths which are created for trasmitting packets depending on RNCT properties. It is shown that, in general, not any configuration of any switch is verifiable. Testing abilities depend on the accepted hypotheses about the switch operating. Two hypotheses are discussed in the paper: the switch hypothesis assumes that the switch operation does not depend on the settings of other switches; a stronger hypothesis about the rule, besides this, assumes that the switch operation according to this rule does not depend on other rules in the configuration of this switch. Section 2 contains preliminaries while Section 3 is devoted to the relationship between switch rules and sets of paths to be implemented. In Section 4, the problem of testing the switch configuration is considered based on the rule hypothesis; a number of statements are estableshed, in particular, the necessary and sufficient conditions of the ability of testing a given rule of a given switch. Section 5 discusses and proves the necessary (but not sufficient) condition and sufficient (but not necessary) condition for checking any switch configuration based on the switch hypothesis. In conclusion, the problems of establishing the necessary and sufficient conditions for verifiability of any switch configuration are discussed.
89-104
Abstract
The paper presents a model-based approach to conformance testing of Extensible Authentication Protocol (EAP) implementations. Conformance testing is the basic tool to ensure interoperability between implementations of a protocol. Using UniTESK technology allows automating the verification of network protocols based on their formal models. Additional applying of mutation testing allows evaluating the robustness of the implementations to receive incorrect packets. We applied the test suite to several implementations of EAP and present brief results. This approach has proved to be effective in finding several critical vulnerabilities and other specification deviations in the EAP implementations.
105-122
Abstract
In this article the on-line problem of scheduling parallel tasks on related computational clusters with processors of different capacities was studied in average case. In the problem the objective is to make a schedule on k clusters with w processors each of N tasks, where the task i,i≤N requires time hi on cluster with nominal capacity of processors and wi≤w processors. We presume for all 1≤i≤N that wi has uniform distribution on (0,w] and that hi has uniform distribution on (0,1]. The processors on different clusters have different capacities v1,…,vk. The task with nominal time hi will require wi processors be computed in time hi/vj on cluster number j. Let sum volume of computations W be the sum of volumes of computations for each task. Let L be the minimal time at which all clusters will compute all the tasks, assigned to them, where each task is assigned to one cluster. The expected value of free volume of computations E(Vsp) is used to analyze the quality of an algorithm, where Vsp=∑1≤i≤kviL-W. It was shown that for every algorithm for scheduling parallel tasks on related clusters E(Vsp)=Ω(w√N). An online scheduling algorithm Limited Hash Scheduling was proposed that distribute tasks to limited areas. This algorithm works in a closed-end mode and has a mathematical expectation of a free calculation volume equal to O(w√(N ln N)). The idea of the algorithm is to schedule tasks of close required number of required processors into different limited in time and the number of allowed to use processors areas on clusters.
123-142
Abstract
At present, big companies such as Amazon, Google, Facebook, Microsoft, Yahoo! own huge datacenters with thousands of nodes. These clusters are used simultaneously by many users. The users submit jobs containing one or more tasks. Task flow is usually a mix of short, long, interactive, batch, and tasks with different priorities. Cluster scheduler decides on which server to run the task, where the task is then run as a process, container or a virtual machine. Scheduler optimizations are important as they provide higher server utilization, lower latency, improved load balancing, and fault tolerance. Achieving good task placement is hard. The problem has multiple dimensions and requires algorithmically complex optimizations. This increases placement latency and limits cluster scalability. In this paper we consider different cluster scheduler architectures and optimization problems.
143-160
Abstract
Memory errors in Linux kernel drivers are a kind of serious bugs that can lead to dangerous consequences but such errors are hard to detect. This article describes static verification that aims at finding all errors under certain assumptions. Static verification of industrial projects such as the Linux kernel requires additional effort. Limitations of current tools for static verification disallow to analyze the Linux kernel as a whole, so we use a simplified automatically generated environment model. This model introduces inaccuracy, but provides ability for verification. In addition, we allow absent definitions for some functions which results in incomplete ANSI C programs. The current work proposes an approach to reveal issues with memory usage in such incomplete programs. Our static verification technique is based on Symbolic Memory Graphs (SMG) with extensions aiming to reduce a false alarm rate. We introduced an on-demand memory conception for simplification of kernel API models and implemented this conception in static verification tool CPAchecker. Also, we changed precision of a CPAchecker memory model from bytes to bits and supported structure alignment similar to the GCC compiler. We implemented the predicate extension for SMG to improve accuracy of the analysis. We verified of Linux kernel 4.11.6 and 4.16.10 with help of the Klever verification framework with CPAchecker as a verification engine. Manual analysis of warnings produced by Klever revealed 78 real bugs in drivers. We have made patches to fix 33 of them.
161-170
Abstract
The Linux operating system is a modern open operating system containing more than 10,000 configuration variables and a large variety of functional system elements that handle the processing of various kinds of tasks. The task is to create some version of the OS for a class of applied systems (medicine, biology, etc.). This task is solved by analyzing the basic functions of the OS kernel and choosing from a variety of elements the most suitable for the operational management of application functions. Based on them, a model of variability is created from the basic characteristics of the OS and the model of the OS variant, including the main functional elements of the OS kernel. These models are tested for the correctness of their identification and relationships with other elements. Then, using these models, the OS version is configured as a configuration file. This file is verified and undergoes comprehensive testing on a set of tests that verify the correct functioning of the operating environment and the processing of tasks of applied systems. This paper discusses how to build a ready-made version of the operating system kernel from source. The preparations, the necessary packages, the patches for them and the ways of their installation will be affected. Then it presents a method for configuring a system version assembled from source and configuring the kernel to run.
171-198
Abstract
It is intuitively clear that the search for scientific publications often has many characteristics of a research search. The purpose of this paper is to formalize this intuitive understanding, explore which research tasks of scientists can be attributed to research search, what approaches exist to solve a research search problem in general, and how they are implemented in specialized search engines for scientists. We researched existing works regarding information seeking behavior of scientists and the special variant of a search called exploratory search. There are several types of search typical for scientists, and we showed that most of them are exploratory. Exploratory search is different to information retrieval and demands special support from search systems. We analyzed seventeen actual search systems for academicians (from Google Scholar, Scopus and Web of Science to ResearchGate) from the exploratory search support aspect. We found that most of them didn’t go far from simple information retrieval and there is a room for further improvements especially in the collaborative search support.
199-220
Abstract
Spread of information is a fundamental process taking place on the Internet. Every day we can observe the publication of various information and its further dissemination through news articles and messages from ordinary users. Although the process itself can be observed explicitly, it is difficult to determine individual propagation paths. The increase of global information in all spheres of human life radically changes the speed and ways of disseminating information. In this review, we study models of information flows on the Internet and divide them into two groups: explanatory models, which suggest the existence of the underlying network over which information propagates, and predictive models, studying spread of individual information flows. Despite all the complexity, the study of the important properties of information spread is necessary for understanding the general processes occurring in the modern information society.
221-236
Abstract
The search and classification of text documents are used in many practical applications. These are the key tasks of information retrieval. Methods of text searching and classifying are used in search engines, electronic libraries and catalogs, systems for collecting and processing information, online education and many others. There are a large number of particular applications of these methods, but each such practical task is characterized, as a rule, by weak formalizability and narrow objectivity. Therefore, it requires individual study and its own approach to the solution. This paper discusses the task of automatically searching and typing text fragments containing biographical information. The key problem in solving this problem is to conduct a multi-class classification of text fragments, depending on the presence and type of biographical information contained in them. After reviewing the related works, the author concluded that the use of neural network methods is promising and widespread for solving such problems. Based on this conclusion, the paper compares various architectures of neural network models, as well as basic text presentation methods (Bag-Of-Words, TF-IDF, Word2Vec) on a pre-assembled and marked corpus of biographical texts. The article describes the steps involved in preparing a training set of text fragments for teaching models, methods for text representation and classification methods chosen for solving the problem. The results of the multi-class classification of text fragments are also presented. The examples of automatic search for fragments containing biographical information are shown for the texts that did not participate in the model learning process.
237-258
Abstract
The rapid growth in the volume of information and the need for its comprehensive analysis lead to the development of new methods of working with multidimensional and space-time data. To work with such data, Qualitative Spatial Reasoning is often used to extract (produce) new knowledge based on facts established in one way or another. Despite the variety of formal analysis systems available, sets of operators do not allow for the identification of complex space-time relationships between objects. Existing software tools, such as SparQ, GQR, QAT, CLP (QS), are focused on analyzing one aspect and are applicable mainly for interval analysis of time series of events and production of conclusions about some spatial relationships. In practice, tools are usually used to analyze simple relationships in simple scenes. The lack of interfaces limits their combined use with quantitative analysis tools, necessary, for example, to establish primary facts, and use within the framework of the concepts of 4D (space-time), 5D (space-time and cost) and multi-D (multidimensional) modeling. This paper proposes a system of topological, metric, directional and temporal operators for complex spatial-temporal analysis of dynamic scenes. This system allows combined usage of qualitative and quantitative analysis methods, which is essential not only for determining initial facts, but also for producing new knowledge based on these facts. The system of operators proposed is deemed prospective for problems of spatial-temporal (4D) modeling and planning of industrial projects, and particularly for specifying and detecting of non-trivial conflicts in project schedules.
259-274
Abstract
The article discusses the possibilities of the open source library SOWFA. The problem-oriented library, operating as part of the open source package OpenFOAM v.2.4.0, is intended for solving wind energy’s problems. In connection with the construction of new wind farms in the Russian Federation (Ulyanovsk region, The Republic of Adygea), the issues of designing and modeling the operation of wind farms and wind turbines are currently relevant. The article describes the structure of the SOWFA library and some of its classes. The study of the dynamics of self-organized turbulent structures and the assessment of their size are important from the point of view of maximizing the power generated by wind turbines, for analyzing the optimal location of wind turbines in wind farm. At the same time, it is necessary to study in detail the process of air’s ejection, the process of displacement of two media, in which one medium, being under pressure, affects the other and carries it in the required direction, in the area of the wind farm. The phenomenon of ejection plays a positive role and allows restoring the velocity’s deficit in the wake of the wind turbine, therefore, affects the wind capacity of wind farm. The phenomenon of ejection can be studied using the motion of solid particles. The article describes an example of adding a new KinematicCloud class to the ABLSolver solver, which describes a kinematic cloud of particles and an example of solving an applied wind energy problem for a model wind farm. The numerical domain for the model wind farm had the shape of a parallelepiped with given dimensions. The unstructured mesh contained 6 million cells. To determine the initial distribution of parameters, we used the neutral atmospheric boundary layer approximation, calculated using the Precursor method, implemented in the ABLSolver solver. The mathematical modeling of the flow parameters in the wind farm was done using the pisoFoamTurbine solver and the Actuator Line Model. In the course of calculation, for the case of a model wind farm with 12 wind turbines, the fields of averaged and pulsation values were obtained for velocity, pressure, subgrid scale viscosity, stress tensor, vorticity. The article compares the values of the dimensionless horizontal velocity in two different sections with the values obtained in the experiment. The calculations were performed using the resources of the high performance cluster of the UniCFD web-laboratory in ISP RAS.
275-292
Abstract
The paper presents a multiscale approach for simulation of the two-phase flows processes in complex technical systems. The multiscale approach is based both on the division of the computational domain into subdomains with their own system of equations, as well as on the splitting of the initial system of equations into several subsystems each is valid to the corresponding scale under consideration. The problem of the far field acoustic noise calculation during the launch of a rocket vehicle is considered as an example of the possible use of a multiscale model. The case is studied with account to noise suppression due to water supply into the gas jets of the propulsion system. Other areas of application of the multiscale model include the cases of the oil and gas industry: killing gas-producing wells located at great depth, killing oil wells with a high gas factor at the fields. The proposed multi-scale mathematical model includes 5 sub-models: 1) gas dynamics of high-speed multicomponent gas mixture flows; 2) the hydrodynamics of a two-phase mixture flow in a homogeneous approximation with the account for the compressibility of the gas phase and the mass exchange between the phases; 3) the liquid-gas interface transport; 4) the transport of a cloud of droplets and its interaction with a gas-liquid medium; 5) noise calculation in the far field using the Ffowks Williams-Hawking acoustic analogy. The model can be extended to include additional sub-models, such as the Eulerian-Lagrange Jet Atomization. The implementation of the submodels can be done on the basis of open source packages: OpenFOAM, Nektar ++, ITHACA-FV. The acoustics library and the hybrid algorithm for compressible homogeneous two-phase flow are implemented as libAcoustics and hybridCentralSolvers modules based on the OpenFOAM open package. The source code of the developed model is freely available through the GiHub project https://github.com/unicfdlab.
293-304
Abstract
Systems of polynomial equations are one of the most universal mathematical objects. Almost all the problems of cryptographic analysis can be reduced to finding solutions to systems of polynomial equations. The corresponding direction of research is called algebraic cryptanalysis. In terms of computational complexity, systems of polynomial equations cover the entire range of possible options, from algorithmic insolubility of Diophantine equations to well-known efficient methods for solving linear systems. The method of Buchberger [ 5] brings a system of algebraic equations to the system of a special type defined by the Gröbner original system of equations, allowing the use of the exception of the dependent variables. The basis for determining the Groebner basis is the permissible ordering on the set of terms. The set of admissible orderings on the set of terms is infinite and even continuum. The most time-consuming step in finding the Groebner basis using the Buchberger algorithm is to prove that all S-polynomials representing a system of generators of K[X]-module S-polynomials. There is a natural problem of finding such a minimal system of generators. The existence of such a system of generators follows from Nakayama's theorem. An algorithm for constructing such a basis for any ordering is proposed.
305-314
Abstract
We consider the problem of using processors with an ARMv8 architecture to speed up the operation of multimedia algorithms and digital processing when solving problems of signal recovery in the filtering process. As an example, the implementation of the algorithm of a digital FIR filter with a linear phase-frequency characteristic is considered. The proposed formula for calculating the filter. The algorithm is optimized using vector SIMD instructions of the ARMv8 architecture. An implementation of the C signal processing algorithm on a BCM2837 chip with an ARM Cortex-A53 processor is presented. The solution provided effective recovery of frequencies distorted when transmitting signals in the audio range and proves the efficiency of using mobile multicore ARMv8 processors for parallel data processing in solving complex computational problems. Experimental results prove that the use of ARMv8 architecture processors when solving signal filtering problems significantly speeds up multimedia and signal processing algorithms such as video encoder / decoder, 2D / 3D graphics, games, sound and speech processing, image processing, telephony and sound.
315-328
Abstract
Numerical simulation plays an important role in the design of new hydraulic units. It allows to minimize the number of expensive experimental tests of products and reduces development time. The flow in pipes, valves, regulators and other hydraulic elements belongs to the internal incompressible flow. Standard numerical methods such as a finite volume method (FVM) and finite element method (FEM) are already successfully used for incompressible internal flows modeling. However, in the case of domains with moving boundaries, these methods are hard to set up and sometimes inefficient. Therefore, now, there is a necessity of search of alternative methods for such class of problems. Requirements for new methods include acceptable accuracy and high computing efficiency. The aim of this study is an overview, testing and comparison different simulation methods for simplest types of internal flow: finite volume method, particle finite volume method (PFVM) and Lattice Boltzmann method (LBM). Different shapes of circular pipes were considered: the straight pipe with the constant area, the step pipe (abruptly increase of the diameter), the backward step pipe (abruptly decrease of the diameter) and the elbow pipe. The velocities and pressure fields, accuracy and simulation time were compared. Next solvers were used in the study: pimpleFoam as the OpenFOAM implementation of FVM, XFlow as the implementation of LBM, and ParticlePimpleFoam as OpenFOAM implementation of PFVM. Four values of the non-dimensional time step (Courant numbers) for PFVM and FVM methods: 1, 2, 5 and 10 were considered.
329-340
Abstract
A half-empiric mathematical model was elaborated to describe polymeric cover outgassing process in open space conditions. The calculation was carried out to estimate film thickness that arises as a result of gaseous products nonequilibrium condensation at a sensitive surface. Source data were: a) experimentally obtained time functions of polymeric cover mass loss at different temperatures; b) the construction element temperature regime data; c) the construction element temperature changes according to its location towards the Sun. To carry out the calculation the construction element and the sensitive surface geometry is considered to be axisymmetric. As a result, the numerical values of the average as well as maximum and minimum film thickness for the whole exploitation period of the spacecraft.
341-366
Abstract
Functional programming plays the big role in the modern computer science and its importance is growing. This is not accidential: this approach helps to create better and more reliable software that is easy to reason about (both manually and automatically). However, these techniques are hardly used in the field of tools helping designing and modeling mission-critical systems. In this paper, we are trying to apply some nice techniques of functional programming to create a modeling system, in particular a simulation system for analysis of temporal behavioural properties of mission-critical systems. As a first step, we designed a representation of simulation time in terms of abstractions used in functional programming and tried to study its compositionability properties.
367-382
Abstract
The paper presents recent results on the way towards accurate and complete verification of industrial operating systems (OS). We consider here OSes, either of general purpose or actively used in some industrial domain, elaborated and maintained for a significant time, and not touching research-related OSes usually developed as a proof-of-concept. In spite of the fact that the stated goal of accurate and complete verification of industrial OS is still unreachable, we consider its decomposition into tasks of verification of various functional OS components and various their properties. The paper shows that many of these tasks can be solved with the help of various modern verification techniques and their combinations. Proposed methods can be lately integrated into an approach to the final goal. The paper summarizes the experience of various OS component and features verification from the projects conducted in ISP RAS in the last years.
ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)
ISSN 2220-6426 (Online)