Vol 26, No 2 (2014)

Configurable Toolset for Static Verification of Operating Systems Kernel Modules

I. S. Zakharov, M. U. Mandrykin, V. S. Mutilin, E. M. Novikov, A. K. Petrenko, A. V. Khoroshilov

PDF (Rus)

5-42

Abstract

An operating system (OS) kernel is a critical software regarding to reliability and efficiency. Quality of a modern OSs kernel is high enough. Another situation is with kernel modules, e.g. device drivers, which due to various reasons have a significantly lower level of quality. One of the most critical and widespread bugs in kernel modules are violations of rules of correct usage of a kernel API. One can identify all such the violations in modules or prove their correctness with help of static verification tools which needs contract specifications describing formally obligations of a kernel and modules with respect to each other. The paper considers existing methods and toolsets for static verification of kernel modules of different OSs. It suggests a new method for static verification of Linux kernel modules that allows to configure checking at each of its stages. The paper shows how this method can be adapted for checking kernel components of other OSs. It describes an architecture of a configurable toolset for static verification of Linux kernel modules, which implements the proposed method, and demonstrates results of its practical application. Directions of further development are considered in conclusion

Graph learning by a set of automata

Igor Burdonov, Alexander Kosachev

PDF (Rus)

43-86

Abstract

Graph learning by automata is a basic task in many applications. Among these applications are verification and testing of software and hardware systems, network exploration including Internet and GRID basing on formal models. The system or network model, in the final analysis, is a transition graph with properties to be examined. In the recent years the size of the systems and networks in everyday use and, consequently, the size of their models and graphs to be learned grows constantly. Problems arise when graph exploration by a single automaton (computer) either requires a lot of time, or a lot of computer memory, or both. Therefore, there is a task of parallel and distributed graph exploration. This task is formalized as graph learning by a set of automata (a set of computers with sufficient total memory working in parallel). This paper covers the solution of this task. First of all the behavior of the automaton on the graph is formalized. Then the algorithm of the graph learning is described in detail. The estimation of the algorithm is O ( m + nD ). Some ways of possible optimization is discussed.

Description of hardware configurations of guest systems in QEMU emulator as separate text files

O. V. Goremykin

PDF (Rus)

87-96

Abstract

QEMU is an open-source full system emulator based on the dynamic binary translation approach. QEMU emulates a complete hardware environment including CPUs, peripheral devices (VGA cards, network interfaces, IDE controllers, sound and USB components, and other). Guest system hardware configuration initialized at QEMU startup. During initialization phase QEMU creates virtual CPUs, peripheral devices and memory hierarchy. Hardware configuration is a part of QEMU source code. Being written in C language hardware configuration can be difficult to edit. Also, QEMU must be recompiled after any change in hardware configuration like addition of a new device. A method of the guest system hardware configuration description as an external file is discussed in this paper. JSON format was selected for configuration files. QEMU has parser for this format readily available because JSON is used by QEMU Monitor Protocol. Each file describes hardware configuration of one board. It consists of sequence of JSON objects. Each object describes properties one peripheral device, bus, CPU or memory segment. Each object has a unique ID associated with it and can reference other objects by their IDs. An arbitrary board configuration can be described with this format. The proposed method was implemented in QEMU and tested on ARM guest systems. The method is generic and can be applied to any board with any CPU architecture.

Interprocedural taint analysis for LLVM-bitcode

V. K. Koshelev, A. O. Izbyshev, I. A. Dudina

PDF (Rus)

97-118

Abstract

Today the development cycle of many application classes requires a security analysis stage. Taint analysis is widely used to check programs for different security vulnerabilities. This paper describes static interprocedural flow, context, and object-sensitive taint analysis approach for C/C++ applications. Our taint analysis algorithm is based on the Flowdroid project’s approach, but in contrast to Flowdroid, which aims to analyze Java bytecode, our approach handles LLVM bitcode and pointer arithmetic. Primary drawback of the Flowdroid approach is a memory usage issue which arises during analysis of medium size applications (around 10 000 edges in the call graph). To achieve scalability of the approach, we suggest a set of heuristics which helps to significantly decrease memory usage of the algorithm. The testing of real-world applications shows that such heuristics make precise taint analysis suitable for the medium size programs. Using our approach, we implemented general taint analysis framework as an LLVM pass. Additional security checks (e.g. Use of Hard-coded Password, Information Exposure, etc.) can be implemented on top of this framework. We have also developed auxiliary passes which resolve targets of virtual calls and build interprocedural control flow graph according to the results.

Using Deterministic Replay for Software Fault Injection

P. M. Dovgalyuk, Y. V. Markin

PDF (Rus)

119-136

Abstract

This paper presents method of improving software fault injection by using deterministic replay. Fault injection and fuzzing are the methods of testing used for checking code coverage quality, improving error handling, and robustness testing. Fuzzing can hardly be applied for stateful communication protocols because of program state could change when restarting an application. The main idea of our method is to inject faults while replaying program deterministically. Deterministic replay requires program execution recording for latter replaying. Recorded log includes user input, incoming network packets, USB input, and hardware timers. During replay we read these events from the log and put them back into the simulator instead of reading inputs or receiving packets from the network. After injecting the fault in replay mode the program execution is different. It means that we should stop the replaying and start normal program execution from that program state. During the execution we simulate all hardware timers to make this mode switching imperceptible to the program. With the help of deterministic replay we can accelerate system initialization, eliminate non-deterministic data sources effect, and simplify environment setup, because the whole program execution before injecting fault is recorded. On the basis of the method the network fuzzer was built. The fuzzer modifies selected network packet saved during session recording and sends it back into the simulator. This phase is repeated from the same program state until the bug in the program was found.

Application of information technologies (genetic algorithms, neural networks, parallel calculations) in safety analysis of Nuclear Power Plants

Yu. B. Vorobyev, P. Kudinov, M. Jeltsov, K. Kööp, T. V. Nhat

PDF (Rus)

137-158

Abstract

This paper investigates important issues in three types of safety assessment methodologies commonly applied for Nuclear Power Plants (NPP). These methodologies are i) dynamic probabilistic safety assessment (DPSA) where application of genetic algorithm (GA) is shown to improve the efficiency of the analysis, ii) deterministic safety assessment (DSA) with meta model representation of the system using pre-performed computational fluid dynamics (CFD) code and iii) vulnerability search (e.g. identification of accident scenarios in an NPP) with application of neural network (NN). The use of advanced computational tools and methods such as genetic algorithms, neural networks and parallel computations improve the efficiency of safety analysis. To achieve the best effect, these advanced technologies are to be integrated with existing classical methods of safety analysis of the NPP.

Survey of polygonal surface simplification algorithms on GPU

V. I. Gonakhchyan

PDF (Rus)

159-174

Abstract

Rendering time depends on number of faces in polygonal mesh. Displaying large number of objects or objects with a lot of faces can result in performance degradation. One of the ways to overcome this problem is to use several levels of detail for polygonal mesh. Levels of detail with low quality are used when object is far away from camera and difference in number of faces is not apparent to viewer. Coarse meshes can be generated manually which is difficult and time consuming. For many applications (computer graphics, computer vision, finite element analysis) it's preferable to use automated methods for polygonal surface simplification to increase performance. Traditional methods for surface simplification are designed for CPU. GPU methods provide improvements in performance and have comparable visual quality. This survey covers algorithms on GPU. They are based on either shader programming or general purpose computing frameworks like OpenCL and CUDA. Shader methods are very restrictive. They are designed to render images in real-time. General purpose computing frameworks provide flexible APIs but require attention to hardware implementation details to avoid performance bottlenecks. Main features of algorithms are provided for comparison as the result. The most used operation for geometry simplification is edge collapse. It's difficult to avoid expensive data exchange between GPU and CPU. It's important to take into account computational model of GPU to increase number of stream processors working in parallel.

Effective spatio-temporal indexing methods for visual modeling of large industrial projects

V. A. Zolotov, V. A. Semenov

PDF (Rus)

175-196

Abstract

As opposed to traditional project planning and scheduling methods, 4D modeling technologies allow performing more comprehensive analysis taking into account both spatial and temporal factors. Such analysis enables to identify critical issues in early design and planning stages and to save significantly on the total project implementation costs. Unsurprisingly that there is a large interest to these emerging technologies and both free and commercial systems have been recently developed and applied in different industrial domains. However, being applied to large industrial projects and programs, most systems face significant problems relating to performance degradation and limited interactive capabilities what ultimately neglects the key benefits of visual 4D modeling technologies. In this paper a promising approach to rising up the performance and scalability of 4D modeling applications by means of indexing project data has been discussed. Three alternative indexing schemes have been proposed and described in conformity to the features of spatio-temporal coherence of arising dynamic scenes. These schemes have been investigated against the efficiency criteria to resolve typical requests connected with deployment and updates of indexes, reconstruction of scenes, analysis of visible objects, and animation of dynamic scenes. Conducted computational complexity analysis and obtained asymptotic estimates confirm the efficiency of the proposed indexing schemes and cause their introduction in industrial practice. Recommendations on their practical use have been also provided.

A combined method for verification of large-scale data models

V. A. Semenov, S. V. Morozov, D. V. Ilyin

PDF (Rus)

197-230

Abstract

The paper is addressed to the actual problem of verification of large-scale data models applied in various industrial areas and specified using popular general-purpose object-oriented languages, such as EXPRESS, UML/OCL. Main benefits of information modeling languages (high expressiveness, declarative nature, advanced set of syntactic units) negatively affect the process of automatic verification of the specifications. The known approaches are based on reduction of the original complex problem to some well-known mathematical statement and its solution by existing methods. The performed analytical survey of the existing methods for model verification demonstrates that they cannot be used for solving the problem because of their high computational complexity. A combined method for verification of large-scale data models is proposed in the paper. The method is based on sequential reduction to the several mathematical problem statements: linear programming, constraint satisfaction problem (CSP), Boolean satisfiability (SAT). Usage of the combined method allows to avoid efficiency issues peculiar to the known approaches. At the first step the polynomial complexity methods of integer linear programming are applied to the original large-scale problem and localize the search region for solution by detection of the necessary amount of objects. At the next steps constraints imposed onto relatively small groups of objects can be considered individually, which allows to reduce significantly dimension of the problem. The key problem of estimation of the number of instances intended for generation of correct object collection and its reduction to an integer linear programming problem is investigated in detail. The performed experiments demonstrate prospectivity of the combined computational strategy and efficiency of the proposed method for verification of large-scale data models. The work is supported by RFBR (grant 13-07-00390).

Lowing ambiguity level in object state estimation in a case-based control system

L. E. Karpov, V. N. Yudin

PDF (Rus)

231-244

Abstract

This work continues the publication series that is devoted to the investigation of methods of controlling complex objects basing on cases. In situation when it is hard or even impossible to use the exact mathematical model of object behavior the case-based reasoning control method becomes adequate. Instead of using the model we may use the accessible information about object under control state, controlling actions and their results. This means that we are using “cases”. In the theory of deduction this idea corresponds to three constituents of the case-based reasoning: description of a problem, solution or action applied, and outcome that is the result of applying the solution. This approach is based on separating of object states into classes, in each of which all the states are equivalent to each other. After that the object under control state is comparing with cases in the case base which was collected in advance. In order to choose the controlling action for the current case some metrics of similarity is used to find a case with similar initial and final states in the case base. The controlling action is extracted from this case then. The way of estimating of similarity of the current case and its precedent is critical for a system that uses the case-based reasoning technique. In previous publications the method was presented that used to separate cases from the case base into classes of equivalence. To estimate not fully described case the projection of classes onto the features space was used. Incompleteness in object description brings to the ambiguity in object estimation especially while one has the lack of time and resources. Classes in the intersection of which the object appears, are forming so called differential set of the object. If there is no feature that may separate the classes the choice of controlling action is hardly being made. The method offered by authors helps to low the ambiguity level on the basis of object prehistory investigation (previous states found and actions used). As a result of taking the prehistory into account the number of possible choices of actions is decreasing. The way of lowing the ambiguity level is described in the article. The notion of “prognostic” differential set of object states is presented. The final state estimation is produced on the basis of comparison of the current and prognostic sets.

Двусторонняя унификация программ и ее применение для задач рефакторинга

T. A. Novikova, V. A. Zakharov

PDF (Rus)

245-268

Abstract

It is generally accepted that to unify a pair of substitutions θ_1 and θ_2 means to find out a pair of substitutions η' and η'' such that the compositions θ_1 η' and θ_2 η'' are the same. Actually, unification is the problem of solving linear equations of the form θ_1 X=θ_2 Y in the semigroup of substitutions. But some other linear equations on substitutions may be also viewed as less common variants of unification problem. In this paper we introduce a two-sided unification as the process of bringing a given substitution θ_1 to another given substitution θ_2 from both sides by giving a solution to an equation Xθ_1 Y=θ_2. Two-sided unification finds some applications in software refactoring as a means for extracting instances of library subroutines in arbitrary pieces of program code. In this paper we study the complexity of two-sided unification for substitutions and programs. In Section 1 we discuss the concept of unification on substitutions and programs, outline the recent results on program equivalence checking that involve the concept of unification and anti-unification, and show that the problem of library subroutine extraction may be viewed as the problem of two-sided unification for programs. In Section 2 we formally define the concept of two-sided unification on substitutions and show that the problem of two-unifiability is NP-complete (in contrast to P-completeness of one-sided unifiability problem). NP-hardness of two-sided unifiability problem is established in Section 4 by reducing to it the bounded tiling problem which is known to be NP-complete; the latter problem is formally defined in Section 3. In Section 5 we define two-sided unification of sequential programs in the first-order model of programs supplied with strong equivalence (logic&term equivalence) of programs and proved that this problem is NP-complete. This is the main result of the paper.

A Threshold Cryptosystem in Secure Cloud Computations

N. P. Varnovskij, S. A. Martishin, M. V. Khrapchenko, A. V. Shokurov

PDF (Rus)

269-274

Abstract

Information security in cloud computing technology is actively investigated by the world scientific community. They uses the internet and the central remote servers to provide and maintain data as well as applications. This users' data files can be accessed and manipulated by any other users. So the problem of secure data storage and computation is actual. The modern studies in this field shows that the indicated problem is much more complex than any of the other information security problems, which are solved by well-known cryptographic methods. So, for example M. van Dijk and A. Juels in the paper "On the impossibility of cryptography alone for privacy-preserving cloud computing" described a mathematical model of the organization of cloud computing and proved that in the case of two users information protection is impossible. This result refutes the well-established point of view that the recently proposed by C. Gentry construction for fully homomorphic encryption solves at least theoretically, all the problems of information security in cloud computing. We offer an alternative model of cloud computing, in which the specified negative result does not holds. It differs from the above in the point that each subject interested in privacy, creates his own crypto server. From the point of view of users these cryptoservers are the part of the cloud. The methods of information protection, using threshold cryptosystem in this new model are investigated.

Syntactical characterization of nondeterministic logspace complexity class

D. A. Nosov

PDF (Rus)

275-296

Abstract

NL is defined as the class of languages recognizable by logspace nondeterministic Turing machines. One of the main unsolved problems in complexity theory is that of relation between classes P and NL. It is known that NL is contained in P, since there is a polynomial-time algorithm for 2-satisfiability, but it is not known whether NL = P or whether NL = L. A possible approach to these problems can be based on searching for an alternative suitable definition of the class NL. Taitslin et al. propose such a definition in terms of nondeterministic programs. The syntax of such programs is similar to that of usual computer programs. Each nondeterministic program takes a finite universal algebra as input. Taitslin et al. defined a class of languages recognizable by such programs and proved that this class is a subclass of NL. In the present paper, we slightly modify their syntactical definition. Namely, we modify a definition of nondeterministic program input and give a new definition of a language recognized by a given program. We prove that the class of languages recognizable by nondeterministic programs according to our definition is just NL.

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Cookies policy