Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 30, No 5 (2018)
View or download the full issue PDF (Russian)
7-30
Abstract
Formation of Informatics and aspects of computer software development, in particular, operating systems and information systems since the period of appearance of the first computers of 1948-1990 is considered. The program of informatization of Russia in 1992 and the development of the fundamentals of modern computer science or informatics and the intellectualization of the development of various types of software systems.
31-54
Abstract
Providing security for computer programs is one of the paramount tasks nowadays. Failures in operation of program software can lead to serious consequences and exploitation of vulnerabilities can inflict immense harm. Large corporations pay particular attention to the analysis of computer security incidents. Code-reuse attacks based on return-oriented programming are gaining more and more popularity each year and can bypass even modern operating system protections. Unlike common shellcode, where instructions are placed consequently in memory, ROP chain contains of several small instruction blocks (gadgets) and uses stack to chain them together, which makes analysis of ROP exploits more difficult. The main goal of this work is to simplify reverse engineering of ROP exploits. In this paper I propose the method for analysis of code-reuse attacks, which allows one to split chain into gadgets, restore the semantics of each particular gadget, and restore prototypes and parameters values of system calls and functions called during the execution of ROP chain. Parametrized types define gadget semantics. Each gadget type is defined by a postcondition (boolean predicate) that must always be true after executing the gadget. The proposed method was implemented as a program tool and tested on real ROP exploits found on the internet.
55-74
Abstract
Many buffer overrun errors in C programs are caused by erroneous string manipulations. These can lead to denial of service, incorrect computations or even exploitable vulnerabilities. One approach to eliminate such defects in the course of program development is static analysis. Existing static analysis methods for analyzing strings either produce many false positives, miss too many errors, scale poorly, or are implemented as a part of a proprietary software. To cover a significant amount of the real program defects it is necessary to detect errors that could happen only on a particular program path and cannot be defined by a single erroneous point. Also, it is essential to find misusage of library functions and user functions. The aim of this study is to develop a detection algorithm that will cover such cases, will produce at most 40% false warnings, will be applicable to any C programs without any additional restrictions, and will scale up to millions of lines of code. We have extended our approach of symbolic execution with state merging to support string manipulations. Also we have developed a string overflow detector based on our buffer overflow approach with integer indices. The new algorithm was implemented in the Svace static analyzer. As a result, the coverage of the buffer-overflow related testcases from the Juliet test suite has increased from 15.4% to 41.5% with zero false positives. Also we have compared our Juliet results to those of the Infer static analyzer. The basic Svace version (without string support) is on par with Infer except for the flow variant of complex loops, whereas string-related buffer overflows are not detected by Infer.
75-88
Abstract
The article discusses a new approach to obtaining additional information about the software module under study based on the preliminary software architecture recovery during the executable code analysis. As a result, it is possible to reduce the requirements for the resources spent by limiting the field of research, rational choice of priorities, and abstraction from secondary elements. The paper demonstrates the feasibility of restoring the software architecture in a two-step process: first, the separate components are isolated, and then their purposes and relationships are determined. An automated method for decomposing a software module is proposed, which allows allocating components corresponding to static libraries, classes, and their groups. This method is based on the functions clustering by the distances between them in the address space and on the call graph. A description of the implementation of the developed method as a plug-in for the IDA disassembler is given.
89-100
Abstract
This paper describes the developed platform for static analysis of binary code. The platform is developed based on interprocedural, flow-sensitive and context-sensitive analysis of the program. The machine-independent language REIL is used as an intermediate representation. In this representation basic data flow analyzes are developed and implemented - reaching definitions analysis, construction of DEF-USE and USE-DEF chains, analysis for deletion of dead code, value analysis, taint analysis, memory analysis and etc. The implemented approach for functions’ annotations allow propagating data between function calls, thereby making the context-sensitive analysis. The platform provides an API for using all implemented analyzes, which allows adding new analyzes as plugins.
101-108
Abstract
The paper proposes an approach to monitoring file operations through capturing virtual disk accesses in the emulator. This method allows obtaining information about file operations in the OS-agnostic manner but requires a separate implementation for each file system. An important problem for implementing this approach is the correct handling of changes in the file system. Operating systems that cache write requests can perform operations in any order. The authors have created a method for detecting read, write, create, delete and rename operations, and a module for QEMU, which monitors operations in the ext3 file system. The advantage of this method over others is that it does not interfere with the operation of the OS and does not depend on it. It is assumed that the QEMU module for file systems other than ext2/3 can be implemented using the methods described in this article.
109-122
Abstract
The article discusses ways to get the content of files, which are modified during the processing in the well-known open source dynamic analysis environment Drakvuf. Drakvuf initially implemented file saving functionality based on the use of undocumented mechanisms for working with the system cache. The author of this article proposes a new approach to obtaining the content of files on Microsoft Windows family systems using Drakvuf. The proposed approach is based solely on the use of the public interface of the kernel by the hypervisor and provides portability between different versions of the operating system. In the conclusion of the article, the advantages and disadvantages of both approaches are presented, and directions for further work are proposed.
123-146
Abstract
The article presents models and algorithms to support end-to-end quality control of complex software and hardware systems through the implementation of the software-controlled process of development and verification of formal models of requirements and architecture of such systems, Firstly, we give the analysis of scientific publications and the normative-methodical base in the field of development and application in practice of the model-based approach is given. We establish that least provided by model, algorithmic and software solutions are issues related to the development of a complete and correct set of requirements, as well as the formalization and verification of technical projects of software and hardware systems. To solve the existing problems, we propose to develop a special unified environment for the development, modeling and testing formal models of requirements and architecture of complex software and hardware systems. These models provide an optimal set of interconnected fUML diagrams presented in ALF notation and verified in the fUML virtual machine and using SMT/SAT solvers.
147-162
Abstract
The article describes TLA+ access control model specification for computer systems, ensuring compliance with the mandatory integrity and confidentiality monitoring requirements with considering memory-based covert channels. The distinctive feature of the model is taking into account the characteristics of the lifecycle of electronic documents and their operating procedure. To specify the access control model, Lamport's Temporal Logic of Actions language was chosen (TLA+). Its notation seems to be the closest to generally accepted mathematical notation and its expressive capabilities and tools allow describing and verifying systems specified as finite automata. The following actions are defined in the model: create/delete a subject, read, write, append (blind write), create/delete an object, grant/remove access rights, include an object, exclude a nested object, approve an object (document), archive an object (document), cancel an approved object (document), copy an object (document). The following invariants are also defined: the type invariant (includes checking the compliance of all fields of the object, the compliance of the subject type, the uniqueness of the subject and object identifiers) and the safety invariant (includes checking the confidentiality and integrity labels of the interacting subjects and objects, the correctness of rights assignment procedures).
163-176
Abstract
Requirements play an important role in the process of safety critical software development. To achieve reasonable quality and cost ratio a tool support for requirements management is required. The paper presents a formal definition of a metamodel that is used as a basis of Requality requirements management tool. An experience of implementation of the metamodel is discussed.
177-196
Abstract
This paper presents our experimental work on neural network models for entity-level adverse drug reaction (ADR) classification. Aspect-level sentiment classification, which aims to determine the sentimental class of a specific aspect conveyed in user opinions, have been actively studied for more than 10 years. In the past few years, several neural network models have been proposed to address this problem. While these models have a lot in common, there are some architecture components that distinguish them from each other. We investigate the applicability of neural network models for ADR classification. We conduct extensive experiments on various pharmacovigilance text sources including biomedical literature, clinical narratives, and social media and compare the performance of five state-of-the-art models as well as a feature-rich SVM in terms of the accuracy of ADR classification.
197-212
Abstract
A pure second-order scheme of quasi-characteristics based on a pyramidal stencil is applied to the numerical modelling of non-stationary two-phase flows through porous media with the essentially heterogeneous properties. In contrast to well-known other high-resolution schemes with monotone properties, this scheme preserves a second-order approximation in regions, where discontinuities of solutions arise, as well as monotone properties of numerical solutions in those regions despite of well-known Godunov theorem. It is possible because the scheme under consideration is defined on a non-fixed stencil and is a combination of two high-order approximation scheme solutions with different dispersion properties. A special criterion according to which, one or another admissible solution is chosen, plays a key role in this scheme. A simple criterion with local character suitable for parallel computations is proposed. Some numerical results showing the efficiency of present approach in computations of two-phase flows through porous media with strongly discontinuous penetration coefficients are presented.
213-234
Abstract
State-of-the-art models of dispersion of contamination in the urban environment and industrial areas employ a CFD approach in order to calculate turbulent characteristics of flow around buildings with complex geometry. The main area of application of these models is to facilitate licensing of potentially hazardous facilities and assessment of meteorological conditions in the urban environment. The usage of the most popular commercial CFD software with regard to modelling flows in the urban environment is significantly limited by the requirement for computational mesh refinement near the surface of the building in order to adequately resolve the characteristic scales in the viscous and buffer sublayers. On the other hand, models based on the traditional gaussian approach cannot take into account the complex aerodynamic effects in order to calculate turbulent characteristics of flow around buildings with complex geometry, including the subtleties concerning atmospheric emissions of gas-aerosol substances. Therefore, the authors developed a robust, highly specialized CFD-RANS model and a calculation code for modelling the atmospheric dispersion of contamination under conditions of a complex three-dimensional geometry that do not require mesh refinement. The authors verified this model using extensive database obtained both in the course of field experiments as well as of wind tunnel experiments. The verification results showed that the developed model satisfies the acceptance criteria for the quality of modelling along with foreign general-purpose codes and highly specialized codes.
235-248
Abstract
The paper shows the results of assessing the possibilities of computational fluid dynamics for predicting the motions of a ship with a moonpool and vertical water motions in a moonpool in regular head waves with a zero ship speed. A moonpool is a well which is used in different types of ships such as cable laying, drill and FPSO, survey, research and so on. This well is used for launching and lifting of different devices, divers, rescue bells, cables and risers, which are protected from outboard wind and waves. The results of the numerical simulation in the OpenFOAM software of heave and pitch motions of the DTMB 5415 model in regular head waves with and without ship speed show good agreement with the experimental data. The experiment was organized with a series 60 model, which was equipped with different moonpool shapes modules in regular head waves with a zero ship speed for determining heave and pitch RAOs and vertical water motions in the moonpool. The results do not show any influence of the moonpool for heave and pitch ship motions. These data are necessary for numerical simulation verification. The results of the numerical simulation of experimental research show good agreement, which means the good efficiency of computational fluid dynamics in heave and pitch motions and vertical water motions in moonpool calculation of a ship with a moonpool. Numerical simulation should be advised for calculations during the ship design process.
249-264
Abstract
Based on the RDF-storage, a software tool was developed for creating a knowledge base containing information about the CFD-calculations performed. The software is a set of scripts written in bash and python, which are published under the GNU GPL3 license. The tool is designed to support the user when making research studies that do not have a strict, pre-defined design of experiment or problem solving algorithm. To formalize the description of the calculation stored in the knowledge base, an ontology is created that serves as the information model for the calculation. As an auxiliary mechanism for carrying out an automated comparison of calculations with each other (a mechanism of "comparators" and "features") was developed and also described in the article. In addition to the systematized data storage, the complex provides the possibility of their automated and semi-automated analysis, including the presentation of a set of calculations in the form of an undirected graph, the construction of flat and spatial dependencies, the search for similar calculations, etc. The article gives examples of data processing results for the project on design of channel in the cylinder head of a piston engine.
265-288
Abstract
We describe our efforts towards building a tool that automatically verify high-level functional properties of Ethereum smart contracts against its formal specification that can be given using four different methods: an invariant over contract state or three different types of trace properties. A model of runtime system, the source code of smart contract together with its specification is translated into SMT-solver formula and checked for counter example. We tested the method on simplified version of notorious TheDAO smart-contract, called MiniDAO. Our proof-of-concept tool was able to find a functional property violation of MiniDAO in just several seconds. We believe that the proposed method is indeed useful and deserves deeper investigation.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)