Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 30, No 2 (2018)
View or download the full issue PDF (Russian)
7-24
Abstract
Interpreted execution of computer programs, its capabilities and advantages is well-covered in the computer science literature. Its key feature is reflection: the ability to access and modify the source code at run time. At the same time, interpreted execution has its shortcomings: lower performance; higher code fragility caused by the possibility to change code during run time; more complicated static analysis; runtime-tied ecosystems. And in some cases like embedded systems, runtimes and interpreted code are impractical or impossible, and compiled code with zero dependencies is the only option. Pure compiled execution can be treated as a paradigm directly opposite to reflection-powered interpretation. If the primary trait of interpreted execution is reflection, then pure compilation should cleanly separate development time and run time. This implies no part of translator being available during run time, no requirements for runtime libraries availability, and, finally, no dependence on the implementation details like variable names. While interpretation is wildly popular, compiled execution can be a conscious choice not only for low-level applications, but other cases as well. The dichotomy between low-level languages and expressive reflection-enabled language is a false one. It should be possible to create an expressive purely compiled programming language, and such a language might be equally suitable both for system programming and application development.
25-44
Abstract
The work is dedicated to the topic of parallelizing programs in especially difficult cases - when the used algorithm is purely sequential, there are no parallel alternatives to the algorithm used, and its execution time is unacceptably high. Various parallelization methods for software implementations of such algorithms and resulting computational load balancing are considered, allowing to obtain significant performance acceleration for application programs using purely sequential algorithms. The above methods are illustrated by the practice of their application to two algorithms used in a dynamic binary code analysis toolset. The main goal of this paper is to show that the use of a purely sequential algorithm in a software implementation does not necessarily imply inevitability of its sequential execution. The proposed methods of parallelizing implementations of such algorithms and balancing the resulting computational load can help to develop efficient parallel program that fully utilize the hardware capabilities of modern computing systems.
45-64
Abstract
Relational programming is an approach that allows you to execute programs in different "directions" to get different behaviors from one relational specification. The direct development of relational programs is a complex task, requiring the developer to have certain skills. However, in many cases, the required relational program can be obtained from a certain functional program automatically. In this paper, the problem of automatic conversion of functional programs into relational ones is considered. The main contribution of the paper is the method of converting typed functions into a relational form, as well as proving its static and dynamic correctness. This method can be applied to typed programs of a general kind. To describe these programs, a compact ML-like language (a subset of OCaml) is used, equipped with a Hindley-Milner type system with let-polymorphism Also, the paper discusses the limitations of the proposed method, presents an implementation for a subset of the OCaml language, and evaluates the method on a number of realistic examples.
65-80
Abstract
This paper proposes a method of automated generation of machine instruction decoders for various processor architectures, mainly microcontrollers. Only minimal, high-level input from user is required: a set of assembly instruction templates and a list of register names. The method utilises the target architecture assembler to reveal the mapping of assembly-level instructions onto their binary encodings by mutating variables in the templates. The recovered mapping is then used as the central part of the architecture-independent decoder. The developed tools allow to significantly simplify the support of a large number of different processor architectures, since the proposed file format does not require high skill of the operator. At the same time, automated generation of decoders is performed much faster than manual or semi-automatic (description of the command character encodings in a certain language manually) development of a corresponding tool. A system based on the proposed method has been implemented and tested over a set of four microcontroller architectures: PIC16F877A, AVR, Tricore, H8/300H. The speed of decoding of our system is in all cases higher than that of standard tools that are in the public domain
81-98
Abstract
Rendering of 3D scenes with big number of objects is computationally intensive. Occlusion culling methods are used to decrease the number of handled objects. We consider interactive occlusion culling methods that have spatial and time coherence. We propose algorithm to increase rendering performance by using occlusion checks implemented in software mode. We propose heuristic to determine hierarchy level that corresponds to the most efficient occlusion checking. The algorithm is compared with the algorithm based on hardware occlusion queries. Checking for occlusion on CPU avoids transmission overhead between CPU and GPU and as a result improves rendering performance of 3d scenes with big number of objects. Section 1 provides an overview of related work as well as general purposes of given paper and its structure. Section 2 describes the basic formulas that are used in software rasterization and visibility checks. Section 3 describes the proposed algorithm for removing invisible surfaces. Section 4 presents the results of comparing the performance of the proposed algorithm and the algorithm based on hardware visibility requests. Section 5 summarizes the main conclusions.
99-112
Abstract
The article considers the problem of the synthesis of a self-checking integrated control circuit with optimization of structural redundancy using the Boolean complement method up to 2-out-of-4 constant-weight code. A method for determining the values of control functions is developed, which makes it possible to set their appearance step by step, this ensures the solution of the problem of testing the corresponding elements of addition by modulo two and the tester circuit. In this case, uncertainties are introduced into the values of functions, which makes it possible to minimize the functions themselves, and, accordingly, simplify the circuit of check logic block. The method of constructing the control scheme by the method of logical addition for the 2/4-code is universal, and taking into account the fact that 2/4-TSC has a simple structure, it allows synthesizing embedded control schemes with structural redundancy not exceeding the redundancy of the control scheme when using the duplication method. Such a conclusion can be made on the basis of a large number of experiments on the use of equilibrium codes in the organization of self-verified control schemes
113-148
Abstract
Authentication is associated with a scenario, in which some party (the applicant) presented the identity of the principal and states that this is the principal. Authentication allows some other party (verifier) to make sure that this statement is legitimate. Authentication is widely used in access control systems to networks and resources of computing systems. In this context, of considerable interest is the Extensible Authentication Protocol (EAP), specified by the IETF in RFC 3748, which provides an effective mechanism for embedding various authentication methods into it, as well as the proper methods of EAP authentication, some of which were standardized in specifications IETF. This article is a review of Extensible Authentication Protocol (EAP) and its methods, specified by IETF. EAP provide an effective flexible authentication mechanism that can be easily expanded with new authentication methods. The variety of mechanisms used to implement the authentication service are shown. The work was performed under support of the Russian Foundation for Basic Research, research grant № 16-07-00603 "The verification of security functionality of the EAP authentication protocol and evaluation of the robustness of its implementations against attacks".
149-166
Abstract
Graphs are used as a data structure to represent large volumes of information in a compact and convenient for analysis form in many areas: bioinformatics, graph databases, static code analysis, etc. In these areas, it is necessary to evaluate a queries for large graphs in order to determine the dependencies between the nodes. The answer to such queries is usually a set of all triples (A, m, n) for which there is a path in the graph from the vertex m to the vertex n, such that the labels on the edges of this path form a string derivable from the nonterminal A in some context-free grammar. This type of query is calculated using relational query semantics and context-free grammars but there are conjunctive grammars, which form a broader class of grammars than traditional context-free grammars. The use of conjunctive grammars in the path querying allows us to formulate more complex queries to the graph and solve a wider class of problems. It is known that the path querying using relational query semantics and conjunctive grammars is undecidable problem. In this paper, we propose a path querying algorithm which uses relational query semantics and conjunctive grammars, and approximates the exact solution. The proposed algorithm is based on matrix operations, and our evaluation shows that it is possible to significantly improve the performance of the algorithm by using a graphics processing unit.
167-194
Abstract
For a distributed system based on a directed graph without multiple edges and loops, the backtracing problem is considered: how to transfer a message from the final vertex of the arc to its initial vertex. The task is to create a structure on the graph that allows the message to be transmitted from the final vertex of the arc to its initial vertex in the minimum time, i.e. on the shortest path. Such a structure at each vertex a is given by mapping the number of vertex b to the number of the outgoing arc through which the shortest path passes from a to b . In particular, such a mapping makes it possible to simulate, in directed distributed systems, algorithms for solving problems on a graph, developed for unditected distributed systems. This increases the running time of such algorithms by not more than k times, where k does not exceed the diameter of the graph, k < n , where n is the number of vertices of the graph. Section 2 describes the asynchronous model of the distributed system used. Section 3 contains the basic definitions and notation, and Section 4 - the statement of the problem. Section 5 describes two auxiliary algorithms for subtree correction, the application of which makes it possible to construct spanning trees of shortest paths: a out-tree and an in-tree. Section 6 contains a description of the various methods for transmitting messages over the graph. In Section 7, two algorithms are proposed for constructing in the memory of the graph root automaton the descriptions of spanning out- and in- shortest path trees, and in Section 8, the algorithms for constructing the required mapping based on them: a "fast" algorithm with T = O ( n ) and N = O ( n ) and an "economical" algorithm with T = O ( n2) and N = O (1), where T is the running time of the algorithm, N is the number of messages simultaneously transmitted along the arc. In Section 9 it is proved that these estimates of time are not improved. In Section 10, the "fast" algorithm is modified for a synchronous model with N = 1. The conclusion sums up and outlines directions for further research.
195-214
Abstract
The article presents the ontology of the "Software usability" domain. Authors provides the review of existing ontologies for “user interface” and “software” domains. The article describes the advantages that can give its use in the analysis and evaluation of software products usability, e.g.: opened data structure for more easily data sharing. The paper presents a class diagram of the proposed ontology and a text description for the classes, object properties and data properties. Presented diagrams contains basic classes (device, session, user, region, variation, image) and subclasses for the event class. Examples of questions that can be answered by ontology are given. e.g.: “What commands are made by users in the region?” and “What variations of the region are in the session?”. Also, the article presents the SPARQL queries that may be used for this ontology. The classification of possible methods of analysis and evaluation of usability on the basis of the ontology is proposed and some of them are described. An example of the user activity data in a standard RDF/XML format is demonstrated. Described ontology "Software Usability" designed for the user activity data representation. Experiments have shown that ontology can be used to analyze the usability of desktop applications for Windows operating systems, including the construction of heat maps. Heat map based on the collected user activity data in is shown. In addition, ontology can be used and extended by any researchers involved in the problems of ease of use of software interfaces, in particular in experiments with different types of interfaces. The authors will welcome comments and suggestions on the refinement and development of ontology from experts in the domain and are ready to cooperate.
215-250
Abstract
High quality labeled corpora play a key role to elaborate machine learning systems. Generally, creating of such corpora requires human efforts. So, annotation process is expensive and time-consuming. Two approaches that optimize the annotation are active learning and crowdsourcing. Methods of active learning are aimed at finding the most informative examples for the classifier. At each iteration from the unplaced set, one algorithm is chosen by an algorithm, it is provided to the oracle (expert) for the markup and the classifier is newly trained on the updated set of training examples. Crowdsourcing is widely used in solving problems that can not be automated and require human effort. To get the most out of using crowdplatforms one needs to to solve three problems. The first of these is quality, that is, algorithms are needed that will best determine the real labels from the available ones. Of course, it is necessary to remember the cost of markup - to solve the problem by increasing the number of annotators for one example is not always reasonable - this is the second problem. And, thirdly, sometimes the immediate factor is the rapid receipt of the marked corpus, then it is necessary to minimize the time delays when the participants perform the task. This paper aims to survey existing methods based on this approaches and techniques to combine them. Also, the paper describes the systems that help to reduce the cost of annotation.
251-262
Abstract
In this paper we present the contactless method of analyzing the cardiac activity of a person based on recording and analyzing a ballistic cardiogram signal. A measuring device for registration of microscopic movements of the body uses a piezoelectric sensor of high sensitivity. Due to sensor’s high sensitivity, the level of background noise is higher than the signal level, so mathematical methods are used for noise reduction. Butterworth filter is used to extract cardiac signal. This approach is more computationally efficient compared to machine learning-based methods, and can be implemented on an edge computing node to which several sensors are connected. The quality of the signal obtained after filtration allows us to detect cardiac cycles. The algorithm used for detection of heartbeats proposed in this paper is also computationally simple enough to be implemented at the edge node. After preprocessing described above data is transmitted to the datacenter (cloud).
263-284
Abstract
In this paper, the particle finite element method (PFEM-2) for the simulation of the axisymmetric flows of a viscous incompressible fluid is considered. The necessary equations for the description of the axisymmetric flows by using the particle finite element method with some assumptions were obtained. The numerical model for solving axisymmetric flows of a viscous incompressible fluid was implemented in the Kratos open-source code on the basis of the existing application for solving two-dimensional problems. The numerical model for the simulation of the axisymmetric flows of a viscous incompressible fluid was validated on two problems. First of them is the Poiseuille problem about viscous flow in the pipe. The numerical solution obtained by particle finite element method are in the satisfactory agreement with the analytical solution for this problem. The second test is a problem of a droplet impact onto a deep liquid pool with similar fluid. As the comparison with analytical results is impossible, the results of particle finite element method simulation were compared with results of the numerical simulation obtained by using the open-source package Gerris (Volume of Fluid method). The results of comparing of numerical simulations of droplet impact onto a liquid pool obtained by two different codes also in satisfactory agreement.
285-300
Abstract
In this paper, we describe the variant of the Runge - Kutta Discontinuous Galerkin (RKDG) method for numerical simulation of 2D gas dynamics flows on structured rectangular meshes. The RKDG algorithm was implemented with C++ code based on the experience in 1D case implementation and verified on simple tests. The advantage of the RKDG method over the most popular finite volume method (FVM) is discussed: three basis functions being applied in the framework of the RKDG approach lead to a considerable decrease of the numerical dissipation rate with respect to FVM. The numerical code contains implementations of Lax - Friedrichs, HLL and HLLC numerical fluxes, KXRCF troubled cell indicator and WENO_S and MUSCL slope limiters. Results of the acoustic pulse simulation on a sufficiently coarse mesh with the piecewise-linear approximation show a good agreement with the analytical solution in contrast to the FVM numerical solution. For the Sod problem results of the shock wave, rarefaction wave and the contact discontinuity propagation illustrate the dependence of the scheme resolution on the numerical fluxes, troubled cell indicator and limitation technique choice. The numerical scheme with MUSCL slope limiter has the higher numerical dissipation in comparison to one with WENO_S limiter. It is shown, that in some cases KXRCF troubled cell indicator doesn’t detect numerical solution non-monotonicity.
301-316
Abstract
The main goal of modern hemodynamic simulation is the prediction of blood pressure in the arteries, as well as the study of the various factors complex effect on the cardiovascular system characteristics. Quasi-one dimensional models of blood flow through blood vessels are the most popular. They allow to model the blood flow in the entire vascular system. Since full-scale simulation of the cardiovascular system requires large computational costs, the problem of parallelizing computation is actual. In this paper, the efficiency study was carried out for parallel algorithms in the numerical simulation of blood flow in the quasi-one-dimensional approximation. For simplicity, we consider the problem of the blood flow simulation in a separate blood vessel. When constructing a parallel algorithm, the domain decomposition method was applied. In each subdomain, the problem at each time step splits into a hyperbolic and parabolic subproblems. To solve the hyperbolic subtask, an integro-interpolation method based on the MUSCL scheme is used. To integrate over time, the second order Runge-Kutta and Adams-Bashfort methods are applied. The Crank-Nicholson method is used to solve the parabolic subproblem. At the subdomains conjunctions, the interface conditions are non-linear systems with three unknowns. These systems are solved using the Newton method. The time-cost structure was obtained for solving the test problem in a serial mode using the AMD CodeAnalyst profiler. Computations were carried out at the cluster of the BMSTU "Applied Mathematics" FN-2 department. The computational results showed that the time gain achieved by using the MPI library does not exceed a few percent in comparison with the OpenMP technology usage. Taking into account the simplicity of parallelizing algorithms through OpenMP, we can choose this technology, but MPI usage allows to make the software package universal (it can work both on shared memory systems and on distributed memory systems).


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)