Vol 27, No 2 (2015)
5-22
Abstract
Modern processors are based on multicore architectures. Such an approach improves the productivity of processors by decreasing the cost of each processor and increasing its performance. When using multicore processors in nodes of HPC cluster we could use following approaches utilizing node resources: (1) multiprocess program (nx1 - running n processes on a node using one thread in each process); or (2) multiprocess-multithreaded (MPMT) (1xn - running one process on a node and inside of a process n threads may work sharing program data of the process). When using multiple threads in a process inside each process privatization is usually performed to reduce critical sections. In this article we consider the second approach, which will bring better results for parallel application presented in this article because of lack of critical sections. The API and appropriate library has been developed and implemented for MPMT applications. The library allows developing parallel applications using MPI interface and inside of each process it is possible to run a few threads. The parallel MPMT application of FT (Fast Furier Transformation) on Java has been developed. The comparison of multiprocess version of FT to MPMT version of FT has been made. Tests on implemented application show 9,5-20% performance improvement. The profiling of developed application shows the bottleneck of MPMT FT is mostly in communication scheme between nodes. Improving the communication scheme will bring better results.
23-38
Abstract
This article focuses on practical questions of performing program analysis for applications executed using interpreters and virtual machines. Using Dalvik virtual machine in Android operating system as target interpreter layer, we showcase the possibilities of program analysis through automatic execution data extraction and profiling. The article describes functional modifications for Dalvik virtual machine which allow generating extra statistics and data to be intercepted by a debugger client for further processing. We conclude with the brief overview of a set of experiments performed on a set of standard Android applications, identifying several memory usage trends; lastly, we discuss possible future improvements for the implemented system.
39-52
Abstract
This article presents a dynamic analysis approach to automatic detection of race conditions in multi-threaded Java applications. We use static Java byte-code instrumentation framework Coffee Machine to avoid additional dynamic instrumentation overhead and support a more extensive set of Java virtual machines. Our instrumentation routines target bytecode instructions related to multithreading, monitors as well as base data access and method call instructions. We perform data race detection using ThreadSanitizer Offline tool utilizing two distinct methods: happens-before relation tracking and lock set tracking. Our approach has been tested on a set of open source projects.
53-64
Abstract
This paper describes the static analysis tool for finding program entities, their metrics, and relations between entities. Program entities are files/directories (physical structure) and classes/functions/methods/global variables (logical structure). Relations are connections between entities, such as calls, inheritance, aggregation, reading/writing, inclusion. We describe the methods for developing such a tool for C/C++ languages based on open source components: LLVM/Clang compiler infrastructure, GNU Binutils linker and archiver. We are sketching the changes that were required to be made in the Clang compiler, and we discuss the analysis methods implemented in the LLVM analyzer. We briefly present the results of testing our tool on Android OS.
65-92
Abstract
The deterministic replay technique can be used for debugging, improving reliability and robustness, software development and incident investigation (including reverse engineering of malware). The paper describes the implementation of deterministic replay of IA-32 based boards in QEMU. Another implementation of this technique in QEMU had been previously published, but it uses a significantly different approach. Deterministic replay implementation details and features substantially depend on deterministic area - the part of virtual machine which execution is being replayed. For replay to be deterministic the implementation must ensure that (1) all information flows across deterministic area borders should be logged and then replayed, and (2) there is no non-determinism inside deterministic area. The proposed approach is called «Min VM» because it’s based on the minimal deterministic area while the former one should be called «Max VM» as it attempts to stretch deterministic area to cover whole virtual machine. The proposed approach shows the advantages of lower time overhead for logging phase and easier support (because it is much easier to ensure determinism of small deterministic area). On the other side the shortcoming is larger log size mostly because deterministic area doesn’t include hard disks so all data flows from disks are being logged. It makes the self-sufficient replay log: image of the original HDD is not needed to replay the execution. The implementation has been tested on popular operating systems: Windows XP, Windows 7 and GNU/Linux 3.12. The current implementation shows 6 - 42% slowdown depending on application code that exceeds previous approach slowdown (17 - 79%).
93-104
Abstract
The paper describes a method for semantic errors detection arising during incorrect code copy-paste made by the developer. The method consists of two basic parts. The first part detects code clones based on lexical analysis of the program. A sequence of tokens is constructed based on the LLVM lexer and then all pairs of maximal, non-intersected matched token sequences are detected. The pairs of identical subsequences are then partially parsed to retain the constructs allowed by the programming language and to remove the incomplete sequences. When the remaining subsequences are big enough, the second stage is applied for them. A Program Dependence Graph (PDG) is constructed for the corresponding function code, and then identical subsequences’ subgraphs are considered. If two subgraphs have shared vertices, then outgoing edges of these vertices are analyzed. This allows detecting semantic errors with high accuracy. The described method is implemented for the LLVM/Clang compiler. Due to this semantic mistakes are detected during program compile time, so there is no need for separate lexical and semantic program analysis. A number of widely used open source libraries and software systems were analyzed. The paper contains the list of detected semantic errors for Linux kernel 2.6 and Android 4.3. For these systems, the true positive rate achieved by our approach is above 65%.
105-126
Abstract
In this paper memory violation detection method is considered. This method applied to program binaries, without requiring debug information. It allows to find such memory violations as out-of-bound read or writing in some buffer. The technique is based on dynamic analysis and symbolic execution. We present a tool implemented the method. We used this tool to find 11 bugs in both Linux and Windows programs, 7 of which were undocumented at the time this paper was written.
127-144
Abstract
Reverse debugging is software development technique that effectively helps to fix bugs caused by nondeterministic program behavior. It allows inspecting past program’s state without rerun. The paper describes implementation of software reverse debugging using deterministic replay based on the QEMU emulator. We present ways to improve debugging performance by reducing saved data, using copy-on-write snapshots’ format and indexing/compressing of replay log. QEMU supports a common user interface for reverse debugging in GDB debugger which allows using reverse-continue, reverse-nexti, reverse-stepi and reverse-finish commands. Time required for these commands’ execution depends on taking snapshots frequency in recording replay log. This paper also presents assessment of snapshots frequency for better performance.
145-160
Abstract
The paper presents a model-based approach to conformance testing of TLS implementations. It discusses the formal model of TLS protocol, the structure of the test suite. JavaTesK tool, based on UniTESK technology, was used to develop the test suite. A set of fuzz operators was developed for general data types and included in the test suite. We applied the test suite to a several popular implementations of TLS client, and present brief results. This approach has proved his efficiency, various errors and vulnerabilities had been found in all chosen TLS implementations.
161-172
Abstract
Algebraic models of programs considered in this paper generalize two models of programs introduced by A.A. Lyapunov and A.A. Letichevsky. The theory of these models focuses on the equivalence checking problem for program schemata which are formalization of imperative programs. We prove that this problem is decidable for a wide class of algebraic models of programs. Our decision techniques are based on the approach to the equivalence checking problem for finite state automata. The aim of this paper is to reveal this relationship.
173-188
Abstract
The article belongs to the theory of program schemes, which are objects designed for the analysis of formalized programs. The structure of such schemes is represented as graphs. Programs schemes with procedures are considered. A subclass of program schemes, called primitive, is defined. A methodology for the construction of full systems of equivalent transformations (E.T.) is given. This methodology is then successfully applied to construct a full system of E.T. in the primitive subclass of balanced gateway program models with left cancellation. The construction is based on the known system of E.T. for similar program models without procedures. An auxiliary type of schemes, called multiexit schemes, is used. Further research topics are given in the conclusion.
189-220
Abstract
The problem of parallel computation of the value of a function of multiset of values recorded at the vertices of a directed strongly connected graph is considered. Computation is performed by automata that are located at the graph vertices. The automaton has local information on the graph: it "knows" only about arcs outgoing from the vertex it resides in, but it "does not know" where (to which vertices) those arcs go. The automata exchange messages with each other that are transmitted along the graph arcs that play role of message transfer channels. Computation is initiated by a message coming from outside to the automaton located at the initial vertex of the graph. At the end of work, this automaton sends outside the calculated function value. Two algorithms are proposed to solve this problem. The first algorithm carries out an analysis of the graph. Its purpose is to mark the graph by a change of the states of the automata at the vertices. Such marking is used by the second algorithm, which calculates the function value. This calculation is based on a pulsation algorithm: first, request messages are distributed from the automaton of the initial vertex over the graph, which should reach each vertex, and then response messages are sent from each vertex back to the initial vertex. In fact, the pulsation algorithm calculates aggregate functions for which the value of a function of a union of multisets is calculated by the values of the function of these multisets. However, it is shown that any function F(x) has an aggregate extension; that is, an aggregate function can be calculated as H(G(x)), where G is an aggregate function. Note that the marking of a graph does not depend on a function that will be calculated. This means that the marking of a graph is carried out once; after that, it can be reused for calculating various functions. Since the automata located in different vertices of the graph work in parallel, both graph marking and function calculation are performed in parallel. It is the first feature of this work. The second feature is that calculations are performed on a dynamically changing graph: its arcs can disappear, reappear or change their target vertices. The constraints put on the graph changes are as minimal as it allows solving this problem in limited time. Estimate of working time is given for both algorithms.
221-250
Abstract
Finite state transducers extend the finite state automata to model functions on strings or lists. They may be used also as simple models of sequential reactive programs. These programs operate in the interaction with the environment permanently receiving data (requests) from it. At receiving a piece of data such program performs a sequence of actions. When certain control points are achieved a program outputs the current results of computation as a response. It is significant that different sequences of actions may yield the same result. Therefore, the basic actions of a program may be viewed as generating elements of some appropriate semigroup, and the result of computation may be regarded as the composition of actions performed by the program. This paper offers an alternative technique for the analysis of finite state transducers over semigroups. To check the equivalence of transducers and we associate with them a Labeled Transition Systems . Each path in this LTS represents all possible runs of and on the same input word. Every node of keeps track of the states of and achieved at reading some input word and the deficiency of the output words computed so far. If both transducers reach their final states and the deficiency of their outputs is nonzero then this indicates that and produce different images for the same word, and, hence, they are not equivalent. The nodes of that capture this effect are called rejecting nodes. Thus, the equivalence checking of and is reduced to checking the reachability of rejecting nodes in LTS . We show that one needs to analyze only a bounded fragment of to certify (un)reachability of rejecting nodes. The size of this fragment is polynomial of the size of and if both transducers are deterministic, and single-exponential if they are k-bounded. The same approach is applicable for checking k-valuedness of transducers over semigroups.
ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)
ISSN 2220-6426 (Online)